Python etc
6.12K subscribers
18 photos
194 links
Regular tips about Python and programming in general

Owner — @pushtaev

© CC BY-SA 4.0 — mention if repost
Download Telegram
LookupError is a base class for IndexError and KeyError:

LookupError.__subclasses__()
# [IndexError, KeyError, encodings.CodecRegistryError]

KeyError.mro()
# [KeyError, LookupError, Exception, BaseException, object]

IndexError.mro()
# [IndexError, LookupError, Exception, BaseException, object]


The main purpose of this intermediate exception is to simplify a bit lookup for deeply nested structures when any of these two exceptions may occur:

try:
username = resp['posts'][-1]['authors'][0]['name']
except LookupError:
username = None
The operator is checks if the two given objects are the same object in the memory:

{} is {}  # False
d = {}
d is d # True

Since types are also objects, you can use it to compare types:

type(1) is int        # True
type(1) is float # False
type(1) is not float # True

And you can also use == for comparing types:

type(1) == int  # True

So, when to use is and when to use ==? There are some best practices:

+ Use is to compare with None: var is None.

+ Use is to compare with True and False. However, don't explicitly check for True and False in conditions, prefer just if user.admin instead of if user.admin is True. Still, the latter can be useful in tests: assert actual is True.

+ Use isinstance to compare types: if isinstance(user, LoggedInUser). The big difference is that it allows subclasses. So if you have a class Admin which is subclass of LoggedInUser, it will pass isinstance check.

+ Use is in some rare cases when you explicitly want to allow only the given type without subclasses: type(user) is Admin. Keep in mind, that mypy will refine the type only for isinstance but not for type is.

+ Use is to compare enum members: color is Color.RED.

+ Use == in ORMs and query builders like sqlalchemy: session.query(User).filter(User.admin == True). The reason is that is behavior cannot be redefined using magic methods but == can (using __eq__).

+ Use == in all other cases. In particular, always use == to compare values: answer == 42.
The del statement is used to delete things. It has a few distinct behaviors, depending on what is the specified target.

If a variable specified, it will be removed from the scope in which it is defined:

a = []
del a
a
# NameError: name 'a' is not defined

If the target has a form target[index], target.__delitem__(index) will be called. It is defined for built-in collections to remove items from them:

a = [1, 2, 3]
del a[0]
a # [2, 3]

d = {1: 2, 3: 4}
del d[3]
d # {1: 2}

Slices are also supported:

a = [1, 2, 3, 4]
del a[2:]
a # [1, 2]

And the last behavior, if target.attr is specified, target.__delattr__(attr) will be called. It is defined for object:

class A:
b = 'default'
a = A()
a.b = 'overwritten'
a.b # 'overwritten'
del a.b
a.b # 'default'
del a.b # AttributeError
The method __del__ is called on the object by the garbage collector when the last reference to the object is removed:

class A:
def __del__(self):
print('destroying')

a = b = A()
del a
del b
# destroying

def f():
a = A()

f()
# destroying


The method is used by Python's file object to close the descriptor when you don't need it anymore:

def f():
a_file = open('a_file.txt')
...


However, you cannot safely rely on that the destructor (this is how it's called in other languages, like C) will be ever called. For instance, it can be not true in PyPy, MicroPython, or just if the garbage collector is disabled using gc.disable().

The thumb-up rule is to use the destructor only for unimportant things. For example, aiohttp.ClientSession uses __del__ to warn about an unclosed session:

def __del__(self) -> None:
if not self.closed:
warnings.warn(
f"Unclosed client session {self!r}", ResourceWarning
)
By using __del__ and global variables, it is possible to leave a reference to the object after it was "destroyed":

runner = None
class Lazarus:
def __del__(self):
print('destroying')
global runner
runner = self

lazarus = Lazarus()
print(lazarus)
# <__main__.Lazarus object at 0x7f853df0a790>
del lazarus
# destroying
print(runner)
# <__main__.Lazarus object at 0x7f853df0a790>


In the example above, runner points to the same object as lazarus did and it's not destroyed. If you remove this reference, the object will stay in the memory forever because it's not tracked by the garbage collector anymore:

del runner  # it will NOT produce "destroying" message


This can lead to a strange situation when you have an object that escapes the tracking and will be never collected.

In Python 3.9, the function gc.is_finalized was introduced that tells you if the given object is a such runner:

import gc
lazarus = Lazarus()
gc.is_finalized(lazarus) # False
del lazarus
gc.is_finalized(runner) # True


It's hard to imagine a situation when you'll need it, though. The main conclusion you can make out of it is that you can break things with a destructor, so don't overuse it.
The module warnings allows to print, you've guessed it, warnings. Most often, it is used to warn users of a library that the module, function, or argument they use is deprecated.

import warnings

def g():
return 2

def f():
warnings.warn(
"f is deprecated, use g instead",
DeprecationWarning,
)
return g()

f()

The output:

example.py:7: DeprecationWarning: 
function f is deprecated, use g instead
warnings.warn(

Note that DeprecationWarning, as well as other warning categories, is built-in and doesn't need to be imported from anywhere.

When running tests, pytest will collect all warnings and report them at the end. If you want to get the full traceback to the warning or enter there with a debugger, the easiest way to do so is to turn all warnings into exceptions:

warnings.filterwarnings("error")

On the production, you can suppress warnings. Or, better, turn them into proper log records, so they will be collected wherever you collect logs:

import logging
logging.captureWarnings(True)
The string.Template class allows to do $-style substitutions:

from string import Template
t = Template('Hello, $channel!')

t.substitute(dict(channel='@pythonetc'))
# 'Hello, @pythonetc!'

t.safe_substitute(dict())
# 'Hello, $channel!'

Initially, it was introduced to simplify translations of strings. However, now PO-files natively support python-format flag. It indicates for translators that the string has str.format-style substitutions. And on top of that, str.format is much more powerful and flexible.

Nowadays, the main purpose of Template is to confuse newbies with one more way to format a string. Jokes aside, there are a few more cases when it can come in handy:

+ Template.safe_substitute can be used when the template might have variables that aren't defined and should be ignored.
+ The substitution format is similar to the string substitution in bash (and other shells), which is useful in some cases. For instance, if you want to write your own dotenv.
A long time ago, we already covered the chaining of comparison operations:
https://tttttt.me/pythonetc/411

A quick summary is that the result of right value from each comparison gets passed into the next one:

13 > 2 > 1  # same as `13 > 2 and 2 > 1`
# True

13 > 2 > 3 # same as `13 > 2 and 2 > 3`
# False


What's interesting, is that is and in are also considered to be operators, and so can be also chained, which can lead to unexpected results:

a = None
a is None # True, as expected
a is None is True # False 🤔
a is None == True # False 🤔
a is None is None # True 🤯


The best practice is to use the operator chaining only to check if the value in a range using < and <=:

teenager = 13 < age < 19
This post is provided by @PavelDurmanov:

As you may know, generators in Python are executed step-by-step. This means that there should be a possibility to "see" that state between the steps.

All generator's local variables are stored in frame locals, and we can access the frame through the gi_frame attribute on a generator:

def gen():
x = 5
yield x
yield x
yield x

g = gen()
next(g) # 5
g.gi_frame.f_locals # {'x': 5}


So if we can see it, we should be able to modify it, right?

g.gi_frame.f_locals["x"] = 10
next(g) # still gives us 5


Frame locals returned as a dict is a newly created object from actual frame local vars, meaning that returned dict doesn't reference the actual variables in the frame.

But there's a way to bypass that with C API:

import ctypes

# after we've changed the frame locals, we need to "freeze" it
# which is basically telling the interpreter to update the underlying frame based on newly added attributes
ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(g.gi_frame), ctypes.c_int(0))


So now we can verify that the generator's locals have actually changed:

next(g)  # 10


You might wonder what is ctypes.c_int(0)? There are 2 "modes" you can use to update the underlying frame, 0 and 1. If you use 1, it'll add and/or update frame local vars that are already present in the frame. So if we'd remove the x from the locals dict and call the update with c_int(0), it'd do nothing as it cannot delete the vars.

if you want to actually delete some variable from the frame, call the update with c_int(1). That will replace underlying frame locals with the new locals we've defined .f_locals dict.

And as you may know, coroutines in Python are implemented using generators, so the same logic is present there as well, but instead of gi_frame it's cr_frame.
The os.curdir is a trap!

import os
os.curdir
# '.'

It's a constant indicating how the current directory is denoted in the current OS. And for all OSes that CPython supports (Windows and POSIX), it's always a dot. It might be different, though, if you run your code with MicroPython on some niche OS.

Anyway, to actually get the path to the current directory, you need os.getcwd:

os.getcwd()
# '/home/gram'

Or use pathlib:

from pathlib import Path
Path().absolute()
# PosixPath('/home/gram')
Python 3.11 is released! The most interesting features:

+ Fine-grained error location in tracebacks.
+ ExceptionGroup and the new except* syntax to handle it.
+ A new module to parse TOML.
+ Atomic grouping and possessive quantifiers for regexes.
+ Significant performance improvements.
+ New Self type.
+ Variadic generics.
+ Data class transforms.

That's a lot of smart words! Don't worry, we'll tell you in details about each of these features in the upcoming posts. Stay tuned!
PEP 657 (landed into Python 3.11) enhanced tracebacks so that they now include quite a precise location of where the error occurred:

Traceback (most recent call last):
File "query.py", line 24, in add_counts
return 25 + query_user(user1) + query_user(user2)
^^^^^^^^^^^^^^^^^
File "query.py", line 32, in query_user
return 1 + query_count(db, response['a']['b']['c']['user'], retry=True)
~~~~~~~~~~~~~~~~~~^^^^^
TypeError: 'NoneType' object is not subscriptable

It shows not only where the error occurred for each frame, but also which code was executed. Beautiful!
PEP 678 (landed in Python 3.11) introduced a new method add_note for BaseException class. You can call it on any exception to provide additional context which will be shown at the end of the traceback for the exception:

try:
1/0
except Exception as e:
e.add_note('oh no!')
raise
# Traceback (most recent call last):
# File "<stdin>", line 2, in <module>
# ZeroDivisionError: division by zero
# oh no!

The PEP gives a good example of how it can be useful. The hypothesis library includes in the traceback the arguments that caused the tested code to fail.
PEP 654 (landed in Python 3.11) introduced ExceptionGroup. It's an exception that nicely wraps and shows multiple exceptions:

try:
1/0
except Exception as e:
raise ExceptionGroup('wow!', [e, ValueError('oh no')])

# Traceback (most recent call last):
# File "<stdin>", line 2, in <module>
# ZeroDivisionError: division by zero

# During handling of the above exception, another exception occurred:

# + Exception Group Traceback (most recent call last):
# | File "<stdin>", line 4, in <module>
# | ExceptionGroup: wow! (2 sub-exceptions)
# +-+---------------- 1 ----------------
# | Traceback (most recent call last):
# | File "<stdin>", line 2, in <module>
# | ZeroDivisionError: division by zero
# +---------------- 2 ----------------
# | ValueError: oh no
# +------------------------------------

It's very helpful in many cases when multiple unrelated exceptions have occurred and you want to show all of them: when retrying an operation or when calling multiple callbacks.
PEP 654 introduced not only ExceptionGroup itself but also a new syntax to handle it. Let's start right with an example:

try:
raise ExceptionGroup('', [
ValueError(),
KeyError('hello'),
KeyError('world'),
OSError(),
])
except* KeyError as e:
print('caught1:', repr(e))
except* ValueError as e:
print('caught2:', repr(e))
except* KeyError as e:
1/0

The output:

caught1: ExceptionGroup('', [KeyError('hello'), KeyError('world')])
caught2: ExceptionGroup('', [ValueError()])
+ Exception Group Traceback (most recent call last):
| File "<stdin>", line 2, in <module>
| ExceptionGroup: (1 sub-exception)
+-+---------------- 1 ----------------
| OSError
+------------------------------------

This is what happened:

1. When ExceptionGroup is raised, it's checked against each except* block.

2. except* KeyError block catches ExceptionGroup that contains KeyError.

3. The matched except* block receives not the whole ExceptionGroup but its copy containing only matched sub-exceptions. In case of except* KeyError, it includes both KeyError('hello') and KeyError('world')

4. For each sub-exception, only the first match is executed (1/0 in the example wasn't reached).

5. While there are unmatched sub-exceptions, they will be tried to match to remaining except* blocks.

6. If there are still sub-exceptions left after all of that, the ExceptionGroup with them is raised. So, ExceptionGroup('', [OSError()]) was raised (and beautifully formatted).
There is one more thing you should know about except*. It can match not only sub-exceptions from ExceptionGroup but regular exceptions too. And for simplicity of handling, regular exceptions will be wrapped into ExceptionGroup:

try:
raise KeyError
except* KeyError as e:
print('caught:', repr(e))
# caught: ExceptionGroup('', (KeyError(),))
I often find myself writing a context manager to temporarily change the current working directory:

import os
from contexlib import contextmanager

@contextmanager
def enter_dir(path):
old_path = os.getcwd()
os.chdir(path)
try:
yield
finally:
os.chdir(old_path)


Since Python 3.11, a context manager with the same behavior is available as contextlib.chdir:

import os
from contextlib import chdir

print('before:', os.getcwd())
# before: /home/gram
with chdir('/'):
print('inside:', os.getcwd())
# inside: /
print('after:', os.getcwd())
# after: /home/gram
The typing.assert_type function (added in Python 3.11) does nothing in runtime as most of the stuff from the typing module. However, if the type of the first argument doesn't match the type provided as the second argument, the type checker will return an error. It can be useful to write simple "tests" for your library to ensure it is well annotated.

For example, you have a library that defines a lot of decorators, like this:

from typing import Callable, TypeVar

C = TypeVar('C', bound=Callable)

def good_dec(f: C) -> C:
return f

def bad_dec(f) -> Callable:
return f


We want to be 100% sure that all decorators preserve the original type of decorated function. So, let's write a test for it:

from typing import Callable, assert_type

@good_dec
def f1(a: int) -> str: ...

@bad_dec
def f2(a: int) -> str: ...

assert_type(f1, Callable[[int], str]) # ok
assert_type(f2, Callable[[int], str]) # not ok
PEP 681 (landed in Python 3.11) introduced typing.dataclass_transform decorator. It can be used to mark a class that behaves like a dataclass. The type checker will assume that it has init that accepts annotated attributes as arguments, eq, ne, and str. For example, it can be used to annotate SQLAlchemy or Django models, attrs classes, pydantic validators, and so on. It's useful not only for libraries that don't provide a mypy plugin but also if you use a non-mypy type checker. For instance, pyright, which is used by vscode Python plugin to show types, highlight syntax, provide autocomplete, and so on.
As we covered a 3 years back (gosh, the channel is old), if the result of a base class is the current class, a TypeVar should be used as the annotation:

from typing import TypeVar

U = TypeVar('U', bound='BaseUser')

class BaseUser:
@classmethod
def new(cls: type[U]) -> U:
...

def copy(self: U) -> U:
...

That's quite verbose, but it's how it should be done for the return type for inherited classes to be correct.

PEP 673 (landed in Python 3.11) introduced a new type Self that can be used as a shortcut for exactly such cases:

from typing import Self

class BaseUser:
@classmethod
def new(cls) -> Self:
...

def copy(self) -> Self:
...