Python etc
6.1K subscribers
18 photos
194 links
Regular tips about Python and programming in general

Owner — @pushtaev

© CC BY-SA 4.0 — mention if repost
Download Telegram
By using __del__ and global variables, it is possible to leave a reference to the object after it was "destroyed":

runner = None
class Lazarus:
def __del__(self):
print('destroying')
global runner
runner = self

lazarus = Lazarus()
print(lazarus)
# <__main__.Lazarus object at 0x7f853df0a790>
del lazarus
# destroying
print(runner)
# <__main__.Lazarus object at 0x7f853df0a790>


In the example above, runner points to the same object as lazarus did and it's not destroyed. If you remove this reference, the object will stay in the memory forever because it's not tracked by the garbage collector anymore:

del runner  # it will NOT produce "destroying" message


This can lead to a strange situation when you have an object that escapes the tracking and will be never collected.

In Python 3.9, the function gc.is_finalized was introduced that tells you if the given object is a such runner:

import gc
lazarus = Lazarus()
gc.is_finalized(lazarus) # False
del lazarus
gc.is_finalized(runner) # True


It's hard to imagine a situation when you'll need it, though. The main conclusion you can make out of it is that you can break things with a destructor, so don't overuse it.
The module warnings allows to print, you've guessed it, warnings. Most often, it is used to warn users of a library that the module, function, or argument they use is deprecated.

import warnings

def g():
return 2

def f():
warnings.warn(
"f is deprecated, use g instead",
DeprecationWarning,
)
return g()

f()

The output:

example.py:7: DeprecationWarning: 
function f is deprecated, use g instead
warnings.warn(

Note that DeprecationWarning, as well as other warning categories, is built-in and doesn't need to be imported from anywhere.

When running tests, pytest will collect all warnings and report them at the end. If you want to get the full traceback to the warning or enter there with a debugger, the easiest way to do so is to turn all warnings into exceptions:

warnings.filterwarnings("error")

On the production, you can suppress warnings. Or, better, turn them into proper log records, so they will be collected wherever you collect logs:

import logging
logging.captureWarnings(True)
The string.Template class allows to do $-style substitutions:

from string import Template
t = Template('Hello, $channel!')

t.substitute(dict(channel='@pythonetc'))
# 'Hello, @pythonetc!'

t.safe_substitute(dict())
# 'Hello, $channel!'

Initially, it was introduced to simplify translations of strings. However, now PO-files natively support python-format flag. It indicates for translators that the string has str.format-style substitutions. And on top of that, str.format is much more powerful and flexible.

Nowadays, the main purpose of Template is to confuse newbies with one more way to format a string. Jokes aside, there are a few more cases when it can come in handy:

+ Template.safe_substitute can be used when the template might have variables that aren't defined and should be ignored.
+ The substitution format is similar to the string substitution in bash (and other shells), which is useful in some cases. For instance, if you want to write your own dotenv.
A long time ago, we already covered the chaining of comparison operations:
https://tttttt.me/pythonetc/411

A quick summary is that the result of right value from each comparison gets passed into the next one:

13 > 2 > 1  # same as `13 > 2 and 2 > 1`
# True

13 > 2 > 3 # same as `13 > 2 and 2 > 3`
# False


What's interesting, is that is and in are also considered to be operators, and so can be also chained, which can lead to unexpected results:

a = None
a is None # True, as expected
a is None is True # False 🤔
a is None == True # False 🤔
a is None is None # True 🤯


The best practice is to use the operator chaining only to check if the value in a range using < and <=:

teenager = 13 < age < 19
This post is provided by @PavelDurmanov:

As you may know, generators in Python are executed step-by-step. This means that there should be a possibility to "see" that state between the steps.

All generator's local variables are stored in frame locals, and we can access the frame through the gi_frame attribute on a generator:

def gen():
x = 5
yield x
yield x
yield x

g = gen()
next(g) # 5
g.gi_frame.f_locals # {'x': 5}


So if we can see it, we should be able to modify it, right?

g.gi_frame.f_locals["x"] = 10
next(g) # still gives us 5


Frame locals returned as a dict is a newly created object from actual frame local vars, meaning that returned dict doesn't reference the actual variables in the frame.

But there's a way to bypass that with C API:

import ctypes

# after we've changed the frame locals, we need to "freeze" it
# which is basically telling the interpreter to update the underlying frame based on newly added attributes
ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(g.gi_frame), ctypes.c_int(0))


So now we can verify that the generator's locals have actually changed:

next(g)  # 10


You might wonder what is ctypes.c_int(0)? There are 2 "modes" you can use to update the underlying frame, 0 and 1. If you use 1, it'll add and/or update frame local vars that are already present in the frame. So if we'd remove the x from the locals dict and call the update with c_int(0), it'd do nothing as it cannot delete the vars.

if you want to actually delete some variable from the frame, call the update with c_int(1). That will replace underlying frame locals with the new locals we've defined .f_locals dict.

And as you may know, coroutines in Python are implemented using generators, so the same logic is present there as well, but instead of gi_frame it's cr_frame.
The os.curdir is a trap!

import os
os.curdir
# '.'

It's a constant indicating how the current directory is denoted in the current OS. And for all OSes that CPython supports (Windows and POSIX), it's always a dot. It might be different, though, if you run your code with MicroPython on some niche OS.

Anyway, to actually get the path to the current directory, you need os.getcwd:

os.getcwd()
# '/home/gram'

Or use pathlib:

from pathlib import Path
Path().absolute()
# PosixPath('/home/gram')
Python 3.11 is released! The most interesting features:

+ Fine-grained error location in tracebacks.
+ ExceptionGroup and the new except* syntax to handle it.
+ A new module to parse TOML.
+ Atomic grouping and possessive quantifiers for regexes.
+ Significant performance improvements.
+ New Self type.
+ Variadic generics.
+ Data class transforms.

That's a lot of smart words! Don't worry, we'll tell you in details about each of these features in the upcoming posts. Stay tuned!
PEP 657 (landed into Python 3.11) enhanced tracebacks so that they now include quite a precise location of where the error occurred:

Traceback (most recent call last):
File "query.py", line 24, in add_counts
return 25 + query_user(user1) + query_user(user2)
^^^^^^^^^^^^^^^^^
File "query.py", line 32, in query_user
return 1 + query_count(db, response['a']['b']['c']['user'], retry=True)
~~~~~~~~~~~~~~~~~~^^^^^
TypeError: 'NoneType' object is not subscriptable

It shows not only where the error occurred for each frame, but also which code was executed. Beautiful!
PEP 678 (landed in Python 3.11) introduced a new method add_note for BaseException class. You can call it on any exception to provide additional context which will be shown at the end of the traceback for the exception:

try:
1/0
except Exception as e:
e.add_note('oh no!')
raise
# Traceback (most recent call last):
# File "<stdin>", line 2, in <module>
# ZeroDivisionError: division by zero
# oh no!

The PEP gives a good example of how it can be useful. The hypothesis library includes in the traceback the arguments that caused the tested code to fail.
PEP 654 (landed in Python 3.11) introduced ExceptionGroup. It's an exception that nicely wraps and shows multiple exceptions:

try:
1/0
except Exception as e:
raise ExceptionGroup('wow!', [e, ValueError('oh no')])

# Traceback (most recent call last):
# File "<stdin>", line 2, in <module>
# ZeroDivisionError: division by zero

# During handling of the above exception, another exception occurred:

# + Exception Group Traceback (most recent call last):
# | File "<stdin>", line 4, in <module>
# | ExceptionGroup: wow! (2 sub-exceptions)
# +-+---------------- 1 ----------------
# | Traceback (most recent call last):
# | File "<stdin>", line 2, in <module>
# | ZeroDivisionError: division by zero
# +---------------- 2 ----------------
# | ValueError: oh no
# +------------------------------------

It's very helpful in many cases when multiple unrelated exceptions have occurred and you want to show all of them: when retrying an operation or when calling multiple callbacks.
PEP 654 introduced not only ExceptionGroup itself but also a new syntax to handle it. Let's start right with an example:

try:
raise ExceptionGroup('', [
ValueError(),
KeyError('hello'),
KeyError('world'),
OSError(),
])
except* KeyError as e:
print('caught1:', repr(e))
except* ValueError as e:
print('caught2:', repr(e))
except* KeyError as e:
1/0

The output:

caught1: ExceptionGroup('', [KeyError('hello'), KeyError('world')])
caught2: ExceptionGroup('', [ValueError()])
+ Exception Group Traceback (most recent call last):
| File "<stdin>", line 2, in <module>
| ExceptionGroup: (1 sub-exception)
+-+---------------- 1 ----------------
| OSError
+------------------------------------

This is what happened:

1. When ExceptionGroup is raised, it's checked against each except* block.

2. except* KeyError block catches ExceptionGroup that contains KeyError.

3. The matched except* block receives not the whole ExceptionGroup but its copy containing only matched sub-exceptions. In case of except* KeyError, it includes both KeyError('hello') and KeyError('world')

4. For each sub-exception, only the first match is executed (1/0 in the example wasn't reached).

5. While there are unmatched sub-exceptions, they will be tried to match to remaining except* blocks.

6. If there are still sub-exceptions left after all of that, the ExceptionGroup with them is raised. So, ExceptionGroup('', [OSError()]) was raised (and beautifully formatted).
There is one more thing you should know about except*. It can match not only sub-exceptions from ExceptionGroup but regular exceptions too. And for simplicity of handling, regular exceptions will be wrapped into ExceptionGroup:

try:
raise KeyError
except* KeyError as e:
print('caught:', repr(e))
# caught: ExceptionGroup('', (KeyError(),))
I often find myself writing a context manager to temporarily change the current working directory:

import os
from contexlib import contextmanager

@contextmanager
def enter_dir(path):
old_path = os.getcwd()
os.chdir(path)
try:
yield
finally:
os.chdir(old_path)


Since Python 3.11, a context manager with the same behavior is available as contextlib.chdir:

import os
from contextlib import chdir

print('before:', os.getcwd())
# before: /home/gram
with chdir('/'):
print('inside:', os.getcwd())
# inside: /
print('after:', os.getcwd())
# after: /home/gram
The typing.assert_type function (added in Python 3.11) does nothing in runtime as most of the stuff from the typing module. However, if the type of the first argument doesn't match the type provided as the second argument, the type checker will return an error. It can be useful to write simple "tests" for your library to ensure it is well annotated.

For example, you have a library that defines a lot of decorators, like this:

from typing import Callable, TypeVar

C = TypeVar('C', bound=Callable)

def good_dec(f: C) -> C:
return f

def bad_dec(f) -> Callable:
return f


We want to be 100% sure that all decorators preserve the original type of decorated function. So, let's write a test for it:

from typing import Callable, assert_type

@good_dec
def f1(a: int) -> str: ...

@bad_dec
def f2(a: int) -> str: ...

assert_type(f1, Callable[[int], str]) # ok
assert_type(f2, Callable[[int], str]) # not ok
PEP 681 (landed in Python 3.11) introduced typing.dataclass_transform decorator. It can be used to mark a class that behaves like a dataclass. The type checker will assume that it has init that accepts annotated attributes as arguments, eq, ne, and str. For example, it can be used to annotate SQLAlchemy or Django models, attrs classes, pydantic validators, and so on. It's useful not only for libraries that don't provide a mypy plugin but also if you use a non-mypy type checker. For instance, pyright, which is used by vscode Python plugin to show types, highlight syntax, provide autocomplete, and so on.
As we covered a 3 years back (gosh, the channel is old), if the result of a base class is the current class, a TypeVar should be used as the annotation:

from typing import TypeVar

U = TypeVar('U', bound='BaseUser')

class BaseUser:
@classmethod
def new(cls: type[U]) -> U:
...

def copy(self: U) -> U:
...

That's quite verbose, but it's how it should be done for the return type for inherited classes to be correct.

PEP 673 (landed in Python 3.11) introduced a new type Self that can be used as a shortcut for exactly such cases:

from typing import Self

class BaseUser:
@classmethod
def new(cls) -> Self:
...

def copy(self) -> Self:
...
The reveal_type function doesn't exist. However, if you call it and then run a type-checker (like mypy or pyright) on the file, it will show the type of the passed object:

a = 1
reveal_type(a)
reveal_type(len)

Now, let's run mypy:

$ mypy tmp.py
tmp.py:2: note: Revealed type is "builtins.int"
tmp.py:3: note: Revealed type is "def (typing.Sized) -> builtins.int"

It's quite helpful to see what type mypy inferred for the variable in some tricky cases.

For convenience, the reveal_type function was also added in typing module in Python 3.11:

from typing import reveal_type
a = 1
reveal_type(a)
# prints: Runtime type is 'int'
reveal_type(len)
# prints: Runtime type is 'builtin_function_or_method'

And for curious, here is the definition:

def reveal_type(__obj: T) -> T:
print(
f"Runtime type is {type(__obj).__name__!r}",
file=sys.stderr,
)
return __obj
PEP 675 (landed in Python 3.11) introduced a new type typing.LiteralString. It matches any Literal type, which is the type for explicit literals and constants in the code. The PEP shows a very good example of how it can be used to implement a SQL driver with protection on the type-checker level against SQL injections:

from typing import LiteralString, Final

def run_query(sql: LiteralString): ...

run_query('SELECT * FROM students') # ok

ALL_STUDENTS: Final = 'SELECT * FROM students'
run_query(ALL_STUDENTS) # ok

arbitrary_query = input()
run_query(arbitrary_query) # type error, don't do that
The isinstance function checks whether an object is an instance of a class or of a subclass thereof:

class A: pass
class B(A): pass
b = B()
isinstance(b, B) # True
isinstance(b, A) # True
isinstance(b, object) # True
isinstance(b, str) # False
isinstance(str, type) # True


Type-checkers understand isinstance checks and use them to refine the type:

a: object
reveal_type(a)
# ^ Revealed type is "builtins.object"
if isinstance(a, str):
reveal_type(a)
# ^ Revealed type is "builtins.str"


One more cool thing about isinstance is that you can pass in it a tuple of types to check if the object is an instance of any of them:

isinstance(1, (str, int)) # True
PEP 427 introduced (and PEP 491 improved) a new format for Python distributions called "wheel".

Before the PEP, Python distributions were just tar.gz archives containing the source code of the library distributed, some additional files (README.rst, LICENSE, sometimes tests), and setup.py file. To install the library from the distribution, pip had to download the archive, extract it into a temporary directory, and execute python setup.py install to install the package.

Did it work? Well, kind of. It works well enough for pure Python packages, but if the package has C code, it had to be built on the target machine each time the package needs to be installed, because the built binary highly depends on the target OS, architecture, and Python version.

The new wheel format allows to significantly speed up the process. It changed 2 significant things:

1. The file name for wheel packages is standardized. It contains the name and version of the package, the required minimal version (2.7, 3.8), the type (CPython, PyPy) of the Python interpreter, OS name, architecture, and ABI version. For example, flask-1.0.2-py2.py3-none-any.whl says "it is flask package version 1.0.2 for both Python 2 and 3, any ABI, and any OS". That means, Flask is a pure Python package, so can be installed anywhere. Or psycopg2-2.8.6-cp310-cp310-linux_x86_64.whl says "it is psycopg2 version 2.8.6 for CPython 3.10 Linux 64bit". That means psycopg2 has some prebuild C libraries for a very specific environment. The package can have multiple wheel distributions per version, and pip will pick and download the one that is made for you.

2. Instead of setup.py, the archive (which is now zip instead of tar.gz) contains already parsed metadata. So, to install the package, it's enough to just extract it into site-packages directory, no need to execute anything.

Currently, the wheel distribution format is well-adopted and available for almost all modern packages.

When you create a new virtual environment, make sure you have the latest version of setuptools for tarballs, and the latest version of the wheel package for wheels. No, really, do it. The wheel package is not installed by default in the new venvs, and without it, installation of some packages will be slow and painful.

python3 -m venv .venv
.venv/bin/pip install -U pip setuptools wheel