Starting Python 3.8, the interpreter warns about
Python 3.7:
Python 3.8:
The reason is that it is an infamous Python gotcha. While
is
comparison of literals.Python 3.7:
>>> 0 is 0
True
Python 3.8:
>>> 0 is 0
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
True
The reason is that it is an infamous Python gotcha. While
==
does values comparison (which is implemented by calling __eq__
magic method, in a nutshell), is
compares memory addresses of objects. It's true for ints from -5 to 256 but it won't work for ints out of this range or for objects of other types:a = -5
a is -5 # True
a = -6
a is -6 # False
a = 256
a is 256 # True
a = 257
a is 257 # False
Floating point numbers in Python and most of the modern languages are implemented according to IEEE 754. The most interesting and hardcore part is "arithmetic formats" which defines a few special values:
+
+
+
Negative zero is the easiest case, for all operations it considered to be the same as the positive zero:
Nan returns False for all comparison operations (except
And all binary operations on nan return nan:
You can read more about nan in previous posts:
+ https://tttttt.me/pythonetc/561
+ https://tttttt.me/pythonetc/597
Infinity is bigger than anything else (except nan). However, unlike in pure math, infinity is equal to infinity:
The sum of positive and negative infinity is nan:
+
inf
and -inf
representing infinity.+
nan
representing a special "Not a Number" value.+
-0.0
representing "negative zero"Negative zero is the easiest case, for all operations it considered to be the same as the positive zero:
-.0 == .0 # True
-.0 < .0 # False
Nan returns False for all comparison operations (except
!=
) including comparison with inf:import math
math.nan < 10 # False
math.nan > 10 # False
math.nan < math.inf # False
math.nan > math.inf # False
math.nan == math.nan # False
math.nan != 10 # True
And all binary operations on nan return nan:
math.nan + 10 # nan
1 / math.nan # nan
You can read more about nan in previous posts:
+ https://tttttt.me/pythonetc/561
+ https://tttttt.me/pythonetc/597
Infinity is bigger than anything else (except nan). However, unlike in pure math, infinity is equal to infinity:
10 < math.inf # True
math.inf == math.inf # True
The sum of positive and negative infinity is nan:
-math.inf + math.inf # nan
Infinity has an interesting behavior on division operations. Some of them are expected, some of them are surprising. Without further talking, there is a table:
The code used to generate the table:
truediv (/)
| -8 | 8 | -inf | inf
-8 | 1.0 | -1.0 | 0.0 | -0.0
8 | -1.0 | 1.0 | -0.0 | 0.0
-inf | inf | -inf | nan | nan
inf | -inf | inf | nan | nan
floordiv (//)
| -8 | 8 | -inf | inf
-8 | 1 | -1 | 0.0 | -1.0
8 | -1 | 1 | -1.0 | 0.0
-inf | nan | nan | nan | nan
inf | nan | nan | nan | nan
mod (%)
| -8 | 8 | -inf | inf
-8 | 0 | 0 | -8.0 | inf
8 | 0 | 0 | -inf | 8.0
-inf | nan | nan | nan | nan
inf | nan | nan | nan | nan
The code used to generate the table:
import operator
cases = (-8, 8, float('-inf'), float('inf'))
ops = (operator.truediv, operator.floordiv, operator.mod)
for op in ops:
print(op.__name__)
row = ['{:4}'.format(x) for x in cases]
print(' ' * 6, ' | '.join(row))
for x in cases:
row = ['{:4}'.format(x)]
for y in cases:
row.append('{:4}'.format(op(x, y)))
print(' | '.join(row))
PEP-589 (landed in Python 3.8) introduced
It cannot have keys that aren't explicitly specified in the type:
Also, all specified keys are required by default but it can be changed by passing
typing.TypedDict
as a way to annotate dicts:from typing import TypedDict
class Movie(TypedDict):
name: str
year: int
movie: Movie = {
'name': 'Blade Runner',
'year': 1982,
}
It cannot have keys that aren't explicitly specified in the type:
movie: Movie = {
'name': 'Blade Runner',
'year': 1982,
'director': 'Ridley Scott', # fails type checking
}
Also, all specified keys are required by default but it can be changed by passing
total=False
:movie: Movie = {} # fails type checking
class Movie2(TypedDict, total=False):
name: str
year: int
movie2: Movie2 = {} # ok
PEP-526, introducing syntax for variable annotations (laded in Python 3.6), allows annotating any valid assignment target:
The last line is the most interesting one. Adding annotations to an expression suppresses its execution:
Despite being a part of the PEP, it's not supported by mypy:
c.x: int = 0
c.y: int
d = {}
d['a']: int = 0
d['b']: int
The last line is the most interesting one. Adding annotations to an expression suppresses its execution:
d = {}
# fails
d[1]
# KeyError: 1
# nothing happens
d[1]: 1
Despite being a part of the PEP, it's not supported by mypy:
$ cat tmp.py
d = {}
d['a']: int
d['b']: str
reveal_type(d['a'])
reveal_type(d['b'])
$ mypy tmp.py
tmp.py:2: error: Unexpected type declaration
tmp.py:3: error: Unexpected type declaration
tmp.py:4: note: Revealed type is 'Any'
tmp.py:5: note: Revealed type is 'Any'
In most of the programming languages (like C, PHP, Go, Rust) values can be passed into a function either as value or as reference (pointer):
+ Call by value means that the value of the variable is copied, so all modification with the argument value inside the function won't affect the original value. This is an example of how it works in Go:
+ Call by reference means that all modifications that are done by the function, including reassignment, will modify the original value:
So, which one is used in Python? Well, neither.
In Python, the caller and the function share the same value:
However, the function can't replace the value (reassign the variable):
This approach is called Call by sharing. That means the argument is always passed into a function as a copy of the pointer. So, both variables point to the same boxed object in memory but if the pointer itself is modified inside the function, it doesn't affect the caller code.
+ Call by value means that the value of the variable is copied, so all modification with the argument value inside the function won't affect the original value. This is an example of how it works in Go:
package main
func f(v2 int) {
v2 = 2
println("f v2:", v2)
// Output: f v2: 2
}
func main() {
v1 := 1
f(v1)
println("main v1:", v1)
// Output: main v1: 1
}
+ Call by reference means that all modifications that are done by the function, including reassignment, will modify the original value:
package main
func f(v2 *int) {
*v2 = 2
println("f v2:", *v2)
// Output: f v2: 2
}
func main() {
v1 := 1
f(&v1)
println("main v1:", v1)
// Output: main v1: 2
}
So, which one is used in Python? Well, neither.
In Python, the caller and the function share the same value:
def f(v2: list):
v2.append(2)
print('f v2:', v2)
# f v2: [1, 2]
v1 = [1]
f(v1)
print('v1:', v1)
# v1: [1, 2]
However, the function can't replace the value (reassign the variable):
def f(v2: int):
v2 = 2
print('f v2:', v2)
# f v2: 2
v1 = 1
f(v1)
print('v1:', v1)
# v1: 1
This approach is called Call by sharing. That means the argument is always passed into a function as a copy of the pointer. So, both variables point to the same boxed object in memory but if the pointer itself is modified inside the function, it doesn't affect the caller code.
What if we want to modify a collection inside a function but don't want these modifications to affect the caller code? Then we should explicitly copy the value.
For this purpose, all mutable built-in collections provide method
Custom objects (and built-in collections too) can be copied using copy.copy:
However,
So, if you need to copy all subobjects recursively, use,
For this purpose, all mutable built-in collections provide method
.copy
:def f(v2):
v2 = v2.copy()
v2.append(2)
print(f'{v2=}')
# v2=[1, 2]
v1 = [1]
f(v1)
print(f'{v1=}')
# v1=[1]
Custom objects (and built-in collections too) can be copied using copy.copy:
import copy
class C:
pass
def f(v2: C):
v2 = copy.copy(v2)
v2.p = 2
print(f'{v2.p=}')
# v2.p=2
v1 = C()
v1.p = 1
f(v1)
print(f'{v1.p=}')
# v1.p=1
However,
copy.copy
copies only the object itself but not underlying objects:v1 = [[1]]
v2 = copy.copy(v1)
v2.append(2)
v2[0].append(3)
print(f'{v1=}, {v2=}')
# v1=[[1, 3]], v2=[[1, 3], 2]
So, if you need to copy all subobjects recursively, use,
copy.deepcopy
:v1 = [[1]]
v2 = copy.deepcopy(v1)
v2[0].append(2)
print(f'{v1=}, {v2=}')
# v1=[[1]], v2=[[1, 2]]
Python uses eager evaluation. When a function is called, all its arguments are evaluated from left to right and only then their results are passed into the function:
Operators
For mathematical operators, the precedence is how it is in math:
The most interesting case is operator
print(print(1) or 2, print(3) or 4)
# 1
# 3
# 2 4
Operators
and
and or
are lazy, the right value is evaluated only if needed (for or
if the left value is falsy, and for and
if the left value is truthy):print(1) or print(2) and print(3)
# 1
# 2
For mathematical operators, the precedence is how it is in math:
1 + 2 * 3
# 7
The most interesting case is operator
**
(power) which is (supposedly, the only thing in Python which is) evaluated from right to left:2 ** 3 ** 4 == 2 ** (3 ** 4)
# True
Most of the exceptions raised from the standard library or built-ins have a quite descriptive self-contained message:
However,
So, if you log an exception as a string, make sure you save the class name (and the traceback) as well, or at least use
try:
[][0]
except IndexError as e:
exc = e
exc.args
# ('list index out of range',)
However,
KeyError
is different: instead of a user-friendly error message it contains the key which is missed:try:
{}[0]
except KeyError as e:
exc = e
exc.args
# (0,)
So, if you log an exception as a string, make sure you save the class name (and the traceback) as well, or at least use
repr
instead of str
:repr(exc)
# 'KeyError(0)'
When something fails, usually you want to log it. Let's have a look at a small toy example:
This example has a few issues:
+ There is no explicit log message. So, when it fails, you can't search in the project where this log record comes from.
+ There is no traceback. When the
So, this is how we can do it better:
Also, the logger provides a convenient method
from logging import getLogger
logger = getLogger(__name__)
channels = {}
def update_channel(slug, name):
try:
old_name = channels[slug]
except KeyError as exc:
logger.error(repr(exc))
...
update_channel('pythonetc', 'Python etc')
# Logged: KeyError('pythonetc')
This example has a few issues:
+ There is no explicit log message. So, when it fails, you can't search in the project where this log record comes from.
+ There is no traceback. When the
try
block execution is more complicated, we want to be able to track where exactly in the call stack the exception occurred. To achieve it, logger methods provide exc_info
argument. When it is set to True
, the current exception with traceback will be added to the log message.So, this is how we can do it better:
def update_channel(slug, name):
try:
old_name = channels[slug]
except KeyError as exc:
logger.error('channel not found', exc_info=True)
...
update_channel('pythonetc', 'Python etc')
# channel not found
# Traceback (most recent call last):
# File "...", line 3, in update_channel
# old_name = channels[slug]
# KeyError: 'pythonetc'
Also, the logger provides a convenient method
exception
which is the same as error
with exc_info=True
:logger.exception('channel not found')
Let's have a look at the following log message:
When this message is logged, it can be hard based on it alone to reproduce the given situation, to understand what went wrong. So, it's good to provide some additional context. For example:
That's better, now we know what user it was. However, it's hard to work with such kinds of messages. For example, we want to get a notification when the same type of error messages occurred too many times in a minute. Before, it was one error message, "user not found". Now, for every user, we get a different message. Or another example, if we want to get all messages related to the same user. If we just search for "13", we will get many false positives where "13" means something else, not
The solution is to use structured logging. The idea of structured logging is to store all additional values as separate fields instead of mixing everything in one text message. In Python, it can be achieved by passing the variables as the
However, the default formatter doesn't show
So, if you use
import logging
logger = logging.getLogger(__name__)
logger.warning('user not found')
# user not found
When this message is logged, it can be hard based on it alone to reproduce the given situation, to understand what went wrong. So, it's good to provide some additional context. For example:
user_id = 13
logger.warning(f'user #{user_id} not found')
That's better, now we know what user it was. However, it's hard to work with such kinds of messages. For example, we want to get a notification when the same type of error messages occurred too many times in a minute. Before, it was one error message, "user not found". Now, for every user, we get a different message. Or another example, if we want to get all messages related to the same user. If we just search for "13", we will get many false positives where "13" means something else, not
user_id
.The solution is to use structured logging. The idea of structured logging is to store all additional values as separate fields instead of mixing everything in one text message. In Python, it can be achieved by passing the variables as the
extra
argument. Most of the logging libraries will recognize and store everything passed into extra
. For example, how it looks like in python-json-logger:from pythonjsonlogger import jsonlogger
logger = logging.getLogger()
handler = logging.StreamHandler()
formatter = jsonlogger.JsonFormatter()
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.warning('user not found', extra=dict(user_id=13))
# {"message": "user not found", "user_id": 13}
However, the default formatter doesn't show
extra
:logger = logging.getLogger()
logger.warning('user not found', extra=dict(user_id=13))
# user not found
So, if you use
extra
, stick to the third-party formatter you use or write your own.Multiline string literal preserves every symbol between opening and closing quotes, including indentation:
A possible solution is to remove indentation, Python will still correctly parse the code:
However, it's difficult to read because it looks like the literal is outside of the function body but it's not. So, a much better solution is not to break the indentation but instead remove it from the string content using textwrap.dedent:
def f():
return """
hello
world
"""
f()
# '\n hello\n world\n '
A possible solution is to remove indentation, Python will still correctly parse the code:
def f():
return """
hello
world
"""
f()
# '\nhello\n world\n'
However, it's difficult to read because it looks like the literal is outside of the function body but it's not. So, a much better solution is not to break the indentation but instead remove it from the string content using textwrap.dedent:
from textwrap import dedent
def f():
return dedent("""
hello
world
""")
f()
# '\nhello\n world\n'
If any function can modify any passed argument, how to prevent a value from modification? Make it immutable! That means the object doesn't have methods to modify it in place, only methods returning a new value. This is how numbers and
This is why every built-in collection has an immutable version:
+ Immutable
+ Immutable
+ Immutable
+
And since it is just a proxy, not a new type, it reflects all the changes in the original mapping:
str
are immutable. While list
has append
method that modifies the object in place, str
just doesn't have anything like this, all modifications return a new str
:a = b = 'ab'
a is b # True
b += 'cd'
a is b # False
This is why every built-in collection has an immutable version:
+ Immutable
list
is tuple
.+ Immutable
set
is frozenset
.+ Immutable
bytearray
is bytes
.+
dict
doesn't have an immutable version but since Python 3.3 it has types.MappingProxyType
wrapper that makes it immutable:from types import MappingProxyType
orig = {1: 2}
immut = MappingProxyType(orig)
immut[3] = 4
# TypeError: 'mappingproxy' object does not support item assignment
And since it is just a proxy, not a new type, it reflects all the changes in the original mapping:
orig[3] = 4
immut[3]
# 4
Python has a built-in module sqlite3 to work with SQLite database.
Fun fact: for explanation what is SQL Injection the documentation links xkcd about Bobby tables instead of some smart article or Wikipedia page.
import sqlite3
conn = sqlite3.connect(':memory:')
cur = conn.cursor()
cur.execute('SELECT UPPER(?)', ('hello, @pythonetc!',))
cur.fetchone()
# ('HELLO, @PYTHONETC!',)
Fun fact: for explanation what is SQL Injection the documentation links xkcd about Bobby tables instead of some smart article or Wikipedia page.
Since Python doesn't have a
This is an infinite type and you can't construct in a strictly typed language (and why would you?) because it's unclear how to construct the first instance (thing-in-itself?). For example, in Haskell:
char
type, an element of str
is always str
:'@pythonetc'[0][0][0][0][0]
# '@'
This is an infinite type and you can't construct in a strictly typed language (and why would you?) because it's unclear how to construct the first instance (thing-in-itself?). For example, in Haskell:
Prelude> str = str str
<interactive>:1:7: error:
• Occurs check: cannot construct the infinite type: t1 ~ t -> t1
Some operators in Python have special names.
Many Pythonistas know about the notorious "walrus" operator (
ones like the diamond operator (
The diamond operator was suggested in PEP 401 as one of
the first actions of the new leader of the language Barry Warsaw after Guido went climbing Mount Everest.
Luckily, it was just an April Fool joke and the operator was never really a part of the language.
Yet, it's still available but hidden behind the "import from future" flag.
Usually you compare for non-equality using
But if you enable the "Barry as FLUFL" feature the behavior changes:
Unfortunately, this easter egg is only working in interactive mode (REPL), but not in usual
By the way, it's interesting that this feature is marked as becoming mandatory in Python 4.0.
Many Pythonistas know about the notorious "walrus" operator (
:=
), but there are less famousones like the diamond operator (
<>
) — it's similar to the "not equals" operator but written in SQL style.The diamond operator was suggested in PEP 401 as one of
the first actions of the new leader of the language Barry Warsaw after Guido went climbing Mount Everest.
Luckily, it was just an April Fool joke and the operator was never really a part of the language.
Yet, it's still available but hidden behind the "import from future" flag.
Usually you compare for non-equality using
!=
:>>> "bdfl" != "flufl"
True
But if you enable the "Barry as FLUFL" feature the behavior changes:
>>> from __future__ import barry_as_FLUFL
>>> "bdfl" != "flufl"
File "<stdin>", line 1
"bdfl" != "flufl"
^
SyntaxError: with Barry as BDFL, use '<>' instead of '!='
>>> "bdfl" <> "flufl"
True
Unfortunately, this easter egg is only working in interactive mode (REPL), but not in usual
*.py
scripts.By the way, it's interesting that this feature is marked as becoming mandatory in Python 4.0.
Have you ever wondered how do relative imports work?
Im pretty sure that you've done something like that at some point:
It's using a special magic attribute on the module called
Lets say you have the following structure:
The value of
Note that for
So when you're doing
You can actually hack
Im pretty sure that you've done something like that at some point:
from . import bar
from .bar import foo
It's using a special magic attribute on the module called
__package__
.Lets say you have the following structure:
foo/
__init__.py
bar/
__init__.py
main.py
The value of
__package__
for foo/__init__.py
is set to "foo"
, and for foo/bar/__init__.py
its "foo.bar"
.Note that for
main.py
__package__
isn't set, that's because main.py
is not in a package.So when you're doing
from .bar import buz
within foo/__init__.py
, it simply appends "bar"
to foo/__init__.py
's __package__
attribute, esentially it gets translated to from foo.bar import buz
.You can actually hack
__package__
, e.g:>>> __package__ = "re"
>>> from . import compile
>>> compile
<function compile at 0x10e0ee550>
Modules have a magic attribute called
that submodule.
So if you do
You can play around with
Create simple Python module anywhere on your system:
__path__.
Whenever you're doing subpackage imports, __path__
is being searched for that submodule.
__path__
looks like a list of path strings, e.g ["foo/bar", "/path/to/location"].
So if you do
from foo import bar,
or import foo.bar, foo
's __path__
is being searched for bar.
And if found - loaded.You can play around with
__path__
to test it out.Create simple Python module anywhere on your system:
$ tree
.
└── foo.py
$ cat foo.py
def hello():
return "hello world"
Then, run the interpreter there and do the following:
python
>>> import os
>>> os.__path__ = ["."]
>>> from os.foo import hello
>>> hello()
'hello world'
As you can see, foo
is now available under os:
python
>>> os.foo
<module 'os.foo' from './foo.py'>