Sometimes you want to check the syntax of a py-file without running it. Such naive check may be useful as a commit-hook or a fast continuous integration check.
There is no direct way to do this. You can run the file as
However, the python standard library contains the
There is no direct way to do this. You can run the file as
python -m module.py
, that prevents the traditional if __name__ == '__main__'
block from running. Still, all imports will be executed, and this may fail if you want to check syntax in the environment where the module can't be and shouldn't be run.However, the python standard library contains the
py_compile
module that generates byte-code from Python source file without running it. That's exactly what we need:$ python -m py_compile test.c.
File "test.c", line 1
int main() {
^
SyntaxError: invalid syntax
CPython supports two levels of optimization. You can enable them with
A regular version of a script is cached to
-O
and -OO
flags.-O
sets __debug__
to False
and removes all assert
statements from the program. -OO
do the same and also discards docstrings.A regular version of a script is cached to
.pyc
file while an optimized one is cached to .pyo
. However, since Python 3.5 .pyo
is no more a thing, .opt-1.pyc
and .opt-2.pyc
are introduced by PEP 488 instead.To watch multiple file descriptors,
asyncio
uses the selectors
module. It provides high-level access to the API your kernel supports such as epoll
(Linux), kqueue
(BSD) and so on via corresponding classes (EpollSelector
, KqueueSelector
etc.).asyncio
uses selectors.DefaultSelector
which is equal to the most efficient implementation on the current platform (epoll|kqueue|devpoll > poll > select
). If you ever need to use selectors
manually, you should prefer DefaultSelector
too.selectors
uses the low-level select
module, written in C. It contains all the implementations supported by your system, which is decided in compile time.“Reduce” is a higher-order function that processes an iterable recursively, applying some operation to the next element of the iterable and the already calculated value. You also may know it termed “fold,” “inject,” “accumulate” or somehow else.
Reduce with
Python provides
Reduce with
result = result + element
brings you the sum of all elements, result = min(result, element)
gives you the minimum and result = element
works for getting the last element of a sequence.Python provides
reduce
function (that was moved to functions.reduce
in Python 3):In : reduce(lambda s, i: s + i, range(10))Also, if you ever need such simple lambdas like
Out: 45
In : reduce(lambda s, i: min(s, i), range(10))
Out: 0
In : reduce(lambda s, i: i, range(10))
Out: 9
a, b: a + b
, Python got you covered with operator
module:In : from operator import add
In : reduce(add, range(10))
Out: 45
SVG is a vector image format that stores image info by specifying all shapes and figures that need to be drawn in XML. The orange circle can be represented as simple as that:
Since SVG is a subset of XML, it's pretty easy to create SVG files in any language. You can do it in Python with
Here is an example on how to express a Recamán's sequence with a diagram.
<svg xmlns="http://www.w3.org/2000/svg">
<circle cx="125" cy="125" r="75" fill="orange"/>
</svg>
Since SVG is a subset of XML, it's pretty easy to create SVG files in any language. You can do it in Python with
lxml
, for example. However, there is the svgwrite
module that is intended precisely for creating SVGs.Here is an example on how to express a Recamán's sequence with a diagram.
To sort some sequence in Python you use
With the
Let's suppose we also want to put the numbers with the same absolute value in ascending order. In that case, we can provide a tuple as a comparison key:
This is not some
sorted
:In : sorted([1, -1, 2, -3, 3])
Out: [-3, -1, 1, 2, 3]
With the
key
argument you can provide a function that will be used to get a comparison key of each value. Let's sort the same sequence by absolute values:In : sorted([1, -1, 2, -3, 3], key=abs)
Out: [1, -1, 2, -3, 3]
Let's suppose we also want to put the numbers with the same absolute value in ascending order. In that case, we can provide a tuple as a comparison key:
In : sorted([1, -1, 2, -3, 3], key=lambda x: (abs(x), x))
Out: [-1, 1, 2, -3, 3]
This is not some
sorted
magic, this is how tuples are sorted in general:In : (1, 2) == (1, 2)
Out: True
In : (1, 2) > (1, 1)
Out: True
In : (1, 2) < (2, 1)
Out: True
Creating a new variable is essentially creating a new name for an already existing object. That's why it's called name binding in Python.
There are numerous ways to bind names, these are the examples of how
You also can bind an arbitrary name by manipulating global namespaces:
Note, however, that you cannot do the same with
There are numerous ways to bind names, these are the examples of how
x
can be bind:x = y
import x
class x: pass
def x(): pass
def y(x): pass
for x in y: pass
with y as x: pass
except y as x: pass
You also can bind an arbitrary name by manipulating global namespaces:
In : x
NameError: name 'x' is not defined
In : globals()['x'] = 42
In : x
Out: 42
Note, however, that you cannot do the same with
locals()
since updates to the locals dictionary are ignored.When you use a variable in Python, it's first looked up in the current scope. If no such variable is found, the next enclosing scope is searched. That is repeated until the global namespace is reached.
However, the variable assignment doesn't work the same way. The new variable is always created in the current scope unless
x = 1
def scope():
x = 2
def inner_scope():
print(x) # prints 2
inner_scope()
scope()
However, the variable assignment doesn't work the same way. The new variable is always created in the current scope unless
global
or nonlocal
is specified:x = 1
def scope():
x = 2
def inner_scope():
x = 3
print(x) # prints 3
inner_scope()
print(x) # prints 2
scope()
print(x) # prints 1
global
allows using variables of global namespaces while nonlocal
searches for the variable in the nearest enclosing scope. Compare:x = 1
def scope():
x = 2
def inner_scope():
global x
x = 3
print(x) # prints 3
inner_scope()
print(x) # prints 2
scope()
print(x) # prints 3
x = 1
def scope():
x = 2
def inner_scope():
nonlocal x
x = 3
print(x) # prints 3
inner_scope()
print(x) # prints 3
scope()
print(x) # prints 1
Imagine you are moving your web-API from HTTP to HTTPS. How do you handle all requests from clients who are not aware they should use HTTPS? You set up redirection rules.
What HTTP status code should you use? The choice is usually between 301 Moved Permanently and 302 Found. The first one is permanent (as the status name states) and the second one is one-off and never cached. Moving to HTTPS is usually permanent, so the choice is obvious, it's 301 Moved Permanently.
The problem with both 301 and 302 is they work correctly only for HEAD and GET requests. Though all other methods (such as POST) should work as well according to RFC, they don't. A lot of modern HTTP-clients (your favorite browser probably included) make GET requests after the redirection despite the original request method. That became so usual that RFC now explicitly says that you couldn't rely on the client persisting the method.
To fight that problem two other codes were introduced: 303 See Other and 307 Temporary Redirect. 303 implies “use GET for the new request” and 307 means “use the same method for the new request”. So basically most of the clients act like they get 303 when they get 302 while should act like they get 307.
Sadly, both 303 and 307 are temporary. To make a redirect that both method-persisting and permanent one can use 308 Permanent Redirect, but that code is still experimental.
So the correct solution for our HTTP to HTTPS migration is to use 307 Temporary Redirect. 308 is even better, but can't be relied on. Mind that human users usually start an interaction by sending GET request, so the problem with 301 only applies to robots.
What HTTP status code should you use? The choice is usually between 301 Moved Permanently and 302 Found. The first one is permanent (as the status name states) and the second one is one-off and never cached. Moving to HTTPS is usually permanent, so the choice is obvious, it's 301 Moved Permanently.
The problem with both 301 and 302 is they work correctly only for HEAD and GET requests. Though all other methods (such as POST) should work as well according to RFC, they don't. A lot of modern HTTP-clients (your favorite browser probably included) make GET requests after the redirection despite the original request method. That became so usual that RFC now explicitly says that you couldn't rely on the client persisting the method.
To fight that problem two other codes were introduced: 303 See Other and 307 Temporary Redirect. 303 implies “use GET for the new request” and 307 means “use the same method for the new request”. So basically most of the clients act like they get 303 when they get 302 while should act like they get 307.
Sadly, both 303 and 307 are temporary. To make a redirect that both method-persisting and permanent one can use 308 Permanent Redirect, but that code is still experimental.
So the correct solution for our HTTP to HTTPS migration is to use 307 Temporary Redirect. 308 is even better, but can't be relied on. Mind that human users usually start an interaction by sending GET request, so the problem with 301 only applies to robots.
list
allows you to store an array of any objects. This is quite helpful but may be inefficient. The array
module can be used to represent arrays of base values compactly. The supported values include various C types including char
, int
, long
, double
and so on. The actual representation is determined by the C implementation.In : a = array.array('B')
In : a.append(240)
In : a.append(159)
In : a.append(144)
In : a.append(180)
In : a.tobytes().decode('utf8')
Out: '🐴'
The
The thing is it's only right for Python 2. In Python 3
map
function produces the new list applying a function to each element of the original list:>>> map(int, ["1", "2", "3"])
[1, 2, 3]
The thing is it's only right for Python 2. In Python 3
map
returns a generator instead, meaning you can apply it to other generators (infinite including). The same thing happened to filter
and range
.In : def gen():
...: l = []
...: while True:
...: l.append(1)
...: yield l
...:
In : map(len, gen())
Out: <map at 0x7f85c4a11978>
In : next(map(len, gen()))
Out: 1
If you want to iterate over several iterables at once, the
Notice, that
zip
function may be a good choice. It returns a generator that yields tuples containing one element from every original iterables:In : eng = ['one', 'two', 'three']
In : ger = ['eins', 'zwei', 'drei']
In : for e, g in zip(eng, ger):
...: print('{e} = {g}'.format(e=e, g=g))
...:
one = eins
two = zwei
three = drei
Notice, that
zip
accepts iterables as separate arguments, not a list of arguments. To unzip values, you can use the *
operator:In : list(zip(*zip(eng, ger)))
Out: [('one', 'two', 'three'), ('eins', 'zwei', 'drei')]
If you want to distribute a package across different paths, it can be done with namespace packages, a special kind of packages that don't contain
However, namespace packages weren't a thing until Python 3.3. Python provided
You can learn more in PEP 420.
__init__.py
files:$ find dir1 dir2
dir1
dir1/package
dir1/package/a.py
dir2
dir2/package
dir2/package/b.py
$ PYTHONPATH='dir1:dir2' python -c 'import package.a; import package.b'
However, namespace packages weren't a thing until Python 3.3. Python provided
pkgutil.extend_path
to solve this problem. You make several packages with the same name, but extend_path
in every __init__.py
, import
loads one of that packages and extend_path
makes sure other will be loaded too:$ cat dir1/package/__init__.py
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
$ cat dir2/package/__init__.py
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
$ find dir1 dir2
dir1
dir1/package
dir1/package/a.py
dir1/package/__init__.py
dir1/package/__init__.pyc
dir2
dir2/package
dir2/package/__init__.py
dir2/package/b.py
$ PYTHONPATH='dir1:dir2' python2 -c 'import package.a; import package.b'
You can learn more in PEP 420.
The
These two types are
io
module provides two types of in-memory file-like objects. Such objects may be helpful for interacting with interfaces that only support files without the need of creating one. The obvious example is unit-testing.These two types are
BytesIO
and StringIO
that works with bytes and string respectively.In : f = StringIO()
In : f.write('first\n')
Out: 6
In : f.write('second\n')
Out: 7
In : f.seek(0)
Out: 0
In : f.readline()
Out: 'first\n'
In : f.readline()
Out: 'second\n'
python
supports several forms of starting a script. The usual one is python foo.py
; in that case, foo.py
would be simply executed.However, you can also do
python -m foo
. If foo
is not a package, then foo.py
is found in sys.path
and executed. If it is, then Python executes foo/__init__.py
, and foo/__main__.py
after that. Note, that __name__
is equal to foo
during __init__.py
execution, but it's __main__
during __main__.py
execution.You also can do
python dir/
or even python dir.zip
. In that case, dir/__main__.py
is looked for and executed if found.$ ls foo
__init__.py __main__.py
$ cat foo/__init__.py
print(__name__)
$ cat foo/__main__.py
print(__name__)
$ python -m foo
foo
__main__
$ python foo/
__main__
$ python foo/__init__.py
__main__
In Linux, crontab file must end with an empty line. This is a highly unusual requirement and may lead to an unexpected behavior.
To avoid dealing with that, you may introduce some tests to your project that will check all committed crontab files. Installing crontabs via a custom script that automatically fixes crontabs is also possible. Finally, configuration your favorite editor to always add an empty line at the end of file may be a good idea.
$ crontab file
new crontab file is missing newline before EOF, can't install.
To avoid dealing with that, you may introduce some tests to your project that will check all committed crontab files. Installing crontabs via a custom script that automatically fixes crontabs is also possible. Finally, configuration your favorite editor to always add an empty line at the end of file may be a good idea.
Converting
The most natural solution for the problem is to use
Since they are the same moments, their timestamps should be equal:
Wait, what? They are not the same at all. In fact, you can't use
The proper result can be achieved with straightforward subtraction:
Again, if you use Python 3.3+, you can solve the problem with
datetime
object to the number of seconds since the start of the epoch is not a simple task until Python 3.3.The most natural solution for the problem is to use
strftime
method that can format the datetime. Using %s
as a format you can get a timestamp. Look a the example:naive_time = datetime(2018, 3, 31, 12, 0, 0)
utc_time = pytz.utc.localize(naive_time)
ny_time = utc_time.astimezone(
pytz.timezone('US/Eastern'))
ny_time
is the exact the same moment as utc_time
, but written as New Yorkers see it:# utc_time
datetime.datetime(2018, 3, 31, 12, 0,
tzinfo=<UTC>)
# utc_time
datetime.datetime(2018, 3, 31, 8, 0,
tzinfo=<DstTzInfo 'US/Eastern' ...>)
Since they are the same moments, their timestamps should be equal:
In : int(utc_time.strftime('%s')),
int(ny_time.strftime('%s'))
Out: (1522486800, 1522468800)
Wait, what? They are not the same at all. In fact, you can't use
strftime
as a solution for this problem. Python's strftime
doesn't even support %s
as an argument, it merely works because internally the platform C library’s strftime() is called. But, as you can see, the timezone of datetime object is wholly ignored.The proper result can be achieved with straightforward subtraction:
In : epoch_start = pytz.utc.localize(
datetime(1970, 1, 1))
In : (utc_time - epoch_start).total_seconds()
Out: 1522497600.0
In : (utc_time - epoch_start).total_seconds()
Out: 1522497600.0
Again, if you use Python 3.3+, you can solve the problem with
timestamp()
method of datetime
: utc_time.timestamp()
.A lot of system calls can be interrupted by an incoming signal. If a programmer wants the call to be completed anyway, they have to issue it again.
The notable example is
However, since Python 3.5, thanks to PEP 475, Python cares about all such calls for you. The following program ends on the first
The notable example is
sleep(x)
function that is expected to freeze the program for x
seconds, but in reality, it can return earlier if a signal appears.However, since Python 3.5, thanks to PEP 475, Python cares about all such calls for you. The following program ends on the first
SIGINT
it receives in any Python before 3.5. But it sleeps for exactly five seconds regardless of the signals in Python 3.5+.import signal
import time
def signal_handler(signal, frame):
print('Caught')a
signal.signal(signal.SIGINT, signal_handler)
time.sleep(5)
List comprehensions may contain more than one
Also, any expression with
You can mix
for
and if
clauses:In : [(x, y) for x in range(3) for y in range(3)]
Out: [
(0, 0), (0, 1), (0, 2),
(1, 0), (1, 1), (1, 2),
(2, 0), (2, 1), (2, 2)
]
In : [
(x, y)
for x in range(3)
for y in range(3)
if x != 0
if y != 0
]
Out: [(1, 1), (1, 2), (2, 1), (2, 2)]
Also, any expression with
for
and if
may use all the variables that are defined before:In : [
(x, y)
for x in range(3)
for y in range(x + 2)
if x != y
]
Out: [
(0, 1),
(1, 0), (1, 2),
(2, 0), (2, 1), (2, 3)
]
You can mix
if
s and for
s however you want:In : [
(x, y)
for x in range(5)
if x % 2
for y in range(x + 2)
if x != y
]
Out: [
(1, 0), (1, 2),
(3, 0), (3, 1), (3, 2), (3, 4)
]
Python supports the new
To make your objects support this operator, you should define one of the following methods:
You can learn more from PEP 465.
@
operator since Python 3.5. It's intended to use for matrix multiplication. However, none of the standard objects support it; it was introduced specifically for the numpy
module.To make your objects support this operator, you should define one of the following methods:
__matmul__
, __rmatmul__
or __imatmul__
.You can learn more from PEP 465.