Python bytecode

Just like Java or C#, CPython is compiling the code into bytecode which is then interpreted by a virtual machine. The Python library dis allows to disassemble Python code and to see how are things are compiled under the hood. Consider the following code:

>>> def test():
...     for i in range(10):
...             print(i)
...

You can call dis.dis(test) to display the compiled bytecode, and dis.show_code(test) to understand the symbols referenced by that bytecode.

>>> import dis
>>> dis.dis(test)
  2           0 SETUP_LOOP              30 (to 33)
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_CONST               1 (10)
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 GET_ITER
        >>   13 FOR_ITER                16 (to 32)
             16 STORE_FAST               0 (i)

  3          19 LOAD_GLOBAL              1 (print)
             22 LOAD_FAST                0 (i)
             25 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             28 POP_TOP
             29 JUMP_ABSOLUTE           13
        >>   32 POP_BLOCK
        >>   33 LOAD_CONST               0 (None)
             36 RETURN_VALUE

>>> dis.show_code(test)
Name:              test
Filename:          <stdin>
Argument count:    0
Kw-only arguments: 0
Number of locals:  1
Stack size:        3
Flags:             OPTIMIZED, NEWLOCALS, NOFREE
Constants:
   0: None
   1: 10
Names:
   0: range
   1: print
Variable names:
   0: i

Understanding the bytecode

The output of dis.dis() is split in 4 columns:

The first column represents the line number (line 2 is “for i in range(10):”, line 3 is “print(i)”)
The second column is the bytecode offset in bytes (each bytecode instruction takes a certain number of bytes).
The third column is the bytecode instruction (see the dis documentation for the instruction reference)
The fourth column is the instruction argument, when any (in parenthesis is a more human-readable translation of this argument)

The CPython bytecode works in a very similar way than assembly languages such as x86 or ARM assembly. It is heavily relying on a stack where it will push arguments on top (using instructions such as LOAD_GLOBAL, LOAD_CONST, LOAD_FAST) and pop arguments off it (POP_TOP). Some instruction will also push a result on the stack.

Finally, like regular assembly languages, the CPython bytecode flow is heavily regulated by direct jumps (JUMP_ABSOLUTE) or relative jumps (e.g. 13 bytes after the next instruction)

However, the CPython bytecode contains high-level instructions such as GET_ITER or MAKE_FUNCTION that are not present in a traditional assembler.

For those who want to get more informations about do the bytecode instructions do, you can look at the C source code in Python/ceval.c.

Loops

The dis.dis() output above highlights the bytecode instructions that implement a loop:

Instruction	Description
SETUP_LOOP 30	Pushes a loop block on the block stack. Argument 30 indicates that the loop ends at offset 30 after the next function (so at offset 3+30 = 33)
GET_ITER	Use the iterator which is (supposed to be) on top of the stack. This iterator is the range(10) computed before. See the next section for details about how function calls are implemented)
FOR_ITER 16	Goes through the iterator and pushes the next value on the stack. When we’re at the end of the iterator, jump at offset 16 after the next function (so at offset 16+16=32)
STORE_FAST 0	Stores what is on the stack to variable #0 (dis.show_code indicates this is variable i)
JUMP_ABSOLUTE 13	Jumps at offset 13 (so back to FOR_ITER)
POP_BLOCK	The loop has ended, remove the loop block from the block stack

Function calls

Python function calls are implemented by putting some arguments on the stack using bytecode instructions such as LOAD_GLOBAL (add a global variable), LOAD_CONST (add a constant), LOAD_FAST (add a local variable), etc. Let’s look at how the print(i) gets implemented:

  3          19 LOAD_GLOBAL              1 (print)
             22 LOAD_FAST                0 (i)
             25 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             28 POP_TOP

Instruction	Description	Stack
LOAD_GLOBAL 1	Adds the global name #1 on the stack. dis.show_code() indicates that name #1 is “print”. So we’re effectively pushing a reference to the print function on top of the stack	print
LOAD_FAST 0	Adds the local variable name #0 on the stack. dis.show_code() indicates that variable #1 is the variable i	i print
CALL_FUNCTION 1	Calls a function with one positional argument (add 256 for each keyword argument, e.g. foo(x=10)). This tells Python to call print with the variable i as an argument. The function will remove the two elements from the stack and put instead the result (even if it is None)	None
POP_TOP	Because we are not using the value returned by print(), we just remove it from the stack.

Actual binary code

The dis library is relying on the __code__ attribute but makes it much easier to use. e.g. instead of the actual binary returned by __code__.co_code, dis.dis displays the bytecode functions in plain English.

>>> test.__code__
<code object test at 0x0000000002258270, file "<stdin>", line 1>
>>> test.__code__.co_code
b'x\x1e\x00t\x00\x00d\x01\x00\x83\x01\x00D]\x10\x00}\x00\x00t\x01\x00|\x00\x00\x83\x01\x00\x01q\r\x00Wd\x00\x00S'
>>> list(test.__code__.co_code)
[120, 30, 0, 116, 0, 0, 100, 1, 0, 131, 1, 0, 68, 93, 16, 0, 125, 0, 0, 116, 1, 0, 124, 0, 0, 131, 1, 0, 1, 113, 13, 0, 87, 100, 0, 0, 83]
>>> test.__code__.co_varnames
('i',)
>>> test.__code__.co_nlocals
1
>>> test.__code__.co_names
('range', 'print')

The line highlighted above corresponds to the actual compiled binary. The first number (120) is the opcode for SETUP_LOOP (look for 120 in Include/opcode.h), the next two numbers (30 and 0) represent the argument, or 30. Likewise, the “13” towards the end of the array corresponds to the argument passed to the “JUMP_ABSOLUTE 13” command. But using dis.dis() is definitely easier!

When dis.dis() is useful

We will see uses for dis.dis() in future posts, but for now one of its uses is when a piece of code does not behave as expected and you don’t know why. Consider the following code:

>>> a / b / c / d
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

We have a division by zero, but what variable is the cause for that error? If the code was inside some more complex code, it would require to step into the debugger. dis.dis() can however help shed some light on what happened.

>>> dis.dis()
  1           0 LOAD_NAME                0 (a)
              3 LOAD_NAME                1 (b)
    -->       6 BINARY_TRUE_DIVIDE
              7 LOAD_NAME                2 (c)
             10 BINARY_TRUE_DIVIDE
             11 LOAD_NAME                3 (d)
             14 BINARY_TRUE_DIVIDE
             15 PRINT_EXPR
             16 LOAD_CONST               0 (None)
             19 RETURN_VALUE

The line with the arrow (highlighted) indicates where the crash occurred: when trying to divide a with b (they have already been pushed on the stack). So we know that b is equal to zero.

	lpoulain on The Garbage Collector
	stef1996 on The Garbage Collector
	lpoulain on The Garbage Collector
	stef1996 on The Garbage Collector
	Python Garbage Colle… on The Garbage Collector

Yet Another Python Internals Blog

Python bytecode

Understanding the bytecode

Loops

Function calls

Actual binary code

When dis.dis() is useful

Leave a comment Cancel reply

Understanding the bytecode

Loops

Function calls

Actual binary code

When dis.dis() is useful

Partager:

Related

Leave a comment Cancel reply