Python under the hood — tips and tricks from a C++-programmers’ perspective 01

SmartLab AI
6 min readSep 5, 2018

Author: Patrik Reizinger

How you ever felt the urge to quickly try out some novel ideas and turn them into code? After that, have you ever felt that allocating memory on the heap and black pointer magic is not exactly helping you to be creative?

Then you just stumbled upon Python, which seemed to provide a just too easy solution for that. Well, that can not be true that something is so handy with a vast number of packages for almost every scientific domain or everyday task.

Yeah, Python rocks in many ways, but for a C/C++-programmer it may need a small change of point of view, or shall we call it programming paradigm. As being in that shoes a year before, I have decided to write some posts about how Python works and give away my tips & tricks which can spare you a significant amount of puzzling looks above seemingly well-written but (also seemingly) erroneous code.

So this series will provide you some a general introduction and tips & tricks for Python, from a C/C++-programmers point of view, while her/his background is the only thing I assume you possess.

I intend to associate my posts with code examples, either with separate ones (e.g. as Gists on GitHub) or just as with a few lines inline. I might prefer the former solution due to better syntax highlighting capabilities on this blog. I may also suggest some article worth reading, if you have further interest in some topics.

The schedule is: firstly I will show you some rather significant and interesting nuances of the language, after that, might you be with me that time, dig deeper — which I highly recommend, because you should have to do something with deep learning if you read this blog (although, my content will be also useful for you if you don’t). This post will clarify for you, what exactly Python is and after that, we investigate through examples the object management system of Python.

We shall begin, shouldn’t we?

The language of Python

Just for clarification, we will discuss the categorization of Python, namely: is it compiled or interpreted? Named after the British comedian group Monty Python, what should we have expected? The answer, honestly, is both.

Yes, we can see the assembly of Python code (we need for that the dis package). But this bytecode is executed on a Python virtual machine.

Furthermore, Python is only a language concept, the implementation can be diverse: you most probably use CPython (the most popular one), but there exist such things as Jython and IronPython also.

You still with me? I hope this small piece of theoretical monologue does not frightened you to get to know Python a bit more thoroughly.

Object management

For in Python everything is an object, they management should be crucial for efficient code. With everything, I really mean everything, e.g. Python has no native number type, so integers and floats are also objects, which results in a significant overhead — one major reason why the most efficient numerical libraries (e.g. NumPy) are not written in Python, they just have an interface for this superb language.

integer = 5
floating_point = 5.0
string = "Five"
l = list()
d = dict()
print("integer is an object: ", isinstance(integer, object)) # True
print("floating_point is an object: ", isinstance(floating_point, object)) # True
print("string is an object: ", isinstance(string, object)) # True
print("l is an object: ", isinstance(l, object)) # True
print("d is an object: ", isinstance(d, object)) # True

Since everything is an object, the fact seems feasible that Python is not typed, i.e. you can assign a class instance to a variable, after that you change your mind and the variable will hold an integer. Alright, then?

If you are familiar with smart pointers from standard C++ or have ever used something like that from the also very fascinating Boost libraries, you surely have a grasp about what a reference counter is, if not, you can imagine that as a simple counter which counts how many references exist to a given variable.

x = 5
y = x
print("ID of x: ", id(x))
print("ID of y: ", id(y)) # id(x) = id(y)
###a = 5
b = 5
print("ID of a: ", id(a))
print("ID of b: ", id(b)) # id(a) = id(b), because an int is immutable, see further explanation below

To further enlighten my former, rather self-referent definition, we should add something to the math behind Python: if you assign a value to a variable, your variable is de facto a referent to that object (that means, it refers to the object). If you assign 2 variables to an object, the internal reference counter will have a value of 2, if one of them goes out of scope, it will be automatically decremented, showing only 1. When the counter hits 0, the object gets destructed automagically.

And here comes the black magic part of Python: if you copy a variable, you can forget in general about the commonly under “copy” referred C++ term, if you would like to have a C++-synonym, copy constructibility would suffice the most. Because Python by default makes shallow copies, i.e. the internal counter will be incremented — it is quite similar to the C++ shared_ptr paradigm. Be sure you understand this paragraph, because it can be the cause for very hard to debug code, if you assume that with copying your objects are independent from each other (for that you need the deepcopy function from the copy package, then you get really what you have expected, only it is a more resource-demanding operation).

Unfortunately, Python gets here more complicated, because it differentiates between “copying“ mutable and immutable data types.

The KISS version: mutable means that modifying the object a referent (= a variable name) refers to modifies the original object, thus every referent of that specific object will register the change. In the case of immutable types, modification of the referred object breaks the reference, it creates a new object, thus only for the variable you modified will the change be visible.

If you are a bit confused, don’t worry, I will show you some examples from which you will easily understand, but before that, I list some examples for mutable (list, dict, tuple) and immutable (int, float) data types.

# immutable references
x = 5
y = x
x = 6
print("y = ",y)
# mutable references
l = [1, 2, 3]
ll = l # shallow copy
l.append(4)
print("ll = ", ll)

Passing parameters

Well, Python does not have pointers as good old C/C++, but the before mentioned reference management system enables you to modify an object within a function and use the modified version in the outer scope.

But…, yes there is a but again! Because we should here also differentiate the mutable/immutable cases.

def mutable_func(x):
x +=1
print("x in inner scope = ", x)

def immutable_func(x):
x.append(5)
print("x in inner scope = ", x)
###x = 4
print("x in outer scope before function call = ", x) # x = 4
mutable_func(x) # x = 5
print("x in outer scope after function call = ", x) # x = 4
###x = [1,2,3,4]
print("x in outer scope before function call = ", x) # x = [1, 2, 3, 4]
immutable_func(x) # x = [1, 2, 3, 4, 5]
print("x in outer scope after function call = ", x) # x = [1, 2, 3, 4, 5]

For mutable types, the modification within a function body is visible to the outer scope, and only for mutable ones. Immutable variables “are reset” to their original values as soon as the function returns. What is the solution? There exist many, for example you can exploit the superb feature of Python to supporting multiple return values.

Just a last remark: although Python support default arguments, you should use them very carefully for mutable types, because they retain their values between function calls, just see the example below and stay tuned for the next post!

def mutable_default_arg(x=[]):
x.append(5)
print(x)
mutable_default_arg() # [5]
mutable_default_arg()# [5, 5]
mutable_default_arg() # [5, 5, 5]

Complete code can be found here:

--

--

SmartLab AI

Deep Learning and AI solutions from Budapest University of Technology and Economics. http://smartlab.tmit.bme.hu/