If you’re a company or individual that has a significant technical investment in Python over the past years, chances are that you’re experiencing your own little Y2K with the shift from Python 2 to 3.
This is a multipart article that describes the current situation and introduces you to some strategies for porting your projects from 2 to 3.
This is the first part that will offer an introduction to the current state of affairs and about what are the challenges that await
Introduction
The state of Python
Python was initially created in the early 90s and has since then become one of the most popular languages in the world. It has spread from it’s initial purpose as a training language to become entrenched into web applications, machine learning, automation, gaming and several other fields.
The early versions of Python were easy to change since the number of users were limited and the language had not solidified yet. As the years went by, it became harder and harder to make fundamental changes and at the same time, deep architectural issues surfaced which required backward incompatible changes. Like a badly set bone, Python was crippled and a drastic decision was taken to make an non backward compatible release with a new major version number. This was Python 3. The first Python 3.0.0 was released in 2008.
There was a general push which intensified over the years asking everyone to port their projects over to Python 3 as well as to use Python 3 for future development. However, for a variety of reasons, the move didn’t go as smoothly as one could have wished. Carrots turned into sticks as time went on and finally, the decision. was taken to sunset Python 2 on Jan 1 2020. This means that bug fixes, enhancements and other improvements would no longer be made to Python 2. In other words, if you decide to stick with 2, you’re on your own and the community as whole will leave you behind. This went on for a while but finally, the announcement about the last 2.x release was made and the sun has set on Python 2.
What do you do from here? The only way forward is to port your program to Python 3. This can be a challenging task and a significant chunk of work for your team.
Summary of differences between 2 and 3
Forewarned is forearmed
In order to deal with this problem, we first need to arm ourselves with knowledge of the differences between the languages and they are, in a very real sense, two different languages that are syntactically very similar. We can broadly break down the changes between the languages into the following categories.
- Syntactic differences
- These are very easy to catch and there are tools which can automatically make these transforms to your code. A simple example is how in Python 2,
print
is a statement and so can be written asprint 'hello'
whereas in Python 3,print
is a function and you need to use parenthesis like soprint ('hello')
. - API differences
- There are subtle differences in the return types of several methods which might lead to gotchas e.g. Given a dictionary
d
,d.keys()[0]
will crash in Python 3 but will return something in Python These are harder to catch but if you run the code, the incompatibilities will cause it to crash and you can fix it. - Library differences
- These are really a subclass of semantic differences but it’s worth mentioning as special case. There are significant differences between the standard library of Python 2 vs. Python 3. These can be caught in the same way as the semantic differences and we can fix it using shims that are provided by several libraries.
- Data type differences
- This is more subtle and they’re tricky to catch. One classic example is the difference between strings and bytes in Python 2 and Python 3. Unless we read the code carefully and write testcases to expose the fault lines which will cause cracks during a port, this will break silently and fatally.
Being aware of these differences allows us to be prepared for the hornets nest that we’re going to attack.
Why is this not like a library upgrade?
If the interpreter is the same, it’s possible to upgrade a code base piece by piece. Feature flags, compatibility layers and several other time tested techniques are available to make this process more or less mundane.
With a Python version upgrade, it’s not possible to port one part of the program and run that piece with 3 while keeping the rest running with 2. The only way this is possible if the two pieces are separate processes and rearchitecting it to make it like that is a lot of work.
If your code is ported, it will run. If it’s not and your’re lucky, it will crash immediately. If, and this is more probable, you’re not, hidden incompatibilities will cause fatal crashes in production.
In the next part, we will discuss the why of porting and the groundwork you need to do before taking the leap.