Should We Return to Python 2?
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 115 | |
Author | ||
Contributors | ||
License | CC Attribution - NonCommercial - ShareAlike 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/58809 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202134 / 115
1
3
19
25
31
34
36
38
41
42
44
46
48
52
60
61
62
65
69
73
76
82
83
84
92
94
96
103
110
113
00:00
CodeHuman migrationDampingRange (statistics)IntegerNumberDataflowLine (geometry)File formatDifferenz <Mathematik>String (computer science)Data structureDifferent (Kate Ryan album)Revision controlCASE <Informatik>Computer fileParameter (computer programming)Functional (mathematics)Division (mathematics)Default (computer science)Backdoor (computing)Electronic mailing listMultiplication signRow (database)Software engineeringMathematicsView (database)ResultantPlanningAverageAxiom of choiceRoundness (object)UnicodePower (physics)Arithmetic meanPoint (geometry)Software testingFraction (mathematics)Object (grammar)Physical systemLogic2 (number)BitBefehlsprozessorPhysicalismLengthWeb pageInformationCalculationDecimalDirectory serviceAttribute grammarDistribution (mathematics)Inheritance (object-oriented programming)Time zoneMessage passingGodLibrary (computing)Projective planeLetterpress printingSign (mathematics)MereologyError messageSoftware developerDrop (liquid)Information securityKey (cryptography)Near-ringFood energyStandard deviationOperator (mathematics)MetreSinc functionExtension (kinesiology)Type theorySieve of EratosthenesNegative numberInstance (computer science)Real-time operating systemTimestampComputer musicModule (mathematics)Social classSystem callTesselationMusical ensembleFloating pointMachine codePiContent (media)Table (information)Pattern languageLatin squareSoftware maintenanceCartesian coordinate systemInfinityProgrammer (hardware)SubsetAuthorizationOnline chatProof theoryNeuroinformatikMobile appQuicksortComputer programmingTwitterProcess (computing)Meeting/Interview
00:36
CodeMultiplication signSoftware engineeringComputer virusMeeting/Interview
01:10
Game theoryFood energyHuman migrationRevision controlCodeSoftware testingStrategy gameMatrix (mathematics)Group actionInstallation artPauli exclusion principleElectronic mailing listVariable (mathematics)Floating pointPerformance appraisalOperator (mathematics)Generic programmingType theoryDefault (computer science)Object (grammar)Inheritance (object-oriented programming)CodeData structureRevision controlContent (media)Cartesian coordinate systemMultilaterationObject (grammar)VotingComputer fileMultiplication signInheritance (object-oriented programming)AuthorizationDefault (computer science)Pattern languageMachine codeSocial classDifferenz <Mathematik>Programmer (hardware)Parameter (computer programming)Projective planeFunctional (mathematics)System callUnicodeMathematicsSpecial unitary groupFood energyElectronic mailing listQuicksortTwitterBackdoor (computing)Point (geometry)Letterpress printingView (database)Library (computing)MereologyDifferent (Kate Ryan album)Proof theoryInformation securityRight angleComputer animation
09:36
Default (computer science)Inheritance (object-oriented programming)String (computer science)Interior (topology)Special unitary groupSummierbarkeitNumberInteger2 (number)DampingMultiplication signPower (physics)CalculationCodeFraction (mathematics)DecimalMereologyGodExistencePoint (geometry)View (database)Different (Kate Ryan album)Revision controlMetreString (computer science)Line (geometry)Software maintenanceType theoryDifferenz <Mathematik>Software developerLogicMathematicsBitRange (statistics)Standard deviationResultantMachine codeArithmetic meanComputer animation
18:03
Process (computing)Open setLattice (order)Reading (process)SubsetTime zonePoint (geometry)Division (mathematics)BefehlsprozessorMultiplication signWeb pageDecimalReal-time operating systemDefault (computer science)Process (computing)Library (computing)Computer fileFlow separationString (computer science)Information2 (number)PlanningObject (grammar)Revision controlDirectory serviceMathematicsIntegerRoundness (object)Floating pointPhysical systemNumberDampingFunctional (mathematics)SubsetInstance (computer science)Object-oriented programmingDistribution (mathematics)Module (mathematics)Letterpress printing7 (number)Standard deviationCodeFile formatSign (mathematics)Sinc functionSoftware testingNumeral (linguistics)Power (physics)Computer animation
26:29
Multiplication signAxiom of choiceRevision controlCodeLecture/ConferenceMeeting/Interview
27:59
Hill differential equation
Transcript: English(auto-generated)
00:06
So, now, we have a talk on should we return to Python 2? And in this interesting talk, we're going to be discussing the migration to Python
00:22
3 and Python 2 relics that are still there in the code. Our speaker is Miroslav Kedivy. He is a software engineer at the Treport company based in Karlsruhe in Germany.
00:47
And I think it's going to be a very interesting talk. So, hello there. Hi. Hello. Happy to be here at home with you, all participants of your Python. Looking forward to meet you in person next time.
01:01
But now, let's have a look at some Python 2, Python 3 code together. So, the question is, should we return to Python 2? And the answer is no. Well, maybe, why? Why shouldn't we return that? And why I'm asking this question. Last year, at the beginning of last year, I just put some sort of joke on Twitter that
01:26
people should abandon Python 2.7 and move to Python 3, finally, because Python 2 doesn't work in 2020 anymore. And then, last year, later last year, I had some time to have a look at some projects and contribute to some online projects, GitHub projects of Python and see whether they have
01:44
already moved to Python 3. And most of them did. But I had an impression that they kept some sort of backdoor in their code and kept a lot of Python 2 code within their Python 3 code, but a lot of code that works for Python
02:00
2 and doesn't really make sense in Python 3. It works in Python 3, but you probably don't need it in Python 3. And then, I wanted to look at it. So, my name is Miroslav Shegimi. I'm now based, actually, in Vienna, together with Treport Austria, a company that uses Python for online trading of energy and other commodities, and I'm every day using Python
02:29
to make the sun shine, the wind blow, and the gas flow. You can join our booth at the conference, in the conference chat, and I will be happy,
02:40
and my colleagues, as well, to welcome you and speak to you about what we do exactly. But what is now my talk about? I suppose, I assume, that you have already migrated your project to Python 3, to some version of Python 3. So, it works in Python 3, but maybe there is still something interesting that we can have a look at, and that may be a possibility to improve your code and make it a future
03:06
proof. The question is, which Python version should your code support? This is the calendar of all the Python versions, with the bolt parts in which months they are fully supported, and the new features come in, and when they are only security
03:25
fixes. So, we are now in summer 2021, and you see Python 3.9 is now the main version. 3.10 is somewhere in the near future. Everybody knows about it, but it has not been still fixed. And 3.8, 3.7, 3.6, they are still with security fixes, provided with security fixes, but they
03:44
are not developed any more. And 3.5, 3.4, and 2.7, they are dead. I'm simply dead. So, forget about them. So, it means now, which version should you support? Now, there are two possibilities. One is you have your app, your programme, that runs on your computer, and that should
04:05
be like really supporting the latest features, then go with Python 3.9, because this is something that is currently developed, and later in a few months, you can switch to 3.10, and so on, because nobody depends on you. But if you have a library, and you expect other people to work with it, then don't
04:24
stick to a single version, because if you want to go to the future, your code has to be maintainable, and your code has to be upgradable also from the point of view of the other application authors or programmers.
04:41
This means that your code has to support at least two or preferably more versions, so that if I am as an application developer, I want to switch from one version to another, then both of them are supported at the same time, which means that if you have a library right now, I would recommend you to support 3.6 to 3.9, later at 3.10, then drop 3.6, and
05:04
so on, and so on, and then just move this cascade further and further. In general, if your code runs Python 3, of course, your prints are functions, you are iterating over bits, not with iter keys, iter values, iter items, you don't use
05:25
range, and probably you still have some pieces of code that uses 6, which was a library to support both Python 2 and Python 3. Now you can probably get rid of them. Then check if you somewhere in your code have an if part where you do something different
05:43
for Python 3 and Python 2. These are all the places that you can actually get rid of. Or if you try to import some library that is in Python 3 but not in Python 2, then probably some try except with import error. This is everything that you can already get rid of, so just check your code.
06:01
And before we touch the code, actually, we must make sure that it runs. So if it is a GitHub project, you will probably have already some nice running structure with GitHub workflows, and probably you are using TOX. I hope so. Your TOX file, very simple, lists all the Python versions that your code should support,
06:25
and then you just run Pytest. Do you run Pytest? You should run Pytest. And all this is run within GitHub workflows, so you have some simple ci.yaml file that supports exactly these versions. And now it is 3.6 to 3.9, but later you will update it to 3.10 and remove 3.6 and so on.
06:46
And this is something that in winter, when I contributed to some project, I just checked, oh, they are still supporting 3.4, 3.5, so just drop it and move to the later versions. And this is something that every year should be touched every time a new version of Python
07:00
comes out. Now, the code quality or the content of code also evolves, because you also have to change your code, because code in Python 3.5 is not like 3.9. And then there is this nice overview of all the main features of Python versions from
07:21
my friend Jurgen. And here you see, for example, if your code supports 3.6, you can already use fstrings, you can use underscores and numerical laterals. But if your code is only supporting several versions, then you cannot use, for example,
07:41
voters operator, some generic types, or the latest pattern matching, because it is only in 3.10. So you have actually to take care of what can you support, but from 3.6, you have already plenty of beautiful features. Then there are some places in your code, for example, at the beginning, coding UTF-8,
08:01
remove that line. You don't need it because in Python 3, that's default. If you construct a class that inherits from object, don't mention this object. You can just remove it, because, per default, it inherits from object. Your super function call also, the default is there that you're probably using, and
08:22
remove it. Make your code simpler, more livable. There are, for example, unicode literals. You can remove them because everything is unicode to a per default in Python 3. And so there are plenty of such small things that you can do automatically, more or less automatically, but fortunately, there are two stuff for that.
08:41
And, for example, wrote a great tool called Pyupgrade that you just apply on your whole code, and you say I want at least Python 3 or Python 3.6, 3.7, and it tries to update your code as much as possible without doing anything too much, and then your code will be automatically more modern.
09:01
You just do Pyupgrade, and then list of files, and then it is quickly, and you can diff and put it in Git as a new commit, and then you see actually what changes have been done. Also, make sure that the code looks good, that your diffs are minimal, and every diff
09:21
is, like, needed, because if your code, if you want to add some new parameter in a function, and you have to add commas before or after that, then the diff is not clear, but if you use, for example, something like black, blue, or yup, then the code will be more legible, the diffs will be more legible and more logical.
09:46
Also, make sure that flake8 or pylint are happy, that you just run flake8 and you see if there are any issues with your code and fix them, and then your diffs and modernization will be much easier and much more transparent.
10:04
Pyupgrade will also touch your string formatting, so, of course, you should use fstrings, because they are cool in most cases, but it will not do everything, because sometimes the fstring is longer than the older version, so it won't touch, but check it, see the difference, and with your human intelligence, human eyes,
10:24
just say, I can update it, it looks better after that. There is one place where you should keep the old formatting, and that's exactly in the log messages, but this is something that Pyupgrade won't touch. And there are also some issues, for example, here, this was in PyTables,
10:43
there was one line of coding, let in one, and I wanted to remove it, like to make everything UTF-8, and the logic behind that was much more complicated, and, actually, I had to do something that was not so visible at the first sight,
11:00
but that worked at the end, so, from the three characters, I created a longer string, and then it works, but this means that I was able to touch the code, and to change it, so it works again, or that it works, it still works, because if you have code that you are the sole maintainer of the code,
11:23
and you are afraid of touching it, because it works, then it's a bad code. You should be able, actually, to touch any line of code that you are a maintainer of, and then it is, the code is probably much better, much easier to maintain.
11:42
The Unicode characters, of course, if you want to type my last name, you can type it directly as Unicode, you can also use the code points, but if you want to make it readable, and you want to stick to ASCII only, then you can use the last version which enters directly the Unicode names of those characters.
12:04
Numbers. There is something that has changed in Python 3 compared to Python 2, and that is dangerous if you don't take care of it, so, for example, math floor, math sail, and round, before these numbers returned flows. Now, they return integers. So, in the old codes that were
12:22
exposing Python 2 and Python 3, you have something like that, but imagine there are plenty of young developers who have never seen Python 2 code, and when they look at this, why, math floor, math sail, round, they return integer, why did they convert it to another integer? So, please remove all the integers.
12:41
The reason why this has changed was that earlier, there were not enough integers, because floats was longer than the range of integers, so they kept floats as a result of these three functions. Now, integers have an undefined or, like,
13:00
infinite range, and that means that now it's easier to work with integers, and actually, you should always work with integers. They are much more sane to work with than floats. So, remove the integers. This is exactly the reason, because in Python 2, the maximal integer was
13:21
9 billion trillion quadrillion, and the float was 10 to the power of 300. So, now, integers are not limited. Another reason to, or another piece of code that work both in Python 2 and Python 3, but they don't really make sense in Python 3,
13:41
is, for example, something like this. You want to calculate the average, or the arithmetical mean of these four numbers. If you do something like this in Python 2, then you will get the integer that's just wrong. So, usually, what you would do, you would convert the first part, the second part to float, or you would multiply everything by 1.
14:03
But in Python 3, integer divided by integer returns a float already. So, why would you convert one of those into floats or multiply it with 1? Doesn't make sense. So, just remove it and do it the standard way. If you are still in the migration, just from future import division,
14:23
and your Python 2 code will behave the same way as Python 3 anyway. So, now, this is actually correct, and do it simply. integer divided by integer, and keep the integers as long as possible. Because as soon as you convert to floats, you lose some precision, and every time you calculate anything with floats, you lose the precision.
14:43
Because God make the integers, all else is the work of men. Use integers. For example, there are some floats that even don't exist. If I take this number in the first line, it doesn't exist as a float. Because at this range, this high range of numbers,
15:02
the difference between two consecutive floats is already two. It means that there is 992, and then there is 994. There is no number with 993 at the end as a float. And you think, that is a huge number. No, the number is not so huge.
15:21
The number is the number of nanoseconds in 100 days. So, this is probably a number that maybe you are going to work with. And still, if you then you have an offset of one, then it's probably wrong. So, don't use floats for things that you can use with or work with integers.
15:42
Another thing with floats is that float is some approximate number. If I take a log and cut one meter from it, then it is never one meter. It's just like around one meter. And if I take number 0.1 and I add it three times, then I will get a number that is not 0.3, because there is no number 0.3 in the floating point range of the standard that Python uses.
16:06
And if you do even more calculations, for example, like here, I add the number 0.3 one million times. So, plus, plus, plus, plus, plus. And then you see that I get a number that is not 300,000. I get some number that is a little bit next to it.
16:21
But it's even wrong. It's not the nearest float. It's a float somewhere nearby. If I want to work exactly, I can use decimal. So, decimal 0.3, that is exactly 0.3 in our logic of decimal number system.
16:40
And then I will get the right number. Or I can use fractions. For example, three tenths is a number that is also, if I add it one million times, I will also get a number that is correct. But you see on the right side, there are the seconds or milliseconds, seconds, how long time this calculation needs. So, if you need something really exact, like you are working with finances,
17:05
then use, of course, decimal. Or even if it takes a little bit longer. But if you work with some physical, something like temperature or the length of something, then use float as it is, because it's not exact anyway. So, that's okay.
17:21
Another place where you shouldn't use floats are, for example, numbers that are exact. So, for example, one hour is exactly 60 minutes, not 60.0 minutes. So, use 60 minutes. And only if you divide, if your number of minutes is, for example, 90, then 90 divided by 60 is already 1.5, approximately, although it is exact.
17:41
But in the point of view of a byte, it is approximately. But it gives you the right result, even if you add or subtract it. The same for averages. So, you don't have to divide by 2.0, because you don't have 2.0 items. You have two items, so divide by integer. So, in your code, check.
18:01
This is something that you cannot do automatically. You have to check manually or search for all occurrences of .0 in your code. And you will see that there are plenty of spots like this that have that use two exact numbers, but they are too exact only because Python 2 did these things differently.
18:23
Wrong. As you see. But they did it differently. And now if you write Python 3 code, write it correctly. Divide by 2. Also, if you want to have, like, the rounded, you can use math floor or integer or then slash slash. That also gives you an integer division.
18:42
Round. That's interesting. Why should you actually use round? I never use round. Round gives you now an integer. That's fine. That's maybe a usage where I could imagine using it. But in Python 2, round returned a float. But there is no float that is exact some number. There is a float that is approximately some number.
19:02
And even there is something like this. Round to two decimal places. That's plain wrong. Why would you want to round to decimal places? There is no float that is exactly two decimal places exact. There are some floats that are nearby. And if you want to get to decimal places, that's probably for printing. Because, of course, as a human, I don't want to see a floating point number like this.
19:23
I want to see something that I can read. With two decimal places. Of course, this is something that I could use. But then I will do it when converting to string. My string can look like two decimal places. Although the number behind that is random.
19:41
Or random. It's a floating point number that has any precision. Or what I can do is like this. With equal sign, it's the debugging function of the recent Python. Allows you actually to print it even nicer. Another thing are percentages. Percentages. Why would you work with percentages?
20:00
It means that multiply by 100 if you want to print something. Because usually you don't calculate work with percentages. You work with one divided by sevens or one seventh. But you don't work with 14.3. And in Python you have the possibility actually to format your string with percent.
20:21
And then it will multiply by 100 and add the percent sign and do the math. Numeric literals. If you have a big number and you do something like this, 10E9 is a float. If you have one second as an integer multiplied by 10E9,
20:42
then you will get a float. Which is not exact. You probably want it to have integers. What you can do, of course, you can write it like this. But that's not eligible. Now you can use underscores. It looks a little bit better. Or 10 to the power of 9. That's also an integer. That's maybe better than E9. Those are numbers.
21:01
Then now let's have a look at path. That's something that you absolutely should use. Because a piece of code that searches for some CSV files, reads them and processes them within a directory, you can write it like this. And it looks much, much, much better. You don't work with paths as strings. You work with them as objects. And you have object oriented. With attributes, you can access everything.
21:22
Parents, name, STEM, suffix, everything. You can create directories. You can check everything. So do it like this. And also what is better is that if you work with strings, then you work with paths as strings. Then you have the strings.
21:41
The strings. And you see a string, but you don't know is it a path or not. But if you have a path object and you pass it around your functions, then it is clear you are working with path objects. And it's great. So have a look at this. Check it. And really you can rewrite your code and make it much, much, much more readable. And of course, if your library supports 3.6 to 3.9, then for example, unlink of an existing
22:09
or missing file is not possible before. But from 3.8, you can already use it. If you have a library that supports several versions of Python, then you have to write it the first way.
22:20
But then later rewrite it the second way. And it will be even nicer. We have to write nice code now. With paths, you can also use pytest. As you told me, you are using pytest already. So that's great. So for example, here you want to test something and you want to have a temporary file. The advantage of this is it creates somewhere in your temp directory some file that
22:42
doesn't have a really nice name. It deletes it afterwards. So you don't really get it when your test is done. And you cannot really see it. But for example, in Python, also in pytest with Python 3, you can use temp path.
23:00
And that's a path somewhere in the slash temp. I will show you where. And then you can work with file names that have normal names. And have names that mean something to you. And for example, in this case, it will create temp pytest of the user Miro, pytest 1. And this is the name of the function and the name you gave it. And pytest 1 is the first instance of my pytest.
23:23
And then if you run it again, it will create a directory pytest with all the test functions below it. And then Python 3 and so on. And default is that it will delete it will keep only the last three pytest runs and it will delete everything before. So it means that at every time on your system in slash temp, you have the last three runs
23:41
and you can see directly the files and they have meaningful names. So that's beautiful to use it. Daytime is the last thing we have. For example, in Python 2, you could write something like this if you wanted to have the Unix timestamp, so the number of seconds since 1970 of a daytime.
24:01
But you had to import calendar. So another library from another module from the standard library and then do something like that. That's not beautiful. But it works in Python 2 and Python 3. Now you can do something like this. So vt.timestamp and you get the number. Also, now you don't have an excuse that you should use time zone of airtime because
24:25
in the standard library, there is daytime time zone UTC for the standard UTC stuff. And now even from 3.8, 3.9, there is zone info, which uses the TZ library from your system in the standard library.
24:42
So you don't have to import pytz or something like that. And now the dates, they just work beautifully and you can really work with that correctly and explicitly with the time zones. I know time zones are difficult. But another thing is time time.
25:01
So if you have some expensive operation, you want to measure it. Time time gives you the number since epoch and then again. But the problem is if you're running expensive operation, your system clock will be set to the correct time using NTP, then end minus start can be a negative number. So in that case, you perform counter that gives you the number of seconds each time
25:25
you call it. This is the number of seconds since some point in the past, but it is always increasing. Or process time. This is for CPU time or for real time. All of this and even more. Wait for a few weeks.
25:42
And the second edition of the Fluent Python book by Luciano Ramayo is going to be published. And it's a book of 1,000 pages that contains really plenty of information or plenty of ideas on how to do such beautiful stuff with Python.
26:01
It is actually the best second book on Python that is available. Full disclaimer, I am one of the technical reviewers. So the question is should we return to Python 2? Do you want to keep the back doors in your code? Here's the plan. When someone uses a feature you don't understand, simply shoot them. This is easier than learning something new. And before too long, the only living coders will be writing in an easily understood
26:23
tiny subset of Python 0, 9, 6. That's all. Thank you very much. Thank you very much for your interesting talk. I think there was a discussion about blue versus black packages.
26:50
So if we could maybe know what you think about this issue.
27:01
It's your choice. Stick with one version, like with black or blue. The problem is that sometimes they even change their mind because they update from time to time. So I would really stick with one version of black and then update it once a year when you also update the version of Python and you also update some syntax of your code.
27:24
But what do you think about blue and black? What is your opinion? I think the blue one is better because it's newer. It has a different opinion on some stuff that has been criticized in black. Okay.
27:40
Thanks a lot for your talk today. We're very happy that we had this amazing opportunity to listen to you. Thank you very much. Thank you. I am available at the booth of Trayport. Happy to chat with everyone and later meet you in person. Thank you very much. Thank you. Thank you.