Introducing Incompatible Changes in Python
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 141 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/68689 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202384 / 141
8
17
22
26
27
31
42
48
52
55
56
59
64
66
67
72
73
77
79
83
86
87
95
99
103
105
113
114
115
118
119
123
129
131
135
139
140
141
00:00
Software developerCore dumpMathematicsElectronic mailing listDefault (computer science)UnicodeFunctional (mathematics)Software frameworkFormal languageFunction (mathematics)Level (video gaming)BitArithmetic meanString (computer science)Operating systemElectric generatorCodeCartesian coordinate systemIterationDecision theoryProcess (computing)Single-precision floating-point formatProduct (business)Semiconductor memoryHuman migrationoutputSoftware developerData dictionaryCore dumpPauli exclusion principleModule (mathematics)Key (cryptography)Social classPhysical systemMehrplatzsystemError messageComputer animation
05:30
Electric generatorDefault (computer science)Extension (kinesiology)Mathematics2 (number)CodeDistribution (mathematics)Electric generatorComputer configurationSingle-precision floating-point formatNetwork topologyAbstractionCartesian coordinate systemGoodness of fitPattern languageBookmark (World Wide Web)ConsistencyFrame problemFormal languageSystem callUnicodeFunctional (mathematics)Key (cryptography)IP addressObject (grammar)ProgrammierstilObject-oriented programmingMathematical optimizationSquare numberDevice driverState of matterWebsiteThread (computing)Data structurePlanningProjective planePairwise comparisonPauli exclusion principleChemical equationSoftware maintenanceSpherical capRow (database)
10:46
CodeModule (mathematics)BitProcess (computing)Pauli exclusion principleRevision controlScripting languageCodeHuman migrationModule (mathematics)MathematicsDefault (computer science)Cycle (graph theory)Uniqueness quantificationDifferent (Kate Ryan album)Asynchronous Transfer ModeFunctional (mathematics)Software developerError messageRun time (program lifecycle phase)6 (number)Multiplication signComputer fileProjective planeCartesian coordinate systemSingle-precision floating-point formatConfiguration spaceAliasingPoint (geometry)Software bugKey (cryptography)Software maintenanceComputer configurationCore dumpTotal S.A.ParsingSound effectVideo gamePhase transition1 (number)Maxima and minimaFreeware
18:48
Repository (publishing)Revision controlScripting languageFunctional (mathematics)Projective planeInformation securityCartesian coordinate systemSuite (music)Multiplication signCASE <Informatik>Software testingPerturbation theoryBitOpen sourceCategory of beingSource codeComputer fileMathematicsEmailRun time (program lifecycle phase)ImplementationExtension (kinesiology)Pattern languageCodeBus (computing)Human migrationDivisorFamilyVideo gameSoftware maintenanceProcess (computing)Regulärer Ausdruck <Textverarbeitung>Context awarenessData structureParameter (computer programming)Rule of inferenceTraffic reportingStability theoryExpressionTerm (mathematics)Scheduling (computing)
26:35
Eigenvalues and eigenvectorsRevision controlFood energySoftware testingBitInternet service providerSingle-precision floating-point formatSoftware testingFeedbackImplementationMathematicsRevision controlBeta functionMultiplication signProjective planeCryptographyBinary codeArithmetic progressionExtension (kinesiology)Alpha (investment)Computer animation
28:16
Storage area networkMultiplication signLevel (video gaming)MultiplicationLecture/ConferenceMeeting/Interview
28:44
DivisorKernel (computing)Software bugRandomizationCrash (computing)LeakSemiconductor memoryRoundness (object)Lecture/Conference
29:30
Roundness (object)Computer animation
Transcript: English(auto-generated)
00:04
Hi everybody, I'm here to talk about how to introduce incompatible changes in Python and if of if possible how to mitigate risk of incompatible changes So my name is Victor Steiner. I'm contributing to Python upstream and Downstream for readouts downstream means that I maintain the federal and rail operating system and upstream means to fix
00:27
Issues in Python upstream. I'm a core developer for 13 years I'm a happy federal and VM user and Sadly, I went through many incompatible changes since 13 years
00:42
So first I would like to come back a little bit to the past how we did the migration using the d-day EPA migration So a long time ago in a galaxy far far away. There was Python 2 Let's travel 15 years in the past before Python 3
01:03
so For the one who don't know we had a language called Python 2 before and we had some some issues in that language and the first one is that 15 years ago Django become more and more popular became a good competitor to PHP frameworks
01:21
but they wanted to use a unicode for anything related to text and the problem is that As soon as you use unicode in Python you get into troubles It means that if you deploy your application on production Everything is fine until a single user put a single non ASCII character
01:41
You will get an error, but you don't know exactly where because you don't exactly control how Python decide between bytes and unicode and In general in Python 2 it was a very frequently asked question So In Python 2 the string ABC is a byte string. It's not characters, but bytes and
02:05
if you concatenate a byte string on a unicode string, which are non ASCII you get an unicode L and Yeah, just in short getting unicode correctly in Python 2 was very troublesome
02:20
So what we decided is that's in Python 3 we move to the correct solution by default Which is that most people actually wants to process text in a Python So we want to use unicode for everything and it becomes a first-class Citizen in Python 3 because if you declare a single string like ABC this one now is a string of characters
02:45
but the trouble is that If you have a large application like Django or ZOOP or or Mercurial or anything Which is based on Python 2 so written with the assumption that all strings are bytes Moving to unicode. That's once is very complicated
03:02
Because you have to rethink for each function each input each outputs What do you exactly want is is it more appropriate to use bytes on to use unicode and? The second issue is that you cannot decide that on an incremental way when you migrate from Python 2 to Python 3
03:20
You have to fix all your technical issues at once and Python is also a very old language and we added slowly One by one some features like for example in Python 2.2. We introduced as a cooler concept of Iterators the pep 2 3 4 and also generators the pep 2 5 5 and
03:47
The problem is that in Python 2 to 0 and that thing did not exist so the existing function like the built-in map and zip function return actually a list and The problem with a list if you have a large data set it consume a lot of memory just to create a temporary
04:06
List just to process the output That you are likely to iterate on it So what we come up is a new module called it at all which contains many Recipes many functions to process everything as iterators and generators and for example you had to replace map
04:27
And zip with I'm up on I I zip but Trouble is that you you have to migrate your existing code one by one and it was a long process to do that and the advantage was not always obvious and
04:43
So it's the same for dictionaries when you have a dictionary and you want to iterate on the pair of key value you have Items method, but this one also return a list so we added a second method called item items So we decided to start from a from a new language
05:05
Which is which has better defaults so in Python 3 we decided to move to generators by default For the map and zip built-in function and for the items method of the dictionary it also returns a generator by defaults and if
05:22
You really want a list instead of the generator. You just have to cast the output to a list and that's it So the idea of Python 3 is that we collected Everything that we didn't like the bad pattern and we try to address them all at once and the idea would to have a good
05:43
Send default behavior. For example, we use unicode by default create generators by defaults and I would say that the language become consistent again because we with so many new features in Python 2 which means Language a little bit Inconsistent and now
06:01
We have to go default and it just works. It's just a minor issue at the end. Oh It's backward incompatible. Oops So Why do we do we have incompatible changes in Python So we have a pep of Python as a Zen of Python
06:23
It's a pepper 20 which say that there should be one and preferably only one obvious way to do it And this principle is very strong in Python, which means that we have a consistent coding style Python it's easier to teach and easier to review Python because most people have the same
06:44
Code so you can compare code between colleagues and even before between projects So what we said for Python 3 is that to make everything consistent again? We have a very simple plan Everybody has to run a tool which automatically convert everything from Python 2 to Python 3
07:04
You do it once you're good Almost We had some troubles which are Dependencies, this is something that we didn't plan Actually when you have a large application Everything is not in a single code base. You have things called dependencies
07:24
Today we are more used to it with PyPI. But before is there was already something like that and The problem is how do you migrate your application if the dependencies are not? Prepared for Python 3 Because if you have a single dependency, which is not compatible with Python 3 you have to port all
07:46
dependencies which have also dependencies it can be a very long tree and The second issue is that when you run the tool 2 to 3, it's really a single way path So you you go to Python 3, okay, but you cannot come back. This is a one-way
08:05
Option and when we propose that to dependency maintainers, they say no we We have as a majority of our user on Python 2. I don't see the advantage of migrating to Python 3 because All Linux distribution are using Python 2 and we are fine with Python 2. So we wait until some other people migrate and
08:30
For all this reasons immigration Didn't took one day as a plant but ten years and Then comes a second thing in Python. It's called a C API
08:43
so the C API is used by many third parties extensions to extend the Python language and I think that The C IPS is a key of the Python success Because thanks to that if you are limited with the Python language, you can easily plug your existing very
09:01
50 years old Fortran code for NumPy you can plug your favorite graphical toolkit application Quite easily. It's very easy to write bindings. It's very easy to call existing function and Yeah, if you have no C API there is no site and there is no NumPy there is no
09:23
scientific stock there is no psychopg the driver for Postgres square and And Another problem is that in Python 3 11 we made many optimization work in Python But to be able to optimize Python we had to make some septal changes in the C API
09:44
Especially in object which are related to code execution So the code object the frame objects and what we call the thread state which contains a state of all Python internals and the problem is that As usual people actually use it and they use it for many different things and they use
10:06
directly this structures and there is no abstraction between the Internals and how people use it So this changes broke a few C extensions So for example instead of accessing directly to the F underscore code of a code object
10:24
You know to call a function which is pi code get code To get a previous frame you have to get to call pi frame get back and to get a frame from a thread state you have to call the paste restate get frame and Another problem is that this function are new in Python 3 11, but you don't have this function in 3 10
10:48
Okay, I saw how we did things in the past and now I would like to see what what is Present solution and how we managed to have a little bit smoother API updates
11:02
so first of all about Python free The migration from Python 2 to Python 3 there was a new module called 6 and this one Is very helpful because you can have a single code base to port your application So you you use a 6 module it works on Python 2 and it works on Python 3 and this is very
11:23
A very practical solution and very helpful and because previously people started to They started to fork the project and to have two different names it will it was very annoying because you had different dependency depending on the Python version or Some bugs were fixed in one version and not the other
11:42
so having everything in a single code base is a key for the for successful migration and The other idea of the 6th module is that instead of having to merge To migrate everything at once to the Python 3 you can migrate your file one by one
12:00
using the 6th module and It's more incremental And also we decided that the DDA approach should be abundant because it didn't work because of the whole issue that I said and We learnt from from our mistake About
12:21
incompatible changes there is also a new practice is that we are trying to make incompatible changes as early as possible in the development cycle and when we see that We break too many things. We open a discussion to say that okay Maybe this change can be reverted and we can wait maybe one or two years until more people get
12:42
Used to the new API and the problem of Python 3 turn is that when it was released Python 2.7 was just The support would just ended and some projects still had to support Python 2 and Python 3 and they didn't want to drop
13:00
Python 2 support right now So we made a few reverts To give more time to this project to be great. For example We remove the you mode for the open function because this one has no effect on Python 3 and it was deprecated for 10 years and also the
13:22
aliases of the collection module For also for many many years it was deprecated but keeping the code didn't was a big maintenance burden So we decided to keep it in one release but to remove again in the next release And the main idea of this process of making incompatible changes early and make reverts
13:45
During the development is to give more time to people to adapt the code Because we know that we have users and we try to be respectful to our users To give some example on Python 3 11. We also reverted
14:02
Removal of unique code aliases because again, we are the many aliases deprecated to many years People didn't pay attention to the deprecation running. So we reverted that change for one more release and also There were aliases in the configure parser and some functions
14:21
And the async core module I'd expected that nobody would still use it but in practice It's still used for different reasons and moving to async IO or the option is not that easy and This free changes and gets reverted, but we did it again in the incoming Python 3.12 release
14:45
The problem of this changes is that they affected too many packages and it takes too much time to fix them so to Decide about a change in Python we have a process for that it's called the pep 3 8 7 so the deprecation process and
15:07
We had some conflicts between this existing pep and the new Release process in Python because in Python 3 9 we decided that we are going to release a little bit faster Instead of having a release every one year and a half. We are going to have a release every October
15:26
so once per year and the old deprecation process was fitted to the old release cycle Because we had like one year and a half to remove a function and with the new release process
15:40
It was only one year to remove a function and we noticed that one year was too short Because people don't read the documentation. They don't pay attention to the warnings or just their different life duties So What we did is to update these documents to to require to deprecate something for at least two years
16:00
so this is a bare minimum, but obviously you can deprecate a function for longer and For example, if you deprecate a function in 3 11 It has to stay deprecated in 3 12 and we are only allowed to remove it in 3 13 so it takes three years in total About the deprecation warning
16:21
It was decided to hide them by default in 210 Because we noticed that most of all your users actually users and not developers as I don't know how to deal with this warnings because Only people who have access to the code know how to modify the code really care about it
16:43
So the idea is to make it more pleasant for users and Give the access to this warning to developers who can enable this warnings We made a tiny change in Python 3 7 is that when you write a script This warnings are shown by default, but only in the main script in the main module
17:06
so to display a warning once what you have to do is to use a dash w w default to see the warning once to treat Every kind of warning as an as an error you can use error instead
17:21
Or you may want to try the development mode of Python Which is dash uppercase dev and this one not only show warnings, but also enable more features More checks at runtime, which are very helpful for developers and if you get too many warnings
17:41
You can dig into the warnings documentation to see how to filter some warnings to only see the one that you care about So what we are trying to do in Python now is to have what I would call a smooth Deprecation so the first point is to add the new way the new API
18:01
Deprecate the old way only in the documentation we start to emit a warning at runtime and And Something which is very important for me is that we we try to explain how to port existing Code which is using the old way we for cruising support for the old version
18:23
And this is something new because in the past we just remove code and you're on your own Now we are trying to help users to actually propose a solution working on the old Python and the new one with a single code base and Making this exercise and help us to see that
18:42
Oh, actually, it's not that easy to have exactly the same behavior with the old in the new way So maybe sometimes we we have to rethink the change to to have a even smoother migration And once you're done with all the steps, okay now you can actually remove the old way
19:02
What we also started to do is to run a code change So there is a script to download the source code of the 5,000 more most popular project on the pipe here repository and once you have a whole source code on
19:20
offline you can have a desk scripts to church with our regular expression expression to see if an API is used in that code and This work help us to see how the API is used how many projects are you are using it and Once we identify the most popular project which use it we try to either report the issue upstream
19:45
propose a fix or to to come up with a solution for them And you can find the script on my github repository So for me ideal migration would be first to as a new API
20:03
documents the change and provide a tool to help the migration to identify and update all affected project or the majority of affected project and If possible, that would be the ideal case to wait for release
20:21
Because if the release of the of the fix becomes after Python is there is a delay between the new Python version and The tool which is compatible with the new Python so once all affected project of get a release you can deprecate the old API removes the old API and
20:42
The issue with that process is that it's quite slow It takes between three and five years and sometimes we want to move faster So we are trying to fit into that migration path, but sometimes it's too slow And a very recent change the recent means two weeks ago
21:01
I defined what I call the soft deprecation and The idea of the soft application is that this one doesn't imply to remove a function It's a way to mark a function that oh you should no longer use it to new projects But it's perfectly fine to continue using it for all the project because it's still tested. It's still supported
21:24
We have still the documentation And not only it's not There is no removal which is schedule But also there is no warning at runtime and this is also something very important Is that more and more project are tested with the warning are checking for warnings in the test suite
21:43
So the idea of the surf deprecation is to mark something as deprecated but not affect Any project because the deprecation is only in the documentation and I told about the code church in the most popular API project
22:02
but sadly we don't have access to every project in the world and Some some cop item dependencies Also a single and an available maintainer. So even if we find The project which is affected we propose a fix sadly. Sometimes it takes a few months of your few years
22:25
to to get a fix and There can be many reasons as a maintainer can be busy with work with other life duties get bored about the project But sick or It's gonna be someone of a friend of someone of his family of half her family
22:42
So it's not about the birth factor of people get his but hit by a bus that can be many many reason like also burnout So, how can we update this project if the maintainer doesn't reply? I Have no solution for that There is also the problem of funding the open source project. Maybe some of them aware of that
23:04
Big companies are relying of key dependencies, but there is no funding for that and Also maintaining this project is a thankless walk walk And yeah, but project which are developed behind closed door we don't have access to the source code
23:22
So we don't know if they are affected or not they can be short script or very large application and Sometimes they are very old projects which are no longer maintained There is also turnover in the in the team. So people who knew how the code was working along no longer in the team
23:43
so for that project there is a script for Python which is called PI upgrades so you can run the script and Gets some automated change for the new version of Python To make it compatible with the new Python and for the C API I've wrote a script called PI upgrade
24:01
Python C API which adds support for the new Python version without losing support for the old Python version Or at least you have one solution, which is not great, but works You just keep an old version of Python, but be aware of the security so about the C API what I did is to write a new tool to provide new functions of the
24:24
New Python version to the old version of Python. So the idea is that you only use the new names But you do you have this new functions on your old Python version So I created this Python three years ago and I created the script to automatically update
24:41
You see application with the new names and I had to add support for Python 2 So Python 2 is still supported because I needed a support for the Mercure project We each didn't finish his emigration to Python 3 and Last year I added many functions for Python 3 11 But in the meanwhile, I added functions for pattern free 12 and even now free
25:05
13 13 At least 10 projects are using it and you can find the documentation online so the idea is that you update the C extensions to use new functions and You copy the header file into your projects and
25:22
once you did this change, you don't have to update as a header file anymore because unless you use new functions, there is no need to update that file and Either side it still support Python 2.7 What we also did for the C API is to define new guidelines to avoid issues
25:43
That we add in the past and issues that we want to get right At least to avoid this issue when we add new functions, for example Functions must not be term borrowed references We should avoid to steal references and we should defines ownership rules on the lifetime of
26:03
arguments on structure members and The idea is that if we follow this new guideline, it should be easier to To support the C API on the Python implementation other than C Python We also try to reorganize a header file in three different categories So what you call the limited API which is related to the stable API
26:24
So public C API and the internal API the internal is the one that you should not use and now it's well separated I'm a little bit out of time. So I'm going a little bit quicker so the future so future would be to
26:41
Spend more time to think about a stable ABI and this one already exists since Python 3.2 And the idea is that you build your C extensions once and you don't have to change your binary anymore Because the ABI is stable you can distribute a single binary and it's just fine
27:00
And we made some changes to check the ABI and to better document it And there are two well-known projects which are cryptography on PI side which are using it So there is a new C API called H pi and this one is designed to work To be efficient on PI PI and it also works on C Python and you can decide
27:24
If you want the best performance So to get access directly to the C API of Python or to have something called the universal ABI Which provides a single binary working on all Python version all Python implementation and The idea is that you have single API
27:41
So it's quite convenient and there is a work in progress port for the NumPy project you need to use it and This is something for you. Please test over alpha release, please test over beta release and At least try to test as a release candidates and provide feedback as soon as possible
28:03
Because we need to know about your issues if we know about your issues earlier We have more time to fix your issues to help you to me. Great So it's very important for us to have the feedback as soon as possible Thank you
28:27
Thank you for the for your insights on compatibility and on multiple levels. Actually, we have maybe time for one question So there are two microphones in the room Or you can ask your questions in this court, which is also possible
28:40
Thank you very much Victor one questions. Do you know if anybody has experimented with? Depreciation by slow down so still provide the API or the function but slow it down by a factor of whatever I know that the Linux kernel is doing that for all the API to motivate people to mean grades and
29:05
So far we didn't decide to Decrease the quality of the old API on purpose There is no need for that We just help people to move to the new one and if possible we try to support the old and the new API But here we are sometimes we think about this ID
29:23
Introducing memory leaks or crashes random bugs. Thank you very much. Please give another round of applause for Victor