Moving big projects to Python 3 - TIB AV-Portal

Moving big projects to Python 3

00:00

1

Regebro, Lennart

Formal Metadata

Title

Moving big projects to Python 3

Subtitle

Did you think the language differences were difficult bit?

Title of Series

EuroPython 2019

Number of Parts

118

Author

Regebro, Lennart

License

CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/44841 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Next year Python 2 is no longer maintained. But you have a monster code base with clever tricks and libraries that don't support Python 2, and your data may be stored in a format that is hard to move Python 3. And that's the easy bit. This talk focuses on the process of moving, not the code changes. Because it's the process that is the hard part. How do you get your code in a state where it's ready to move? How do you get the whole team on the boat to Python 3? All Python 3 talks I have seen, including those I have given, and all the texts on how to port, including the book I wrote, focus on the code changes. With increasing backwards compatibility in Python 3 and forward-compatibility in Python 2, this actually became a lesser problem for big code bases. The extra issues of large, old code bases Can you stop adding features? (1 min) Separate team vs getting everyone on it (2 min) Python 2 compatibility: You need it (1 min) The steps Fix your development process (2 min) Replace old libraries, or take over maintenance and port them (2 min) Make sure your tests are solid (1 min) Run 2to3 but only backwards compatible fixers (2 min) Run tests on Python 3 to stop backsliding (4 min) Run all tests: Expansive or slow Store passed tests Detect tests that change Turning it off adds a lot of extra work Port all your little utilities and tool scripts (1 min) Fix fix fix fix (1 min) Add tests with Python 2 data, to test migration (2 min) You might need migration scripts Extra careful staging tests (1 min) Production: Try, fail, repeat (1 min) Clean the code up (3 min)

Keywords

Deployment/Continuous Integration and Delivery

EuroPython 201930 / 118

1

32:25

Dash: Interactive Data Visualization Web Apps with no Javascript

2

29:36

Google Cloud for Pythonistas

3

30:25

Visual debugger for Jupyter Notebooks: Myth or Reality?

4

30:41

Image processing with scikit-image and Dash

5

10:06

EuroPython 2019 - Closing Session

6

45:35

EuroPython 2019 - Lightning talks

7

1:05:06

EuroPython 2019 - Lightning talks

8

1:09:28

EuroPython 2019 - Lightning talks

9

04:02

Europython 2019 - Morning Announcements

10

15:34

EuroPython 2019 - Opening Session

11

49:34

EuroPython 2019 - Recruiting Session

12

14:24

EuroPython 2019 - Sprint Orientation

13

30:23

Building Industry 4.0 logistics applications with MicroPython and ESP32 MCUs

14

42:52

Machine learning on non curated data

15

43:47

Don't start with a database

16

1:02:11

Delta Chat, CFFI, pytest and all the Rust

17

29:04

Natural language processing with neural networks.

18

45:49

Maintaining a Python Project When It’s Not Your Job

19

30:30

Gamifying the study of algorithms

20

28:10

21

28:17

Downloading a Billion Files in Python

22

12:46

Docker meets Python - A look on the Docker SDK for Python

23

29:49

Deploy Python to the cloud faster with Azure Serverless

24

29:51

Introduction to Python and MongoDB

25

31:23

How Thinking in Python Made Me a Better Software Engineer

26

43:26

Audio Classification with Machine Learning

27

26:16

PEP yourself: 10 PEPs you should pay attention to

28

28:32

Zen of Python Dependency Management

29

26:45

Building a Powerful Pet Detector in Notebooks

30

33:19

Moving big projects to Python 3

31

45:18

From days to minutes, from minutes to milliseconds with SQLAlchemy

32

44:09

Teaching Programming to the Next Generation

33

43:25

AI in Contemporary Art

34

47:16

Getting Your Data Joie De Vivre Back!

35

40:01

Advanced asyncio: Solving Real-world Production Problems

36

52:13

EPS General Assembly 2019

37

24:39

EuroPython 2020: Help us build the next edition!

38

31:06

PyRun - Shipping the Python 3.7 runtime in just 4.8MB

39

24:58

“When a biologist met Python”

40

27:43

Exceptional Exceptions

41

29:32

Publish a (Perfect) Python Package on PyPI

42

30:05

Geospatial Analysis using Python and JupyterHub

43

42:46

Python for realtime audio processing in a live music context

44

44:49

How to ship a Python app to a hundred million desktops

45

27:46

How to read (code)

46

32:35

From Python script to Open Source Project

47

47:05

AsyncIO in production - War Stories

48

45:14

Tips for the scientific programmer

49

58:46

Look Ma, No HTTP!

50

43:51

From 0 to 180 in 10 years: Evolving a helper script into a 180,000-lines-of-Python-code project

51

44:23

A Day Has Only 24±1 Hours

52

28:03

Bioinformatics pipeline for revealing tumour heterogeneity

53

29:24

Building Data Workflows with Luigi and Kubernetes

54

30:48

Tools of the Trade: The Making of a Code Editor.

55

45:56

The soul of the beast

56

27:12

Dissecting tf.function to discover AutoGraph strengths and subtleties

57

26:46

GraphQL in Python

58

38:25

59

31:20

Data-Driven Customer Relationship Management bin Banking with Python

60

46:47

Modern Continuous Delivery for Python Developers

61

31:02

Configuring uWSGI for Production: The defaults are all wrong

62

29:00

The dos and don'ts of task queues

63

26:34

PlotVR - walk through your data

64

30:13

Parallel computing in Python

65

26:11

Game Development with CircuitPython

66

26:53

Code quality in Python

67

43:31

Software patterns for productive teams

68

53:28

Advanced pytest

69

27:29

Practical decorators

70

42:03

How to train an image classifier using PyTorch

71

45:31

Opening PyPy's magic black box

72

32:23

Python's Parallel Programming Possibilities - 4 levels of concurrency

73

30:27

The Dangers of Outsourcing Software Development

74

42:21

Wait, IPython can do that?!

75

28:43

Accelerate your Deep Learning Inferencing with the Intel® DL Boost technology

76

21:50

Supercharge your Deep Learning algorithms with optimized software

77

44:57

Go(lang) to Python

78

46:29

Get up to speed with Cython 3.0

79

23:34

Testing Microservices: fast and with confidence

80

30:36

Become a command line wizard

81

24:06

Opt Out of Online Sexism – Open Source Activism

82

45:30

Don't do this at work

83

31:02

Refactoring in Python: Patterns & Approach

84

30:28

Writing an autoreloader in Python

85

28:32

Enhancing Angklung Music Rehearsals with Python

86

27:16

Using Python to Teach Computational Finance

87

47:41

Python Performance: Past, Present and Future

88

30:26

Static typing: beyond the basics of def foo(x: int) -> str:

89

30:17

From HTTP to Kafka-based microservices

90

41:49

Why You Should Pursue Public Speaking

91

28:08

Code review for Beginners and Experts: Tips & Tricks

92

29:12

useFlask() - or how to use a React frontend for your Flask app

93

29:53

What about recommendation engines?

94

1:01:38

The state of Machine Learning Operations in 2019

95

44:24

Python Standard Library, The Hidden Gems

96

30:28

Explaining AI to Managers

97

40:44

How we run GraphQL APIs in production on our Kubernetes cluster

98

28:57

Python vs Rust for Simulation

99

25:42

The Agile comedy: from hell to paradise

100

27:40

Better WebSockets - Server-Sent Events, a carefree alternative

101

30:27

The Story of Features Coming in Python 3.8 and Beyond

102

30:15

How To Build a Python Microservice Without Losing a Job

103

40:39

How to write a JIT compiler in 30 minutes

104

42:12

Are women underrepresented in the High Performance Computing (HPC) community?

105

16:59

Hack The CPython

106

29:36

Astro Pi: Python on the International Space Station

107

29:55

status quo of virtual environments

108

32:14

Deep Learning with TensorFlow 2.0

109

31:07

Do we have a diversity problem in Python community?

110

43:10

How software can feed the world

111

45:20

Introduction to low-level profiling and tracing

112

30:28

Understanding Numba - the Python and Numpy compiler

113

37:23

Is it me, or the GIL?

114

39:28

Running a Synchrotron on Open Source Python

115

29:26

Unleash the power of C++ in Python

116

27:06

A Python-powered pantographic plotter

117

44:21

And now for something completely different.

118

29:44

Optimizing Docker builds for Python applications

Automatic playback

Speech

Text

Image

00:00

GoogolPoint cloudPhysical systemMaxima and minimaCharge carrierScale (map)Service (economics)Core dumpDigital signalContinuous integrationServer (computing)Integrated development environmentVirtual realityMultiplication signComputer-assisted translationIntegrated development environmentCuboidMedical imagingLevel (video gaming)Statistical hypothesis testingCore dumpAdditionProduct (business)Process (computing)QuicksortStatistical hypothesis testingPhysical systemMereologySoftware developerCartesian coordinate systemWeb 2.0Software frameworkServer (computing)Interactive televisionStability theoryProjective planeProbability density functionComa BerenicesRevision controlComplex systemMathematicsSoftwareWeb applicationError messageType theoryState of matterAsynchronous Transfer ModePoint (geometry)Procedural programmingSlide ruleBitNormal (geometry)Service (economics)LengthContinuous integrationNetwork topologySinc functionLatent heatTurbo-CodeTraffic reportingSelf-organizationOpen sourceLogical constantBranch (computer science)PiLecture/ConferenceComputer animation

07:58

Level (video gaming)Revision controlHash functionCurvatureLatent heatOvalHash functionInstallation artMultiplication signUnit testingSoftware maintenanceLine (geometry)PlanningFunctional (mathematics)Statement (computer science)CASE <Informatik>Statistical hypothesis testingRevision controlSoftware developerLevel (video gaming)Standard deviationSelf-organizationLibrary (computing)Type theoryPersonal identification numberProjective planePoint (geometry)CodeMathematicsService (economics)Execution unitMedical imagingElectronic mailing listStatistical hypothesis testingBitPhysical systemComputer fileINTEGRALSystem callSinc functionMetropolitan area networkFlow separationComputer animation

15:33

Strategy gameCodeOperations researchRevision controlTotal S.A.Normal (geometry)Military operationStatistical hypothesis testingNormal operatorSoftware developerTask (computing)Extension (kinesiology)BitMultiplication signInsertion lossError messageGoodness of fitDuality (mathematics)TrailRevision controlService (economics)CodeWhiteboardMathematicsGame theoryPhysical systemStrategy gameContinuous integrationStatistical hypothesis testingStatistical hypothesis testingFunctional (mathematics)Software bugSet (mathematics)Branch (computer science)Structural loadProduct (business)Total S.A.Level (video gaming)

23:08

Integrated development environmentString (computer science)ASCIIWritingHuman migrationStatistical hypothesis testingMiniDiscStatistical hypothesis testingLevel (video gaming)CodeCore dumpScripting languageError messagePlanningComputer fileDatabaseMultiplication signoutputStatistical hypothesis testingPlug-in (computing)Uniform resource locatorMereologyPhysical systemReal-time operating systemStatistical hypothesis testingIntegrated development environmentProduct (business)CodeHuman migrationLine (geometry)Virtual realityRead-only memoryWritingMiniDiscDirectory serviceProjective planeMobile appOrder (biology)Level (video gaming)Revision controlQuicksortStructural loadCartesian coordinate systemUnicodeComputer animation

30:42

Term (mathematics)Multiplication signElectronic mailing listProjective planeDrop (liquid)Core dumpResultantMereologyLecture/Conference

33:14

Lecture/Conference

Transcript: English(auto-generated)

00:04

So, my name is Leonard, don't bother about my last name. If somebody asks me how to pronounce it, I get self-conscious and then I mispronounce it.

00:21

I'm born in Sweden, but I live in Poland with my wife, daughter, two cats and way too many fruit trees. I've been using Python since Python 1.5.2 and I've been working with Python and web since 2001. I wrote the book on how to move from Python 2 to Python 3.

00:43

You can find it in HTML and PDF on Python3porting.com. It's open source on GitHub. And I work for Brightcore. Brightcore, we're doing this type of software that insurance companies use to deal with insurance policies and claims, so it's not fairly interesting unless you are in the insurance

01:04

industry. We work completely remotely and, yes, we are hiring, so if you want a job and you are looking for remote work, you can talk to me. I'm new to the whole recruiting bit and I've done this before, but talk to me anyway.

01:21

We are not on Python 3 yet. We have just started. It's an ongoing effort. But at my last job, called Shoebox, which is also a very nice company and which most likely are also looking for people, although I should warn you that the system there is insanely complex, we successfully moved this large and insanely complex system to Python

01:47

3 last year. So let's step back in time, back to the Stone Age, when you or somebody at your current job made some sort of application in Python.

02:04

And this is you, back in the Stone Age, with your web framework. And whoever did this, you or the other person at your job, made such a good job that this application is still running. It's probably a web app.

02:21

It's probably some old version of turbo gears, web to pie or maybe even zoop. And you have for years now been bravely running away from Python 3. But you can't run any longer because Python 3 is committing suicide.

02:41

But don't be afraid of Python 3. A lot of people are afraid of it and think it's horrible and bad and everything, but it's not the killer rabbit of Kerbanog. It's just a regular old Python. Now, the hard part of porting to Python 3 is getting your system into a state where

03:02

it's easy to port, and this is something that is a benefit for you anyway to do this, to fix up your system. The porting itself is quite easy. It's what comes first that is hard. And that first step of that is to stop being a fire department, because many large

03:22

organizations are constantly just putting out fires in their applications. And that's not a good situation to port to Python 3, because if the changes that you are making to your system as a part of normal development keeps breaking it and

03:42

turning into problems and you have to fix them in panics, then moving to Python 3 is going to create several of these fires, and that's going to be a big problem. Also, if you are in constant firefighting mode, you don't have time to move to Python 3. So you have to first get development to be normal and calm and regular.

04:09

So you have to get out of firefighting mode. Now how to do that is in itself a whole talk or maybe a whole conference, and I was asked, would not be the person to do that anyway, because I'm not a DevOp guy.

04:24

I'll just mention some things that I've seen being done to fix this situation. And this slide here assumes that your software is a service of some sort, a web app or some other service, and that you have like a production environment that you need to

04:41

keep up. Because that's the firefighting that I've seen and that I know, and I don't even know if you can have firefighting, if you have some other sort of application, but if you do have firefighting in another sort of situation, then come talk to me, because I'm interested in hearing why you have firefights in that situation, why you're

05:03

firefighting. So to port to Python 3, you need to have tests, because otherwise you don't know if it's going to work on Python 3. But tests also help with stability. So if you are firefighting and there is a problem, make sure you have a test to make sure that never happens again.

05:22

Always add tests. And you have to run those tests, and that means that for any sizable project you need to have continuous integration. If you have a production situation, if you have a production server, you also need staging servers to test things on.

05:40

You should have automatic deployment. Deploying the latest release or just making the latest release from master or from a production branch should just be pushing a button. You shouldn't need to do anything more, and everything else should be automatic.

06:01

Extra points if this is done automatically every night to a staging server, so you know that your release procedures actually work. And monitoring. Of course, as the previous speaker said, you should know if there is a problem before

06:23

your users know it. And there are some Python specific things you can do, too, like you should run in an isolated production environment, and that means a buildout or a virtualenv and maybe some sort of containers, containers are very in now and have been for several years, and

06:43

that helps so you don't get weird interactions with operating systems. Like for example, Docker, and for most of you this, what I'm going to say now, it's probably obvious, but I just only realised it the last few months when I worked at Brightcore, so I'm going to mention it because it's new to me.

07:02

If you use Docker on production, you quite often have to rebuild the Docker images. For example, every time you have new requirements, you rebuild the Docker images because a part of the images is the virtualenv that you install and you install all the packages. And if you do that, and some new requirement creates a conflict when installing, you get

07:26

that error when building the document images, not when pushing to production, and that's a really good thing, because your deployment doesn't mess up production because it breaks when you're building the images. In addition, you can then use those images on continuous integration and maybe even develop

07:45

on them so that your developers have exactly the same environment as production. So that's really nice, and it's like, oh, wow, now I understand why everybody is talking about Docker, it took me years. So with all these things in place, your firefighters can take it easy, and you can

08:04

go on to preparing, or you can go on to planning, which I'm going to talk about later, or you can do both at the same time. So there's two stages here, preparing and planning, and they are independent, you can do both at any time.

08:21

And the first preparing is that you should pin all your versions of all your packages, every requirement that you have. And if you don't know what pin means, it means that in your requirements file you specify exactly which version, not at least this version or less than version, exactly which

08:41

version should be pinned. And unfortunately I have not found a way to require this in pip, to tell pip that everything has to have a pinned version. So one way you can do this is to verify in the install script, or if you have Docker

09:03

in the build images, that what you installed by getting a pip freeze, you get a list of exactly what you installed, and compare that to your requirements file, so that you don't have installed something that's not in the requirements file, for example. Another way to do it is to add hashes to the requirements file.

09:23

Then you're specifying not just which version, but which exact package to install, so you install a specific wheel or a specific egg or something, you can have several hashes for each version, so you can say all these are okay.

09:40

This has the benefit that as soon as you specify one hash, pip will refuse to install anything that doesn't have a hash. So that way you know that you are getting exactly what you want when you're installing it. It's extra maintenance, extra work to get all these hashes in, but it also means that

10:01

if somebody uploads a malicious package to the cheese shop, you won't download that by a mistake. You know exactly what you're installing. So one of those versions, make sure that you know exactly what you have when you're installing.

10:21

You might also, as a preparing, want to increase the test coverage even more, because it's very good to have a line coverage when porting to Python 3, so there's no hidden Python 2 statements somewhere that you missed in the porting. What percentage of test coverage you want is really a matter of opinion.

10:43

100% is obviously awesome, but for a big project that's generally unobtainable. 90-95% maybe seems reasonable. And you can bridge the gap by reading all the lines that are uncovered by actually having before every big release, or at least before you're trying to do the last big pushes,

11:09

that you actually check all the code and you just read it manually, because at some point that gets easier than writing a test for them. When testing, there's one big thing that you might encounter, and there's this philosophy

11:26

when it comes to unit testing that you should test each function separately, you should have one test for one function, and every call from that function out to other functions

11:41

should be mocked. But if you do that, you only test that the function is doing what you tell it to do. You don't actually test that it works. And if the API calls then changes, the test will still pass, and this is a huge problem

12:01

with Python 3, obviously, because the standard library changes. So this type of testing is practically useless when porting to Python 3. So if you are doing this in your unit tests, if you have this principle and follow that to mock out all the calls from a function when you test it, then you need to have 95%

12:25

coverage from your integration tests, your unit tests you can basically ignore. After this you need to upgrade your dependencies. You have to make sure that the latest Python 2 compatible version is what you're using of all your dependencies.

12:45

And after you have done that, you have to make sure that all of the dependencies you have are also Python 3 compatible. And you may have to replace or in worst case port those dependencies. But since those are separate packages, that's

13:01

generally relatively easy to do unless the package is highly magical, but by today most highly magical popular packages already support Python 3. And if they don't, like for example Python MySQL, there are forks of them that people are moving over to that do support Python 3.

13:22

So this stage can take a significant time, especially if you have not been keeping your dependencies up to date. I have met people and talked to people here that are still running on like Python 2.6 because they actually can't upgrade to Python 2.7 and stuff like this. So if you're in that situation, expect this

13:42

to take time. And then you come to stage 3, planning, or you already did it. And planning here is a lot about how many people in your team should do the porting, should all of them be involved, and should you move to Python 3 directly

14:05

or should you have Python 2 and Python 3 compatible code for a while. And there's three questions there I have for you. The first is can you stop adding features and stop firefighting and for how long can you do that? Because porting will in best

14:22

case take two weeks and in worst case even if you do everything at one go it can still take months. Can you stop adding features and stop firefighting that long? Do you have some deep magic that only a few of your developers understand? Because that

14:44

deep magic has a big risk that it's difficult to port to Python 3 and that bit will then block everything else. And how big is your team? If you have 50 people you can't put all of them on porting to Python 3, that's just a logistical nightmare, the mythical

15:05

man month remains mythical even with Python 3. So you can't put 10 people on doing this, maybe more, 20 I think is stretching it. Unless you are very good in your organisation

15:23

at putting a lot of people on doing one thing. And if your system is already split up to multiple separate services, then you can put one team on each of these services so then you can easily put 5 or 10 people on each service so then you are way ahead

15:44

of the game. But most of these old systems are monoliths. So some different strategies here then is to do it all in one go and you don't have deep magic, you can't stop adding features for a month, maybe why not do it all in one go? Well, it takes less time to do it, it's less work in total, a little bit, but

16:08

not a lot less work, but a little bit. And you can aim directly for Python 3 code which is a benefit and speeds things up. But there is a high risk of doing this. If you start doing this, you put all your 7 developers on porting to Python 3 for two

16:23

weeks and then you discover that there is some huge issue that means you kind of have to stop right now. Well, then you go back to adding features and adding and fixing bugs and your two branches are going to start to diverge. And there is a risk that when

16:41

you start half a year later with Python 3 again that you basically have to throw away all the work that you did during those two weeks. So it's a very high risk strategy of doing it. And of course all other work has to stop. So slow and steady is a safer strategy. And this means that you aim to write code that

17:03

will run under Python 2 and Python 3 at the same time. Although you run it on Python 2 in production, until everything works under Python 3 and then you can switch. This is the low risk version. It doesn't disrupt normal operations. It's a little bit more work

17:24

and more importantly it takes longer time because you're going to still do all your other work at the same time so Python 3 gets pushed a little bit to the side and it can take half a year to get through all of this because not everybody is working on it.

17:41

And of course you need dual version support which means it takes a little bit more work. What you can do if you have a development team that is small enough to fit into one big house, you can start with a Python 3 sprint for all the developers but not aim

18:02

for Python 3 but aim for Python 2 and Python 3 compatible code so it runs on both. That way when you come back half done you can switch to having a dedicated team to do the last bit or just do it as a background task when you don't have anything that is really, really critical. And this is what we did at Chewbox. We rented a house in southern Spain

18:24

during the winter when there's low season so it was cheap. Got all the guys, almost all the developers in there and we tried to move to Python 3 for a week. And we got almost the whole way there. It was, we got a fair bit done. So of course we weren't done but

18:46

you know we had solved most of the critical issues and it's a lot of fun to get everybody into one room and just hack away on something. So this is low risk because you're aiming for Python 2 and Python 3 compatible code. It only disrupts your normal operation briefly for a week

19:04

or two or however much you want to take. And everybody gets on board and feels involved which is good. It's not just one or two guys in a corner sitting, porting to Python 3 where everybody else just sits and go oh Python 3, Python 2 that was good we

19:20

shouldn't have and stuff like that. So everybody gets involved so it's good. The drawback is that you do still need dual version support. It's still fairly slow although not as slow as just the really slow version. Then you come to the actual porting stage and there's

19:43

several things you need to do here. You will not start to run your tests under Python 3 here. This will obviously fail and that's okay but your continuous integration system still needs to

20:02

run it under Python 3 and make sure that as much as possible runs under Python 3 because otherwise people will add back incompatible code. And if you have some people trying to port to Python 3 while other people are adding Python 2 code you're going to backslide and

20:20

you'll never ever going to get done. The trick to stopping this is continuous integration. But of course you cannot just let your continuous integration say no this failed because it doesn't work on Python 3 because in the beginning basically all tests will fail or

20:40

in fact it probably won't even be able to find the tests in the beginning. So what you need to do is get your CI gurus the people who knows your continuous integration system well to set it up to keep track of which functions which tests that once passed

21:01

under Python 3. And if they passed under Python 3 then the CI run should fail if that test no longer runs under Python 3. And that way every time you change something and some tests stop working under Python 3 you're going to have to fix that. And sometimes that means that you make a small little change. It's like it's a bitsy little change

21:24

you just fix a little bug and suddenly lots of things that used to work under Python 3 no longer watch under Python 3 and then you have to spend a whole day fixing all this. Sometimes it's the tests that need fixing. It's really boring work but these things happen and you have to do that to stop this backsliding. We turned it off briefly

21:45

for a firefighting thingy at shoebox and forgot to turn it back on for a month or two and there was loads of incompatible code added during this time even though everybody knew they shouldn't do it. It just happens by mistake and you basically have to go on

22:04

and fix a lot of issues that you already fixed once again. So that's really annoying. So have this. Stop the backsliding. I'm sure you know what 2 to 3 is already. It's this tool that will convert Python 2 code to Python 3 code. What's really helpful

22:25

here is to use modernize. Modernize is a set of extensions to 2 to 3 that will convert from Python 2 code to code that is compatible with both Python 2 and Python 3 and it does this by using the 6 compatibility layer. There's another compatibility layer called

22:45

Python future. It also has its own 2 to 3 extensions but Python future inserts a lot of magic primarily into Python 2 to make it look more like Python 3 and this magic has bitten me several times so my recommendation is to not use Python future but to rely on

23:05

Python modernize. And as I mentioned the first errors you will get are errors that actually prevent you from even finding the tests to run. The test runner won't find anything because you will just get import errors everywhere and behind those import errors

23:25

there's usually either other import errors or syntax errors. So you're going to have to fix that. And the way to fixing that especially in the beginning is to figure out what is wrong, find one of these Python modernize or 2 to 3 fixers that will fix

23:45

that specific wrongness and then run it. Maybe even just on that file where you had a problem because if you start with just going oh I'm just going to run Python modernize on everything and then go on from there then when you find errors those errors may

24:03

be in lines that already have been changed and then you don't know if that error was really there from the start or if it's an error that was introduced when running the fixers. So therefore in the beginning you need to do this slowly and carefully one fixer at a time maybe even on one file at a time just to fix that file and then you run

24:28

the tests again and the import error you get is in some other location so then good then you fix that and then you go on to the next import error. You could of course

24:42

just once you find the error you go oh I know what to do here and this is an easy way easy thing to fix and it's tempting to just change the code save it and run the thing again but the problem is the next error you will get three lines down is the same thing

25:00

again and doing that gets quickly very boring so use these fixers to run on files so it doesn't get so boring because it will fix several places at one time but do one fixer at a time and then you just fix fix fix fix and this is where the book is finally useful

25:24

because the book is about finding how to fix these errors and as you get more confident you can start running those fixers on like maybe a whole directory at a time and things like this because you're starting to get a better feel for what is happening but if you run it

25:47

on a lot of files at once and you're several people doing this you're going to get merge conflicts so this is why it's good to do it one file at a time if you're many don't forget that you have scripts in your development environment usually you have some

26:04

sort of helper scripts to create test data to copy databases from production so you can test on real data locally these kind of things loads of these little helper scripts they're going to have to be ported to if they run in a separate virtual environment you can actually do that first as practice as a good thing to get up and running on on porting

26:26

if they run in the same environment same virtual environment as your main application that's usually because they import that implication application to do things and then you're going to have to port them last but don't forget that these also have to be ported

26:43

and the sooner the better basically you also need to write data migration tests you have to take the data that you have that is generated under Python 2 and make sure that you can still load it under Python 3 and that you get the right thing that you get Unicode when you expect

27:02

Unicode that the encodings are still correct basically anytime you're loading loading data from a database or disk you need to have a test there and if it doesn't work you need to write migration scripts and if you're using pickles well I'm sorry you're in deep shit

27:28

so once all tests pass or maybe even before you try to push Python 3 to staging try to run this under Python under on the staging under Python 3 this is going to fail the first few

27:41

times and that's okay and then once everything seems to work test it properly on staging with production data that everything seems to work fine click through everything be thorough and once that also works you pushed it put to production or if you don't have a production

28:02

then you make a release if you have production and you can actually move like one customer at a time to Python 3 do that take it slow and careful if possible if you need to migrate the database to get onto Python 3 try starting everything read only so you know that it at least works

28:28

in that situation first before you enable editing if you can fall back to Python 2 be prepared to fall back to Python 2 if there's an error and then when you have it on production

28:47

I've had it for a few weeks or so you party you're done yeah I got it on Python 3 and after party you have to clean up and that's not so fun but once you cleaned up after the party you have to clean up the code and that is a lot of fun now you can get rid of all

29:06

those Python 2 backwards compatibility things and that feels very satisfying this is a really nice part of the project getting rid of all the old craft see this as an opportunity to just

29:23

prettify your code in general just go through it fix it up remove anything old and ugly pet ate it maybe run it through black to get everything formatted exactly as it should be and things like this make your code feel new and shiny again it doesn't take very long to do this

29:44

actually one because you have to go through the code to remove the old Python 2 backwards compatibility things anyway prettifying and cleaning up the code in general is basically you get that for free and in general even with a big system this just takes a few days

30:01

so do it it feels really nice and then done you're up on Python 3 the code doesn't even run on Python 2 anymore everything is fine and finished and you have all the new features of Python 3 and you can start using them so in summary stop firefighting prepare and plan

30:26

in whatever order you want fix the tests on the Python 3 push to staging production and then clean up that's the general plan any questions uh and going to uh 2 plus 3 compatibility

31:13

and support also for long term because we are talking about people who will still be using Python 2 in years to come um and then we have another big project related to the first one

31:26

and we are after the experience with this we we were thinking of going straight away to Python 3 drop support for Python 2 now I got from your talk that maybe this could have been

31:43

the result of using future is because exactly what we got is we spent so much time fixing the Python 2 part after migrating to 2 plus 3 the 3 uh the 3 was working well the 2 was suddenly broken everywhere yeah so is this would you recommend if someone has to still support 2 and 3

32:06

to to really I mean is it futurized versus modernize and 6 yeah I recommend modernize and 6 then if you need to run on both Python 2 and Python 3 yes I mean if you're already using futurize and we have both on bright core and on on shoebox in the list of our requirements

32:28

with futurizes there because it's being used by other packages that we are using so people are using it successfully so if you are using it successfully and it's it's working then that

32:40

that's fine um but but if you're not already using futurize I would recommend against it because I think it's more trouble than it's worth anything else all right uh yeah come and talk to me about your experience in trying

33:08

to move to Python 3 that's interesting too so thank you