We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Static typing: beyond the basics of def foo(x: int) -> str:

00:00

Formal Metadata

Title
Static typing: beyond the basics of def foo(x: int) -> str:
Subtitle
Exploring the practicalities of explaining complex code to mypy
Title of Series
Number of Parts
118
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The Python community has been warming up to static typing for a few years now. You may have seen talks that did a great job of introducing the basic concepts, mypy, and high-level strategies to cover existing code bases. We need to go deeper. Let’s talk about the challenges you inevitably encounter when you try to type-check a large code base. One full of many moving parts, complex architectures, metaprogramming tricks, and interfaces with a dozen other packages. Static type checking is very powerful – when you use it to maximum advantage and explain your code to the typechecker accurately. We will cover a few tools at your disposal: generics, signature overloads, protocols, custom mypy plug-ins, and more. There is more than just tools, though. Behind them all are universal concepts valid in any language. I hope to convince you that thinking in terms of the type system helps you write better code…
Keywords
20
58
Software engineeringUniversal product codeMaizePiPlanningLecture/Conference
Error messageStrategy gameComplex (psychology)CodeConfiguration spaceBitFunctional (mathematics)Multiplication signPoint (geometry)Complex (psychology)CuboidPiCodeFront and back endsArithmetic meanTypprüfungLine (geometry)Default (computer science)Configuration spacePartial derivativeComputer programmingType theoryComputer configurationExecution unitFluid staticsCovering spaceMilitary baseComputer fileComputer animation
Configuration spaceElectronic mailing listFunction (mathematics)Modul <Datentyp>Endliche ModelltheorieDynamical systemPiForm (programming)Computer configurationOffice suiteModule (mathematics)Electronic mailing listArithmetic progressionSoftware developerCodeFluid staticsSoftware testingArithmetic meanComputer animation
Modul <Datentyp>Module (mathematics)RoutingDirection (geometry)PiCodeWordComputer configurationComputer animationJSONXMLUML
Game controllerModule (mathematics)Electronic mailing listDefault (computer science)Point (geometry)Configuration spaceException handlingArithmetic meanPiMultiplication signCodeProjective planeComputer animationXML
Software testingDefault (computer science)Unit testingConfiguration spacePiSoftware testingDigital photographyPatch (Unix)BitCodeSampling (statistics)Slide ruleComputer animationXML
Software testingCodeDefault (computer science)Computer configurationType theorySoftware testingComputer filePiCodeDefault (computer science)Proxy serverComputer animation
CodeComplex (psychology)Generic programmingVariable (mathematics)WeightCodeType theorySocial classAverageVariable (mathematics)Generic programmingNeuroinformatikWeightArithmetic meanWell-formed formulaResultantJSONXMLComputer animation
WeightData typeSocial classMultiplication signAverage
WeightData typeDecimalSocial class
DecimalParameter (computer programming)Error messageVariable (mathematics)AlgebraGeneric programmingWeightMixed realityAverageDecimalDampingLine (geometry)Point (geometry)NumberSocial classMereologyType theoryAlgebraFunctional (mathematics)Lie groupMultiplication signWeightInstance (computer science)Block (periodic table)Parameter (computer programming)CodeRevision controlVariable (mathematics)Run time (program lifecycle phase)PiJSONXML
Nominal numberCommunications protocolNumberIdeal (ethics)Nominal numberType theoryTheoryDifferent (Kate Ryan album)Inheritance (object-oriented programming)Data structureAbstractionReal numberWord
Inheritance (object-oriented programming)Error messageParameter (computer programming)QuarkFunctional (mathematics)Computer scienceInheritance (object-oriented programming)PiPoint (geometry)Social classJSONXML
Execution unitConvex hullInheritance (object-oriented programming)Communications protocolHierarchyContrast (vision)Social classTerm (mathematics)Arithmetic meanCommunications protocolCodeInterface (computing)Dot productCASE <Informatik>Formal languageObject (grammar)Validity (statistics)Functional (mathematics)PiJSON
Multiplication signGoodness of fitType theoryOrder (biology)Functional (mathematics)DatabaseDecimalComputer animation
DecimalCodeComputer configurationDecimalComputer configurationType theoryDifferential (mechanical device)Different (Kate Ryan album)Semantics (computer science)CodeXML
Computer configurationInversion (music)Type theoryCodeFunctional (mathematics)Constructor (object-oriented programming)TypprüfungComputer configurationModule (mathematics)XML
Computer configurationDecimalModulo (jargon)Parameter (computer programming)Type theoryAliasingModule (mathematics)Functional (mathematics)DecimalLimit (category theory)Point (geometry)Operator (mathematics)ImplementationArithmetic meanJSON
Computer configurationDecimalMeta elementoutputPlug-in (computing)CodeLine (geometry)Function (mathematics)CASE <Informatik>CodeResultantEndliche ModelltheorieFunctional (mathematics)Social classRun time (program lifecycle phase)Plug-in (computing)Multiplication signParameter (computer programming)Moment (mathematics)DecimalImplementationPrisoner's dilemmaSubject indexingElectronic mailing listNP-hardElectronic signatureDifferent (Kate Ryan album)Type theoryMetaprogrammierungOnline helpShared memoryPiComputer animation
Series (mathematics)Information overloadSequenceGeneric programmingPrice indexComputer multitaskingBitSequenceRevision controlWordType theoryProgram slicingCASE <Informatik>Electronic signatureInformation overloadNumberDot productImplementationSquare numberSubject indexingSeries (mathematics)Social classPoisson-KlammerModule (mathematics)Operator (mathematics)PiJSONXML
Configuration spaceCovering spaceCodeGeneric programmingCommunications protocolSemantics (computer science)Plug-in (computing)Multiplication signConfiguration spaceType theoryLimit (category theory)Default (computer science)Data storage deviceVideo gameWritingFluid staticsModule (mathematics)Context awarenessGeneric programmingDynamical systemBoilerplate (text)Semantics (computer science)IntegerPlug-in (computing)Social classVariable (mathematics)Communications protocolLine (geometry)PiComputer animation
Communications protocolCovering spaceCodeGeneric programmingSemantics (computer science)Plug-in (computing)Configuration spaceSoftware developerProjective planeMachine codeCodeLecture/Conference
Type theoryMultiplication signCASE <Informatik>CodePoint (geometry)Data structureCountingMusical ensembleGame theoryPrimitive (album)Virtual machinePauli exclusion principlePiLecture/Conference
Transcript: English(auto-generated)
Many of you already use my pie to to check your production code Wow, that's quite a lot For those who aren't this might be a little bit steep in the beginning But I hope I won't scare you away from my fire
My name is Vita. I'm a software engineer and now a co-founder of of a company called quant plane We started five years ago and what we do is we trade stocks mostly in Europe and We do that automatically and semi automatically
we are based in Prague and Everything we do on the back end is in Python 3.7 at this point and we also happen to use a lot of async IO and I Like to think we were very early to start using type annotations and my pie so
Static typing is still quite a new thing to the Python ecosystem and the community We're still learning how to use it and the tooling is still being actively developed and
for those reasons it is sometimes a bit difficult to Maybe not get started with static typing but to actually cover complex code bases with static types But despite these challenges I
believe it is really worth it because When it is done properly it can help you Avoid a lot of mistakes and in the box before you even run your program before you even run your unit tests I'm going to talk and two chapters
the first one is the high-level approach you might want to take when you when you have a big code base and you want to cover it with my pie and Then we'll talk about a few examples of code that is
Not the usual hello world function and how you might go on about typing that and In the end. I'll remind you that it really is worth it. Even though it will look a bit complicated sometimes
so I Mentioned before that we started using static typing quite early at a point where we already had a couple hundred thousands lines of code and My pipe was very early back then and it it was crashing on the code. I don't mean spitting out
typing errors But actually crashing so we had to start gradually and only cover our code step by step and a big lesson we learned Unfortunately not in the beginning, but over time was that the default my pi configuration
is quite lenient and if you don't make it slightly stricter than the default is you might Learn a few bad habits that will come and bite you later And you will still have to fix your code and your annotations, so I would recommend you
Whatever the code is that you're going to run my pie on I would recommend you to have full coverage meaning There are no functions which have no type annotations or partial annotations So these are the config options you you might use for that
Second these are optional, but you might want to consider them These restrict some forms of dynamic typing in your code Some of these options are difficult to enable but If you can do it, or if you're starting with a new code base, I would definitely use them and
This you really want to do with my pie and and static typing It's sometimes easy to fall into a trap where you think you know what you're doing on you
or you think you know what my pie is doing and It might not immediately tell you that your understanding is not quite correct So enabling some warnings will help you with that Since covering an existing large code base is a huge amount of work you want to go step by step
so you begin by opting in meaning you run my pie only on the modules you've already covered and You might even start with a single module With a very small step, and then you then you just keep adding on
and You want to defend your progress by adding this check into your CI pipeline? So it runs before your tests do and then of course you never want to make that list smaller You only ever want to expand it What worked well for us was doing an internal hackathon where a couple of developers stayed at the office overnight and?
worked hard to to increase our coverage and We're still not completely there, so we might have to do a few more sleepovers when you go the opt-in route you need to deal with imports because
You are covering. Maybe just a few modules But those modules might be importing other code which You maybe are not ready to check in the beginning So this is how you tell my pie to not complain too much about
other modules a word of warning that follow imports Directive has another option called skip the documentation warns you not to use that we did and It was a terrible idea Don't do it at some point when your opt-in list is sufficiently long you might you definitely want to switch to opt-out
Meaning you run my pie on everything by default except for some modules that you exclude in your config and Of course there might be dozens of these
Ignored modules in the beginning when your exclude list is huge But then of course over time you work to make that list smaller and smaller until it disappears The benefit of getting to opt-out is that any new code you add to your any new modules you add
To your project will be checked by default covering unit tests is tricky matter Despite me recommending strict strict configuration for my pie I will backtrack on that for tests and
Just make my pie a little bit more lenient for reasons explained In this code sample by the way. I'm going to put the slides up online So you don't have to take photos of everything there will be a lot of configuration and code
When you use mocks and monkey patching in your tests Which you often do there is no way to explain that to my pie as of yet If there's a very complicated problem see so you just need to ignore Those places where were you monkey patch?
But despite these challenges. I would urge you not to ignore all your test files completely Because even when you partially cover them with my pie you will get some benefits because my pie will be able to check that your tests are using your
your tested code as Intended meaning the annotations are being respected if you build your own python packages You should know that even when they do have type annotations in their code and my bypasses on them
If you use that package somewhere else my pie will not follow those annotations by default so you need to tell it It's very simple. You just need a marker file added to the package and that's it But unless you do that you don't benefit from from annotations in packages
When you use third-party packages which might not have type hints There's a few options you have you might write stubs. This is something. We don't really do so I won't go into detail You might want to ignore all third-party packages or better you
Again, very explicitly ignore just those that don't have annotations So now that you know Generally how to approach a code base We can talk about a few examples of what you might find in your code base
the first example a very useful frequently used tool generics and type variables Who here has heard of these or maybe even use them?
Wow, that's that's really good. I think this is one of the most useful and needed features Let's take the example of a weighted average, which is a very simple formula simple computation where you add up values and you average them using weights and
critically, we will want to implement this average as Incrementally updatable meaning you can keep on adding values to the average and getting back the result so you might start by writing a very simple class for example that starts with some
internal pre computed values and you will be able to add a weighted value to the class to the to the average and Then you just add a simple method that that can calculate the average at any time You will notice that we are using floats as the data type in there
So we are explicitly saying that we can only calculate averages for floats but Imagine you not only want to do that, but you might also want to use decimals Which is a an arbitrary precision data type in Python
so of course as written as annotated that class will work for floats as expected and Of course, it will not work in decimals because you said your values were going to be floats By the way the reveal type
Function is extremely useful That is provided by my pie. So it's undefined at runtime But for debugging what my pie thinks your variables are it is a very useful function so if you want to allow floats or decimals a
Good good way to do that is to parameterize your your weighted average So you make it a so-called generic class and you say it is parameterized by this Type variable which we called algebra type in this example and we restricted that type variable to either be
a float or a decimal then your original class will be very similar to the previous version, but You will suddenly have a small trouble with the number zero By the way, this code block contains a very small lie
Maybe some of you can see it, but it's not very important at this point. It works and Then in the rest of your class instead of saying float you will be saying algebra type So you've parameterized the type Now when you want to use that class When you instantiate it you need to add the value for that type parameter
When you create the instance like this So the first few lines are a weighted average of floats and You can see that it also returns a float. And of course the second part is a weighted average of decimals
What is nice is once you create a weighted average of a certain type You cannot change your mind and start mixing the times that is desired There is an even possibly Nicer and cleaner solution Which is to say?
Actually what I need to do with my numbers is to add them multiply and divide them So I don't care if those are floats or decimals or something else. Ideally, I would just say they are real numbers So in theory that sounds great, but in practice the abstract number types in Python don't really
And this is a good example of Of typing in Python being quite pragmatic It isn't an Ideal world it is a pragmatic world and you need to be pragmatic, too
So your type annotations often won't be perfect. They won't perfectly describe what you had in mind, but they will approximate it another very important examples example is Understanding the difference between
Nominal and structural typing. So these are fancy sounding words, but it's nothing complicated Nominal typing you already know that's that deals with class inheritance. So Animal examples seem to be popular in computer science for some reason so I went with one and
There's a base class of an animal and then we have a duck that apart from whatever behavior animal has can also quack and then Suppose we want to make a function that accepts something that can quack and make it quack
so as Annotated here this works because it's it's very trivial You're just telling my pie that your function needs a duck because only ducks can quack and and that passes
However, imagine you wanted to have another animal that can quack and you want that function to work for that animal, too So we could create a penguin which possibly makes sounds close to quacking But it would be wrong to inherit that from a duck. That would be very wrong
So you just inherit from animal but then of course your make it quack function doesn't work because it was told to expect ducks so Nominal typing means you use You use classes and class hierarchy when specifying the types you need
in contrast to that We can be talking about structural typing where you where you describe your types in terms of the capabilities they have so here you're creating a thing called protocol in other languages you might have heard a term an interface or maybe a trait and
This this is actual code Those three dots are valid Python syntax in case you didn't know and this really just tells my pie that there is an interface or protocol called can quack which
exposes a public method called quack and When we declare this and change our Function slightly so it now accepts something anything that can quack Then this will work for both animals and
The interesting thing is that we didn't have to inherit from that protocol that protocol is a class, but that is more of a syntactic convenience and Now any object that will have a quack method will meet the requirements of this function so this is very useful for
for duck typing Pun intended Another example we encountered is when you want to Somewhat define your own type without creating an entirely new time a
Very simple example of that is when you have a function called place order maybe that accepts a price and quantity of Some goods that you're buying maybe And
Doesn't matter what it does with that. Maybe it will save it in a database or whatever and Wouldn't it be nice if we could somehow differentiate between a price decimal and a quantity decimal They really are just decimals, but there is a very clear semantic difference between prices and quantities
If we could do that It would make our code more readable Because when you read those annotations you will clearly see this is of type price, and this is of type desk of type quantity and It would also make it hard to mix them up, so you wouldn't be able to accidentally pass a quantity in place of price
I Come from a company that trades under financial markets and confusing Prices and quantities or buy and sell is not a mistake you want to make So the first option that you might think of is to alias
a type so you say there is There is something called price, and it really is equal to decimal and then you can use that price as a constructor But that works But unfortunately that doesn't create a new type. This is just a convenience for you, so you don't have to type so much
It does make the code easier to read when you suddenly start writing price instead of decimal But as for type safety you get absolutely no benefit Another option that exists in the typing module is a function called new type
which Kind of aliases an existing type, but it is a true alias In the sense that my pi now and understands that it is a price and not a decimal So What what this does is you can still create decimals, but then you need to wrap them in your
Price type and from that point onwards my pi knows it's a price not a decimal and if we defined a function that takes a price you will see you can't pass a Bare decimal to it, and you can't even pass a quantity to it even though it really is another decimal
so this is what we wanted and We are now able to differentiate between the types This all works to a point Once you start modifying the values
You're back to the original type because really it is just a decimal under the hood and once you start making Operations meaning calling methods on that type it will return back the original type so There is a limitation you should be aware of the only
perfectly correct solution of defining a new type is To actually define a new class and implement your type and the behavior you want so That is nice and clean But of course you will pay a runtime price because your price implementation will probably not be faster than there than D
The decimal type that's already in Python implemented in C So that way lie a lot of interesting dilemmas about preparing static typing purity or
preferring pragmatic runtime performance and simplicity Now in Python you can do a lot of meta programming a lot of Lot of magical tricks and my pie cannot always understand them a
Good example of that is the data classes module. That's new in Python 3.7 Or if you're familiar with Django then Django models Are an example of meta programming that my pie wouldn't be able to figure out by itself So you actually are able to write plugins for my pie that help it understand
Magical code This is even newer than my pie itself There isn't much in the way of documentation yet, and there's just a few working plugins out there We also had to write a plugin and
If you need to go that way, too Then you might find our plugin useful because it's got twice as many comments as it has code and So this is a bit hard at this moment, you probably won't need it But if you do, I'm sure this will all get easier over time
The final example I'd like to share is overloading function signatures That means properly typing the case where? your function might take different sets of types of parameters and
Return different types of results based on the parameters a Simple example is is Having something that's indexable Like a list is but let's let's say it's it's your own type. So here we could have a series of
numbers and we want to be able to index individual numbers But also slices and we want my pie to understand that when we use a single index A single value is returned and when we use a slice that a sequence of values is returned So we begin by creating a generic class. This should be familiar to you by now
like I said, it's a very common tool and What we need to do next is to explain The the two versions of the square brackets operator
Which is called dunder get item in Python So these two They look like method definitions. They actually are But they don't have any body again Those three dots are exactly what you want to put in there and you annotate them with an with an overload Decorator coming from the typing module and all they do is they explain to my pie
What are the possible ways of of calling those methods And then you just add the actual implementation which in this case is very trivial and
That real implementation has to be typed so that it includes all the versions of the signatures That you mentioned previously So this is a bit wordy But it actually makes a lot of sense. Once you get used to it. The syntax is easy to understand. I
Had a lot more In store for this talk, but I had to cut a lot so to fit into the time limit So I only have seven takeaways for you One is to try and make your life harder by using stricter configuration than the default is
Second is to go bit by bit. Don't take Too much on in the beginning and go module by module and get to opt out when you can Definitely Learn to work with generics and type variables because those are your friends you will be meeting them quite a lot
Learn to use protocols because they are very much in line with the dynamic spirit of Python So you don't have to create? Classes for everything just for static typing be aware of new type because it can add
More semantic to your types not everything has to be a decimal or an integer you can call it user ID or price or quantity Writing plugins is hard but it is so important for my pie to spread and become more popular that I'm sure it will get easier eventually and
the last example overloading Looks like boilerplate, but it's not really that complicated and it is useful
so Who Who here thinks that typing is complicated after this talk I? Certainly do but There are good reasons for that one is that we have to learn and understand new concepts as developers
and that is great because they They force us to think about our code more and in ways. We perhaps didn't before and Another reason is that the tooling is still quite young And it's developing very fast very actively, but there still is a lot of issues to cover and
if I may just one final sentence that is once this becomes more popular and More prevalent once we learn how to use this then all our codes will be
Much less error-prone and development will be more fun. I can already see that in smaller projects Thank you
Short question Somebody's close to the microphone Thank you for a talk. I have a question so beyond primitive type annotations, do you think that
Optimizing for a human which python says readability counts and that kind of pep 20 things And optimizing for machine might buy in this case is a zero-sum game or it's kind of something else That is a very good question
At this point we are making concessions to my pie definitely we are adding code and Structures to the code that we maybe would not do otherwise But I don't think the gap is too big and if it is done correctly then
You might be making the code easier to read for humans as well when you alias your your complicated type annotations and Use them cautiously then I don't think we lose all that much very little and we gain a lot All right, it's very hot topic, but we are running out of time so thanks again Vita for this nice talk