We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

`typing.Protocol`: type hints as Guido intended

00:00

Formale Metadaten

Titel
`typing.Protocol`: type hints as Guido intended
Serientitel
Anzahl der Teile
112
Autor
Lizenz
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
EuroPython 2022 - `typing.Protocol`: type hints as Guido intended - presented by Luciano Ramalho [The Auditorium on 2022-07-15] Duck typing and static typing are not opposites. Go is a successful statically checked language with support for duck typing through interfaces that work like `typing.Protocol` does. A `Protocol` subclass defines an interface that past and future classes can implement without any coupling to the interface: they simply provide the required methods. That's statically checked duck typing: a powerful combination! In this talk we'll get back to basics looking at how duck typing is used in Python since the beginning, how `__dunder__` methods leverage that idea to support what we recognize as *Pythonic* code. Then we'll see how `typing.Protocol` fills the gap in the original PEP 484—Type Hints, and finally lets us properly annotate code that leverages the flexibility and loose coupling of duck typing. Finally, we'll look at the experience of the Go community to learn what makes a good Protocol. Spoiler alert: your favorite Python ABC may not be the basis of a useful Protocol! This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/
29
Formation <Mathematik>TypentheorieTaskComputeranimation
StichprobeÜberlagerung <Mathematik>MusterspracheWeb-SeiteTypentheorieProtokoll <Datenverarbeitungssystem>ATMComputeranimation
TypentheorieRichtungPauli-PrinzipGanze ZahlFunktionalKomplexe EbeneMathematikXMLComputeranimation
TypentheorieDämpfungFormale SpracheSelbstrepräsentationÄußere Algebra eines ModulsZeichenketteMultiplikationsoperatorNichtlinearer OperatorInstantiierungGanze ZahlComputeranimation
ZeichenketteImplementierungInformationsspeicherungPhysikalisches SystemCodeArithmetischer AusdruckTypentheorieAusnahmebehandlungSelbstrepräsentationInterface <Schaltung>Innerer PunktNichtlinearer OperatorEinsSchaltnetzDämpfungAdditionVerschiebungsoperatorGanze FunktionCliquenweiteZeichenketteVirtuelle MaschineZeiger <Informatik>Kartesische KoordinatenOrdnung <Mathematik>TeilmengeComputeranimation
ImplementierungTypentheorieInterface <Schaltung>ProgrammbibliothekResultanteObjekt <Kategorie>Physikalisches SystemKlasse <Mathematik>InstantiierungGrundraumHydrostatikDifferenteCodeNatürliche ZahlComputeranimation
SichtenkonzeptFunktionalPunktPhysikalisches SystemObjekt <Kategorie>ProgrammiergerätKlasse <Mathematik>MetadatenTypentheorieCoxeter-GruppeProtokoll <Datenverarbeitungssystem>BrowserTelekommunikationCluster <Rechnernetz>Computeranimation
MeterMereologieProtokoll <Datenverarbeitungssystem>Klasse <Mathematik>CodeMultiplikationsoperatorTypentheorieE-MailFormale GrammatikDifferenteLesen <Datenverarbeitung>Objekt <Kategorie>Modifikation <Mathematik>ProgrammbibliothekMailing-ListeArithmetisches MittelMomentenproblemTeilmengeArithmetischer AusdruckSchreiben <Datenverarbeitung>FunktionalInstantiierungBitComputeranimationBesprechung/Interview
W3C-StandardKlasse <Mathematik>Delisches ProblemMaschinencodeGanze ZahlFunktionalSystemaufrufObjekt <Kategorie>ParametersystemMultiplikationsoperatorFehlermeldungZeichenketteFormale SpracheDatenfeldWellenpaketAppletCASE <Informatik>IterationDickeAttributierte GrammatikAutomatische IndexierungDatenstrukturt-TestCodeInterface <Schaltung>MultiplikationExtreme programmingObjektorientierte ProgrammierspracheÜberlagerung <Mathematik>MaßerweiterungNichtlinearer OperatorComputeranimation
Lesen <Datenverarbeitung>QuellcodeHydrostatikIterationProgrammierumgebungClientAnalysisProtokoll <Datenverarbeitungssystem>ParametersystemSoftwaretestVererbungshierarchieCodeParametersystemProtokoll <Datenverarbeitungssystem>Interface <Schaltung>Klasse <Mathematik>ClientObjekt <Kategorie>HydrostatikProzess <Informatik>SoftwaretestWeb SiteSystemaufrufFunktionalQuellcodeLesen <Datenverarbeitung>Hierarchische StrukturMereologieOrdnung <Mathematik>Formale SpracheGeradeProgrammierumgebungElektronische PublikationComputeranimation
GasströmungGammafunktionATMDualitätstheorieHydrostatikLaufzeitfehlerDynamisches SystemSampler <Musikinstrument>TypentheorieAutorisierungComputeranimation
Pauli-PrinzipElektronischer ProgrammführerIndexberechnungProtokoll <Datenverarbeitungssystem>WarpingTypentheorieObjekt <Kategorie>Interface <Schaltung>HydrostatikLaufzeitfehlerPauli-PrinzipTermMultiplikationsoperatorMechanismus-Design-TheorieFormale SpracheKartesische KoordinatenDiagrammComputeranimation
Folge <Mathematik>Protokoll <Datenverarbeitungssystem>VariableFunktionalNichtlinearer OperatorTypentheorieFolge <Mathematik>Charakteristisches PolynomResultanteObjekt <Kategorie>NebenbedingungPhysikalisches SystemCodeZeichenketteDämpfungParametersystemProtokoll <Datenverarbeitungssystem>BitVarianzCASE <Informatik>SystemaufrufElement <Gruppentheorie>ZahlenbereichVariableXMLComputeranimation
MedianwertCodeProjektive EbeneTypentheorieMultiplikationsoperatorProtokoll <Datenverarbeitungssystem>CodeParametersystemFunktionalStatistikProgrammfehlerVarianzMailing-ListeDatensatzZahlenbereichAusdruck <Logik>Computeranimation
StandardabweichungProgrammbibliothekElektronische PublikationParametersystemStabilitätstheorie <Logik>Service providerQuick-SortART-NetzFunktion <Mathematik>MathematikLokales MinimumInformationsüberlastungDefaultVektorpotenzialCodeProgrammbibliothekTypentheorieLokales MinimumDivergente ReiheInformationsüberlastungProgrammfehlerSoftwarewartungMultiplikationsoperatort-TestParametersystemFunktionalPhysikalisches SystemMultiplikationComputeranimationXML
Divergente ReiheInformationsüberlastungProgrammbibliothekStandardabweichungPaarvergleichObjekt <Kategorie>ImplementierungSoftwaretestTypentheorieDeklarative ProgrammierspracheOrdnung <Mathematik>ImplementierungMailing-ListeFunktionalProgramm/QuellcodeXMLComputeranimation
VektorpotenzialLokales MinimumInformationsüberlastungCodeElektronische PublikationProgrammbibliothekStandardabweichungDivergente ReihePaarvergleichObjekt <Kategorie>ImplementierungBimodulKovarianzfunktionAbstraktionsebeneProtokoll <Datenverarbeitungssystem>Klon <Mathematik>VersionsverwaltungIndexberechnungNichtlinearer OperatorMultitaskingSchlüsselverwaltungVektorraumKomponente <Software>Objekt <Kategorie>Protokoll <Datenverarbeitungssystem>VererbungshierarchieProgrammbibliothekAutomatische IndexierungKlasse <Mathematik>KardinalzahlIntegralComputeranimation
DatenmodellProgrammbibliothekStandardabweichungClientCodeEinfache GenauigkeitProtokoll <Datenverarbeitungssystem>CodeDatenmodellProtokoll <Datenverarbeitungssystem>TypentheorieInterface <Schaltung>ProgrammbibliothekStandardabweichungMereologieIntegralCodecClientAppletTermGebäude <Mathematik>Computeranimation
RechnernetzHydrostatikCodeTypentheorieInverser LimesATMLeistung <Physik>ProgrammfehlerMultiplikationsoperatorComputeranimation
Protokoll <Datenverarbeitungssystem>VariableMultiplikationsoperatorTypentheorieFunktionalVarianzRechter WinkelFunktion <Mathematik>ZählenEin-AusgabeVorlesung/KonferenzComputeranimation
UnrundheitKontrollstrukturVorlesung/Konferenz
Transkript: Englisch(automatisch erzeugt)
Thank you very much. I think this is the fanciest auditorium that I've ever spoken to. I'm hoping to be up to the task. So, typing protocol, type hints as Guido intended.
The title of the talk is a reference to this T-shirt which I wish I had, but I don't, right? Programming the way Guido indented it. So, yeah, that's my book.
Almost everything that I'm gonna show you today came out of a lot of research that I did while writing the book. The second edition has more than 100 pages about type hints and other new stuff. So, we're gonna talk about four things.
What is a type? The four modes of typing, some typing protocol examples, and a conclusion, okay? So, what is a type? So this is a direct quote from PEP 483, the theory of type hints, which was written
by Guido van Hossen and Ivan Levitsky, Levkivsky. So there are many definitions of the concept of type in the literature. Here we assume that type is a set of values and a set of functions that one can apply to those values, right?
For me, my first understanding of types was very much focused on the set of values because you think about, oh, okay, so you have this set of the complex numbers and then you have the real numbers and then you have the integers and so on.
But that's in the beautiful, pure world of mathematics. In the real world, there are some small issues, right? Reality is complicated. Reality is more complicated than math.
So, I have an integer and then I convert it to a float. Looks all right, except that one is not equal to the other, right? Actually, in the set of floats, there is no float that represents precisely that number.
And if you import the decimal module, it can, when you create an instance of decimal, it shows you then a string representation with full precision, and as you can see, they are not really the same number, although they are pretty close, right?
So, yeah, a little side note. It's interesting that the float type has been around for a long time. It's basically embedded in all modern CPUs, right? The other basic operations with floats
are embedded in the CPUs, but it's a type that is more useful to scientists and engineers than for people, for instance, who deal with money, right? Which is, I guess, most of us, or many of us, I don't know. So, I always thought it was kind of weird
that most mainstream languages don't have a really good way of representing money, and people try to use float for that, but it's not great, so decimal is an alternative in Python to do that.
But the set of values definition is not useful for other reasons also, because Python does not provide practical ways to specify types as sets of values, except for a null. In practice, in the standard library, we have very small sets, like the set of the none type,
which has only one instance, none, bool, which has two instances, and then, on the other hand, you have extremely large ones with billions of instances, right? The int and the string types, for example, the possible values are only limited
by the width of the pointer in the machine. And also, we have no way to say, for instance, that a quantity type that I would like to use, for instance, in an e-commerce application to validate that orders are not placed
with zero of an item, or more than a thousand. This could be an interesting safety measure, right? Usually, it's rare that you order a thousand or more of something in a retail store. So you can't say that. There's no way in Python to say that
in the type system to express that, right? For instance, also, there's no way to say that our airport code is the set of all the 17,576 three-letter combinations of uppercase ASCII letters. Neither of those sets can be expressed in the type systems.
At least not in the type systems of mainstream languages, and certainly not in the type system of Python. So, in practice, it's more useful to think that int is a subtype of float because it implements the same interface.
Basically, the interface of float has to do with arithmetic, and all the operations also apply to integers, right? But ints also have a few other methods, like, for instance, those that do bitwise manipulation,
shifts right, shifts left, things that wouldn't make sense in a float. So that's the main reason why we can think of int as a subset of, not a subset, a subtype of float, because it implements all the types, the entire interface of float, and an additional few methods.
I put a couple of asterisks here because just yesterday I discovered there are a few float methods that int does not implement. Okay, but they're kind of bizarre. One of them is one that generates an hexadecimal representation of the float.
But anyway, as a good approximation, we can think of subtypes, and in general, that's true for subtypes, that they implement the entire interface of the supertype in addition to more methods, right?
So for instance, here I continue that example by showing that the pipe operator, which is the bitwise or, you can use it with an integer, but you cannot use it with a float, right?
And another example of how it's more useful to think about the interface is that in Python's static type system, we have now the any type, and basically,
because any value can be assigned to an object variable, and any value can be assigned to any variable, or you can say that they are like the universe types, right? But the super important difference between them is that the object type implements
a narrow interface, right? Pretty much every, most of the classes that we create from object add methods, right? They don't exist in objects. And but any, on the other hand, is this magical type
that is required because of the nature of the gradual type system that we have, which means that sometimes you may have a piece of code that is not annotated, or you're using a library, a third-party library that is not annotated, so you get a result from there, and all that the type checker can infer
is that the return is any, right? And any satisfies every interface. It's like this magical interface that contains all possible methods that exist today or that will ever exist, right? So this is a crucial difference between them.
So in general, if you think of subtypes as sets, you get the biggest set is in the top, right? Object is the set of everything. And then you go down, down, down, down to more specific types, more specialized type
that usually have fewer instances. But on the other hand, the interface grows, right? At the top of the class hierarchy, the interface is narrower, and as you go down, it gets wider, more specialized, and it gets more methods.
So from that definition of PEP483, you're going to adopt that point of view that a set is a set of functions that one can apply to those values. A set is defined by the set of functions that you can apply to an object, right?
And then here's another quote from Smalltalk, and I'm not talking about Smalltalk just because I have a gray beard. I'm talking about Smalltalk because it's really very,
the type system of Smalltalk is very similar to the type system of Python, just like the type system of Ruby and the type system of JavaScript. And in the Smalltalk community, they use this term, protocol, right?
So this quote is from one of the designers of Smalltalk. Every object in Smalltalk, even a lowly integer, has a set of messages, a protocol, that defines the explicit communication to which that object can respond. In Smalltalk, there was no explicit way
to declare a protocol, but some browsers that you use to code in Smalltalk did present clusters of methods as a protocol. And of course, a class could implement several protocols.
So that was the way they used it, more as metadata for the programmer than something that was checked, right? So the main takeaways from this first part is that types are defined by interfaces,
and protocol is a synonym for interface, right? So that's where the name of the typing protocol class that we're gonna talk about comes from. Now let's talk about duck typing.
So last time I looked, Alex Martelli, who is my friend and also one of the reviewers of the first edition of my book, is credited, was credited by Wikipedia for being the first person who used
in a written message, that you can find online, this duck metaphor. I don't know if he invented it, but he was the first one who used it in a Python mailing list to talk about, to respond to a question about polymorphism
that had to do with type checking in Python many years ago when there was no static typing, right? So the idea is that you don't check whether something is a duck. You check whether it quacks like a duck
or walks like a duck, et cetera, depending on exactly what subset of duck-like behavior you need. I started using Python in 98, and at the time there was no ABCs, there were no ABCs in the standard library. So in the beginning I was a little bit mystified
by the appearance of phrases like, oh, this function takes a file-like object. Okay, what is that? There was no definition anywhere, right? But it was a common practice to have these kinds of expressions.
I think there's still phrases like that in the documentation. A buffer-like object is another thing that is super important for data science. There was some formalization of what that means, but the interesting thing is the precise meaning
of a file-like object depended on the context, right? Sometimes all you needed was, I need something that has a read method that returns bytes, for instance. That's file-like enough for me in this moment.
Does that make sense? In another situation, maybe you need something that has a close method and a write method, things like that, right? But it was always very informal, similar to how it was informal in Smalltalk as well. So one definition that you can give if somebody asks you what's the difference
between a protocol and an interface, if we are not talking about static typing, you can say that protocols traditionally were informal interfaces that were not defined explicitly in code, but where the code depended
on one of those interfaces or on a few methods to do something, right? So for instance, I've talked Python to lots of people and most of my students over time
came from statically typed languages like Java mostly and C++, C-sharp. And so I always like to show them this example of a very simple function, double, to illustrate duck typing, right? So I create this single function double
and I can use it with an int, with a float, with a complex, with fractions, and those are all numbers, but I can also use it with sequences, right?
So what is the idea here? What is the protocol that double expects X to fulfill? Double expects that the argument X is able to multiply itself by an integer.
That's what is implicit in that code and this is why all of those objects support that because under the covers they implement a method called dundermoo, right? So dunder, double underscore in front, behind,
and some special name. Dundermoo is actually the method that any class that you write can implement to support the multiplication sign, this operator.
Now here's another example, a more dramatic example. So I have this simple class that represents a train, okay? And its only attribute is a length that is established when it's constructed.
And so then we have the dunderlen method that returns the length and we have the dundergetitem method. And basically checks whether the I is within the range,
right, from zero to length minus one. And if it is, then it returns a string saying car number, such and such, right? Otherwise it raises index error. So for instance, if I build a train with three cars
and I ask the len of the train, I get three. So this is duck typing, right? Notice this class inherits implicitly from objects. There's no other special class that I'm using here.
So what makes something work with len in Python is basically for built-ins, Python has a shortcut, okay? Built-in and extensions, like extensions written in C or Rust or other languages, can implement in those lower level,
using the Python C API, what is called a slot, that you respond to the len. But even if you don't do that, there is also, it's funny because the C code of Python is kind of object-oriented C, okay?
Not C++, object-oriented C. There's this kind of nested structures in which everything that has multiple items in it has a field that Python can inspect immediately to know how many items there are, okay?
So that's very fast because it doesn't involve any method call. But if that field doesn't exist, then Python will try and find whether there is a dunder len implementation, either in the low-level language or in Python if it's a Python object, right?
So that's how len works. And it's an example of duck typing, right? You don't need to inherit from anything to implement, to support len. You just implement that method. The second example is what triggers the dunder get item method, right?
And again, another example of duck typing. But the third example is really dramatic because the target of a for loop,
in this case the T, is supposed to implement an interface of iterable, right? And how do you implement the interface of iterable? Formally, you implement a dunder iter method
that returns an iterator, which is another interface that implements a dunder next method. However, I didn't implement dunder iter here, okay? Well, anyway, in a regular duck-typed language, you would, any class that would implement dunder iter
would work for iteration. But Python goes even further than that because that's so fundamental. Iteration is so fundamental to the language. And also to support legacy codes. It just so happens that the Python machinery,
this is actually built in the iter built-in function. The iter built-in function looks to see if the object implements dunder iter. But if it doesn't, then there's a fallback. And the fallback is, does it implement dunder get item?
And then it tries to send a zero. It tries to call dunder get item with a zero argument. And if that works, then it starts iterating by providing more indexes until the dunder get item returns index error, right?
So this is kind of an extreme example of duck typing, right, of the machinery of the language trying to use the object, even when the object doesn't implement one particular interface, it falls back to another interface and does what you expect, right?
Okay, so now I'm gonna show the first example of a protocol. And this, I think, is the best example of a protocol in the whole talk. It's the most similar to the kinds of protocols that you are likely to use in your job.
So what I'm saying here is I have a function called run file, and it takes one argument called source file, which is a text reader. And then immediately above, there's a definition, right, of a class that inherits from protocol, right?
And that definition is exactly like I wrote it here. The ellipsis in line 203 is part of the syntax, okay? The ellipsis is telling you that the body is omitted.
Instead of pass, which explicitly say, says this doesn't do anything, what this is saying is the body will be defined somewhere else. It doesn't matter what the body is. So in order to satisfy that protocol,
an object, any object in the world, can satisfy that protocol by implementing a read method that returns a string. That's it, okay? So, and often, you will see code like that where the interface is defined
right above the place where it's used. This is common. We see that a lot in Go code. And why am I talking about Go code? Because Go is the first mainstream language that is statically typed but also supports this idea of declaring protocols.
Notice that whatever class I use to pass, to produce a text reader, doesn't need to inherit. And in fact, they won't inherit from text reader. All they have to do is to implement the interface, okay?
So the benefits of using typing protocol are that you can preserve the flexibility of duck typing. So you can let your clients know what is the minimal interface expected regardless of class hierarchies. You can support static analysis, right?
IDEs and linters can verify that an actual argument that appears in a call site for the run file function does satisfy the protocol, right? And you reduce coupling because the client classes don't need to subclass anything,
just implement the protocol. And this also makes testing easier, right? Because if you need to create some kind of mock to test, all you have to do is implement that one method. You don't need to inherit from anything, right? So let's talk about the four modes of typing.
We all know there's this duality between static typing and dynamic typing. And this has to do with when the types are checked, right? Static typing is designed to support static checking, which is checking them by tools and compilers and so on. And runtime checking is, I mean,
dynamic typing was invented to support runtime checking, right? The authors of the type of the type hint SPEP promise that Python will remain dynamically typed, okay? But actually the situation is more interesting than this.
The situation is that we have these two axes where besides the static checking and runtime checking, we also have the issue of how the types are defined or identified.
So in static typing of languages like Java, you have nominal types. Things belong to a type because it's explicitly declared. You inherit from something or you implement an interface, right? But with duck typing, what we have is structural types,
which means objects belong to a type because they implement a certain interface. But they don't need to explicitly declare anywhere that they do implement. They just have to actually implement it, right? So these are like the two opposites. But this diagram actually has other areas.
This area, Alex Martelli, when he was reviewing the first edition of my book, he invented this term goose typing. I don't have time to talk about that, but basically that's the use of ABCs to do explicit type checks at runtime.
And then Lukas Langa and others, Lukas is speaking right now at another room here, they proposed PEP 544, which implements structural duck typing or the idea of, I mean, structural subtyping
or the idea of static duck typing, right? Duck typing, static duck typing. So, and that's supported by typing protocol, okay? There are other languages that support things in those quadrants.
See, go appears in the static duck typing quadrants, but also in the goose typing quadrant because it has mechanisms for type assertions at runtime, and so on, okay? So let's see some more examples. We were talking about the double function. How could we annotate that with a protocol?
So can we do that? Yeah, okay, this has a serious problem. The problem is the type checker will complain that the object type doesn't implement the other move. Does that make sense?
So this code, without looking at anywhere else, the type checker knows that this code is invalid, okay? Can I use any? Yes, but it's useless because it's the same thing as no annotation. The any annotation pretty much defeats type checking.
You have to use it very carefully and like an emergency exit. I could do this and say that x is a sequence of t. And here I have to introduce this idea of the type var,
because sequences is a generic abstract base class, so I'm saying this. If I get a sequence of floats, I'm going to return a sequence of floats. If I get a sequence of strings, I'm going to return a sequence of strings. This is what the type variable is doing, right?
When the type checker analyzes this, it gets a concrete example of a call site, it looks at the type of the elements in the sequence, and then it binds t to that type, and then it will infer that the return is a sequence of that same type, okay?
So this only works with sequences, it doesn't work with numbers. So the solution is to create a repeatable protocol. So the repeatable protocol is saying that it's anything that implemented under Moo,
and that takes a repeat count, which is an integer. And now I can say that x is a repeatable, and this returns a repeatable. Are we done yet? No. Because now, when I call double, the result that I get is inferred to be something
that only implements under Moo, and if you need to do other kinds of operations with the object, you won't be able to, the type checker won't let you. Of course, at runtime, everything happens, right? Because that's the characteristic of the gradual type system of Python.
At runtime, only the dynamic types exist, there's no constraint. But the type checker will point out that if you try to do something else with this result, no, this result only supports under Moo, because that's what it says. So the way to solve this is a little bit more involved.
You need to create another type var. So the protocol is the same. The new thing here is this type var here. And here, I'm saying that rt can be bound to any type,
but it has to be limited by, I don't like the keyword argument bound, I think they should have used another word for this, maybe limited or upper bound or something. But because there's this, the binding of the variable and the bounds, which is kind of a boundary, right? What this is saying is that
we, rt, can only accept something that is repeatable. But because of the way that this variable is declared, the actual type of it will be preserved as well. So if I pass a float, this will, the type checker will be able to infer,
first it will accept the float, because the float implements under Moo, and it will understand that this returns a float, okay? So that's one thing that we need to do in this case.
So when I discovered these things, I was doing research for the book, and that was the time of Python 3.8 when typing protocol was introduced. There's this project called Type Sheds, right, which has the type annotations for the standard library and other projects. And I contributed several fixes
by implementing protocols there. So for instance, there was this bug that I found that there's the medium-low, the medium-low function in the Statistics module actually works with strings, with a list of strings, okay?
And there's an example just like that in the documentation. So, but it wasn't, but at the time, we had a false positive, which was, if I submitted this code to a type check, it would complain, and why is that?
Because the annotation at the time was saying that the argument had to be an iterable of number. And number was a type that they invented there, which was the union of int, float, and complex, I think, or int, float, and fraction, I think.
Anyway, it didn't work with strings, the annotation, although the function did work with strings, right? So the solution was to create this sortable protocol. And again, you see the same formula that we saw before. You have a protocol, and you have a type var
which is bound to that protocol, and then I can use that type var to say that median row accepts an iterable of sortable t, which is the type var bound to sortable, and it returns a sortable t, right? When I contribute that, I looked around,
and I found many other places in the Python 3.9 standard library where we could improve the annotations by using protocols. Okay? And then there was the biggest challenge for me at the time was the max function. The max function is super powerful, right?
Like I said, I've taught Python many times. I never saw a student say that, oh, max is super complicated to understand. No, it's pretty simple. You can pass it an iterable or a series of arguments, and then if it's an iterable, then it will return the biggest item in the iterable.
If it's a series of arguments, it will return the biggest of the arguments, right? But then there is some optional arguments and so on. So it's very flexible, easy to use. I like this API, okay? But then somebody posted a bug about that, and I tried to fix it.
The old type hints were like that. So overload is something that this decorator comes from the typing module, and its mission is to allow you to do multiple type annotations to different ways
of invoking a single function, right? So it was already pretty complicated. My time is running up, so I'm not gonna delve into the details here, but this wasn't working. This was producing that bug that this person reported. The bug was that if you passed five and top
where top was none, it exploded. The solution was this. It took me several hours, actually a couple days, and with support of people more experienced than I in the type system to get to this.
My first attempt had eight overloads, and then one of the maintainers of type shed showed me how to simplify and reduce it to only six. In order to do that, because Max is written in C, I wrote it in Python to test and so on,
and what caught my attention was that my implementation was shorter than the declarations to satisfy the type checker. So there was a lot of stuff that happened after that. If you go look at the code, it's not like that anymore. I won't go into those details here,
but it's just for you to know that with time, supports less than became supports greater than, and then they invented another name, and then they created. Now what they are using is supports rich comparison, which is a union type of supports under LT
and supports under GT, which means that in most of those functions that appear in this list here that have to do with ordering stuff, what they do is they accept, the current annotations accept something that supports either less than or greater than.
Either way, it will work. There are some protocols in the standard library. Most of them have to do with numbers because unfortunately, the numeric tower, remember the numeric tower, it's completely broken for use with typing, and why?
Because the base class of the numeric tower, which is called number, has no methods, which means that objects of that class are useless. You can't do anything with them. But anyway, so they invented those protocols
that have to do with whether something supports a conversion, right? And one of them, for instance, I use in one example in the book, which is supports index. Supports index is a particular protocol that means that the object implements dunder index,
which as opposed to dunder int, which floats also support to convert to int, dunder index is supposed to be implemented only by integral types. So for instance, if you use NumPy, all the integral types of NumPy supports, implements the dunder index method,
so they can be used as indexes, okay? So to summarize, I recommend that you use typing protocol to build Pythonic APIs. In fact, you can see that the standard library is very Pythonic, although there are parts of it that I don't consider very Pythonic,
but most of it is very Pythonic, and the standard library really cannot be annotated without the use of typing protocols, right? So you should use typing protocol because then you can support duck typing with type hints, and duck typing is the essence of Python's data model and standard library.
This allows you to follow the integrate segregation principle, the I in the solid principles, which means that client code should not be forced to depend on methods that it does not use. I remember when I used to code in Java, sometimes you had to implement methods just to satisfy the compiler, right?
And this is a recommendation that I learned from studying Go, okay? Narrow protocols are the best, okay? Most protocols that come in the Go standard library
have only one method, and that's how your protocol should do as well. That's why trying to think of an interface in terms of an ABC or a Java interface with the many methods is not a good way of getting started with designing a protocol, okay? Okay, closing words, I have to say these things
every time that I talk about Python typing, okay? Being optional is not a bug or a limitation of Python type hints. It's the feature that gives us the power to cope with the inherent complexities, annoyances, flaws, and limitations of static types. So please do not decide in your team
that everything has to have type hints always, okay? That's the only way to use a type checking. This is in the strictest possible mode. This is really bad. It's like deciding that every code has to be 100% test-covered.
No, 95% is good enough a lot of times, okay? So that was my talk. Thank you so much. I don't know if we have time for questions. Maybe one question. Yes. Thank you very much.
We have time for one question. Just remember to go to the microphone so that everybody can hear it. Yes, please. Hey, thank you for your talk, Lucian. Asking as a beginner in typing, I could not follow why when we define the repeatable protocol, why do we say that mall will return something that's also repeatable?
Why don't I say that I'm just returning anything? Let me see. So the repeatable protocol, this? Yeah, yeah, exactly. So, yeah. So what's the specific question?
Why am I saying that dundermool will return T? Is it something that I decide or is it typically the thing that makes sense? Yes, it's something that you decide, but this is how I see the,
because the idea of this double function is that it repeats something. If it's a number, that means multiplying by two, but if it's a sequence, it's repeating it. And that's actually why I called it repeatable and repeat counting.
In both of those cases, the type will be the same, the output type will be the same as the input type. Does that make sense? Yeah, yeah. Actually, this is the right solution, right? Sorry, that one has the problem, and this one is the one that is correct because it uses the bound type var.
Okay? Okay, so you can catch up with Luciano well, during coffee break or during lunch. Thank you very much for your presentation, and let's give him a round of applause. All right. Thank you.