Dynamically generated methods with a non-generic signature
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 141 | |
Author | ||
Contributors | ||
License | CC Attribution - NonCommercial - ShareAlike 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/68663 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 2023109 / 141
8
17
22
26
27
31
42
48
52
55
56
59
64
66
67
72
73
77
79
83
86
87
95
99
103
105
113
114
115
118
119
123
129
131
135
139
140
141
00:00
Dynamical systemElectronic signatureCellular automatonView (database)Kernel (computing)GUI widgetComputer fileParameter (computer programming)Library (computing)Default (computer science)Streaming mediaGoodness of fitTransformation (genetics)Sampling (statistics)Position operatorInstance (computer science)RoutingEstimatorWeightVariable (mathematics)Electronic signatureFunctional (mathematics)Type theoryCASE <Informatik>Different (Kate Ryan album)Similarity (geometry)Functional (mathematics)outputInformationMachine learningSlide ruleOpen sourceFitness functionSocial classParameter (computer programming)Interior (topology)Key (cryptography)NumberRight angleObject (grammar)Exception handlingVirtual machineBitAlgorithmDefault (computer science)Latent heatMetadataLoginCoefficient of determinationTraffic reportingSound effectInheritance (object-oriented programming)Condition numberInformation technology consultingMeta elementFunction (mathematics)Computer animation
05:12
Computer fileParameter (computer programming)Electronic signatureLibrary (computing)Function (mathematics)View (database)Cellular automatonKernel (computing)GUI widgetDefault (computer science)Functional (mathematics)String (computer science)Functional (mathematics)Electronic signaturePoint (geometry)Attribute grammarPauli exclusion principleSummierbarkeitFunction (mathematics)Parameter (computer programming)NumberoutputCoefficient of determinationDefault (computer science)Object (grammar)Instance (computer science)BitPosition operatorDampingIntegrated development environmentSocial classLoginExistential quantificationComputer animation
10:19
Computer fileKernel (computing)View (database)Cellular automatonGUI widgetInstance (computer science)Online helpMusical ensembleComputer animation
10:50
Function (mathematics)Demo (music)GUI widgetFood energyKernel (computing)Cellular automatonView (database)Computer fileMathematicsInterrupt <Informatik>Demo (music)GodComputer animation
11:15
Coma BerenicesHidden Markov modelComputer animationLecture/Conference
12:47
View (database)Computer fileKernel (computing)Cellular automatonGUI widgetFunction (mathematics)Electronic signatureWeb browserSystem callLetterpress printingObject (grammar)Type theorySocial classStatement (computer science)String (computer science)Instance (computer science)Set (mathematics)SequenceBitWeightSampling (statistics)Equaliser (mathematics)Functional (mathematics)Key (cryptography)Fitness functionAttribute grammarConstructor (object-oriented programming)CodePoint (geometry)TypprüfungLine (geometry)Sound effectParameter (computer programming)EstimatorSubsetElectronic mailing listElectronic signatureOnline helpArc (geometry)MathematicsComputer animationLecture/Conference
17:59
Key (cryptography)View (database)Cellular automatonKernel (computing)GUI widgetElectronic signatureParameter (computer programming)Social classEstimatorLoop (music)Inheritance (object-oriented programming)WeightFunctional (mathematics)Electronic signatureSet (mathematics)RoutingDynamical systemFitness functionMeta elementParameter (computer programming)String (computer science)Online helpSampling (statistics)Type theoryInstance (computer science)MetadataPattern languageStatisticsAttribute grammarStatement (computer science)Letterpress printingSequenceDefault (computer science)Revision controlKey (cryptography)CASE <Informatik>Object (grammar)outputException handlingPauli exclusion principleComputer animation
23:06
View (database)Cellular automatonComputer fileKernel (computing)GUI widgetFood energyFunction (mathematics)Electronic signatureString (computer science)Communications protocolNetwork socketServer (computing)Multiplication signFormal languageFluid staticsComplete metric spaceQuicksortComputer configurationElectronic signatureIntegrated development environmentType theoryComputer animationLecture/Conference
24:12
EstimatorFood energyView (database)Cellular automatonGUI widgetKernel (computing)String (computer science)InformationData typeFunction (mathematics)Electronic signatureCodeElectronic signatureFunctional (mathematics)Computer animationLecture/Conference
24:53
GUI widgetKernel (computing)View (database)Computer fileCellular automatonEstimatorModule (mathematics)Function (mathematics)Coma BerenicesElectronic signatureInformationString (computer science)Inheritance (object-oriented programming)Functional (mathematics)Dynamical systemSocial classFunction (mathematics)Right angleEstimatorRun time (program lifecycle phase)Functional (mathematics)Computer animationLecture/Conference
26:10
Communications protocolSocial classLecture/Conference
26:35
Key (cryptography)View (database)Computer fileEstimatorInstance (computer science)Function (mathematics)GUI widgetCellular automatonKernel (computing)Error messageData typeFood energyCASE <Informatik>Telephone number mappingSocial classSet (mathematics)System callRight angleComputer animation
27:14
Functional (mathematics)Electronic signatureLecture/Conference
27:56
Functional (mathematics)Parameter (computer programming)Electronic signatureEstimator1 (number)Social classFunctional (mathematics)Sound effectFitness functionLecture/Conference
28:57
Electronic signaturePauli exclusion principleFunctional (mathematics)Multiplication signLecture/Conference
29:39
Computer animation
Transcript: English(auto-generated)
00:05
So this talk is motivated by a use case we had in scikit-learn. We needed to do something, and we were like, how do we do that? And it turned out to be a very cool thing we did, and then I thought, why not talk about it? You can find the talk on the GitHub repo, so the slides are there already.
00:25
I did a bit about me. I did my PhD in cancer diagnostics and did algorithms and machine learning there. Then did some machine learning consulting, and these days I mostly work on open source stuff like scikit-learn and scops and Fairlearn. There are some prerequisites to understanding the talk.
00:44
It would make it easier, like knowing that functions can take a variable number of position logs and keyword arguments, set attribute, get attribute, those kind of things. The fact that in Python, if you have very specific dunder methods in certain places under certain conditions, they're called, and with their output, something is done or
01:05
they have a side effect, inheritance, type annotations, not necessarily, but we use them, and then when you do help of an object, it tries to read certain information about that object and show you. So what is the motivation for us to do this thing that we're going to talk about?
01:23
In scikit-learn, we have estimators, and then we have meta-estimators. You can put estimators inside meta-estimators, and these meta-estimators will route certain things to your estimator, like you put a transformer and a classifier in a pipeline, and then you fit it, and then it passes X and Y to your transformer, then it passes it
01:41
to your predictor. But you can also have certain metadata, like sample weight would be a metadata, or in this case my custom transformer would have a sample weight and other metadata, and what I want to do is to be able to say, hey, my fit is requesting sample weight, it's not requesting
02:01
this other metadata, don't use that, and then put that in a pipeline, and then call fit on a pipeline, and pipeline should know how to route things. Right now you can't do that, you almost can if you use the nightly build, but the idea is for us to have that, and we want it for this method, so there are certain
02:20
requirements on this method. One is that we don't want to change our estimators. We want that method to be dynamically generated, specifically because we also wanted this thing to work on third-party estimators. If you're doing your own estimator, you inherit from the base estimator, these methods should just exist.
02:40
And we didn't want it to have a generic signature like we accept everything. No. If your fit accepts only sample weight and this other metadata, then you should only be able to do that here, not any other random metadata. And we wanted a good doc stream. Seems very basic, turned out quite interesting.
03:01
In short, we're going to use different pieces to build this puzzle. We're going to use introspect functions. We're going to use the signature object to read and create a signature object, and then we're going to use a descriptor returning a function to add these methods to the estimators,
03:21
and then we use init stop class to create the right descriptors attaching them to the estimators when needed. Now we talk about these steps. So inspect and signature. Let's say I have a function f, it takes a, it's type inted, and then it takes a bunch
03:42
of positional args, a keyword argument, and variable keyword arguments. Then I have a class a, it has a method with a very similar kind of signature. Now I can check if f is a function, yes, it's a function.
04:00
Is a.g a function? No it's not. It's a method. We'll see immediately what really the difference is. But is a.g, like not an instance, but the class a function? Yes, that one is a function. We can do other things. When you get the signature, it returns an object which has information about whatever
04:26
input your method takes. I can go through them, it has this nice dictionary, I can go through that dictionary, it gives me the name, and different things, like what is the kind of this argument, what's the default value, and are there any type annotations on it.
04:41
If I run that on f, I get what you expect, a, args, b, key wrs, and you see like they have different types, like positional or keyword, or this one is a keyword only one, is there any type int here, or is there a default. If I do that on a.g, I get everything including self, and all the other arguments, but if
05:06
I do that on an instance of a.g, I get the same thing except self, and that's pretty much the difference between bound, like changed, but think of it bound methods and not bound methods. There's a lot more to inspect, you can read it like there, but this is what we're
05:24
going to need for the rest of the talk. Then, next thing, let's say I want to have a function that adds five to my input, easy, this is how you do it. Now imagine I want to create a bunch of these functions that add different numbers
05:43
to my input, I can create a function that creates a function inside it, and then returns that function, and then I can use that. So here, I have a create adder, it creates a function, and then here I return f, I don't return f of something, I return f, and then here the output would be that function, which
06:02
I can actually call. I could do that same thing with a lambda expression, we're not going to use that because our functions are complicated, but you could do it. This is a lot of you, but any major questions up until this one? I don't want to lose too many people.
06:22
If you do have questions, every now and then I'm going to ask if you have questions, just move to the microphone, we can have it a little bit more interactive. So now we have a function, we want to change its outfit. Let's say I have a very simple function, it accepts a bunch of position logs and
06:43
it calculates the sum of the inputs. And here I get help on f. When I run that, it says, well, it's f, it takes a bunch of arguments, and this text is what you have here as your doc string.
07:00
Where are these things stored? Well, first of all, I could call it, it does what it's supposed to do. The dunder doc includes the doc string, the dunder name includes the function's name. And then when I do inspect.signature, it gives me the signature of the function.
07:20
Now what I want to do is to change all of these. I'm going to say, I'm going to change your name, f.name is adder. I'm going to create a signature object and assign it to dunder signature. Basically if a function or a method does not have this dunder signature, then signature
07:43
tries to read the actual signature, but if this exists, then inspect.signature would return this, which is also used by most of your IDEs and whatever tool you use to develop, which would mean that the hints that you get would be more meaningful now. What I'm doing here is I'm saying I have one parameter called a, it's positional or
08:03
keyword, the default is zero, and it's an optional float. I have another parameter b, the same thing. And then here in my signature, I say I have these parameters and I'm returning a float. I also changed the doc string of f, just whatever string you want to put here.
08:25
So I do that, and then I get help on f. Now I get a much nicer thing. You might ask, why bother, I could do these things on the method itself, but I'm going to be generating my methods dynamically, so I would need to set these dynamically as well.
08:45
And now if I try to get the signature of f, I say that I can only pass an a and b. But is that really true? What happens if I do f 1, 2, 3? It works.
09:01
You're not changing the actual function, you're just changing things about the function. That's why my way of saying that is that I'm changing the outfit of a function, not the function itself. You could try inside your function, you could try to simulate the same thing that Python would do if this was actually the signature of the function, and we'll show somewhat how
09:23
to do that later. Questions till now. So the Python signature object, the way you're using it, it was introduced in this pep, I think in 3.6, you can read more about it there.
09:44
At this point, for me, the best documentation in Python are these peps, they're very extensive, really nice. So now descriptors. What's a descriptor? A descriptor by definition is a class that implements under get, and it's a descriptor
10:07
when it's assigned as a class attribute. Not as an instance attribute, but a class attribute. Wait, I feel like something's not working.
10:58
Gods of demo.
11:09
I'm going to disconnect and connect again. Let's see what happens. I hope I don't have to reboot.
12:02
The best thing that can happen during a talk, yes, because I had to, I had to, everything collapsed. I'm trying again now.
12:30
Let's see what happens now.
12:43
Yep, it should work. We did it. Okay. Let's just to be sure, I'm going to run from top all of these.
13:14
So descriptors. It is this thing, it's a class that implements under get, and it is like it behaves as
13:22
a descriptor when it's assigned as a class attribute. Not an instance attribute, but a class attribute. What happens is that if you do that, and then you try to access it on an instance, then Python will call this under get, and it will return whatever this under get is
13:41
returning. I've put a few print statements for us to see like the sequence of things happening. When we run this, first we pass here, then we go to the descriptors init here,
14:04
and then we try to access pet here, and then we access it, and when we access that, we go inside get. So you define that first, it's like this thing runs, and then when you define that, well, you're creating an instance of pet here, and then you go down here.
14:25
Is this kind of clear? And then what's important here is that your get also has access to the instance that owns you, the instance of the object that owns you, and the type of the object that owns
14:41
you. So we could also try to use that. Here I'm changing a bit this string, I'm saying, well, I'm myself, and my owner is this. And then here in my instance, I would add a name. And then I set the instance's name here, so this would be the constructor, and then
15:05
it comes to the get of the descriptor, and then I would have access to instance.name. And now it means that I have access to the instance, I can not only just read it, I could also change it, also the type itself.
15:21
Now we're getting closer to the point where what do we do for the request methods that we had? So this is like much closer to the actual code we are using now in scikit-land. I'm going to get the keys, these are the list of arguments that my method, that set fit request method accepts.
15:43
This is a descriptor, and as we talked about, this descriptor can also return a function, among other things. So we're going to create this function and return that. In this function, I check if the given keyword arguments has anything that I do not accept. And then if that happens, I raise a type error.
16:00
A type error also happens to be the same thing that Python raises if you try to call the function with arguments that the function doesn't accept. So we're kind of here trying to simulate the same behavior. And then for everything else, we have a side effect on the instance. We just set some attributes. So for example, if you say sample weight equal true, we are going to set request
16:23
sample weight equal true. That's what this set attribute does. And then I'm going to return the instance which would result in me being able to chain my methods on the instance. And then here I return f. And then now I can assign this descriptor to an estimator class.
16:41
So if I have an estimator, I say set fit request is this request method and only accept sample weight. And then when I do that, then I can have an instance of this estimator, then I try to access set fit request which returns that function that I defined, and then I can call that function and say sample weight true. And when I do that, as a side effect, this estimator would have this request
17:04
sample weight set by this line. And if I change that to like blah, it would also be blah. If I do that with a parameter that is not expected, I get a type error.
17:26
If I get the help of that method, not very helpful. So let's try to fix this one now. I'm going to have a helper function to return a signature which would be the desired signature of this thing. It requires the owner, the object owning it, and the type of the object owning it,
17:45
and the keys. What does it accept? The first argument is an instance method, so it accepts self. It's a positional or keyword argument, and the type of it is the owner. Then I extend that for every key in these keys, I say create a parameter, it's a keyword
18:04
only parameter, the default is none, and it's an optional bool. And then I create the signature object with these parameters, and the return type is also owner. That's because I was returning the instance itself. And now I'm going to change my descriptor slightly.
18:22
Everything stays the same. I'm going to add a method name here, which would be the name of the method, the name of the thing that I'm using in the owner. So if it's set fit request, this method name would be set fit request. And here I'm going to set the name of the function, some custom doc string that like in reality we're going to do like a much nicer one, and then I set the signature
18:45
with that input function, helper function that I have. And now when I create it, it's the same except that I also say, hey, I'm actually the set fit request. And when I get the help, I get this nice thing. It is set fit request, it takes sample weight as an optional bool, it returns the
19:05
estimator. We're almost there. There is a really nice how-to on like descriptors, you can do a lot more, there's a lot more to descriptors, and you can also have things like what happens if the user tries to set a value or they try to delete something, they're all there.
19:25
The next thing is init subclass, we're not finished yet, we need to generate these descriptors, this is now hard coded, we need to make that dynamic. We need to make that dynamic in a parent class. Again, as a reminder, we had an estimator, it has fit, and we want to have this set fit request, but we want this set fit request to be generated automatically.
19:45
We create a parent class with this init subclass method in it. What this does is that for every class that inherits from you, it calls this thing. And then this thing would have access to the class, not the instance, and then you can do things to it.
20:01
So here, for example, I'm going to have I set the attribute to five, and then I call super init class. And then I create a child class with some init, that's all. Let's see the sequence of things here. This one is also very interesting. Here I'm printing the class that I get, here I have a print statement, in it here
20:24
I have a print statement. And then I access this attribute. So the way it works is that first we said, yeah, we defined this parent class, but then before defining this, before we get here, we get this print statement. So as you define your class, this init subclass runs.
20:45
And then once this is defined, then it's done, then you can create your instance and access your attribute, in which case, in this case, it would be five. Now let's try to get closer to what we actually want to do. Now here in my init subclass, check if my class, this is the child class, has a fit,
21:05
and if this fit is a function. If yes, then I'm going to print the signature of that function, that's all. And then I have an estimator inheriting from parent, and then it has a fit method. When I run that, I get the printed version of the signature.
21:22
So it's kind of working. Now developing on that, we have that, yes, we have fit, it's a function, now I'm going to get the signature of that fit. I'm only interested in parameters that the parameter name is not self X or Y, because in the scikit-learn API, self is self, and X and Y are not metadata, and we say everything
21:45
else is going to be some metadata that I need to route. X and Y, I have a very explicit routing for them. And then for that, I'm going to say, well, set an attribute on class, the name of that
22:00
attribute is this set fit request, and the value of that is this request method, which is the descriptor that we just defined. And I say, the parameters that you should accept are these parameters, and this is your name. And then my estimator would inherit from that parent. Now the estimator.setFitRequest just exists, and if I get the help, it's what I wanted.
22:27
We do in scikit-learn, we do the same thing, except that we also loop around all the methods that we care about, like fit, transform, fit, predict, all of those. And then this init subclass is a much easier way to do these things that you could also
22:43
do with the ABC meta classes, and it's this pep for it you can read, it's really nice. So in summary, we used inspect and signature, and then we did a descriptor or returning a method with a custom signature and docstring, and then we used init subclass to customize all the child classes.
23:02
Thank you.
23:27
Last time I gave this talk, it took an hour and a half, because people were confused. So thank you for your talk. I did have one question, though I'm afraid I already know the answer. I really like static autocompletion, especially for when you start using type hints in Python
23:46
as you are. Using either JEDI or PyRite or any other sort of language server protocol, do I have any options to translate all this to static autocompletion? I don't know about JEDI, but one of the reasons we wanted to have a non-generic signature
24:03
was exactly that. If I try to get my hint here, I get what I set. So my IDE knows exactly what the signature of the method is. So if your autocomplete is doing the same thing that, like, here, IPython or VS Code
24:23
or any other one does, it should work. Okay. Thank you. Is there a way to actually create functions that dynamically not predefine the name in
24:49
the JEDI method? Yes, you can make things a lot more dynamic if you... So when you do, like, creating functions dynamically, I think then you mean attaching
25:05
them to a class. Yes. Right. Okay. You have this thing in Python that you can override two things. One is I think get attribute.
25:21
This one is called in certain ways. And then the other one, def get adder. This one, if you customize this one or both of them, then you can return anything you want when a user tries to access anything in your class. Which means that if you have implemented those, and then I do, like, the estimator dot blah,
25:47
this black comes to your get adder, and then you can be, like, okay, now I want to return this thing, and that thing could be a function. And then that would be actual runtime. Because what I'm doing here is, like, when you parse and then your class is set.
26:02
But if you want it to be done runtime, then you can do that through get adder. Hi. Thanks. Yes. I have a question with the descriptor protocol. Is there a reason why you're not making use of the under get name?
26:24
Or is that because you're doing a set matter on the class that get name is not being called? So I love this question. Somebody asked it. So as I said, descriptors expose also other methods. One of them is set name.
26:41
The set name would work in this case. Because what happens is that first you're going to have this. Then you're going to have new called. And then after new is called. And then set name is called in new. After new, init subclass is called.
27:02
And then because in this case I am assigning the descriptors to my class in init subclass, which happens after new, then set name wouldn't be called. Right. Okay. Thanks.
27:23
Hi. So you're using a descriptor. Am I correct in assuming that you could do everything that you've done without a descriptor and just creating directly the function there? Or is there anything that does require a descriptor?
27:45
And is there any other pros and cons in using descriptor instead of going for it directly? Yeah. So I could have just implemented these. It would have been tricky to change their signature because the signature of these
28:00
functions depend on the method that you have in estimator. And I don't know what my child estimator would have as signature of fit. So I need some place to introspect the signature of that fit method and then change the signature of this one. So there is something that I would need to change. Also inside the function I would also have to inspect the signature of the other method
28:23
to make sure that the parameters that I'm receiving are exactly the ones that fit we also accept. So in effect it would be the same as like doing the descriptor because you need to do some modification on method anyway based on the child class and it's easier to do that in a descriptor.
28:41
But there are at least three different ways to do the same thing. This is only one of them. To me it was the easiest one. Is there a way to access the parameters of a function in a way that can't be faked like you could with dunder signature?
29:04
Oh, not to my knowledge. All right, thanks. So when I was reading the PEP on signature my understanding was that there is a way but
29:22
we shouldn't do that. If we want people to perceive a function with a different signature than the actual signature we should be setting the dunder signature. But there is an ugly other way. I think my time is over. Thank you very much.