The Magic of Attribute Access
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 27 | |
Number of Parts | 119 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/20015 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Berlin |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
EuroPython 201427 / 119
1
2
9
10
11
13
15
17
22
23
24
27
28
41
44
46
49
56
78
79
80
81
84
97
98
99
101
102
104
105
107
109
110
111
112
113
116
118
119
00:00
Attribute grammarRight angleOperator (mathematics)Gaussian eliminationComputer animation
00:30
Attribute grammarParsingNetwork topologyElement (mathematics)Object (grammar)Social classMountain passInheritance (object-oriented programming)CASE <Informatik>NamespaceLibrary (computing)QuicksortModule (mathematics)Flow separationAttribute grammarCoefficient of determinationOperator (mathematics)Standard deviationComputer animation
01:06
Object (grammar)Social classMountain passLetterpress printingNetwork topologyElement (mathematics)ParsingSquare numberOrdinary differential equationAttribute grammarApproximationType theoryObject (grammar)Instance (computer science)Social classMathematical optimizationLatent heatMathematicsOperator (mathematics)Exception handlingFunctional (mathematics)Endliche ModelltheorieLecture/Conference
02:20
Object (grammar)Attribute grammarError messageSquare numberSocial classOrdinary differential equationData typeProxy serverMultiplication signAttribute grammarInstance (computer science)Data dictionaryType theoryDevice driverFunctional (mathematics)AreaRule of inferenceComputer animation
03:41
Object (grammar)Proxy serverLimit (category theory)Social classAttribute grammarForm (programming)Lecture/Conference
04:16
Social classProxy serverAttribute grammarBitCASE <Informatik>Functional (mathematics)Object (grammar)Process (computing)Computer animation
04:41
Social classSet (mathematics)Attribute grammarLecture/Conference
05:03
Social classInheritance (object-oriented programming)Object (grammar)CASE <Informatik>Multiplication signAttribute grammarLink (knot theory)Data dictionaryComputer animation
05:48
Limit (category theory)CASE <Informatik>Attribute grammarSet (mathematics)Multiplication signFunctional (mathematics)Lecture/Conference
06:17
Inheritance (object-oriented programming)Social classSocial classAttribute grammarData dictionaryNamespaceInterface (computing)Game controllerComputer animation
07:00
Attribute grammarQuicksortMultiplication signAttribute grammarInstallable File SystemNetwork topologyLecture/ConferenceComputer animation
07:30
Square numberSocial classAreaAttribute grammarRippingAreaSocial classAttribute grammarObject (grammar)Square numberComputer animation
08:12
AreaSquare numberSocial classAttribute grammarInstance (computer science)Square numberInstance (computer science)Greatest elementAttribute grammarSocial classObject (grammar)Computer animation
08:52
Square numberAreaSocial classInstance (computer science)MathematicsSystems engineeringObject (grammar)Attribute grammarComplete metric spaceBitCASE <Informatik>AreaInstance (computer science)WebsiteLecture/ConferenceComputer animation
09:51
Instance (computer science)Social classMathematicsAttribute grammarCategory of beingArithmetic meanSet (mathematics)Computer animation
10:25
Category of beingImplementationSet theoryDescriptive statisticsLecture/Conference
10:52
AreaSocial classSquare numberCategory of beingSummierbarkeitData typeSet (mathematics)Attribute grammarMechanism designFunctional (mathematics)Category of beingMultiplication signSocial classAreaSource code
11:44
Mountain passFunctional (mathematics)Attribute grammarInstance (computer science)Object (grammar)Social classParameter (computer programming)Computer animationLecture/Conference
12:10
Point (geometry)Function (mathematics)Social classSocial classAttribute grammarParameter (computer programming)Line (geometry)Functional (mathematics)Instance (computer science)Computer animation
12:37
Line (geometry)Functional (mathematics)Social classQuicksortLecture/Conference
13:04
AreaSquare numberSocial classPoint (geometry)Function (mathematics)Instance (computer science)Read-only memoryPoint (geometry)Instance (computer science)Social classObject (grammar)Semiconductor memoryComputer animation
13:40
Attribute grammarObject (grammar)Point (geometry)Social classRead-only memoryPoint (geometry)State of matterSemiconductor memoryNumberType theoryMultiplication signSocial classObject (grammar)Well-formed formulaAttribute grammarSpacetimeRight angleSet (mathematics)Lecture/ConferenceComputer animation
14:45
Well-formed formulaSocial classAttribute grammarError messageInstance (computer science)Attribute grammarError messageDirection (geometry)System callSocial classLine (geometry)Descriptive statisticsCASE <Informatik>Logical constantLecture/ConferenceComputer animation
16:26
Attribute grammarSet (mathematics)Instance (computer science)Lecture/Conference
17:09
Social classInstance (computer science)System callLibrary (computing)Category of beingFunctional (mathematics)Attribute grammarSoftware frameworkResultantImplementationComputer animation
18:11
Maxima and minimaAttribute grammarSet (mathematics)Instance (computer science)Well-formed formulaNormal (geometry)Lecture/Conference
19:29
Well-formed formulaSocial classError messageAttribute grammarOrder (biology)Image resolutionInheritance (object-oriented programming)Object (grammar)Mountain passInstance (computer science)Order (biology)Social classInheritance (object-oriented programming)Image resolutionWell-formed formulaAlgorithmHierarchyObject (grammar)Cellular automatonMultiplicationLinearizationAttribute grammarBitElectronic mailing listComputer animation
20:36
Social classElectronic mailing listInstance (computer science)HierarchyEndliche ModelltheorieObject (grammar)Attribute grammarType theoryLecture/Conference
21:24
Attribute grammarObject (grammar)Data typeSocial classInheritance (object-oriented programming)Mountain passGeneric programmingComa BerenicesMultiplication signComputer animationLecture/Conference
22:24
Read-only memorySocial classPoint (geometry)Object (grammar)Attribute grammar3 (number)Order (biology)2 (number)TupleDifferent (Kate Ryan album)Attribute grammarAreaSampling (statistics)Right angleScripting languageStatement (computer science)Social classLine (geometry)SequelMultiplication signCategory of beingDescriptive statisticsSinc functionOperator (mathematics)Food energyDatabaseInternet forumCodeSlide ruleFunctional (mathematics)IterationCodierung <Programmierung>Real numberInstance (computer science)System callCASE <Informatik>LogicState of matterComputer animationLecture/Conference
Transcript: English(auto-generated)
00:15
All right, can you hear me? Okay, so I'm sure you're all familiar with the operators in Python.
00:23
Not all of them are as interesting as some of the others, but the most interesting one of all is the smallest. So hello, I'm Peter and I'll be diving into the depths of the dot operator. Now as you know Python has a giant standard library where the dot is sort of the path separator.
00:45
So you have some parent module and you get some child out of it. Now this is a different use of the dot than I'll be speaking about, but it's a case where the dot does some kind of namespacing.
01:03
So what I will be talking about is attribute access. So the three things you can do with the dot is set an attribute on an object, get an attribute out of an object, and delete the attribute. The syntax I hope you all know. Now these are some of the most optimized operations in Python, so it's very good to use them.
01:27
But you always have to know the name of the attribute you're getting. If you don't know it, you can use the built-in functions set adder, get adder, and delete adder, which actually do the exact same thing, just a bit slower.
01:41
Now to understand what the dot does, we'll introduce this very simple approximation of what an object is in Python. So if you have an object, it has a type which doesn't change very often, and it defines the behavior of the object. And then you have the dict, which contains all the data specific to that one instance,
02:06
and that's expected to change quite a lot. Of course, there are always exceptions, but we'll go with this simple model. Now as an example, I have some class square, I define a method on it that gets put in the type,
02:22
and then when I define an attribute on an instance of that object, the attribute goes in the dict. And the dict is under the hood, just a simple dictionary. In Python 3, it's more than a simple dictionary, but it acts like one. And then when I want to get the attribute out, I just use the dot again, and Python looks in the dict.
02:47
And if it doesn't find the attribute in the dict, then it looks on the type. So I can also get the get area method, which is not in the instance dict. So here are the simple rules. When you set an attribute, it goes directly to the dict.
03:03
When you get the attribute, you try in the dict, then you try the type, and if it's not there, then fail. And most of this talk will be about how to make this work somehow differently.
03:20
So the first thing you can do to override this behavior is to put a special get attr method on the type. What this does is it hooks into step 3 here, and instead of failing right away, this function gets called and whatever it returns gets returned as the value of the attribute.
03:45
So this simple class just proxies all attribute access to some other object. This works. It has some limitations. For example, it won't work on attributes that are already in the dict.
04:03
So if you ask for underscore object here, which is already set there, the get attr won't be called. Now there's another method you can do, which is get attribute. It has a longer name, and it's more powerful.
04:20
This one actually takes over the whole attribute getting process. So it's a bit more difficult to use because if there's any attribute you already have on the object, you have to make a special case for it. Otherwise, you can do anything you want in this function, and it'll work.
04:47
Now that's getting attributes. There's one more thing you want to do, and that's setting them. And for that, we can have a set attr method. So what this class will do is it keeps a dictionary.
05:02
When you try to get an attribute from the object, it looks in the dictionary and returns whatever it finds. And if you want to set an attribute, it also looks in the dictionary and... Well, it sets the attribute on the dictionary.
05:21
As you can see, I'm special casing the dict because I'm setting the dict here, and I don't want to use the dict that's not set already. We also have the del attr, which does deleting attributes. So it's the same.
05:43
Did you have time to read it? I guess most of you are looking at me. So, yeah. The question is, do any of these hooks run during init? They run every time you set an attribute in Python.
06:02
So if the init has attribute setting, then yeah. And if there's just a function that gets called on the beginning, there's nothing too special about init. So yeah, I have to special case this for the setting here in the init.
06:20
Now, if you ever find yourself writing something like this, think twice, because the attribute namespace is not entirely under your control. You have attributes like under anything. You'll inevitably want to add some methods to your class. You'll want to enable subclasses to add new attributes.
06:44
So usually when you have something like a dictionary, stick to a dictionary interface and don't mess around with attributes. Otherwise you'll run into trouble pretty fast. Yeah, so I haven't seen this many times, actually,
07:01
because this sort of blanket overriding of getting and setting attributes is not that useful. Usually what you want to do is you have one attribute that needs some kind of special treatment, or you have several, but each one is special in its own way. So if you did the get adder, you would have a nasty tree of ifs and it's not very nice.
07:26
So for this, Python has a very special feature called descriptors. Now what descriptors do is you put a special object in the type, which will control access to the specified attribute.
07:41
So if I have some kind of square and I want it to have an area, I put some kind of magic special object into the class, and when I set the side, and then I look at the area attribute, this descriptor will take the side, this five here,
08:02
square it, and give that back. This is pretty easy to implement. The descriptor object only needs one method, which is get, so double underscores get. What this method gets is the instance, so that would be the square here,
08:21
and if the instance is set, it can return the value of the attribute. If the instance is not set, that means we're getting the attribute from the class itself, so that's the usage on the bottom here. What most well-behaved descriptors do is return the descriptor itself,
08:41
so you can use it for some other reasons. Is that clear? You just have special object to control access to an attribute. Now what this object can also do is control setting, so if you use a method called set,
09:03
it gets the instance and the value the user's trying to set, and it's free to do anything it wants. In my case, we want to update the side, because the user set an area to something, so we can update the side to match.
09:26
Is anybody having trouble reading that? No? Okay. And the last thing there is is delete. The short del was already taken, so it's longer. This one isn't that useful because you don't find yourself
09:43
deleting attributes all that often, but for completeness, it's there. Now, a bit of terminology. When a descriptor has this set method, it's called a data descriptor. If it does not, it's called a non-data descriptor.
10:00
This set means that pretty much you want to control all access to the attribute. If you only have get, you're just getting that out. It means there's some data you presumably want to store in that attribute, so that's why it's called the data descriptor. Now, how many of you know the property decorator?
10:24
Almost everyone? Yeah, so as you can see, I've pretty much gone the long way to do something like the property decorator, and in fact, the property decorator, all it does is create a descriptor. You can actually implement property in pure Python as a descriptor.
10:43
You just give it three functions and call them in the appropriate special methods. I have an example here, so I have a set area, get area, del area, give these three functions to the property, and without an area.
11:03
You can actually call the built-in property like this, and it'll do the right thing. If you add some more sugar to this class, then you can really re-implement all of the mechanics of the property. And again, this has set, so it's a data descriptor.
11:27
Yeah, so Python actually likes descriptors very much, and anytime there's something special to do on attribute access, you have a descriptor. For example, if you look at a simple function,
11:45
if you look at the attributes it has, one of them is get, because functions themselves are descriptors. When you have a function on a class, and then you have an instance of the class, you want to get the function, you don't get the function back, you get a method.
12:02
You get an object that has the function itself and the self-argument baked in. Right, so here's a very simple class with a very simple function, and when you get the attribute from an instance of that class, you get a bound method.
12:25
If you call that, it automatically provides the self-argument. And if you get a line from the class, it gives you the original decorator, or the original function.
12:41
As I said, most well-behaved descriptors return themselves when you get them from a class. In Python 2, you would get something called an unbound method, which doesn't really do anything that useful, but it's there. Now they sort of fixed it, so it just gets the function back.
13:03
Now if you look closely to the first descriptor we had, yeah, it works pretty much the same way. It does something special when you get the thing from the instance. When you get it from the class, it returns the descriptor itself.
13:21
Right, another thing I want to talk about is this little trick for saving memory. If I had a point class and I had millions of these objects around, I wouldn't want each of them to have this dict attribute,
13:43
which as I said is a normal dict, so it takes up memory, and I know that in a point I'll only ever have an x number and a y number and nothing else. So what this special magic incantation will do is it will actually make the type not
14:00
have a dict attribute. It'll have the type and it'll have directly the x and y in the C object itself, so there will be no dictionary and it'll save the memory. You can of course set and get the x and y attributes, but you cannot set anything else because there's no space.
14:21
In the object for anything extra. Right, and if you try to get x from the class, you get this descriptor. So every time there's some special attribute, Python implements it with a descriptor. Right, and I think now is the time to give you the whole magic formula.
14:47
So this is the way an attribute is gotten from an instance. First, you try get attribute. If there is get attribute, you just call it and get it back. If it throws an error, the error is raised.
15:06
If there's no get attribute, you look up the attribute on the class, and if it is a data descriptor, then you call its get method and return that. Only after that, only after looking for the data descriptor,
15:23
the dict is checked, so the value is gotten directly from the instance attributes. This doesn't call any descriptors, you just get the value straight back. After that, you check the non-data descriptor,
15:40
or you check if the descriptor is non-data, if that's the case, then you call that. If it's not a descriptor at all, if it doesn't have the tender get method, then you just return it directly. So if on the class I have some value like, you know, a class attribute, a constant for example, it's just returned directly.
16:03
Okay. After that, you fall back to get attr, and if that is also not there, then attribute error is raised. Now, there is this weird thing about the data descriptor and non-data descriptor being in two different places. What this allows is if you have a data
16:25
descriptor, it pretty much controls all the access to the attribute. So what the Python designers thought is that if you define both get and set, then you probably want to control the access to that attribute yourself.
16:44
If you don't define the set, then you're free to override that attribute in the instance. So you can put something in the dict, and then since it's a non-data descriptor, you will get it back from the dict before the descriptor is checked.
17:05
Right? There's one nice use of it. In the pyramid framework, it's called ray-fi. Some other frameworks call it cached property. I've heard it called lazy property. So what this does is you give it a function, and then when you get the corresponding attribute,
17:28
the function is called, and then the attribute is set with the value you've computed. So it calls the function, puts the result in the dict, and whenever you get the attribute again,
17:46
it doesn't call the function again. It doesn't go to the descriptor, it just returns the cached value from the dict. So this is a way if you want to implement lazy property.
18:01
There's some discussion about adding this to the standard library, and so maybe we'll see it under some name. Is there anyone who doesn't understand? Excuse me? If you want to invalidate this, you just remove it from the dict.
18:29
Yeah, you just delete the attribute. Yeah. The magic is that you can just set the attribute normally. Now if you would know the name in advance, this would literally be instance.name
18:45
equals value. Now the setting is not affected at all with this descriptor, so if you want to change the value in the dict, you just do it normally. The only thing that's different is getting, and that's only when it's not already in the dict. Other questions on this?
19:18
I'm not finished. Yeah, if you have a general question, then
19:25
just wait. Okay, so another thing about this magic formula is this on class, which I put in italics, because it's not as easy as it looks, because looking up something on the class of instance is
19:42
not an actual attribute access. It's a bit different, and it has to do with something called the method resolution order. So if you have a class and you have a subclass of it, you can check for the method resolution order, and it gives you the child, the parent,
20:04
and the object. Now when you look something up in the class, it goes through the classes in this order. So the attribute would be defined on child, it would be returned from the child. If it's not on the child, Python looks in parent. If it's not there,
20:21
Python looks in the universal superclass. Now if you have some kind of weird hierarchy of classes with multiple inheritance and stuff like that, there's an algorithm called, I think, C2, which you can look up, which converts this hierarchy to just a list that's checked linearly.
20:46
One more thing about this MRO. It's actually an attribute defined on the metaclass. I don't know if you're familiar with that, but it's defined on the type metaclass,
21:07
and if you have an instance of the object, it doesn't show up, because if you have an instance, you only check the instance itself and its class, not its metaclass. So this is maybe a useful way
21:21
to hide things from instances if you need them on the type. You just put it in the metaclass. And if you don't know what metaclasses are, I'm sorry, but I don't really have time to explain it. Okay, that's it. Time for questions. Okay, thanks to Peter, and we have
21:51
almost eight minutes for questions. So please raise your hand, and I will come with a microphone. Hi, thank you very much for the insights. It's really, really interesting. And one thing I
22:04
actually saw, which was quite interesting, was the slots attribute. How would you compare this to name tuple? And what are the, I understand what the advantages are compared to name tuple, but
22:26
then why do we need name tuple? Name tuple is actually used a bit differently. The name tuple is immutable. That's the first thing. And the second thing, name tuple has order in the attributes, so you can actually use it as a tuple. Here you don't have order, so use whatever makes
22:46
sense in your case. If I wanted to add, for example, an iterator on this, or make this an iterator, I couldn't really do that with name tuple since that's already an iterator.
23:14
Can I return data descriptor, for example, like from get attribute method?
23:22
Excuse me? Can I return data descriptor from get attribute method? Well, you can return it, but the get method won't be called. You can actually call the get method yourself. That's not a problem, but it won't be called if you just return it.
23:47
Okay, any more questions? There's one.
24:00
Sorry, the side that you implemented there with the area descriptor could perhaps just as easily or possibly even cleanly be implemented using straight properties in that it's then inline in one spot and it's easy to find. What real world examples do you have of where descriptors are actually useful in code that they add more than they take? I've just seen them used before
24:31
where effectively they could have been implemented in other ways, but now the descriptor class, because it was separate, it involved a lot more jumping around the code to try and follow the logic of what was going on. Yeah, if you just have a simple case like I have here,
24:46
it is better to use property if you can. One thing I took from the talk because of time reasons is actually examples of more complex descriptors. If you have an ORM, like say SQLAlchemy,
25:03
that uses descriptors a lot, and it's because the descriptor is a class in itself, so it can have other behavior than just getting and setting or just controlling attribute access. For example, in SQLAlchemy, you can do operations with the descriptor,
25:24
which is a column on the database, and it will generate SQL statements. Well, if you use it on the class, it generates the SQL statements. If you use it on its instance, it gets the data of that column. In simple cases, it's better just to use property,
25:47
which is also a descriptor. It's just a simple one. In complex, when you need some more state or functionality built into the descriptor, then use a class, and when you have several
26:01
related attributes like that, you can just create a descriptor class and reuse it. But, yes, I agree that the code is not as readable as it could be when you use descriptors, because there's one more place you have to check. But it's magic. Use it wisely.
26:28
And I think this was the perfect conclusion for this talk. Thanks again, Peter.
Recommendations
Series of 13 media