We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The Magic of Attribute Access

00:00

Formal Metadata

Title
The Magic of Attribute Access
Title of Series
Part Number
27
Number of Parts
119
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production PlaceBerlin

Content Metadata

Subject Area
Genre
Abstract
Petr Viktorin - The Magic of Attribute Access Have you ever wondered how the "self" argument appears when you call a method? Did you know there is a general mechanism behind it? Come learn all about attributes and descriptors. ----- The first part of this talk will describe what exactly happens when you read or write an attribute in Python. While this behavior is, of course, explained in the Python docs, more precisely in the [Data model] section and [related] [writeups], the documentation gives one a "bag of tools" and leaves combining them to the reader. This talk, on the other hand, will present one chunk of functionality, the attribute lookup, and show how its mechanisms and customization options work together to provide the flexibility (and gotchas) Python provides. The topics covered will be: * method resolution order, with a nod to the C3 algorithm * instance-, class-, and metaclass-level variables * `__dict__` and `__slots__` * data/non-data descriptors * special methods (`__getattr__`, `__getattribute__`, `__setattr__`, `__dir__`) In the second part of the talk, I will show how to use the customization primitives explained before on several interesting and/or useful examples: * A proxy object using `__getattr__` * Generic desciptor - an ORM column sketch * the rudimentary `@property`, method, `staticmethod` reimplemented in pure Python (explained [here][2] and elsewhere), which lead to * SQLAlchemy's [`@hybrid_proprerty`][4] * Pyramid's deceptively simple memoizing decorator, [`@reify`][5] * An ["Unpacked" tuple properties][6] example to drive home the idea that descriptors can do more than provide attribute access (and mention weak dicts as a way to non-intrusively store data on an object) (These are subject to change as I compose the talk. Also some examples may end up interleaved with the theory.) Hopefully I'll have time to conclude with a remark about how Python manages to be a "simple language" despite having these relatively complex mechanisms.
Keywords
80
Thumbnail
25:14
107
Thumbnail
24:35
Attribute grammarRight angleOperator (mathematics)Gaussian eliminationComputer animation
Attribute grammarParsingNetwork topologyElement (mathematics)Object (grammar)Social classMountain passInheritance (object-oriented programming)CASE <Informatik>NamespaceLibrary (computing)QuicksortModule (mathematics)Flow separationAttribute grammarCoefficient of determinationOperator (mathematics)Standard deviationComputer animation
Object (grammar)Social classMountain passLetterpress printingNetwork topologyElement (mathematics)ParsingSquare numberOrdinary differential equationAttribute grammarApproximationType theoryObject (grammar)Instance (computer science)Social classMathematical optimizationLatent heatMathematicsOperator (mathematics)Exception handlingFunctional (mathematics)Endliche ModelltheorieLecture/Conference
Object (grammar)Attribute grammarError messageSquare numberSocial classOrdinary differential equationData typeProxy serverMultiplication signAttribute grammarInstance (computer science)Data dictionaryType theoryDevice driverFunctional (mathematics)AreaRule of inferenceComputer animation
Object (grammar)Proxy serverLimit (category theory)Social classAttribute grammarForm (programming)Lecture/Conference
Social classProxy serverAttribute grammarBitCASE <Informatik>Functional (mathematics)Object (grammar)Process (computing)Computer animation
Social classSet (mathematics)Attribute grammarLecture/Conference
Social classInheritance (object-oriented programming)Object (grammar)CASE <Informatik>Multiplication signAttribute grammarLink (knot theory)Data dictionaryComputer animation
Limit (category theory)CASE <Informatik>Attribute grammarSet (mathematics)Multiplication signFunctional (mathematics)Lecture/Conference
Inheritance (object-oriented programming)Social classSocial classAttribute grammarData dictionaryNamespaceInterface (computing)Game controllerComputer animation
Attribute grammarQuicksortMultiplication signAttribute grammarInstallable File SystemNetwork topologyLecture/ConferenceComputer animation
Square numberSocial classAreaAttribute grammarRippingAreaSocial classAttribute grammarObject (grammar)Square numberComputer animation
AreaSquare numberSocial classAttribute grammarInstance (computer science)Square numberInstance (computer science)Greatest elementAttribute grammarSocial classObject (grammar)Computer animation
Square numberAreaSocial classInstance (computer science)MathematicsSystems engineeringObject (grammar)Attribute grammarComplete metric spaceBitCASE <Informatik>AreaInstance (computer science)WebsiteLecture/ConferenceComputer animation
Instance (computer science)Social classMathematicsAttribute grammarCategory of beingArithmetic meanSet (mathematics)Computer animation
Category of beingImplementationSet theoryDescriptive statisticsLecture/Conference
AreaSocial classSquare numberCategory of beingSummierbarkeitData typeSet (mathematics)Attribute grammarMechanism designFunctional (mathematics)Category of beingMultiplication signSocial classAreaSource code
Mountain passFunctional (mathematics)Attribute grammarInstance (computer science)Object (grammar)Social classParameter (computer programming)Computer animationLecture/Conference
Point (geometry)Function (mathematics)Social classSocial classAttribute grammarParameter (computer programming)Line (geometry)Functional (mathematics)Instance (computer science)Computer animation
Line (geometry)Functional (mathematics)Social classQuicksortLecture/Conference
AreaSquare numberSocial classPoint (geometry)Function (mathematics)Instance (computer science)Read-only memoryPoint (geometry)Instance (computer science)Social classObject (grammar)Semiconductor memoryComputer animation
Attribute grammarObject (grammar)Point (geometry)Social classRead-only memoryPoint (geometry)State of matterSemiconductor memoryNumberType theoryMultiplication signSocial classObject (grammar)Well-formed formulaAttribute grammarSpacetimeRight angleSet (mathematics)Lecture/ConferenceComputer animation
Well-formed formulaSocial classAttribute grammarError messageInstance (computer science)Attribute grammarError messageDirection (geometry)System callSocial classLine (geometry)Descriptive statisticsCASE <Informatik>Logical constantLecture/ConferenceComputer animation
Attribute grammarSet (mathematics)Instance (computer science)Lecture/Conference
Social classInstance (computer science)System callLibrary (computing)Category of beingFunctional (mathematics)Attribute grammarSoftware frameworkResultantImplementationComputer animation
Maxima and minimaAttribute grammarSet (mathematics)Instance (computer science)Well-formed formulaNormal (geometry)Lecture/Conference
Well-formed formulaSocial classError messageAttribute grammarOrder (biology)Image resolutionInheritance (object-oriented programming)Object (grammar)Mountain passInstance (computer science)Order (biology)Social classInheritance (object-oriented programming)Image resolutionWell-formed formulaAlgorithmHierarchyObject (grammar)Cellular automatonMultiplicationLinearizationAttribute grammarBitElectronic mailing listComputer animation
Social classElectronic mailing listInstance (computer science)HierarchyEndliche ModelltheorieObject (grammar)Attribute grammarType theoryLecture/Conference
Attribute grammarObject (grammar)Data typeSocial classInheritance (object-oriented programming)Mountain passGeneric programmingComa BerenicesMultiplication signComputer animationLecture/Conference
Read-only memorySocial classPoint (geometry)Object (grammar)Attribute grammar3 (number)Order (biology)2 (number)TupleDifferent (Kate Ryan album)Attribute grammarAreaSampling (statistics)Right angleScripting languageStatement (computer science)Social classLine (geometry)SequelMultiplication signCategory of beingDescriptive statisticsSinc functionOperator (mathematics)Food energyDatabaseInternet forumCodeSlide ruleFunctional (mathematics)IterationCodierung <Programmierung>Real numberInstance (computer science)System callCASE <Informatik>LogicState of matterComputer animationLecture/Conference
Transcript: English(auto-generated)
All right, can you hear me? Okay, so I'm sure you're all familiar with the operators in Python.
Not all of them are as interesting as some of the others, but the most interesting one of all is the smallest. So hello, I'm Peter and I'll be diving into the depths of the dot operator. Now as you know Python has a giant standard library where the dot is sort of the path separator.
So you have some parent module and you get some child out of it. Now this is a different use of the dot than I'll be speaking about, but it's a case where the dot does some kind of namespacing.
So what I will be talking about is attribute access. So the three things you can do with the dot is set an attribute on an object, get an attribute out of an object, and delete the attribute. The syntax I hope you all know. Now these are some of the most optimized operations in Python, so it's very good to use them.
But you always have to know the name of the attribute you're getting. If you don't know it, you can use the built-in functions set adder, get adder, and delete adder, which actually do the exact same thing, just a bit slower.
Now to understand what the dot does, we'll introduce this very simple approximation of what an object is in Python. So if you have an object, it has a type which doesn't change very often, and it defines the behavior of the object. And then you have the dict, which contains all the data specific to that one instance,
and that's expected to change quite a lot. Of course, there are always exceptions, but we'll go with this simple model. Now as an example, I have some class square, I define a method on it that gets put in the type,
and then when I define an attribute on an instance of that object, the attribute goes in the dict. And the dict is under the hood, just a simple dictionary. In Python 3, it's more than a simple dictionary, but it acts like one. And then when I want to get the attribute out, I just use the dot again, and Python looks in the dict.
And if it doesn't find the attribute in the dict, then it looks on the type. So I can also get the get area method, which is not in the instance dict. So here are the simple rules. When you set an attribute, it goes directly to the dict.
When you get the attribute, you try in the dict, then you try the type, and if it's not there, then fail. And most of this talk will be about how to make this work somehow differently.
So the first thing you can do to override this behavior is to put a special get attr method on the type. What this does is it hooks into step 3 here, and instead of failing right away, this function gets called and whatever it returns gets returned as the value of the attribute.
So this simple class just proxies all attribute access to some other object. This works. It has some limitations. For example, it won't work on attributes that are already in the dict.
So if you ask for underscore object here, which is already set there, the get attr won't be called. Now there's another method you can do, which is get attribute. It has a longer name, and it's more powerful.
This one actually takes over the whole attribute getting process. So it's a bit more difficult to use because if there's any attribute you already have on the object, you have to make a special case for it. Otherwise, you can do anything you want in this function, and it'll work.
Now that's getting attributes. There's one more thing you want to do, and that's setting them. And for that, we can have a set attr method. So what this class will do is it keeps a dictionary.
When you try to get an attribute from the object, it looks in the dictionary and returns whatever it finds. And if you want to set an attribute, it also looks in the dictionary and... Well, it sets the attribute on the dictionary.
As you can see, I'm special casing the dict because I'm setting the dict here, and I don't want to use the dict that's not set already. We also have the del attr, which does deleting attributes. So it's the same.
Did you have time to read it? I guess most of you are looking at me. So, yeah. The question is, do any of these hooks run during init? They run every time you set an attribute in Python.
So if the init has attribute setting, then yeah. And if there's just a function that gets called on the beginning, there's nothing too special about init. So yeah, I have to special case this for the setting here in the init.
Now, if you ever find yourself writing something like this, think twice, because the attribute namespace is not entirely under your control. You have attributes like under anything. You'll inevitably want to add some methods to your class. You'll want to enable subclasses to add new attributes.
So usually when you have something like a dictionary, stick to a dictionary interface and don't mess around with attributes. Otherwise you'll run into trouble pretty fast. Yeah, so I haven't seen this many times, actually,
because this sort of blanket overriding of getting and setting attributes is not that useful. Usually what you want to do is you have one attribute that needs some kind of special treatment, or you have several, but each one is special in its own way. So if you did the get adder, you would have a nasty tree of ifs and it's not very nice.
So for this, Python has a very special feature called descriptors. Now what descriptors do is you put a special object in the type, which will control access to the specified attribute.
So if I have some kind of square and I want it to have an area, I put some kind of magic special object into the class, and when I set the side, and then I look at the area attribute, this descriptor will take the side, this five here,
square it, and give that back. This is pretty easy to implement. The descriptor object only needs one method, which is get, so double underscores get. What this method gets is the instance, so that would be the square here,
and if the instance is set, it can return the value of the attribute. If the instance is not set, that means we're getting the attribute from the class itself, so that's the usage on the bottom here. What most well-behaved descriptors do is return the descriptor itself,
so you can use it for some other reasons. Is that clear? You just have special object to control access to an attribute. Now what this object can also do is control setting, so if you use a method called set,
it gets the instance and the value the user's trying to set, and it's free to do anything it wants. In my case, we want to update the side, because the user set an area to something, so we can update the side to match.
Is anybody having trouble reading that? No? Okay. And the last thing there is is delete. The short del was already taken, so it's longer. This one isn't that useful because you don't find yourself
deleting attributes all that often, but for completeness, it's there. Now, a bit of terminology. When a descriptor has this set method, it's called a data descriptor. If it does not, it's called a non-data descriptor.
This set means that pretty much you want to control all access to the attribute. If you only have get, you're just getting that out. It means there's some data you presumably want to store in that attribute, so that's why it's called the data descriptor. Now, how many of you know the property decorator?
Almost everyone? Yeah, so as you can see, I've pretty much gone the long way to do something like the property decorator, and in fact, the property decorator, all it does is create a descriptor. You can actually implement property in pure Python as a descriptor.
You just give it three functions and call them in the appropriate special methods. I have an example here, so I have a set area, get area, del area, give these three functions to the property, and without an area.
You can actually call the built-in property like this, and it'll do the right thing. If you add some more sugar to this class, then you can really re-implement all of the mechanics of the property. And again, this has set, so it's a data descriptor.
Yeah, so Python actually likes descriptors very much, and anytime there's something special to do on attribute access, you have a descriptor. For example, if you look at a simple function,
if you look at the attributes it has, one of them is get, because functions themselves are descriptors. When you have a function on a class, and then you have an instance of the class, you want to get the function, you don't get the function back, you get a method.
You get an object that has the function itself and the self-argument baked in. Right, so here's a very simple class with a very simple function, and when you get the attribute from an instance of that class, you get a bound method.
If you call that, it automatically provides the self-argument. And if you get a line from the class, it gives you the original decorator, or the original function.
As I said, most well-behaved descriptors return themselves when you get them from a class. In Python 2, you would get something called an unbound method, which doesn't really do anything that useful, but it's there. Now they sort of fixed it, so it just gets the function back.
Now if you look closely to the first descriptor we had, yeah, it works pretty much the same way. It does something special when you get the thing from the instance. When you get it from the class, it returns the descriptor itself.
Right, another thing I want to talk about is this little trick for saving memory. If I had a point class and I had millions of these objects around, I wouldn't want each of them to have this dict attribute,
which as I said is a normal dict, so it takes up memory, and I know that in a point I'll only ever have an x number and a y number and nothing else. So what this special magic incantation will do is it will actually make the type not
have a dict attribute. It'll have the type and it'll have directly the x and y in the C object itself, so there will be no dictionary and it'll save the memory. You can of course set and get the x and y attributes, but you cannot set anything else because there's no space.
In the object for anything extra. Right, and if you try to get x from the class, you get this descriptor. So every time there's some special attribute, Python implements it with a descriptor. Right, and I think now is the time to give you the whole magic formula.
So this is the way an attribute is gotten from an instance. First, you try get attribute. If there is get attribute, you just call it and get it back. If it throws an error, the error is raised.
If there's no get attribute, you look up the attribute on the class, and if it is a data descriptor, then you call its get method and return that. Only after that, only after looking for the data descriptor,
the dict is checked, so the value is gotten directly from the instance attributes. This doesn't call any descriptors, you just get the value straight back. After that, you check the non-data descriptor,
or you check if the descriptor is non-data, if that's the case, then you call that. If it's not a descriptor at all, if it doesn't have the tender get method, then you just return it directly. So if on the class I have some value like, you know, a class attribute, a constant for example, it's just returned directly.
Okay. After that, you fall back to get attr, and if that is also not there, then attribute error is raised. Now, there is this weird thing about the data descriptor and non-data descriptor being in two different places. What this allows is if you have a data
descriptor, it pretty much controls all the access to the attribute. So what the Python designers thought is that if you define both get and set, then you probably want to control the access to that attribute yourself.
If you don't define the set, then you're free to override that attribute in the instance. So you can put something in the dict, and then since it's a non-data descriptor, you will get it back from the dict before the descriptor is checked.
Right? There's one nice use of it. In the pyramid framework, it's called ray-fi. Some other frameworks call it cached property. I've heard it called lazy property. So what this does is you give it a function, and then when you get the corresponding attribute,
the function is called, and then the attribute is set with the value you've computed. So it calls the function, puts the result in the dict, and whenever you get the attribute again,
it doesn't call the function again. It doesn't go to the descriptor, it just returns the cached value from the dict. So this is a way if you want to implement lazy property.
There's some discussion about adding this to the standard library, and so maybe we'll see it under some name. Is there anyone who doesn't understand? Excuse me? If you want to invalidate this, you just remove it from the dict.
Yeah, you just delete the attribute. Yeah. The magic is that you can just set the attribute normally. Now if you would know the name in advance, this would literally be instance.name
equals value. Now the setting is not affected at all with this descriptor, so if you want to change the value in the dict, you just do it normally. The only thing that's different is getting, and that's only when it's not already in the dict. Other questions on this?
I'm not finished. Yeah, if you have a general question, then
just wait. Okay, so another thing about this magic formula is this on class, which I put in italics, because it's not as easy as it looks, because looking up something on the class of instance is
not an actual attribute access. It's a bit different, and it has to do with something called the method resolution order. So if you have a class and you have a subclass of it, you can check for the method resolution order, and it gives you the child, the parent,
and the object. Now when you look something up in the class, it goes through the classes in this order. So the attribute would be defined on child, it would be returned from the child. If it's not on the child, Python looks in parent. If it's not there,
Python looks in the universal superclass. Now if you have some kind of weird hierarchy of classes with multiple inheritance and stuff like that, there's an algorithm called, I think, C2, which you can look up, which converts this hierarchy to just a list that's checked linearly.
One more thing about this MRO. It's actually an attribute defined on the metaclass. I don't know if you're familiar with that, but it's defined on the type metaclass,
and if you have an instance of the object, it doesn't show up, because if you have an instance, you only check the instance itself and its class, not its metaclass. So this is maybe a useful way
to hide things from instances if you need them on the type. You just put it in the metaclass. And if you don't know what metaclasses are, I'm sorry, but I don't really have time to explain it. Okay, that's it. Time for questions. Okay, thanks to Peter, and we have
almost eight minutes for questions. So please raise your hand, and I will come with a microphone. Hi, thank you very much for the insights. It's really, really interesting. And one thing I
actually saw, which was quite interesting, was the slots attribute. How would you compare this to name tuple? And what are the, I understand what the advantages are compared to name tuple, but
then why do we need name tuple? Name tuple is actually used a bit differently. The name tuple is immutable. That's the first thing. And the second thing, name tuple has order in the attributes, so you can actually use it as a tuple. Here you don't have order, so use whatever makes
sense in your case. If I wanted to add, for example, an iterator on this, or make this an iterator, I couldn't really do that with name tuple since that's already an iterator.
Can I return data descriptor, for example, like from get attribute method?
Excuse me? Can I return data descriptor from get attribute method? Well, you can return it, but the get method won't be called. You can actually call the get method yourself. That's not a problem, but it won't be called if you just return it.
Okay, any more questions? There's one.
Sorry, the side that you implemented there with the area descriptor could perhaps just as easily or possibly even cleanly be implemented using straight properties in that it's then inline in one spot and it's easy to find. What real world examples do you have of where descriptors are actually useful in code that they add more than they take? I've just seen them used before
where effectively they could have been implemented in other ways, but now the descriptor class, because it was separate, it involved a lot more jumping around the code to try and follow the logic of what was going on. Yeah, if you just have a simple case like I have here,
it is better to use property if you can. One thing I took from the talk because of time reasons is actually examples of more complex descriptors. If you have an ORM, like say SQLAlchemy,
that uses descriptors a lot, and it's because the descriptor is a class in itself, so it can have other behavior than just getting and setting or just controlling attribute access. For example, in SQLAlchemy, you can do operations with the descriptor,
which is a column on the database, and it will generate SQL statements. Well, if you use it on the class, it generates the SQL statements. If you use it on its instance, it gets the data of that column. In simple cases, it's better just to use property,
which is also a descriptor. It's just a simple one. In complex, when you need some more state or functionality built into the descriptor, then use a class, and when you have several
related attributes like that, you can just create a descriptor class and reuse it. But, yes, I agree that the code is not as readable as it could be when you use descriptors, because there's one more place you have to check. But it's magic. Use it wisely.
And I think this was the perfect conclusion for this talk. Thanks again, Peter.