A Magic Implementation of NotImplemented
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 141 | |
Author | ||
Contributors | ||
License | CC Attribution - NonCommercial - ShareAlike 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/68634 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 2023138 / 141
8
17
22
26
27
31
42
48
52
55
56
59
64
66
67
72
73
77
79
83
86
87
95
99
103
105
113
114
115
118
119
123
129
131
135
139
140
141
00:00
ImplementationJames Waddell Alexander IIExecution unitSoftware testingInformation securitySoftwareCybersexSoftware engineeringComputer animationLecture/ConferenceMeeting/Interview
00:36
Library (computing)Computer networkSource codeComputer animation
00:58
Library (computing)InformationComputer networkProjective planeMaxima and minimaMereologyBitCASE <Informatik>AuthorizationCodeMultiplication signLibrary (computing)Equaliser (mathematics)
01:53
Core dumpLibrary (computing)ResultantRun time (program lifecycle phase)IntegerError messageElectronic mailing listComputer programmingType theoryEndliche ModelltheorieProgramming paradigmMathematicsComputer animation
02:57
Dependent and independent variablesOpen sourceMathematicsQuicksortCodeOpen sourceBitSoftware repositorySoftware maintenanceVideo gameDependent and independent variablesMultiplication signChecklistLibrary (computing)Computer animation
04:01
Software testingOperator (mathematics)Equals signLibrary (computing)CodeExecution unitContext awarenessDependent and independent variablesContent (media)DatabaseRevision controlTable (information)Type theorySource codeMusical ensembleCuboidGreatest elementProjective planePersonal identification numberOperator (mathematics)BitType theoryLibrary (computing)Software testingRun time (program lifecycle phase)Home pageEquals signComputer animation
05:05
CodeLibrary (computing)Execution unitContext awarenessDependent and independent variablesSoftware testingContent (media)DatabaseRevision controlTable (information)Type theoryPauli exclusion principleMaximum length sequenceDependent and independent variablesElectronic mailing listLengthCASE <Informatik>BitLine (geometry)Library (computing)Software developerSoftware testingQuicksortString (computer science)Pattern languageOperator (mathematics)Equals signComputer animation
06:09
Client (computing)Operator (mathematics)State diagramControl flowInformationComputer networkObject (grammar)Pairwise comparisonSoftware testingUniversal product codeDifferenz <Mathematik>IntegerObject (grammar)Line (geometry)Computer fileBitDependent and independent variablesOrder (biology)String (computer science)Right angleLibrary (computing)Regulärer Ausdruck <Textverarbeitung>HookingLogicFerry CorstenTracing (software)Statement (computer science)Pattern languageClient (computing)Slide ruleEquals signGoodness of fitQuicksortPairwise comparisonCASE <Informatik>FlowchartCodeSocial classSource codeAvatar (2009 film)Expected valueDynamischer TestComputer animationLecture/ConferenceSource code
11:29
Library (computing)Parameter (computer programming)String (computer science)Matching (graph theory)Equals signSocial classError messageLogicBitObject (grammar)Regulärer Ausdruck <Textverarbeitung>Core dumpCASE <Informatik>Slide ruleComputer animationSource codeLecture/Conference
13:46
Dependent and independent variablesSoftware testingType theoryHash functionLibrary (computing)Validity (statistics)String (computer science)LogicKey (cryptography)Regulärer Ausdruck <Textverarbeitung>Social classWriting1 (number)Statement (computer science)CASE <Informatik>Electronic mailing listQuicksortCybersexCore dumpVideo gameComplex (psychology)Process (computing)Right angleBitFunctional (mathematics)IP addressComputer animation
16:57
Computer networkFormal languageCodeMathematicsProjective planeDivision (mathematics)CodeStructural loadComputer animation
18:00
Task (computing)Operator (mathematics)Exclusive orDivision (mathematics)Hacker (term)CASE <Informatik>Operator (mathematics)MathematicsHypothesisParameter (computer programming)Software testingSocial classRange (statistics)LogicHookingNumberWordSlide ruleObject (grammar)Library (computing)BitDivision (mathematics)System callRandomizationCombinational logicDifferent (Kate Ryan album)Information overloadApproximationLecture/ConferenceComputer animation
20:10
HypothesisSlide ruleSource codeMaxima and minimaCodeLibrary (computing)Link (knot theory)BitoutputRange (statistics)Combinational logicLecture/ConferenceComputer animation
20:57
Execution unitPattern languageSoftware testingObject (grammar)Software developerMessage passingEquals signPiInformation overloadProjective planeShift operatorElectronic mailing listPower (physics)BitGoodness of fitLibrary (computing)String (computer science)FreewareError messageProgramming paradigmLogicTracing (software)InformationSlide ruleOperator (mathematics)MultiplicationOrder (biology)Division (mathematics)Subject indexingLecture/ConferenceMeeting/InterviewComputer animation
Transcript: English(auto-generated)
00:05
Thank you. I'm very encouraged to see this many people care about testing. I was worried not everyone would show up. Thank you for coming. Yeah, I'm Alexander. I'm a software engineer. I work at Palo Alto Networks, which is a cybersecurity company.
00:23
And this talk is about, oh, that doesn't work. Sorry, give me one sec. There we go. This talk is about my love affair
00:41
with Pydantic and ultimate rejection. I really, really like Pydantic, and I definitely feel like I had a crush on the source code when I first found out about it, and so I really wanted to contribute to try and be part of the project.
01:03
But when I had a look at the project, I realized it wasn't really the right time to contribute. They had kind of shut it down. And so I had a look at some other projects that the author had written, and I found one called Dirty Equals. So this talk is going to be about my story of contributing to Dirty Equals and things that I learned from that.
01:24
But just to kick things off, in case you don't know what Pydantic does, this is like a really simple, minimal example. And I wanted to focus a little bit on how Samuel Colvin, who wrote Pydantic, thinks about code, because that's kind
01:40
of the thing that I learned the most from contributing to his projects. And I think what he does is takes things that are internal to Python and changes them in a small way in a library. That has very interesting results. So in Pydantic, sorry, I don't know how to change that,
02:02
but I'll just keep going. He is very good at, well, yeah, sorry, here he's made Python check type hints at runtime. So you can see here that that B value to my model is, sorry, the A value, I mean, is supposed to be an integer,
02:22
but later, when it says my model A equals 100, I've actually passed it a list of integers. So I then get an error, which says, oh yeah, this works, that it should be an integer, but you've actually given it a list of integers.
02:41
So this has changed how Python normally works. Type hints are normally just type hints. They're just hints, and they don't get checked at runtime. But if you use Pydantic, your program will actually error if the type hint is wrong. So something does happen at runtime. And this kind of paradigm of making stuff change internally
03:03
to Python in a sort of contained way is what I think is really cool. So, but just first of all, I wanted to talk a bit about what happens when you come across a new open source library and how to figure out if it's worth contributing to, since I think people don't always talk about this, but it can save you a lot of time. If you don't do this, I think you can do a lot of work
03:22
on a pull request, and then nothing happens, which is quite disappointing. So this is sort of my checklist. The first thing I would check is when was the most recent commit? Hopefully it was a merge commit, so it was someone's merged in a pull request from someone else, rather than it just being the maintainer of the library, adding their own code.
03:42
I also like to see what the activity is like on the issues and get a sense of what's happening with the library, and then I'm not going to name any names, but on some repos I've seen really, really frightening responses from maintainers, and I think life's too short. So these are things that I think are worth checking before you make a open source code contribution.
04:09
But yeah, onto dirty equals, which is the project that I did end up contributing to after running some of those checks on Pydantic.
04:20
Dirty equals is a little bit like Pydantic for testing, so if you want to do cool different things with your tests, this library is good for you. It ultimately helps you make tests that are easier to write, I also think a bit more fun to write. I personally find writing tests can often be a bit boring and tedious. To me, this is a way to add some syntactic sugar that makes it a bit more fun.
04:42
And it fundamentally lets you misuse the equals operator in Python. So just like Pydantic let you check type hints at runtime, this is the kind of trick pulled in dirty equals. It changes the way that equality works in Python, which is what it says here on the homepage of the library
05:02
in that box at the bottom. And yeah, it's really useful when you're testing responses back from APIs, which I'll explain in a bit more detail later, but I think probably a lot of Python developers work with APIs, and so that's kind of
05:21
one of the main use cases for this. This is just a very simple example of the library syntax. And yeah, here you can hopefully see how equality is getting misused. We can write a really simple test case and do these sort of unusual checks with new objects from dirty equals. We can check in the first line there,
05:44
does the list have length three, and then does it contain the string A? And you can do this with quite interesting syntax with like an ampersand there, a pipe operator and not equals, and you can chain and combine these. So this is a simple example now, but hopefully you'll see as we go through the talk
06:02
how you can kind of build on this and make it all a bit more complex. Then this is probably the kind of pattern you'd be most likely to use in actual production code where you are writing an API test. You here are gonna maybe mock out a client
06:22
and then test that some JSON is equal to something like this. And I think hopefully here you can see how versus just kind of testing this is exactly equal to the kind of response you're expecting, which is what at least I used to do before discovering this library. You can write much more dynamic test cases in a way that's quite readable
06:40
using these dirty equals objects. So like here on that avatar file line, you can check that the string matches a particular regex. So I think that's probably a nicer pattern for testing. It's also gonna make things arguably a bit more modular if you wanna reuse some of these dirty equals objects when you're expecting these kind of responses.
07:00
And you get a pretty nice diff in pytest. So this is an intentionally failing test where I gave it the wrong value for the avatar file. And you can see in the diff, you get the dirty equals object back. For me, this is way easier to read. Not right now. Normally it's way easier to read than just seeing like a pretty long pytest trace back
07:23
with some regex that I'm not really sure where it is. So I think the advantage of making things more declarative and explicit is that when you get around to failing tests and have to fix things, you get these quite nice readable objects. So how does this really work though under the hood? If you think about it, this feels like bad Python.
07:44
It's not bad Python. I wouldn't be giving a talk on it. But normally equals equals is a strict equality check. So like that second line there should definitely fail. Hello definitely does not equal true for a whole bunch of reasons. But if we use this is true like thing
08:01
from dirty equals it passes because it's a truthy string. It's not an empty string. So something funky has happened in order for this to take place. And that's what I'm gonna kind of move into. For the rest of this talk, I'm gonna get a bit into how equality actually works in Python, which I did not know before contributing to this library.
08:23
So it turns out that when you have a statement like X equals equals Y in Python, the first thing that happens is that X checks, can I compare itself to Y? The thing on the right. And it calls the done equals method,
08:41
which is what you can see up here at the top. And the important thing from this slide is that this code never runs, which is why if you run this Y equals doesn't get called. And the reason for that is that this returned true here. So it exits here.
09:02
This might make more sense if you have a look at this slide. So in Python, equality is designed such that if you return not implemented from the thing on the left, it then proceeds to check equality at the thing on the right. So here we have X equals equals Y. The first thing that fires is done equals here.
09:22
You return not implemented in Python if you don't know how to compare yourself to Y. So X is saying, if I don't know how to check I'm equal to Y. If Y was an integer here, rather than some weird class I wrote, there would be a quality logic here to check if it's the same kind of integer,
09:43
assuming X was also an integer. But the really important thing here is that Y equals gets called. So that basically means if we write a custom object in Python, which is what all the dirty equals objects are, things that aren't in the standard library, then whatever we write in this done equals method here
10:03
is going to control how the equality logic works. So using this approach, we kind of get to hook in to how to do equality checks in Python. And this is from the C Python source code for how PurePath works, just to kind of show you what a sort of good blueprint by someone who knows Python
10:21
a lot better than me looks like for this. So if a PurePath object is trying to compare itself to another object, PurePath is designed so that it doesn't know how to compare itself to anything that isn't a PurePath object. So this is basically saying, if it's PurePath, then go here and do the PurePath kind of comparison
10:41
that is all good. But if it's not, the object is getting compared to, if it's not PurePath, then return not implemented. So then we go to the object on the right and check if that knows how to compare itself to the object on the left. So just to wrap this up, hopefully this makes sense now.
11:00
This is a kind of flow chart of what happens. If X knows how to compare itself to Y, it just does the comparison straight away. If it doesn't, then we see if Y can compare itself to X. And that's this bit here is what I'm gonna dive into next. Why would we care about this? This, when I was first looking into this, seemed pretty weird, equality works well in Python,
11:23
why would you wanna change it? But you can now use this to write your own comparison logic and make things give back that they're equal or not equal based off your own logic. So a really simple example of that is here.
11:41
And the dirty equals library is a little bit more complicated than this, but this is just some kind of hopefully clearer examples to follow. So this is how you would create a class in Python that you say is equal to the object on the left if the object on the left is bigger than five. So this other argument here that gets passed to the done equals method
12:02
is the object on the left. So that's why two equals equals X here is false and six is true because obviously six is bigger than five. And you can also do this for not equals. So the same kind of logic that I've been talking about
12:21
also works for not equals. So the idea here is Beyonce is only equal to herself. So hello is not equal to Beyonce, you too is not equal to Beyonce. But it is true that
12:46
sorry it is false that Beyonce is not equal to X because Beyonce is indeed equal to herself. So this is a more realistic example for how you would do something like the is,
13:02
I don't know how to say this out loud, is stir, is str class that was in dirty equals. So here when you build the class you give it a regex argument and then in the check I'm catching the case where it's not a string
13:20
because this from the name hopefully it's gonna be obvious to people that it's only for a string. I'm gonna raise a value error but if it is a string then I have my own logic here to basically say yes it's equal if there's a regex match but otherwise it's false. So that's the kind of core idea of how you would implement this.
13:41
In the next slide, trying to work with these dramatic pauses. This is how we would use it. So this regex here is basically saying alphanumeric only which means that I get false for only this one at the end. All the other ones are alphanumeric strings
14:01
so they pass fine. So this is, yeah we've kind of done it now. We've done our own way of writing an assert statement with a quality that does really, really funky stuff. Just in case this is still feeling very confusing this is the kind of core logic for how this all worked.
14:25
And next you might be wondering okay this kind of seems fun, this seems interesting but would I actually wanna use it at work? So I thought I would give you a concrete example of some tests that I've rewritten using this that have made my life easier.
14:40
Because I work in cyber I often have to, well not in this job but in my old job we've had an API people were paying for and we were sending back data like IPs and hashes. And it was pretty important we'd send back the right kind of hash or make sure that an IP was valid otherwise obviously people are paying for it
15:00
but also it would break things. And this I think is a much nicer way to check an API response. So suppose you get an API response back that has JSON that we turn into a dictionary that has a hashes key and IPs key. Where I worked often we had mixed
15:21
SHA-256, MD5s and SHA-1 hashes but we now were sort of transitioning to only send back SHA-256s but often there was some bad hashes in the response. So this is really easy to test in dirty equals and by bad hash I mean one that's not the right kind of type. You can just do here fail the test
15:42
if the list contains a MD5 hash and also check just what that was that it contains a SHA-256 hash. And then another thing that I think is pretty cool is the way this library is designed it's really easy to create your own logic
16:01
to run these tests. So suppose you don't like his hash or you don't like his IP for some reason and then you want to do your own stuff you can use this function check class here and here you just pass it a callable so here check IPs that I've defined above and then you can do whatever logic that you want here.
16:23
Here I've just done a generator comprehension that basically says make sure everything in the list is an IP so this will fail if I had any kind of invalid IP here. But there's a lot more sort of complex stuff that you could build up here and this is for me where the writing test bit
16:41
becomes a bit more fun. This is I think quite a lot more fun than just normal sort of I don't know I can't really be bothered to write IP validation logic in most of my tests personally maybe everyone else is more diligent but that's kind of what it would be like at work and so I'm now just gonna recap on some of the things that I took away from this
17:03
and I hope might be useful for you after that I'm gonna move on to a couple of other kind of fun things that I think you could take with some of the stuff I've discussed. So the main thing for me was if you really like someone's code style or their way of thinking about code to try and get involved try and contribute to one of their projects even if one isn't active.
17:22
For me the main thing I learned doing this was that it is kind of okay to change Python internals if you do it intelligently and I feel like I learned from this code base how to do that well and I don't think I would have learned that if I hadn't contributed to it and I think there's a lot of other stuff
17:42
you could do with that so I was gonna show you an example next about how PathLib actually does this. I gave a dry run of this talk to some colleagues and they told me PathLib does something similar which I had never really realized. While this loads PathLib basically overrides the dunder true div method in Python
18:03
so it kind of, hacking is the wrong word. I'm gonna stick to overloads. Here true div is what happens when you'd say something is divided by something else. It'll be a bit clearer on the next slide but in PathLib it basically says if you're dividing an object by another object
18:23
just join the path which is how you get, should I, gonna do this slide first, really cool syntax like this. So you get this whole path gets put together by doing this call whereas this operator normally in Python
18:42
would mean you have to divide stuff so I think this is a really nice way to do pretty cool syntax tricks and this is a really good little example in my opinion in PathLib but I think this kind of approach is something you could definitely put into other libraries
19:00
and I also think it's quite a cool feature of Python that you have these dunder methods that are exposed and you can hook into and change things. One last thing that I took away from the workshop on pytest a bit earlier is using hypothesis for testing and I think this would play quite well with dirty equals as a combination
19:21
so here's just like a very quick example I put together of how that could work. If you're not familiar with hypothesis what it's basically doing here is saying given some hypothesis give me a random bunch of numbers within this range 100 to 102, 10 to 12 and apply them to these arguments X and Y
19:41
so this is kind of the same syntax as parameterizing fixtures in Python but the difference is that you don't specify the parameters instead hypothesis gives you a bunch of random data so the idea is that you have a more robust test since you might be checking for more edge cases but I think if you then add in some dirty equals logic
20:01
so like here I've said, sorry, that if is a prox is a dirty equals class that lets you check a number within a range so it's gonna say approximately 110 with like delta of four so max 114 minus 106
20:21
I think that's quite a nice combination I think that lets you kind of have a bit of breathing room around the random stuff that gets input and I often if I'm on a bigger code base I write a ton of fixtures and parameterize them and it gets really messy and I thought this is quite a lot cleaner so yeah that's pretty much the end of the talk
20:41
I'll share the slides afterwards here are a lot of the resources I use for it including some of the source code links if you wanna kind of dig in and really get into it and I think I will finish there thank you very much for listening.
21:10
Thank you and I admire how you were calm when the slides were not working I would freak out completely
21:21
I also have a question actually so I use Pytest a lot and if you assert that something is in the list for example and the assertion fails it will tell you nicely this item here was missing on index three or whatever but if I always just assert equalities
21:40
with custom equality methods how do I propagate more information to the actual developer who's running the test that why the equality failed other than returning true or false or not implemented? Yeah that's a very good question I think it's a bit of a paradigm shift with this library
22:01
is that you don't get that kind of Pytest trace back so you actually get a really unhelpful error unless your dirty equals object is very clear so if you did like list equals equals contains X and then your list didn't have X in the test failure message
22:21
it will say it failed because it didn't contain X so I think that's pretty clear to read but if you wrote a messier dirty equals object like it should just contain strings then it's not very useful and one thing I should say is that if you use dirty equals it doesn't stop you from using everything else in Pytest so if it makes more sense to use the pattern you describe
22:43
you can just use that for that test. Thank you. Hi I have maybe a bit of a stupid question but considering all these magic methods that you mentioned could we go even more postmodern and do it like dirty subtraction, dirty power
23:00
dirty multiplication doesn't have to be equals right? Yes so you could but you probably have to be smarter than me so I mucked around with them after doing this talk and it was kind of harder than I thought but I think the true div example is like a simpler way to do it where you don't have to get into the logic
23:20
of the order and not implemented you can just overload a method so I don't know what the dunder method is for multiplication but I mean why would you want to do that out of curiosity? How would you want to change multiplication? Just because it's fun no? Yeah it is fun. So if you want to do that I would recommend having a look at the Python like object operator docs
23:43
that's one of the resources in my slide where it has all the dunder methods that you can overload. Also if you find Maxim Danilov he has a very cool project where he's doing something similar to that yeah.
24:01
Let's give a big applause to Alexander.