Iteration Inside Out
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Untertitel |
| |
Serientitel | ||
Anzahl der Teile | 132 | |
Autor | ||
Lizenz | CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben | |
Identifikatoren | 10.5446/44912 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
EuroPython 201893 / 132
2
3
7
8
10
14
15
19
22
27
29
30
31
34
35
41
44
54
55
56
58
59
61
66
74
77
78
80
81
85
87
91
93
96
98
103
104
105
109
110
111
113
115
116
118
120
121
122
123
125
127
128
129
130
131
132
00:00
IterationProtokoll <Datenverarbeitungssystem>SoftwareSchlüsselverwaltungProdukt <Mathematik>DatenbankAbfrageDivergente ReiheTypentheoriePascal-ZahlendreieckBefehl <Informatik>ZahlenbereichArithmetische FolgeFolge <Mathematik>ProgrammschleifeZeichenketteLoopMailing-ListeÄhnlichkeitsgeometrieSpannweite <Stochastik>IndexberechnungTupelFormale SemantikKnotenpunktImplementierungAusnahmebehandlungComputervirusInverser LimesFehlermeldungDemoszene <Programmierung>Ganze ZahlAutomatische IndexierungFunktion <Mathematik>Weg <Topologie>SystemaufrufOrdnungsbegriffTorusNotebook-ComputerCodeDatenbankFolge <Mathematik>Formale SpracheOrdnung <Mathematik>ProgrammierungTypentheorieFormale SemantikIterationMakrobefehlProdukt <Mathematik>VariableKategorie <Mathematik>SoftwaretestGanze ZahlAutomatische IndexierungKonfiguration <Informatik>Analytische FortsetzungBitDivergente ReiheEinfach zusammenhängender RaumGeradeInverser LimesLoopMereologieMomentenproblemObjektorientierte ProgrammierspracheRechenschieberResultanteZählenZahlenbereichQuick-SortVersionsverwaltungGüte der AnpassungÄhnlichkeitsgeometrieExogene VariableSpannweite <Stochastik>ComputervirusData DictionaryAusnahmebehandlungCASE <Informatik>Strategisches SpielFehlermeldungInstantiierungRandomisierungPunktKlasse <Mathematik>Varietät <Mathematik>Arithmetische FolgePoisson-KlammerProtokoll <Datenverarbeitungssystem>ProgrammschleifeWort <Informatik>Lesen <Datenverarbeitung>Mailing-ListeBefehl <Informatik>Weg <Topologie>KonditionszahlSchreib-Lese-KopfTypprüfungSchlüsselverwaltungDifferenteAutorisierungNeuroinformatikObjekt <Kategorie>HoaxElement <Gruppentheorie>sinc-FunktionMultiplikationsoperatorTVD-VerfahrenZweiMapping <Computergraphik>Rechter WinkelDemoszene <Programmierung>GamecontrollerMathematikZeichenketteProgrammbibliothekAggregatzustandFunktionalHardy-RaumTermProzess <Informatik>Pay-TVFormation <Mathematik>Notebook-ComputerInnerer PunktMinimumMixed RealityKontrast <Statistik>TupelEvoluteComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:00
Yes, this talk I call iteration inside out, and it's an explanation of Python's iteration protocol. It assumes that you do not particularly know much more than how to write a for loop. That's the main assumption. So yes, I'm Naomi Cedar. I am at the moment the chair of the Python Software Foundation.
00:23
That has nothing to do with this talk, but I put it there anyway. I am the author of the Quick Python book. Third edition just came out. I do have a couple of copies with me that I will give away to people who ask nicely. So come see me later on, and I will be happy to talk to you about that.
00:42
And I also lead a small Python team for a company called Dick Blick Art Materials. We're the largest supplier of fine art supplies, that is, paints, brushes, canvases, in the United States. That's not a huge market, but we're one of the biggies. So I enjoy that job, actually.
01:03
So I'm going to start with a quote from Dave Beasley. I don't know if you had a chance to see. He gave a remote talk for a PyCon Pakistan, which, among other things, features him doing a trombone solo. Highly recommend that.
01:21
But in this talk, which is called the iterations of evolution, he goes way, way, way back in Python history. He describes the for loop and the iteration protocol as Python's most powerful useful feature. And I happen to agree, it's really an interesting thing that is enormously powerful and enormously
01:43
taken for granted. So when you think about it, we have repetitive chunks of data all around us. It's one of the things, after all, that computers are good at, is repeating something over and over and over and over and over again many more times and much faster than
02:04
we can do. So just picking some random examples of things that I have run across not that long ago. You could have a series of temperature readings for the month. You could have dictionary keys, maybe for a dictionary mapping member IDs to members.
02:23
You could have a CSV of a million or two or 20 million products. You could have the text of Moby Dick to a very big book, a lot of things in sequence. You could have the results set for a database query finding your daily sales even.
02:41
So there are just a lot of different things. And these things don't, on the face of them, have a whole lot in common. They're different types. The objects involved are different. The elements contained in those objects are of different types.
03:00
Sometimes it might be just a simple type. Sometimes it might be itself a compound type of various other pieces, whatever. They're all kind of different. But the situation that we want to use to get at them is all the same. That is, they're large series of data where we want to do the same thing one after the other all the way through.
03:25
So in Python, obviously, we would use a for loop. And that seems pretty obvious, except that the way that Python does for loops was not always an obvious thing.
03:44
In fact, it used to be something that had to be called out as being different. So going back in history, this is from the Python 1.1 documentation, which is nearly 25 years ago. And here's what it says about for loops.
04:01
I kind of like the tone of this whole thing. It's like Python and for loops, like for loops and you. And the for statement in Python differs a bit from what you may be used to in C or Pascal. Those were the days, Pascal. Rather than always iterating over an arithmetic progression of numbers like in Pascal,
04:24
for those of us who suffered through Pascal, you remember that for loop just gave you numbers. That was it. Or leaving the user completely free in the iteration test and step as in C, and I'll talk about that in a second. Python's for statement iterates over the items of any sequence, e.g. a list or a string, in the order they appear in the sequence.
04:47
So in the early days of Python, this was weird. This was something that was specifically called out in the documentation. So for contrast, some of you may have done C.
05:01
I mean, you know, some of us have survived C programming. This is a C for loop. And for those of you who aren't familiar, you can see, though, that there's a lot going on in the line with the for on it. OK, in the parens there, we're declaring a variable, we're initializing a variable to be a loop control.
05:24
We have a condition for when the loop continues and we are incrementing i, the loop control variable. That's all packaged in that for line. Now, if you haven't done C before, I can tell you that what's required there is that you have two semicolons, basically.
05:44
All the rest is up for grabs in that for line. And in fact, really what's going on is that this is just a repackaging of this kind of while loop. You see, you've got the same pieces.
06:01
We declare a loop control variable. We initialize the loop control variable. Got a condition to see whether the loop continues. And then at the bottom, we increment the loop control variable. It's the same thing. In both cases, you are free if you want to not put those things in or to put in something completely different.
06:21
It doesn't matter. You can do whatever you want, but that's the way that it works. And in both of those cases, the loop control variable, the i, is just an integer variable that gets incremented. And it's used as an index into the array. There is no particular connection between the two.
06:42
In fact, when you notice, this is controlled by a variable called list.len. And of course, I realize for C, it's a little bit of anachronism or a misnomer to even use the word list. But I put it in for you. But that has to be determined completely separately. That has nothing intrinsically to do with the array.
07:04
So that's the way that things used to be. Now, you can, and here's where I'm going to start playing around a little bit with code for your amusement, for my excitement. Get the adrenaline going. But you can. And people, when they used to start doing Python, you'd see this a lot, they would do a for loop.
07:24
And they would say, like, for i in range, when, oops, a list. And then we would do something.
07:43
And that's kind of reproducing what the C sort of loop would be. And as I say, people beginning Python coming from a language like C would do this a lot. Remember, I think I did it when I started. So that gets you the loop. It works.
08:01
But in fact, even that isn't exactly what C is doing. Because to get our list of index values, our 0, 1, 2, 3, to go through these elements in the list, we're actually using the range to get ourselves a sequence. And then we're going over the sequence. So it's not even exactly the same as what C does.
08:22
And it's certainly not the way that most people would write a for loop that they would think of as being, you know, quote, Pythonic. Normally, what we would do is more something like this.
08:47
And that, you'll notice, throws away that whole going by index. And instead, we're going by one item in a sequence. Okay, now, this shouldn't be new to any of you. This is just kind of setting up our main premise here that Python's for
09:02
loops, Python's iteration, is a somewhat different thing than a lot of the traditional things. I will say that these days, there are many sort of newer languages that do sort of the same thing or pretty much the same thing. A lot of the languages that didn't used to have something like this now do.
09:23
Even in C, they have some macro jickering that makes you come up with something that kind of sort of similar in a way. So there are lots of different options for doing this. But for a while, Python was pretty much one of the few that did things this way. And all of this kind of raises the question, how does it do that?
09:47
So when we go back and we look at that example, let's just pop back there. How does that for loop know where it is? I mean, the whole key to a for loop is you go through an order one after another after another. So how does it know what's next?
10:02
There's nothing in the for loop that we just did that really explicitly says that. And how is it that basically you can use for loops on all of these wide varieties of container types and it all just seems to work? Or better yet, suppose I want to make something that I can stick in a for loop.
10:23
How do I do that? So that's what I want to talk about very briefly. So key thing is for Python, this whole iteration using for relies on a protocol, not on individual types.
10:40
And this has been the case since Python 2.2. And I'm old. I mean, I was around at the end of the Python 2.1 few months, whatever. It wasn't the current version for very long. But so this is about as long as I've been doing Python. This has been kind of codified as a protocol.
11:00
And particularly if you're newer to the language and you hear people talking about Python doing duck typing and you don't really have a clear example in your head what this means. That is what this means, is that Python's iteration protocol is an example of duck typing. And if you're not familiar with duck typing, this is a case where if it walks like iteration and it quacks like iteration, it's iteration.
11:28
Okay? So this is a protocol. And for this protocol, turns out you need two things. Okay?
11:40
I realize this is sort of looking like we're defining one thing by variations of the same thing. What's this? What you need for iteration is you need an iterable object. I'll tell you about those next. And there is also need for an iterator object.
12:00
And I'm going to talk about that after I do the iterable, but in fact, most of the time, Python actually takes care of the iterator for you. So it's not something that you need to worry about a lot most of the time. So before I realized I was going to go way, way over and I had to sort of squeeze things down, I was going to go through the whole Python glossary thing on iterable, but instead, nope, you got the TLDR version.
12:29
So basically, an iterable is something that can return its members one at a time in order, like a list, string, tuple, anything like that. Or it's so and basically, that means one of two things.
12:45
It can be a class with a dunder iter method that returns an iterator, which again, I haven't talked about yet, so put that on hold. Or it can be any class that you make that has a dunder get item method that follows sequence semantics.
13:06
And by that, we mean you start at the beginning from zero and count up using integers to get to wherever you want to be in the sequence. So it's one of those two things. Can be both, but basically, those are the two things.
13:21
And what happens is, when you do a for loop, if you hand it an iterable, it's going to create an iterator sort of anonymously behind the scenes to manage the for loop for you. Okay, so that's why you almost never need to worry about it because it gets done automatically.
13:42
So again, to restate, if you use the iter function, which is a built-in function, on an object, in order for it to be an iterable, it needs to return an iterator. So it's either got a get item that we can use square brackets and an index to, or it's got an iter.
14:04
So let's actually talk about doing that. Now, there are a couple of different ways that you can check to see if you've actually got an iterator. So you can look and see if it's got an iter method. So if we've got a list, I can check and see if my list has a property under iter.
14:31
And since it's a list, it does. So you can check that. But that's only one of the possibilities. The other possibility is, does it have a get item that follows sequence semantics?
14:44
Well, you can check and see if it's got a get item, but it still might not be following sequence semantics. Dictionaries, for example, have get items, but they don't take integer indexes, so that wouldn't work. So that's a little bit trickier to decide.
15:02
Basically, though, remember, we're programming Python. So we have the philosophy it's easier to ask for forgiveness than permission. So what you can do is you can just try calling iter on an object, and either it will give you a type error or it will give you an iterator. If it's an iterator, then yes, you have an iterable object.
15:25
So what I want to do, first step is I want to make the minimum possible iterable. Now, this is kind of modeled on an iterator that is in the collections library. But basically, I don't want to make a thing that repeats itself.
15:44
I'm trying to keep this as simple as possible. So basically, if I make a repeater object and I tell it that what I want it to repeat is hello, and I give it the number four, I can loop over it and it will give me four hellos. Simple enough, right?
16:02
So let's see what that's like. So I'm going to, and as I say, here's the exciting part, where we'll see if I can actually manage to write this class without having something horrible go wrong. So let's see here. We need an init, and I want to give it the value that it's going to be echoing to me.
16:28
And I'm going to give it the limit as in the number of times I want it to repeat. So here, this is just a little bit, see, slaf. I already did it.
16:42
So value equals limit. Okay, that's good enough for my init. Now, the other part I needed was get it, right?
17:01
So let's see what that might look like. Self, and I'll call that index. Oh, thank you. That's very helpful. I know I did that when I was practicing,
17:22
and it took me like two or three minutes to go back and figure out what I did. It kept saying, this is not an iterable. Yes, it is. No, it isn't. So yes, so got that. So here, we need to do something that will sort of satisfy sequence semantics.
17:41
So I'm going to do, we need to start from zero. If it's less than zero, this is really not going to work out well for us in this case. And we don't want to go over our limit. Otherwise, why would we bother to set the limit there? So if it's, if it's, ah, thank you.
18:01
That was the other thing I kept screwing up. Ah, I was just testing you. Ah, there we go. Okay. So if that's the case, I'm going to return self.value.
18:20
And what happens if we go over? It's the other question that's not quite solved. So what we do there is, if you want, out of my four element repetition,
18:41
you want item 99, I'm going to say that's not a good index, and I'm going to throw you an index error. Okay, so we got that. And, um, yeah, I don't think we screwed that up. So, so we actually have then, this is a class for an iterable.
19:02
At least I'm saying that. Let's see if it actually is. So, I'm going to make an instance of my repeater class here. I want it to say hello four times. Ah, and, ah, then I want to see if it, um,
19:22
if it actually does have a dunder dunder iter method. Okay. It doesn't because I didn't define one. Okay, fair enough.
19:40
Does it have a get item with sequence semantics? Well, we could mess around with that a little bit, but I think basically the easiest thing to do is just to see if I can get something from index zero. And I do. So, okay, so we've got something that kind of looks like a sequence object.
20:02
Let's see how that actually goes. So can I get an iterator from it? This is sort of the real test. If I can't get an iterator from it, then this whole talk is over, right? So, ah, maybe that's a hint. I don't know. But let's see.
20:23
So, yeah, we can get an iterator from it. Now, remember what I got there was really pretty simple. Ah, so if it's an iterator, it should work in a for loop, right?
20:42
So there we go. So that, that little tiny class is all that you need to have it be an iterable. Ah, and, you know, list comprehension. Oops, let me do that again. There we go. There we go. Okay, so that works too.
21:02
So basically what happens behind the scenes is I say every time that repeat, my little object gets passed to a for loop, the iter method is called on it and an anonymous iterator is created behind the scenes and that's what manages how many hellos we get, okay?
21:23
Um, and when you get to that point where you're off the end of your object, when you hit that index that's one too far, the iterator knows to pick up the index error and stops the iteration and then every time we do that, we get a fresh iterator.
21:44
Otherwise, we would only be able to do the loop once and I'll talk about that in a second. So again, this, this was the minimum case we had. The get item was the bit that we actually needed to make an iterable and the index error was the part we needed so that our, our iteration would stop.
22:02
You can for amusement, and I've done this before, I don't recommend doing it very often though, you can leave off that part and just have it go forever. Ah, one hopes you have some other strategy for getting out of the loop then, ah, short of turning off your computer and walking away in disgust. At least that's why you had to do it in the, in the bad old days.
22:22
Ah, so, so that's it. It's really very easy to make an iterable. But that's kind of just taken that question we raised at the beginning and just kind of kicked it on down the road. It hasn't really answered how does it know which item is next because clearly the iterator is just responding to,
22:43
or the iterable, excuse me, is just responding to being given an index and it feeds back an item, it doesn't know what's next. The loop itself doesn't seem to know what's next. Your loop doesn't even care what you're giving it as long as it follows the protocol. So it's the iterator part that is actually keeping track of what's next.
23:03
And anything that does that for us is an iterator. And the key thing for an iterator is that it has a dunder next method. Or for reasons that I've never really managed to make up a good explanation for, in Python 2.x it's just next without the dunder.
23:22
I don't know why that happened. Ah, but, um, obviously it seemed like a good idea at the time. But, ah, it's, um, it's got a next that gives you the items in turn. So that's what an iterator is. It's one of these things that will manage where you are in the sequence and keep on giving you the next.
23:41
When you get to the end, it doesn't raise an index error like our iterable did. Instead it raises a stop iteration exception. And that it knows means no more loop. Um, and if you keep trying with that same iterator, it'll keep on telling you stop iteration. It's over.
24:00
Basically iterators are one-use objects. You can make one that wouldn't be, but this is the way that they work. The other thing that an iterator needs is it needs a dunder iter method. And that dunder iter method returns the object itself.
24:22
Which means that all iterators can be used as iterables. So, um, it's like, huh? We'll talk a little bit about this in a second. The thing with iterators though, again, is one-use and done.
24:40
They don't magically refresh. You can't reuse the same iterator. You're always going to get a new object. So, I want to do the same thing, but I want to do it with an iterator now so that you can see how that works. So, let's back up here. We need to do two things. We need to have a next, and we need to have a dunder iter method.
25:04
So, again, let's sort of do this. Help me out here. So, first thing I want to do is, again, a little bit of housekeeping.
25:22
We'll do the same sort of thing. We'll do that. So, obviously, that's just the same business.
25:40
And we want to do one more thing because we're going to actually need to keep track of where we are in this sequence. So, I'm going to call it count. And I'm going to start counting how many objects we've iterated at zero. So, I'm just kind of setting this up as part of my dunder init. So, then, what was the next thing we needed?
26:05
Audience participation. Next. Well, or either, whichever, but, yes, next is more fun, so we'll do that one. Okay.
26:21
And, again, we can kind of whatever. I mean, this is a little bit simpler. I do not mean to do that. There, stop that. There. So, we want to be sure we're not past our limit, basically.
26:49
So, if we're still in the iteration business, remember, we started count at zero, but we don't want to go past our limit. So, if we're still able to do that, we kind of need to do two things.
27:03
We need to type properly is one thing. First, we want to keep track that, oh, yeah, we've probably used up one of our iterations, so we'll do that. And, then, the other thing we would do, of course, is return self.value.
27:23
Okay. And, if we are at the limit or over, if we've gone past where we should be, I'm sorry about that. Pardon me?
27:40
Oh, I'm sorry, I'm not hearing you. Well, yeah, that's probably the wise thing to do. Oh, I see, I lost the whole thing. That would have been embarrassing.
28:01
That would have not been good. Okay. Else. So, let's see here. We want to raise. Stop iteration. There we go. Okay. There we go. Let's move ahead.
28:21
Let me advance this quickly. Okay. We're running out of time. I know I'll get in trouble if I go too far. So, basically, I need to back up here. And, I'm going to make a repeat iterator.
28:40
Does it have an iter method? Well, you can guess that, yes, it does have it under iter method. And, I will quickly do this. Pardon me?
29:07
Get back to repeat iterator. Pardon me? Get back to repeat iteration. Oh, I didn't actually. Oh, oh, oh, oh.
29:23
Thank you. Well spotted. All right. There we go. Okay. Now, I'm going to do this. And, we should be good. So, there. We got one repeat. All right. I'm going to move ahead here.
29:41
And, this is the last question. Because, we're not going to repeat. We're running out of time. I didn't go nearly fast enough. But, if I do a for loop here, this is a trick question. What happens now?
30:04
We should get three because I've already called it on it once. Oops. Oh, dear. Oh, that's because in my hurry, in my hurry, I forgot to add the iter method. There we go.
30:21
So, yeah, we're running out of time. So, I clearly mistimed this. There we go. Make that.
30:42
Make that. And then, if we do that now, we have four because I didn't exhaust it. Now, if I were to do this again, if I were to run this again, nothing happens.
31:00
Right? Because it's exhausted. And, in fact, if I then want to, let's fake this. I'm going to do this one last thing. I'm going to call next on it. Next six probably won't work. What happens now?
31:24
So, for those of you who know, yes, you get a stop iteration thing. It's only when you call next that you're going to see that. So, in any case, this is a little bit beyond what I was going to do. But, the slides are there, and I will post them. And, you can look at the Jupyter notebook, and you can contact me.
31:40
And, thank you very much for your time. Thank you. Yes.