Kung Fu at Dawn with Itertools
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 18 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21239 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Maxima and minimaArmElectronic meeting systemMetropolitan area networkWebsitePresentation of a groupAgreeablenessElectronic mailing listLoop (music)Level (video gaming)Point (geometry)AbstractionFormal languageFocus (optics)State of matterSocial classElectric generatorMultiplication signNumberElement (mathematics)Subject indexingFunctional (mathematics)WordRange (statistics)Term (mathematics)OrbitRight angleVideo gameLattice (order)Computer fileCuboidAreaIterationMappingKey (cryptography)Line (geometry)Data dictionaryFerry CorstenOrder (biology)Single-precision floating-point formatString (computer science)Type theorySystem callSpring (hydrology)INTEGRALException handlingSheaf (mathematics)Series (mathematics)Monster group1 (number)Limit (category theory)Data managementBuildingCASE <Informatik>WritingNormal (geometry)Mass2 (number)Arithmetic meanObject (grammar)Letterpress printingCodeComputer animationLecture/Conference
09:12
Term (mathematics)Element (mathematics)Multiplication signStack (abstract data type)Object (grammar)IterationProcess (computing)Program slicingFunctional (mathematics)NumberPrime idealString (computer science)Positional notationElectronic mailing listCondition numberDefault (computer science)ExpressionCountingLimit (category theory)Range (statistics)Loop (music)TypprüfungTrailNatural numberGenerating function2 (number)Object-oriented programmingRootPrime numberCodeSquare numberProof theoryLine (geometry)DivisorMereologyVideo gameWater vaporEquals signCartier-DivisorSemiconductor memoryState of matterSoftware frameworkElectric generatorSocial classBuffer overflowPoint (geometry)Power (physics)Field (computer science)CASE <Informatik>WordInheritance (object-oriented programming)Identity managementSimilarity (geometry)Division (mathematics)Symbol tableGroup actionMathematical optimizationRight angleSource codeFormal languageNoise (electronics)DataflowEstimatorMathematicsStability theoryChannel capacityFrame problemComputer animation
17:53
Metropolitan area networkElement (mathematics)System callElectronic mailing listIterationPrime idealState of matterFunctional (mathematics)String (computer science)Electric generatorCASE <Informatik>XMLLecture/ConferenceComputer animation
18:37
SineAreaIterationRange (statistics)Functional (mathematics)ForestGroup actionElement (mathematics)Line (geometry)String (computer science)CausalityObject (grammar)IntegerAlgorithmManifoldDifferent (Kate Ryan album)GradientKey (cryptography)Point (geometry)ResultantBit rateFrame problemCoefficient of determinationNumberFrequencyCondition numberMathematicsRootMultiplication signThresholding (image processing)Product (business)Revision controlSystem callDivisorPredicate (grammar)TrailWave packetComputer fileServer (computing)DataflowWavePrime idealCASE <Informatik>Duality (mathematics)Maxima and minimaSquare numberPattern languageLimit (category theory)Prime numberSeries (mathematics)MereologySemiconductor memoryLetterpress printingCuboidDrop (liquid)Loop (music)Lambda calculusInteger factorizationElectric generatorInheritance (object-oriented programming)WordFreewareStandard deviationElectronic mailing listUniqueness quantificationBuffer overflowComputer animation
27:13
Mach's principleGroup actionEqualiser (mathematics)Negative numberPoint (geometry)IntegerElectronic mailing listKey (cryptography)ChainMultiplication signElement (mathematics)outputFunctional (mathematics)MathematicsPolynomialLengthThresholding (image processing)2 (number)NumberDefault (computer science)Hash functionCodeWordSeries (mathematics)Data compressionQuicksortIterationLoop (music)CountingCycle (graph theory)Set (mathematics)ExpressionTrailVideo gamePower (physics)String (computer science)Ferry CorstenRevision controlRow (database)Event horizonCondition numberComputer programmingKernel (computing)Chemical equationExtension (kinesiology)Staff (military)Data dictionaryForestCASE <Informatik>Right angleTable (information)GradientWater vaporPairwise comparisonP-valueActive contour modelWeightForcing (mathematics)Line (geometry)Computer animation
35:49
Object (grammar)PixelCombinational logicSlide ruleBlock (periodic table)BuildingPoint (geometry)Type theorySocial classRecursionLoop (music)Video gameLine (geometry)DistanceFreewareRow (database)Condition numberCodeFactory (trading post)TupleConstructor (object-oriented programming)SummierbarkeitFunctional (mathematics)Formal languageMathematicsProduct (business)Functional programmingIterationoutputCartesian productResultantCommunications protocolField (computer science)Inheritance (object-oriented programming)Endliche ModelltheorieFitness functionPerfect groupHill differential equationArithmetic meanRight angleModule (mathematics)CASE <Informatik>MereologyWebsiteSurfaceDampingElement (mathematics)Electric generatorLevel (video gaming)Form (programming)Office suiteBit rateComputer animation
42:33
Line (geometry)Software frameworkSet (mathematics)Constructor (object-oriented programming)Design by contractPattern languagePoint (geometry)TelecommunicationGroup actionFormal languageParameter (computer programming)CASE <Informatik>Programming languageInheritance (object-oriented programming)MathematicsComputer animation
Transcript: English(auto-generated)
00:00
Our next presenter is Victor Teron. He's site reliability engineer at Google. And he will present us a Kung Fu add-on with either tools. Welcome. Thank you. And this is not Kung Fu, but it was a super cool image,
00:21
so I had to use it. We will make up for it later. Also, I have a lot of things to say. If any of you disagrees, please raise your hand and disagree. We all know how we loop over a list in Python, right? We are told this all the time. We have to say for thing in whatever, do thing. For example, here we're looping over the elements.
00:41
And we are always told. That's way better than doing this, which we might be tempted to do when we come from other languages. We could say, OK, I'm going to generate all the indexes, and I am going to do stuff with the indexes, accessing each one of the elements. And even worse, we know we don't want to do this. Like even instead of using range,
01:01
we could try to increase the value of index by hand. We don't do this. So we know, we have to use what is Python for loop, which in other languages is a for each. And we know that's going to give us a higher instruction level, which means the code is more readable, it's clearer. So we like it.
01:21
We don't have to think about what's going on. We only say, OK, loop over this, whatever it is, and it works, it's nice. Also, it's safer, and this is important, because we are not working with index, so we don't have to think. That's a major goal in life. We just have to do stuff. For example, if we tried to work with indexes by hand,
01:41
this is a off-way one error. We forgot, we went too far, and we tried to access the last element. It's not there. When we use a for each, a Python for loop, this is not something that can happen. And of course, it's not only going to work with lists. We can use it, for example, with strings. For each letter in the string,
02:01
well, letter is a name of the variable, we can access each one of the characters in the string. We can also use it with sets. We can loop over the elements, although we don't know the order in which we are going to see them. We can loop over dictionaries. For example, here, using items, we can get the key and the value of each of the mappings.
02:22
Even with files, we open a file, and then we use the, we loop over the file descriptor, and we can loop over each one of the lines. And that's the Python way. In the end, we can use loop with anything that's iterable, and that's in the docs. We take it for granted. Now the question, what's an iterable?
02:42
When we go to the docs, it turns out all the answers are not only in stock workflow, you can also go to the docs and read them. And it says, okay, an iterable, it's an object capable of turning one element at a time, blah, blah, blah, blah. It means our class has to implement a magic method, dunder, iter, dunder,
03:03
or also get item, but let's focus on the first. So, okay, we have to implement either this magic method. What's this magic method? And it says, okay, this method is going to be called when we need an iterator.
03:20
And it should return a new iterator. Okay, so we know we need to implement iter and make it return an iterator. So first, good, at some point, we will call either the built-in, and we will get the iterator, and it's true. If we have a list of numbers, for example, we can say, okay, print dunder, iter,
03:43
and yes, it's there. It's a method of the list object. And if we call either the built-in, we get the iterator, and yes, it's a list iterator object. So I'm not lying, but now, what's an iterator? Well, it turns out an iterator is something
04:03
that conforms to the iterator protocol, which means you have to implement two different methods. You have to first implement either again, and you have to return self, which makes sense, but at first, it doesn't look like, but you have to do that thing.
04:20
So this is, remember, we are writing our own iterator, and the docs are right. It makes sense, because in that case, you can use the iterator with a for and in statements. Anyway, we don't care. We will do this, we can follow instructions. And then we have to implement another magic method.
04:41
We have to implement next, and next should return the next element, and when there are no elements left, which it should raise stop iteration. I don't know what the second sentence means. It's something with low level, and this was next in Python 2, but nobody uses Python 2 anymore. And so now we have finally our clumsy and awkward iterator,
05:04
and we are looping over a set, a series of elements, and either is going to return self, and next is going to return the next element. Once we are no more elements, we went too far with index, we are going to raise stop iteration.
05:21
J, we did it, and we can see. Once there are no elements left, stop iteration. Now, something important. We know that when we work with iterators, next is going to return the next one, but we can only use next with iterators. So if we try to call next directly on a list, it's not going to work.
05:40
It's going to say, okay, no, it's not an iterator. You cannot do that. So we have to call first iter, and then we get the iterator, and then we can call next as many times as we want until we hit this stop iteration, which means no more elements left. And that's exactly what the for loop is doing
06:01
under the hood. It's using iter, it's getting an iterator, and then it starts calling next forever until it sees a stop iteration. So if we want to implement the for loop in pure Python for no reason at all, we will have to do something like this, and that's basically what in low-level, actual beautiful code, it's happening in C Python.
06:24
Now, something nice. We said, okay, your iterator, this monster we wrote that raises stop iteration, it's raising an exception when there are no elements left, but we are not seeing any exception when we work with Python, when we loop over something,
06:42
we are not going to see an exception. Why? Because the for loop is actually listening for this stop iteration, and that's the signal it uses to stop processing. So it's actually going to be expecting a stop iteration in one of the next calls, and when that happens, it breaks.
07:02
Now, generators. Generators are a special type of iterator. We all know how a normal function works. We start in the function, we do stuff, and then at some point, we finish and we return. And yes, that's obvious, but sometimes we don't think about it.
07:21
Whatever we have to return from our function has to be done at once. For example, here, we have a super simple function and it has a single exit point, so we do something in the first line, second line, and then we return a value. We can have multiple simple points. This is one-on-one. We can return the function either in the first return
07:43
or using the second, but once we hit one of the returns, we are going to finish and we will never be back. If our function doesn't have a return, that's a implicit no. So actually Python functions are always returning something. If we don't say return something, it's returning no.
08:01
Now, how is that a problem? Well, it's not, but for example, let's say we want to work with all the even numbers in the world. We want a getEven function that it's going to give us a lot of even numbers until a threshold. We could write it the simple, the ugly way or the Python way using rage,
08:21
which we cast to a list and it works. That could become a problem at some point. We want to work with a lot of numbers because this function, we said, okay, we have to return everything at the same time. So if I want to sum billions and billions, as Carl Sagan would say, of numbers, we can't do that because that's not going to fit
08:41
into memory. So that's when we use generators because generators are functions that are able to return values one by one, which means the state of the function is frozen and we don't use return, we use yield. And when we get to a yield, we return the value and the function freezes there
09:00
until we call the following next and over and over. That's super cool. Basically, if we have one or more yields in our function, it's not going to be a normal function, it's going to be a generator function and it's going to return a generator iterator, which for short, we refer to as generator.
09:21
So for example, here, simple generator isn't using return, it's using yield, which means we can call it up to three times and it's going to return values one by one. The last time, there are no elements left, so it's going to return us all iterator, stop iteration. So in this case, we said, okay, we want to return all natural numbers.
09:42
Okay, I'm going to do this. I have an endless loop, it's never going to work, but I don't care because I am only returning elements one by one. And in this way, I could keep calling this generator all the time. With a normal function, I dare you to try this, but if you try to run this, you will end up killing the process because it's never going to work.
10:01
We can never return all natural numbers as far as I know for at once. So PLDR, I wasn't paying attention. Iterators are object on which we call next. That's the only thing we care. And every generator is an iterator, but not vice versa. Also, and this is something great, and Alex Martelli wrote it once on Stack Overflow.
10:21
Everything you can do with a generator, you could do it with your own iterator class. But generators give you a basic framework. Basically, all the keeping state is already done for you. The framework is there. So if it's a simple, most of the time, for simple cases, we can use a simple generator
10:42
for extra points. How do we measure a generator? Can we call length? No, we can't because we don't, well, we don't generate numbers one by one. So if we call length, it's going to say type error. You can do that. We could call, we could cast it to a list, and then we could call length. What's the problem? Well, as we said before,
11:00
we are going to be building all the list into memory. Better, we can use sum. We can say, okay, I'm going to add a lot of numbers, one for each element in the generator. So we have to exhaust the generator, but in that way, we can count it. Usually for ask convention, we use a underscore as a throwaway variable
11:20
to signal we don't care about the actual value. Then wait, time for some fool. Three or five minutes left. I would need some water if that were possible. If not, it's fine. Do we have water? We do have water. Okay, that's the good life. Thank you.
11:40
Prime numbers, yeah. Prime numbers are great. How do we determine if a number is prime? There are a lot of actual ways, really mathematical ways, but I like this simple stuff. I am going to do it by hand. I am going to try all possible numbers because a prime number only have, only is only going to have two divisors, right?
12:02
Two and it's one in itself. Okay, I'm going to try all the other numbers. And if there is at least one divisor, you are not prime. Otherwise, you are prime. So I am going to do it this way. It works. Now, we don't have to do that really accurate thing. Do we need to do it by hand? We said, no, no, no, use range. And with range, we can loop elegantly
12:22
over all the candidate divisors. This is a mandatory optimization, although it's not strictly related to what we are seeing here. We only have to check divisors up to the square root. There are a lot of proofs. I didn't understand them actually, really, but Stack Overflow, again, there is a proof for humans there,
12:41
but it basically says, every non-prime number will have a divisor, the less than the square root or equal to. So, okay, that's a lot of improvement. Done for now. So now, integer, how can we use that to do something remotely useful? Interior factorization. So we want to decompose a number into factors.
13:00
So for example, eight is two times two times two. And simplest method, again, the trial division, which means we are going to try all the prime numbers and we are going to, okay, let's see, okay, can I define the number by this? So to write my solution here, I said, okay, I want to get the first n prime numbers.
13:22
How do we do it? Okay, I'm going to build an empty list and I am going to keep trying numbers. Okay, number, are they prime? Yes, then I put it into the list and finally I return it. It worked. Again, problem, we are returning all the prime numbers at the same time. So, okay, we are here. So let's use a generator. And in that way, it's exactly the same
13:41
but we don't use these memory in list. We can return prime numbers as we generate them. But now we are doing two different things here and that's how our code can break because we are keeping track of how many numbers we have generated so far and what's the next number we have to evaluate. Oops, oops.
14:00
And that's either tools count because we are in the other tools talk. Either tool counts, it's super simple but cute. It's going to make an iterator that returns numbers. And by default, we are going to count from zero and the default step is one. In other words, it's like range but with no upper limit.
14:20
We can call it, keep calling, keep calling it forever. For example, here, okay, next, next, next and we will get numbers. And how is this usable? Well, we can at least simplify our code a little bit. We can say, okay, for number in either tools count and that's never going to stop. But still, we are doing two things.
14:40
We are generating prime numbers and at the same time, we are counting. How many have we generated so far? So I'm going to do something. Okay, I'm going to split it in two different things. First, a generator of prime numbers and then a function get primes which is going to be using the other generator and it's going to say, okay, how many have we returned so far?
15:00
This many, okay. At some point, we have enough, we stop. Good, but we can make it even better because, okay, we have written a function there for primes, we can use a simple generator expression or arguably, maybe even better in this case, we can say, okay, okay, filter. I am going to filter all the numbers starting from two
15:23
and I am going to take only is prime because that's what the filter is going to do. Filter is going to lazily, one by one, it's going to check all the numbers every time we call next and it's going to return the next. That satisfies this condition. Filter, by the way, was either tools I filtered before but again, that's all history.
15:43
Get primes now becomes this thing. Okay, I have the counter and I have my list of primes because then primes is going to be, it's going to be returning primes one by one and we can count how many we need. But now, take how many? This thing is a common pattern actually. Give me the first n elements of this iterable
16:02
and it turns out, we cannot use slice notation because it's going to complain actually, it's going to say no, a generator is not subscriptable which makes sense because you are generating elements one by one, if you are doing that, how are you going to access randomly elements? You cannot do that but you can use iter tools is slice which is basically like slice notation
16:22
but doing lazily with all things iterable. So for example, if we have item tools count which means all the numbers in the world and we say okay, iter tools is slice, give it five, take five, it's only going to return the first five and then stop but we can use it not only with generators,
16:41
we can use it with, for example, with a string and we are going to get the first three letters or we can, and yes, as usual, we can use steps other than one or we can start counting not from zero but from other value. So if we want the first n prime numbers, we can say okay, iter tools is slice primes
17:01
and as many numbers as we need. So let's rewrite our function. Now we get okay, I have all the prime numbers in the world I am going to loop over this thing is slices returning and I am yielding them. Now that's okay, we are looping over a generator and the only thing we do is yielding them. That's what yield from was invented from.
17:23
Yield from has, well, it allows us to do older stuff but the important part, it allows us to delegate part of the work to a different generator which means every time we call next, yield from is going to ask the second generator,
17:41
okay, in turn, next, give me your value and it's going to return it. So it's basically the same thing as we were doing before. It's the same thing as doing for and in thing, yield that thing, we can do it within a same line with yield from. So here, spam is using yield from foo which means we are simply calling in turn foo
18:01
and returning the element. So each call to spam goes to foo and foo yields an element and spam yields it in turn and it works. Again, not only with generators, we can use it with anything that's iterable. For example, here we are yielding from a string.
18:20
So we can rewrite our function as this. We are simply, okay, we are taking a list of primes and we are taking how many did you say you need and it works. For example, in this case, five and it works. 18 minutes so far and I have not answered the question because we are actually not going to need this.
18:43
We don't want to get the first n prime numbers. We want to get a series of prime numbers up to the square root. That's what we said we have to do. So no. So we have to actually generate generators up to a value.
19:00
That's the threshold. We don't know how many. We know the maximum value. So we could say, okay, I'm going to do it the naive way. I'm going to have this primes until function and I am going to loop over the primes. At some point, I am going to hit the limit and then I will stop. If the value is less than the threshold, I will yield this prime number.
19:22
And it turns out this is also a common pattern. This thing about looping until a condition is no longer satisfied and that's take while. Itertools take while is going to make an iterator that's going to return elements and it's going to stop when the condition is no longer true. That's usually seen with Lambda functions
19:42
but it's not mandatory. So for example, here we say, okay, you have numbers which is a lot of numbers and I want to take numbers while Lambda X, X less than four. So I want to take numbers as long as they are less than four. So the first time we call next on the iterator, it's going to say, okay, I got one
20:01
because it's the first element. Is one less than four? Yes. So I yield it. I return it. I return it, sorry. The same thing is going to happen with two, with three but not with four because four is no longer satisfied the condition. So it's going to stop iteration
20:20
because there are no elements left. Again, we don't have to use Lambda functions. If we had a function here which happened to be smaller than 10, we could say, okay, take while where the predicate, the condition is this function and we are going to evaluate it for each value. So our kung fu version of the thing we needed is this.
20:43
I am yielding from take while where take while, what's the condition? Okay, that we are not yet there. We have not hit the threshold. And what about the opposite? We have it. It's drop while. Drop while, it's going to make an iterator returning again elements but it's going to ignore all the elements
21:03
that make the condition true. At some point, the condition is not going to be true anymore and then we start up to the end returning elements. So for example, here we are ignoring numbers as long as they are less than four. So the first time we call next, what's the drop while going to see?
21:20
It's going to see one but one is not less than four. So we drop it. Same thing with two, same thing with three. At some point, we reach four. Four is not less than four. So we have to return it. And then we go on and on until the range is over. So combining both things, we can write this Qt function to get prime numbers
21:41
within a range. We can use take while. That's going to return an iterator and we can feed it. We can pass it along to drop while. And in that way, we can generate all the prime numbers between start and stop. And that's, and yes, we could specify different values, of course.
22:01
By default, we will generate everything from two to infinite but we could specify other values. How is this useful? Well, because finally, to answer this thing, we have the algorithm for integer factorization. And we don't really care but we can glance over it. We have to loop over all the prime numbers
22:21
and for each prime number, we try to divide the number by the prime number. If we can divide it, the prime number is a factor. We save it and we keep doing it recursively. So very quickly, we have 30 is 30 divisible by two. It is, so two is a factor. And what's the quotient, 15.
22:42
So we start doing it recursively with 15. Then we start with the first prime number, it's two. It's not divisible. We move on to the next and so on. Eventually, we reach the base case if the, and we will be over. So here, we are using primes until our need function
23:01
using iter tools under the hood. And we could chain, like every time we find a factor, we create a new list and we concatenate it to the reverse if call. And well, and we can see that it works. Not that we are actually going to using production, but it's there.
23:21
More config. Now, the unique command. I need water. Sometimes we have used unique command. Usually Stack Overflow tells us to use it. It's a new copy and paste without knowing what you're doing. In your servers.
23:41
Unique is going to remove duplicate elements in standard input, but only if they are together, which means when you are starting and you go to IRC free node and you ask, how do I remove duplicate lines in a file? This detailed UK, you have to sort, and then you pipe it to unique or this thing sort, that's you, right?
24:01
Okay, I want to write it myself. Why not? So let's say we only work with strings. So I have to do something like this. I have to say, okay, I am going to loop over all the words of the string. I am going to keep track of the result. And every time I see new letter, I have to say, okay, are you different than the last thing I saw?
24:20
If it's different, I add it. If it's not, I ignore it. So we will have to do it. There are maybe faster ways. This is a kind of simple way to do it. And eventually we return the result. Okay, here we can use group by. Group by is super cool, but it's also hard at first. I struggled to find examples that made sense.
24:42
This is the best I could find. So you have to use, and group by is going to group elements, yeah. It's going to make an iterator that's going to return, every time we call next, it's going to return a two element tuple, the key and the group. And what's a group? Well, a new group is going to start
25:02
when the value of the key changes. The group objects themselves are an iterable thing. So here we have this string with repeated elements. We say, okay, group by. Tell me what's groups. And it tells you, okay, it's a group by object. Thank you, Python.
25:21
So now we say, okay, next groups. And it's going to give me, okay, the first group, which is the first key and the first group. And I say, okay, what's the key? And it says, it's A. And what's group? It's a grouper object. Thank you again. But well, it's an interval.
25:40
So I will say, okay, I will keep calling next until something happens. And it turns out, I can call it two times. And it says, okay, you got an A, you got an A. And then you got a stop duration. I didn't like you. So I don't know what's happening. This is one of these cases where the docs have improved a lot and still are hard to newcomers.
26:02
Thank you. It turns out the key is what the group contains. For example, A and the group are the actual elements. So let's go to B. It's going to tell us, okay, the key is B because this group has a lot of Bs.
26:23
And the group itself is B because there was only one B. And the same with C. We have the key is C because in this group, we have a lot, we like Cs. And the group is C, C, C because we have three Cs. And this is super silly, right? How could we use this to something useful?
26:41
Well, for example, here, we can write a loop form. Maybe it's simpler this way. We can say, okay, print the key and the group. And we can say, okay, first we saw a group that's As and it had two As. Same with B and with C. We have three Cs, okay? That's useful because, for example, here,
27:01
we wanted to remove duplicate elements. If they are together, we can say, okay, I am going to group by all the elements. And every time I meet, I find a new group, I am going to take the key. I don't care about the group because I want to remove duplicate elements. So basically, each key of each group is going to mark the beginning
27:21
of a new set of distinct characters. And we can do it like that or even in a simple way. Something else. This is a common interview question. You can find it online a lot. I want a compression algorithm. I am going to take a string and I am going to use the counts. So if I have three As,
27:40
I am going to replace it with a three. How do we do it? Well, if you don't know the power of eye turtles, your life is pain and misery and toils and blood. Blood, especially. Because you are supposed during an interview to come up with something like this. You have to keep track of what's the current group,
28:01
what's the number of times you've seen this last thing. Every time you start seeing a new group, you have to restart your counter. You have not to forget that eventually you're going to exit the for loop but you still have something else in your hands so you have to append it at the end. Okay, no higher.
28:20
It turns out you can use either tools and you can say, okay, I am going to group elements and every time I find a new group, I take the key. And how many times, what's the count? Okay, I will count the groups. How many elements were there in the group? And if you do that, you chain the key and the count of elements, it's already in the group and you have your compression algorithm.
28:42
Now, key A, elements in the group AAA, that's still strange. And that's because by default, group by is using the equality. So A is equal to A because A is equal to A. That's silly again, right? But it turns out the same way we've sort or sorted,
29:01
we can use the key function, we can use it with group by and we can say, okay, let's transform each element before we compare it. And that's known as the key value for each element. How is this useful? Okay, I want to group letters by length. I want to say, okay, I have a series of words and I want blue and gray to go together
29:21
because they have all four elements. So if we don't know better, although this is still good, we can say, okay, I'm going to use a hash table, a dictionary Python. It's going to be a default dict. And I am going to say, okay, for each letter, I'm going to count the number of elements and I am going to dot key in the dictionary and I am going to append the word.
29:42
Okay, but we can also do it with group by. I can say, okay, I am going to sort the list by dot key and I am going to group all the letters and each group is already going to have all the letters that have the same length. We could argue a lot, I mean, willing to do that, whether this or the previous question was better
30:03
and probably the other is faster. I will argue that this is clearer because it's basically, it's already in the name. It's grouping things by a criteria. In this case, the length, it's all there. Now my favorite, mathematics, run away. We have this function that I found online
30:23
and we can use NumPy to build a polynomial. That's all the mathematics I know. And we can evaluate it, we can call the function. And well, I am stuck here, I have to keep going. It looks like this. Okay, it's something.
30:40
I am going to say it's time versus something. Really, really important. And we are at work and we are monitoring this thing. And I have a super cool teammate trying to do this. And we spent three weeks in one single line of code. So we said, okay, how do we do this? We want to trigger an alarm.
31:02
We have to awake everybody if at some point we spent less than three seconds below zero. And here by hand, we are relaxed. We know the answer, right? It's true because between zero and three, there are at least three seconds there where we were below the threshold.
31:21
For simplicity, let's say, okay, I only care at exact integers. So we can evaluate it only there. So now the question is rephrased. Have we at some point being, let's say typo there. Are there three consecutive points with negative sign? You can do it the ugly way.
31:41
The beautiful way is like this. I am going to use not the equality thing for the grouping. I am going to say, okay, I want to group values by whether they are negative or not. And that's the key we use. The key is, okay, are you less than zero? And in this way, every time what's the key going to be?
32:02
Well, the key is going to be the key value for each element. So the key is going to be negative, true or false, which is going to say, okay, you are negative or not. And the group is going to have the actual values. So if we look over it, we can see that the first group, the key is false because all the elements in this group, according to these criteria are positive.
32:22
So the condition is the key is false and those are the four elements. But the second group, it turns out it's negative. That's the key. And we had at least three. So yes, the condition was true. We have to trigger an alarm. You can implement it without either tools. I would not do that.
32:42
Something simple. We want to alternate indefinitely between numbers, between the minus one and one. If we are absolute newcomers, how do we do it? We write a while true loop and we do that thing. If I got a minus one, I switch to a one and the other way around.
33:02
A little less beginner version would be, okay, I'm going to multiply by minus one because somebody told me that's going to give me the opposite. So it's simpler. But still, we're probably at least the first front row is going to die if we try to do this because it's never going to end. So we have to use the pro version. And what's the pro version?
33:20
Well, I was paying attention. It's going to be something I entered, right? Yes, it's a generator. And yes, we could write a generator. We could say, okay, while true. That while true is never going to end, but I don't care because I am going to be using the iterator. So every time I call next, I'm going to get the next element. But we have the confu version. The confu version is cycle.
33:42
Cycle, it's going to take an iterable and it's going to return elements one by one. And then it's going to start over. So we are going to loop endlessly over this input iterable. And this is nice. We are keeping a copy of the returned elements, which means if we were using a generator as the input,
34:04
the generator event is going to run out of numbers, elements, sorry, but cycle saved a copy so we can keep using it forever. As we can see here, g was a generator expression. At some point it ran out of stuff
34:21
because we cycled over it, getting one for nine, one for nine all the time. So if we try to use g again, it says stop iteration, you have nothing else. But the cycle with either tools is still working. Now if we want to iterate over two or more iterables, how do we do it? Because let's say I want to go from one to 10
34:42
and then I want to go down to two and keep doing that forever because I am playing Kerbal Space Program and I never get to launch my rockets. I want to do the thing. I want to cycle over two things, over the first list from one to 10 and then the second.
35:03
But Python is going to complain because cycle only takes one element. And by extension, how do we do it? If we want to loop over two different iterables, what do we do? For example here, we want to say one to three and then all the vowels. Do we do this? Like do we have to write two different for loops? Well, we have chain.
35:21
Chain is going to take a lot of things as input and it's going to make an iterator. And this iterator, this is super simple. It's going to return elements from the first iterator, then it's going to move to the second and so on. So for example here, we only have to say, okay, let's chain two different iterables,
35:40
the first list and the second list and it works. And then in turn, we can pass this along to cycle. More kung fu and we are five minutes. That's at least 50 slides. Okay. Rolling dices. If we want to roll, if we roll two six side dices,
36:02
that is two normal dices, how many unique combinations are there? Oh, I remember that from high school. That's the Cartesian product, right? So I'm going to do a nested for loop and then they tell you, okay, and if you have three and you say, okay, I will write three for loops, but at some point you have to generalize your solution.
36:23
And this is really hard for at first because, okay, how do I do it? Well, you can do it in different ways, but you can use recursion and you can say, okay, for each element, I am going to combine it with all these sub-products. You create this ugly thing, which I can prove it works.
36:41
But you have product and product is super simple. Product is going to make an iterator returning the Cartesian product of all the input iterables. If you want to compute the product of something with itself, you use repeat. So you, for example, here we want to roll,
37:01
yes, oh, oh, yes. We want to roll two dices, we use repeat two. And we can see that the first 10 things it's returning are those. So an easy question, if we roll four dices, how many outcomes, in how many outcomes are they going to add up to five?
37:20
Well, that's simple. I am going to use product. I am going to generate all the combinations, kind of ugly, I know, but I don't want mathematics. And I am going to filter all the results. And I am going to take all that satisfy a condition. What's the condition that the sum of all these sides equals five? And well, it turns out there are only four different results.
37:40
And finally, pixels. Let's say we're working with pixels, and lots of them. This is something we see a lot. We have a constructor, we have this init function, and we say, okay, row and column. And the constructor is only saying, okay, row equals row, column equals column. And then we have the repper, the magic method, because we want to be able to print.
38:02
We see this a lot, but this is a perfect example where we can use named tuple. Because the name tuple is this wonderful factory function in the collections module, which is going to create a new subclass of a tuple for us. Which means if our class can be unmutable,
38:22
it's going to make our life way easier because we only have to say, okay, I am going to create a new class called pixel, which has two fields, row and column. And that's going to do all the magic under the hood, and it's going to create the new class, pixel. And with that object, the new class, we can create new objects. And we get a lot of stuff for free.
38:43
If we need functionality, we can use inheritance. We inherit from this class, and we can add our own method. For example, here, the distance. I want to be able to compute the distance. And now, all this was an excuse, because I wanted to say, okay, I want to get all the neighbors. If I have a pixel, I want to be able to move up,
39:04
down, left, right, and to the four corners, I want to get all the eight neighbors of my pixel. I'm cool. I mean, this is intermediate level, so I know I have to use a generator. I am going to use yield, right? So I am going to do this thing.
39:20
I am going to yield eight times, and I am going to return, well, the eight neighbors. Yes, but it's a lot of lines for nothing, right? So we can do this thing. I am going to, instead of working with creating all my objects manually, I am going to use offsets. I am going to say, okay, at all times,
39:41
I am going to move either zero, minus one, or one in both axes. So using that, I can create the offsets, and I can return all the neighbors. Of course, when we get zero and zero as a product, that's the pixel itself, where we already are, so we don't have to use it. We ignore it. In all the other cases, we can create the object.
40:03
For super extra points, that's A plus, instead of hard coding the class name there, saying yield pixel, we can not do that. And we can say, okay, type self. Have you ever seen that? If I do type self, I can, CLS, for example,
40:23
you could use any other name, is going to be a new class, sorry, it's going to be the type of our class, and then we can use it to create the new objects in that way, if at some point we rename our class, we don't have to track down these things, and it works. So we have covered a lot of stuff.
40:43
I didn't actually believe, I will be able to. It's a lot of cool things that are there in Python, waiting for us. But still, we have only scratched this surface. There are a lot of things in the item tools module.
41:03
Even more important, all these things that are so interesting, well, if they are for you, all the building blocks in the item tools module are inspired in language like Haskell. So we have been saying, oh, item tools is so cool, when actually we were doing functional programming.
41:23
So maybe something worth looking into, if you haven't yet, because that's what we were doing here. And if there are any questions, I will happily take them. Thank you.
41:49
Okay, thank you very much. We have one minute from one question.
42:09
Well, great, so thank you very much. Do you say these constructs as a trade-off between readability and elegance or something like that, or for you, like this is? Sorry, the last sentence, a trade-off between what?
42:21
Between readability of the code and what the elegance of the brevity of the code, because for some people to understand this, they have to know about these high-level constructs and understand what they mean. I don't know if you understand my point. I was thinking last night about this quote about how somebody said, and it's a shame,
42:42
I don't remember the name. One person said, if a new programming language doesn't change the way you think, it's worthless. So this, at first, it's super ugly. The group by, I agree, it's super intimidating, because what's happening here? At some point, you start seeing the patterns,
43:01
and it turns out these are constructs that are happening a lot. So I would agree that for newcomers from the outside, this may make not a lot of sense, but once you change your mindset, I think there are problems that you simplify a lot. It's silly, from 100 lines to only two,
43:22
so in some cases. Thank you. Okay, thank you very much.