There's treasure everywhere
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 96 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/51851 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Range (statistics)Task (computing)Computer-assisted translationIterationParameter (computer programming)Symbol tablePosition operatorMereologyLengthAverageOnline helpProcess (computing)CASE <Informatik>NumberPlanningMetreServer (computing)Element (mathematics)Slide ruleLoop (music)Formal languageRange (statistics)BitRegular graphQuicksortFunctional (mathematics)Type theoryOperator (mathematics)Task (computing)Computer programmingNeuroinformatikContent (media)Point (geometry)Imaginary numberTunisMultiplication signLibrary (computing)Prisoner's dilemmaExpected valueSpherical cap2 (number)Interior (topology)PressureRight angleAlgorithmSet (mathematics)Control flowTable (information)Cellular automatonPerfect groupWeightComputer animation
09:10
Range (statistics)Task (computing)Type theoryoutputPartition (number theory)Complex (psychology)Pivot elementLoop (music)10 (number)Category of beingFerry CorstenRange (statistics)Multiplication signLoop (music)WordElement (mathematics)Core dumpMereologyRight angleState of matterCodeGroup actionMathematical optimizationControl flowJava appletException handlingQuicksortFunctional (mathematics)Type theoryAlgorithmNeuroinformatikPoint (geometry)CompilerShape (magazine)NumberHeegaard splittingProper mapCartesian coordinate systemThread (computing)Real-time operating systemPartition function (statistical mechanics)Random accessFocus (optics)Slide ruleCausalityPartition (number theory)Online helpPivot elementPrisoner's dilemmaRevision controlDatabase normalizationIntegerFitness functionMedianVideo gameVariable (mathematics)RandomizationGoodness of fitDiagram
18:15
Pivot elementLoop (music)Price indexSubject indexingRange (statistics)Pivot elementPosition operatorLine (geometry)Right angleInformationPartition function (statistical mechanics)Software bugUnit testingElement (mathematics)BitPlanningFunctional (mathematics)Partition (number theory)Physical lawStudent's t-testAlgorithmTelecommunicationLoop (music)Bound stateMereologyImplementationLaptopQuicksortLimit (category theory)Nuclear spaceForcePairwise comparisonOverhead (computing)Group actionReduction of orderKey (cryptography)NumberBoom (sailing)MathematicsMedianProcess (computing)Mathematical optimizationNeuroinformatikMultiplication signArray data structureHypermediaMaxima and minimaStack (abstract data type)Covering spaceAreaPrice indexCASE <Informatik>SoftwareExecution unitArithmetic meanArithmetic progressionComputer animation
27:21
Pivot elementSubject indexingPrice indexImplementationElectronic mailing listRange (statistics)outputPartition (number theory)ComputerAlgorithmAngleMiniDiscProduct (business)ParsingElement (mathematics)ParsingCASE <Informatik>HypermediaPoint (geometry)LengthTwitterSymmetry (physics)Group actionPosition operatorAlgorithmPartition (number theory)Symmetric groupElectronic mailing listDot productSubject indexingNetwork topologyNumberComputer fileAdaptive behaviorProduct (business)Right angleCartesian coordinate systemMultiplication signLine (geometry)Computer programming2 (number)Slide ruleProcess (computing)MedianRange (statistics)Pivot elementMathematical optimizationQuicksortDeterminismType theoryReduction of orderMereologyGoodness of fitField (computer science)Limit (category theory)Internet forumBeat (acoustics)Branch (computer science)Cache (computing)Vector spaceSparse matrixData structurePower (physics)NeuroinformatikSummierbarkeitPulse (signal processing)Euler anglesPrice indexRootPattern languageFunctional (mathematics)Dependent and independent variablesWord1 (number)MetreWindowCore dumpProbability density functionWebsiteMassEscape characterBitExecution unitRevision controlArc (geometry)Term (mathematics)ImplementationGraph (mathematics)HeuristicForm (programming)Surjective functionDifferent (Kate Ryan album)IterationSelectivity (electronic)Fraction (mathematics)Merge-SortUnit testingPlastikkarteCompilerOverhead (computing)Scalar fieldMessage passingComputer animation
36:27
Slide ruleReal-time operating systemComputer virusBenchmarkWordBuildingMereologyLine (geometry)CountingMultiplication signPlotterComputer programmingRight angleSource code
37:33
Linear mapPivot elementQuadratic equationDeterminismMedianElement (mathematics)Subject indexingMathematical optimizationPartition (number theory)OvalAlgorithmPivot elementLinearizationImplementationMedianGroup actionAverageHelmholtz decompositionPlastikkarteRight angleElement (mathematics)Price indexMultitier architectureNeuroinformatikPartition (number theory)outputParameter (computer programming)Subject indexingLoop (music)Range (statistics)NumberFinitismusMultiplication signPoint (geometry)Forcing (mathematics)QuicksortGoodness of fitHeuristicPattern languageSelectivity (electronic)Fraction (mathematics)CASE <Informatik>Product (business)Position operatorRecursionExpert systemMathematicianDivisorComplex (psychology)MereologyFunctional (mathematics)PermutationWebsiteRandomizationThread (computing)Computer programmingRow (database)Pairwise comparisonPseudonymizationInternet forumLinker (computing)HypermediaLibrary (computing)Object (grammar)Computer fileSlide ruleSymmetry (physics)Sampling (statistics)Proof theoryReal-time operating systemMetropolitan area networkComputational geometryTerm (mathematics)5 (number)Category of beingCase moddingPhysical lawVirtual machineVideo gameReal numberStatisticsEuler anglesData storage deviceOverhead (computing)Cartesian coordinate systemCellular automatonSheaf (mathematics)Table (information)Software testingRule of inferenceLengthExecution unitArithmetic meanState of matterStatement (computer science)Different (Kate Ryan album)ResultantCompilerFactory (trading post)Power (physics)Compilation albumSpectrum (functional analysis)Ocean currentSemiconductor memoryWritingMaxima and minimaRandom matrixParsingGoogolDeterminismAdaptive behaviorAsymmetryBuffer overflowStack (abstract data type)Key (cryptography)Boom (sailing)Process (computing)Cache (computing)Reduction of orderApproximationMonster groupMathematical optimizationGraph (mathematics)Devolution (biology)1 (number)Unit testingComputer animation
Transcript: English(auto-generated)
00:04
Hello, Oslo. Hey, you guys. Thank you very much for having me. This is amazing. This is my third year, and I just love Norway.
00:20
It's just awesome. Let me tell you a few things I like, because it's been a wonderful couple of days so far. One thing I like is that I did some research, and it turns out that I'm an average Norwegian. 180 centimeters, unspecified weight, and I'm still working on getting the IQ up to the average.
00:48
So if I applied for a residency in Norway, I might be just your average guy in Norway, which would be awesome. And one thing I really admire about Norwegians is they speak English to perfection, like awesomely.
01:05
Like idioms, everything. They got everything. Even they got the not so nice parts of American English. For example, sometimes some Norwegian folks, I won't name names, they end a sentence like a question. You know, like, hello, I'm Jonas. I work at Cisco.
01:26
And I'm like, well, I don't know your name, but Jonas sounds plausible, and I can't fix your job at Cisco. I can't help there, but I'm glad you asked. Well, it could be the opposite, which is new. It was new to me.
01:40
There's folks who say a question as a sentence. Instead of saying the rub, they go to the rub. And the other night, I ate at the hotel, and then I went to have a beer. So I go have a beer, and I sit at a table, and this guy comes, like a viking, like two by two.
02:03
Two meters by two meters. And he comes and says, are you having dinner? It was like, final. It was almost as if I heard you're having dinner. And I was like, well, I didn't plan to eat again, but there's nothing I can say, so order a salad. What can you do?
02:25
So I presume that guy was actually a bouncer at the restaurant, because he was too solid and too muscular to be just a server. So maybe they were shorthand, and they said, you know, Trigve, why don't you come help here, because you're shorthanded with the waiters.
02:43
And he was like, oh, sure, tips are good. I don't know why people always give me these great tips. For some reason, I don't understand. So anyhow, I love Norway, and I also like one more thing. I'm going to take 30 seconds for this, because I think it's wonderful. I like the, if I can pronounce it properly, egalitarianism in Norway.
03:05
Was it understandable? Alright, I like it. The other night I was watching TV, just flipping channels, not understanding anything. So I found a documentary from the prisons in Norway.
03:22
So they showed the interior prison cell, which looked awfully good. And it was like eerie, and it was like, oh, this is Deja Vu all over again. I've seen this somewhere, so I'm looking at the prison cell, and it's like, this is familiar, where did I see this?
03:40
And then I look around me and it's like, oh, it's the same furniture in this hotel room here. Same thing, I was amazed. But anyhow, well, you guys are great, but I gotta say, you destroyed Troy Hahn this morning. Like the keynote. I thought he was hilarious. And everyone was like, yeah, well, I got that, next one.
04:03
Next joke. He had like one after another, I was chuckling a couple of times, I couldn't contain myself. And everyone was looking at me like, what's with this ass hat here? What's wrong with this guy? I was in the library and I was supposed to keep my mouth shut. So in the end it's like, okay, so Troy has this great talk, hilarious talk,
04:24
and I was like, why isn't everybody laughing? And he was like, there's some golf cap. Okay, yeah, we appreciate your attempts. And then I talked to the guy and he says, and I say, oh, this was brutal. And he says, oh, this is as expected. I've been in Norway before.
04:40
So, all right. Forget all about all that. We're going to talk about efficiency today. I have like a great task on my hands right now. I have a great task on my hands. It's a huge pressure. Here's the pride. Here's what we're going to do tonight. Last talk of the thing, I hope you're not, we're going to be late five minutes.
05:04
Because I have to tell all this stuff about Norway, right? We're going to be late a bit. I have many slides. I have much to share with you. New content. And my hope is that by the end, by the end of this talk, all of us are going to have a sort of a mini imaginary hat, if you wish,
05:24
which is our efficiency hat. Whenever we wear it, we're going to be able to look at things in a slightly different way. Just slightly different, which is going to allow us to find and improve those points in our programs that are less efficient than they might be.
05:43
And you're going to see tonight how we're looking at a few absolutely old classic algorithms, literally from 50 years ago. And we're going to improve them together in ways that nobody knew how to for 50 years. I'm not kidding.
06:01
First I'm going to talk about sentinels. Who can tell us what a sentinel is? Who's got the Viking blood in them? Come on. What's a sentinel? Yes, please. It's a special value that you plant ahead of an operation
06:20
to make sure that you can arrange computation in a more efficient way. Thank you. So awesome, right? So we have this sentinel notion. And that's great because there's a joke which I hope most of you don't know yet. There's a thing with how does a program find an elephant in Africa?
06:43
Does anyone know? Great, so this is new to you. There's one guy. Well, you can just phase out, tune off for a second here. So the joke was you put an elephant in Cairo and you start from South Africa and you go up north and if you find an elephant you're good.
07:03
You look at are you in Cairo or not? If you're in Cairo then you didn't find any because you found yours. Is that what I'm saying? Same about find, right? Let's look at the find routine that's very simple. And I'm going to use the D language for examples here because it fits on the slides.
07:22
Thus I define a generic function in the D language. You just put the type arguments first and then the regular function arguments. And I'm going to use indexed access just because it's easy. I'm going to just talk a bit about that later. So we're going to have a loop as easy as pi. So I have for i equals zero all the way to the length of the range
07:43
which is kind of a sort of an array. You can think of a portion of an array or a pair of iterators if you wish. I'm incrementing this good guy and if I found the element I'm going to break out of the loop and now what I'm going to return is the range from that particular position all the way to dollar which is a symbol for the end of the range.
08:02
So if I have an array with numbers in it I'm going to look for it and I'm going to return the portion of the array positioned at that particular number that I found. What if I don't find the number what do I return? Talk to me. I don't find the thing. When does the loop break?
08:21
When i equals length and here I'm going to return r from length all the way to length which is the empty range, the sliver of the empty range just at the end of the range which is nothing. Just like in the STL remember you return and when you're done searching and you've found nothing you return and in this case you're returning the empty range.
08:43
This function has been implemented millions of times in millions of languages everybody knows it inside the node so today we're going to make it faster. How do we make it faster? Obviously by using sentinels I presume, right? So how do all ideas?
09:02
Talk to me. Sorry? Oh, yeah. So save this guy in a variable. Yes, actually it turns out that the compiler is very good at doing that already. So in good shape, thank you. That's a good start. It's the right hat.
09:20
Other ideas? Sentinels. Right, how about I take e I put it at the very back of the range, the last element instead of the last element there and then I run the search. What do I save, Hubert?
09:40
Yeah, so I've got to save that. What computation do I save? Awesome, thank you. Yep, this is Right, I find my elephant and in the end I'm going to look at my is it my elephant which probably I painted pink
10:00
you know, whatever, so I marked it in some way. Yeah, it's a good point. It's a good point, but Hubert, let me do the talk here, I have the microphone. Alright. So I saved this work and this is important to save even though I add some more stuff here and here because this is my core loop. This is where it's going to be like
10:22
disproportionately much executed, right? Awesome, so let's do that. I'm going to save, as Hubert said I'm going to save the last element of the range in a temporary c and I'm going to push e, I'm going to just put e in its place there and then I'm going to do the loop, there's this whole scope exit thing
10:43
which means whenever this scope exit is finished here I'm going to put the element back just to restore the previous state of affairs and then, this is my savings this is the core loop, this is where the action is this is where it's at, right? This is where it happens, so I have nothing here
11:01
so I'm going to save time. And here I do that whole cleanup that we're talking about Ah! Was I just an elephant in Cairo? Is my elephant, is it pink, etc. So this is my extra work that I'm doing but I don't care because I did it once. Right? So this is the fix-up. So that's the prelude, that's the core, this is the fix-up, and I'm done.
11:22
Terrific. Now, a few details. I do this whole scope exit thing, what if something throws an exception so I just want to preserve the range What if the range doesn't support random access? What do I do? This doesn't apply, right?
11:41
It's not going to apply anymore if the range does not support... I've got to fall back to a conservative version which doesn't do all of these tricks So there's a number of things I'm not going to focus on but if you do want to focus on those I have a talk which just searched for ACCU 2016
12:01
and my unpronounceable and unwritable last name and you're going to find that talk and it discusses these things in more depth. I eliminated that part because I want to share new material with you that's not online, it's nowhere to be found.
12:21
Great. So, it doesn't work for all types but I'm going to focus on things like integers and sort of the basic, the civilized types, right? Very nice. Alright, question. Does it work or not? What do you think? How much improvement are you going to get? I don't know, he gives like 1%. Give it 1%. Who gives 1% or more?
12:43
Oh, okay. Who thinks it's going to be negative? Negative performance. Depends on the range. Depends on the number of things such as the range size because then I do a write, the range has like 5 elements a write is going to be a disaster, right? Okay, let's say we kind of focus on the longer side
13:01
of the like, you know, hundreds, thousands, millions of elements because that's where it's at, right? So, does this help or not? Who gives it like positive performance? More than zero for large, long ranges. Thank you. Yes, great. Who gives it 5%? A few hands. 10%.
13:21
Anthony's like, I can't believe this is happening. Alright, so actually it turns out it's pretty good. It's pretty good. We have time in milliseconds here and we have like millions of elements here. We're talking floating points, search in a floating point array and just to make it clear for you, I plotted the speed up in percentage
13:45
compared to the baseline, which is the conservative, the simple find. And guess what? We get like awesome speed up there between like 20 and 60%. Pretty neat, you know? I could roll with that. It's something I could endure to live with.
14:04
Yes? Great. Excellent. You would think, however, so far I bored you because you knew all this stuff. Or worse, if you didn't know it, you didn't know it because it's not fashionable.
14:20
It was, I'm not kidding, it was fashionable in the 70s. It was like, oh, those sentinels, you know, put that in a core of ferrite, put a sentinel there like, you know, scribble it by hand with a knife or whatever, that's my sentinel right there. You can literally hold it in your hand, right? It was a very popular technique.
14:42
And then it fell out of use. It's still very good actually, it turns out. You just have to do the proper specialization in introspection such that you apply it only when it's applicable. For example, how about this code again? How about multiple threads execute this code?
15:02
You don't, that's a bad idea. I mean, that left, just talk to your ear off about that, right? It's not what you want. Look at it, yeah, it's not what you want. Definitely not. So there's a number of situations in which this doesn't apply, but you should apply it whenever applicable. So, you know, but this is like old things.
15:23
Let's talk about new things. How about 20-horse partition? Partition, the workhorse of quicksort. Who can define what partition does? Partitioning like an array. Partition. I give you an array, I give you a number, partition it for me.
15:42
What does it do? Splits into two parts. Splits into two parts. Well, what are the properties of those parts? Somebody else. Yes, please.
16:02
Awesome, so choose a pivot and then put the little things on the left and the greater things on the right. And that's your partition. It's used by quicksort, and quicksort is like a very important function, right? It's also used by median, nth element, which is also very important. You know, a bunch of algorithms, right? So, okay, so it's nice to optimize partition, right?
16:23
That's all I'm saying. So let's look at the baseline partition that is already highly optimized. I'm using a label break here, by the way. Like in Java. You put a label here, you put a label break here. I just want to do like zero redundant computation.
16:42
That's why I need this. Otherwise, I would need to use a go-to. And believe me, I never use go-to, except when I do. I never use it, except when I do. Because when I do, it's awesome, right? So I do it like only for a reason, but I never do it. I never use it. Never in my life, except when I do, right?
17:03
So because of this label loop here, I can go like this. Well, here's how it works. I have low and high. So I have two variables. I got to mirror this for you. So I'm doing some awesome brain thing in real time. Like this is left, this is right. Alright, you see? I'm an average Norwegian.
17:22
I have the IQ, right? Okay, so I go from the left. Then I find something that doesn't fit. That doesn't, you know, it's too great. It's greater than the pivot. And I come from the right until I find something that's less than the pivot. What do I do then? It's not on the slide.
17:41
It's not on the slide. Talk to me. Swap them. Whoop. Put them in place. And then what do I do? Go on. What do I do then? Swap again. And then what do I do? And then they meet and I'm done.
18:00
This is your partition function. I mean, think of this. It was 1956 when it was invented. At that time it was like one of the first algorithms ever. You know, it was amazing that it's like Tony Hoare, like genius. And let me tell you this. He also invented quicksort. It took five years for people to get the correct implementation of quicksort.
18:23
They had bugs for five years in it. I'm not kidding. So it was very hard back then. It's not like, ah, you know, I know how to do this. Hopefully. Alright, I hope I have no bugs here. So this is the loop that goes from the low portion up until I find this guy.
18:42
Right? This is my pivot because I put it here. Swap. I'm swapping into the first position just to put it on the left. So this is my part. I found the left bound. And then I'm continuing with going from the right with the other bound. With the high bound. And I stop if I have the pivot greater than or equal than the element, the current element.
19:04
And I found the right bound. And then I'm swapping and making progress. And this is my function. At the end I just a little fix up to put the pivot back where it was and I'm done. And this is my partition function. It's a classic implementation. You would find a variant of it in the STL.
19:23
Now, there's a few subtleties about it, which are interesting. Here I could use greater than and it would still be a correct algorithm. Because I go with law until I find something that's strictly greater than the pivot. I can't just go through the equal elements.
19:41
I'm fine. And there's a very subtle problem with it. What is that? Who can tell us? It's an extremely interesting problem. And similarly I could use greater than here instead of greater than or equal. Which means I go through the equal elements. I skip the equal elements. I leave them in place.
20:01
What is the problem there? Genius. If all elements are the same or you have a lot of elements that are the same, what's going to happen? Somebody else. Somebody else. Another student here.
20:21
It's an interview question. Well, what happens? Let's say everything is equal. What's going to happen then? Well, what I have here is this law is going to go all the way up. I start from the left and everything is the same so everything just goes through. And I say, oh, so my partition partitioned the array in a completely useless way all the way on the right.
20:43
Whereas I want partition to have some fairness in it. If they're all equal I want somewhere in the middle. Because that's a nice thing. Why is it a nice thing? Because when you quicksort, if partition goes all the way then, what do you get? Quadratic performance. Awful. So partition must be fair.
21:01
Meaning whenever you have equal elements they must be fairly distributed along the two halves of the array. Very interesting. It turns out there are people who had these bugs and it was awfully difficult to diagnose. Because this is going to pass all unit tests.
21:20
Unit tests is not going to help this. Agile is not going to help this. Scrum is not going to help this. See how I'm going? So it's not going to be helped by those techniques. So you've got to kind of really, it's difficult. Anywho, let's make it faster. Thoughts, ideas?
21:42
I'll give you a hint. Let's use a sentinel. How do we use a sentinel in this case? It's a bit more complicated. It's a bit more complicated. So consider this. The first line there puts the pivot in the first position in the array. At the front of the array I have the pivot already. So I know that the first element is kind of less than or equal to the pivot and also greater than or equal to the pivot which is good information.
22:06
But on the other side I don't have a sentinel. So I have one on the left side but I don't have one on the right side. What do I do now? Put one on the right. Save the element. Do what you did with find. So we could do this. I did it.
22:21
It doesn't save anything. It just doesn't add up. It doesn't add up. The saving is like 0.5%. It's like nothing. The function is more complicated for no good reason. So now I thought about how to minimize work here because this is what we're looking at. We don't have an agenda to plan sentinels in places.
22:40
Right? We have an agenda to optimize software. And this is what we want to do. This is what we're up for. Right? So we're looking at minimizing work here. So I want to consider this for example. Let's tell me what's overhead and what's legitimate work. For example, is this work?
23:05
No, it's overhead. I'm comparing limits that have nothing to do with the algorithm. Is that work? Is this work? Yes, I'm comparing element against the pivot. I must do this. There's no way to not do this. I must look at the elements and compare them against the pivot.
23:23
This is like whatever I eliminate, this has to stay. If I don't do this, I'm not looking at all the array. I'm not doing the right thing with all elements. So this is work. This is overhead. I'm incrementing things. I'm doing these things. These are overhead.
23:40
This is work. And the comparison between low and high is overhead. So I have a lot of overhead mixed in with the legitimate work. So we want to reduce these overheads. These comparisons and these indices. This is also overhead, but I don't care. It's just at the end. It's not in a loop.
24:00
So who gives a damn, right? Great. So let's reduce work. And here's an idea. Here's an idea. How about we plant a sentinel in the middle and instead of going all the way to one end, we kind of put a sentinel in the middle and we go and we meet in the middle
24:20
and it just stops there. You can't because you don't know where the middle is. You don't know where you're going to end up, where those guys are going to meet up. You don't know. So we can't do that. So I thought about that for an hour or so. And it's crazy, right? Everything I'm telling you here and I'm saying it was easy, it's like, you know, sleepless nights, friends.
24:43
I didn't know how it's like. You do the same, right? I mean, come on. This is group therapy, right? Let's be honest. Okay. So we can't plant a sentinel in the middle. So we need to put one at each end. Now here's a key idea. Let's put sentinels at both ends indeed
25:00
and we're going to create a vacancy. Who knows what a vacancy is? Vacancy. Not at the hotel. What is a vacancy in nuclear physics? Vacancy. Anthony. It's a missing electron in a crystal.
25:22
And what happens if you have a vacancy in a crystal? Attractions, yeah. There's some, you know, sexiness there, right? Some attraction going on. So whatever there's a hole, another electron is going to be prone to fill that hole, that vacancy, because it's an imperfection.
25:42
They're going to, by attraction, they're going to just want to migrate there. What's going to happen when that electron is going to get into the vacancy? Another vacancy. So, you know, it's essentially moving the vacancies around. And essentially, physically, what happens is that vacancy starts to behave like an electron with a positive charge.
26:04
Like a positron, except it doesn't explode when it, you know. So it's like a positron. It's a vacancy. It moves the other way. So electrons move this way, and the vacancies are moving the other way. And guess what? This is how semiconductors work. Like, there's a bunch of vacancies floating around this laptop right now.
26:21
I'm not kidding. All right. So we have this vacancy idea. So here's how we apply this idea. We created an imperfection in the array. So we have an array of numbers, and we put a sentinels, and then we create a vacancy, a hole in the array. We kind of say, ah, this is kind of an empty position in the array.
26:41
And then what do we do? Once, let's say, I have a vacancy on the far right here. What's the first thing I want to do? Fill it with what? Appropriately. You can't go wrong with that, right? So we come from the left, and we're looking for a big element,
27:00
something bigger than the pivot. Once we find it, we fill the vacancy. What happens next? What's the next phenomenon that happens? I have a new vacancy here on the left now. Oh, what do I do now? Genius. So I come from the right, and I've got to fill the vacancy. Whenever I find something little,
27:20
I put it in the vacancy here. And now I have a new vacancy here, and this way I'm moving the vacancies. I'm moving the vacancy for how long? What's going to happen in the limit? I end up somewhere in the middle, at the right partitioning point.
27:42
And then what do I do? So they meet the vacancies right there at the partitioning point. Everything is greater, everything is less here. And I put a pivot back. And I'm done. So I've got to save the pivot, etc. So this is all a simple matter of programming once you've got the idea.
28:00
It's going to be like three slides. Yeah, three slides. It's a long function. I'm not expecting you to understand every comma of it. Right? It has complications. So this will be the prelude. I'm going to put the pivot in the first position. Well, I have the special case here.
28:20
I'm going to save the pivot. Save it. Because right now it's at the left most point of the range. So I'm going to save the pivot. I put it there. And now I plant the pivot in the end as a sentinel. So I'm going to save the old position of the last position. This is like length minus one. So it's the last element in the range,
28:41
the last element of the array. This is my last element. I'm going to save it. And then I'm going to, bam, slap the pivot onto it. And that's my hold. That's how I start with the vacancy on that far right. And then it's all, as I said, a matter of programming. And it turns out I get to save a bunch of work.
29:00
I get to save a bunch of work. And I need to test only one for two of these long iterations. I save a bunch of work, which is pretty awesome. And let me give you one more detail, which I find is interesting here. All of this vacancy filling and stuff
29:20
you can think of as a half swap. Because people usually think in terms of comparing and swapping things when they think of sorting, partitioning, et cetera. Actually, the whole vacancy business is half a swap. And that's great because it's more economical. What's better, to do the temp and all that three steps or just do half a swap and the other half from another point?
29:43
Awesome. I'm seeing some nods here that reveal that I may get slowly, slowly to the average Norwegian IQ there, which is awesome. Thank you, Detlef. So getting there, right? So the work has become a matter of moving the vacancy around and in the end we're going to fill it back.
30:01
This is my core. It's become a lot cheaper. It's become a lot cheaper and it does a lot more work, a lot less overhead. This is overhead, this is work, this is work, this is overhead, this is work, this is overhead, and this is work. It's like almost 50-50 and some overhead just can't escape, right?
30:22
Awesome. In the end it turns out, so I've kind of analyzed the thing and run through it, and in the end there's a bunch of fix up you need to do. But this is how it works. It's unit tested. What can go wrong, right? So yeah, there's a number of cases here you need to fix up at the end because you've kind of been running real crazy with these indices here,
30:42
just kind of low, low is high, plus two, and et cetera. So there's a large fix up part here. I don't care, it's done once. Alright, so we're done with this. Let's see how good it works, how well it works. So time, milliseconds.
31:00
Oh, that's pretty cool. We're looking at somewhere between like 3-5% all the way to like 25%. And guess what? This is a function that has not been improved in 50 years. It's a big deal. And moreover, this savings is going to transfer into sort because all sort does is partition.
31:23
Everything else is just smoke and mirrors. Most of the workhorse is partitioning, right? So with this it means I get to improve sort significantly. And that's awesome, and it's new, and I hope it convinces you that I've got to look at these sentinel things, right?
31:44
Alright, so I thought I'm smart. I thought I'm above the average Norwegian IQ. I'm not. I am not because I did some researching of the topic. I thought I'm awesome. But actually who's awesome?
32:01
There's like three other people. Two Indians and one Russian. So I found this paper from 2011 by Abhyagnar and Ingle, which is very funny because it's one of the few papers I read which is a research paper and has the title research paper.
32:21
I'm not kidding. So it's a paper entitled research paper, which I love because it's obvious what it is, right? You can't go wrong with that. So it's a research paper and a title research paper with a sub-title engineering quicksort partition algorithm by these guys and they have the same vacancy. They describe the same vacancy idea. I think it's a bit inferior because you can't choose any pivot you want.
32:43
So it's just a bit more limited than our version. But this is a core idea. Also, Stepanov has a nice PDF on his site. If you just Google for Stepanov partition, you're going to find a nice PDF, which discusses both sentinels and vacancies.
33:02
He calls them holes. And the nice thing is they discuss sentinels in one chapter, discusses vacancies in the next chapter, and in the end there's an exercise. Please combine both. So I essentially implemented the exercise, if you wish. But it's awesome that what's interesting is not that this work exists,
33:22
it's that it's recent. It's not 1975, right? It's recent work. Like 2011, Stepanov wrote this like what? It was like the 2000s. Like this is recent work. And it's interesting because with the right attitude, if you wish,
33:43
with the right outlook on things, you're going to find opportunity to optimize things. You may think I'm cherry picking. I'm not cherry picking. There's a bunch of work you can apply sentinels to. Bad products or sparse vectors.
34:00
Awesome applications, set intersections, same patterns. You want to merge sorted lists. You know, the merge function, right? Merge sort. You can put a sentinel at the end. Lexium parse, you can put a sentinel at the end. You parse your lex a lot faster. So a very great many.
34:23
So we have a guarantee like 40 passed. Awesome. Actually I can spend a minute here to talk about these. Bad products or sparse vectors. You have sparse vectors which are index and value. And they're sorted by index. And you go, whenever I find equal indices, I add it to the sum, right? How do I apply the sentinel here?
34:44
So that product, it's sparse vectors. Most things are zero in a sparse vector. So you only store the non-zeros together with the index. So I have a sparse vector which has an element like 3.2 at position 42. And it has like 0.1 at position 100 and so on, right?
35:03
So this is my first sparse vector. I have another sparse vector which has the same structure. And whenever you multiply these guys, you need to find the same indices and multiply those guys together and add them up. That's the definition. Scalar product, right? So how do you apply a sentinel to that layout?
35:23
Yes. Stick something at the very end. It turns out there are subtleties. For example, I don't know. We've been talking about this on a forum. And essentially, sort of one of the best approaches is to put like the size Tmax.
35:40
Like the larger size T at the very end of the largest thing there. And then when you go through with it, you just have to do one test out of three instead of doing on each branch. So there's subtleties, but it can be done. Same deal about settings, distinction, merges, sort of lists. Lexing and parsing. How does that work?
36:01
Lexing, like lexing a large file. Like a C++ file after pre-processing, I may add. You know Hello World, how big it is? 36,000 lines. 80 megabytes, it's amazing. It's amazing how a compiler can go through with it. Like it's a miracle of technology, it can do it in two seconds.
36:22
Right? So you pre-process Hello World. I'm not kidding. You pre-process that. Oh, actually I have it. You know what? I'm going to show you. No kidding here. Alright. Is this even visible? Oh, by the way, I'm building my slides with LaTeX. And what you saw, the graphics and everything, the plots, they were generated this morning in real time.
36:42
It's part of my build. So the slides are built running the benchmarks. So that you don't want to do make-j. Because it's going to run two benchmarks at the same time. It's going to be messed up. Anyhow, what was I? What was I proving here? Oh, Hello World. Where are you? Hello.
37:01
Hello.de. I don't know, word count. 37,000 lines. And 538,000 lines if you round up. And I don't know, let's see the characters.
37:20
That's like 1.2 megabytes. And that's Hello World. So that's like... So this is it. This is the program. I'm telling you the sheer fact that Hello World compiles, gets into an object file, gets linked with the linker at libraries and stuff.
37:42
Nobody knows how a linker works. The sheer fact that it goes through with it, it's a miracle of technology. If I want to explain this to a guy who didn't know computing, they would say, I'm kidding there. This can't be right. This is like, what's next? e to the i power of minus i pi is minus one or what?
38:03
One or what, what's wrong with you, right? All right, how do you apply this to Lexium parsing this large Hello World program? So here's an idea. Let's say I read a file in memory, and I pop a zero at the end, which is an invalid token,
38:21
an invalid character in any C++ file. Right? And then any Lexium is going to have a huge, big switch statement with like, depending on the current character, you're going to do different state things. Right? What do I save there? What am I saving by means of planting a zero in a big switch statement?
38:45
Huh? You save one comparison per character. Because otherwise you have a switch and you have at the top of the switch, you have while I'm not at the end of the file. Right? So you have a big S if, and inside of it a big S switch.
39:02
And inside the switch you have a nice table and everything all the compiler knows how to optimize that. But outside that you have one if per character, which is crazy. So you triple the speed of things if you just put the zero as a case inside the switch because then the cost of testing for that case is divided by the size of that big table.
39:24
Right? This is awesome. So sentinels are very much usable outside the sort of the obvious find I'm going to find. I'm going to plant an elephant there, et cetera. Right? So you may still think I'm cherry picking here. So let me actually switch gears here
39:40
and continue with this hat on. Let me discuss a different algorithm. Who knows about selection? The selection problem. Who knows about nth element in STL? nth element. Yes, please. I saw a hand. Oh, you just heard of it? You know of it. Okay, somebody else except the founder voice, Hubert Matthews, here.
40:06
nth element in the STL. nth element, I give you an array, I give you a number, you give me? Mark? Awesome. So I give you an array, I give you half the length of the array.
40:22
nth element is going to put the smaller things on the left and the greater things on the right, thus actually giving me right at the middle. So what's the element that would be if the whole array were sorted? To rephrase, given an array and a number n, find what a of n would be if the array were fully sorted.
40:43
You just give me one element. And you know, on the face of it, if you approach it sort of naively, just sort the array and you're done. Just so there you have your answer. But actually there's a cheaper way to do it. Right? There's a variant of it.
41:00
Also, like Mark said, you want to place everything that's more than a of n to the left, everything that's greater to the right. There's a relationship between selection and the partition we just discussed. The partition, I choose the pivot, I partition things, but I don't know where exactly when I'm going to end up. With selection, it's exactly I want to be in this position and I want to put everything less and everything greater.
41:20
Applications. Give me the top 10 salesman. Give me the top 100 salesman. Find me the median height of the Norwegian male, age 46. 180. This is it. I'm not kidding. Right? So computing the median is very important in a bunch of applications. Great.
41:41
Now, so this is also like much more important application such as shortest path and nearest neighbor. So it's just like a lot of computational geometry algorithms end up doing median at some point or another. So it's an important algorithm. So for those of you who don't know about quick select, I'm going to introduce it to you
42:01
because I think it's a very interesting and fascinating algorithm. Who knows about quick sort? Quick sort. Most of us, right? Who knows about quick select? All right. I'm very happy to introduce quick select to you. Here's what quick select does. It's very clever. So quick select is what it does is it does the partition part like quick sort.
42:23
But, you know, quick sort then recurses to the left to the left and recurses to the right, right? Okay, quick select partitions and get some pivot position and then it looks, what am I looking for? Am I looking for an index here or here? And then lazily, kind of not lazy, cleverly
42:40
only recurses from one side, either the left or the right. So whereas quick sort needs to fully sort the array, quick select is like, ah, I don't care for this half because my index is not there, what I'm looking for. So I'm just going to recurse on one side of the of the divide, of the pivot. Very interesting. So you recurse only once.
43:00
What do you think the complexity of quick select is? Complexity of quick sort is all log n log n, right? So quick select, which only does half as many recursions, is going to come up too. Obviously for those of you who are expert mathematicians.
43:21
Right? Linear. But, like with quick sort, if you choose a bad pivot systematically, you're going to end up quadratic. So in a way, the risk is even higher here because you go all the way from linear to quadratic. In quick sort you go from l log n to quadratic
43:40
in the worst case, right? So with quick select, if you don't have a good pivot you're completely messed up. Great. So let me show you how nice and easy quick select is. I just use partition as a primitive and all I'm doing is I recurse either here or here. I don't even care to recurse. I put a loop around it.
44:00
I just reduce the range. A very simple function. Actually, I'm not kidding. This is production quality. So you don't even need to optimize it. As good as it looks, it's just production quality. You can't make it much faster. So that's great. Quick select. Awesome. So just to explain how it works,
44:22
partition is what we discussed. You choose a pivot. It gives you the partitioning. Quick select cleverly uses it repeatedly to get an exact partition where it wants to, usually around the median. So once you partition, you're good. Great.
44:40
If quick select gets to eliminate any fraction of the array then you get linear time. Sometimes if you get all the way on the left or on the way on the right, it's not good. As we discussed, the pivot quality is a problem here. What techniques do you know for choosing a good pivot for quick sort?
45:02
The same problem applies. Ideas for quick sort, good pivot. Find something that's about in the middle. A good pivot for quick sort. The average between. You take the median between probably
45:22
the left, the right, and the middle. People do that. Right. You could randomize. You could say I'm going to randomize and choose some random element. Or choose three random elements, compute their median, and that's my pivot. So there are a number of heuristics that people use. What's the problem if you don't randomize?
45:44
If you don't randomize. You choose first, last, and middle, and you compute the median. Yes? You get degenerate inputs. Degenerate inputs. You get some patterns in the input that completely mess you up. Which are going to be quadratic, right? And actually it's plausible
46:00
because it can be machine produced. So actually there's a tax you can imagine. So, not good. What are the problems if you do randomize? I choose a random pivot. You may end up choosing a bad one anyway. And even though, do you know this phrase in statistics?
46:21
It's almost never. I'm not kidding actually. You know what almost never means? It's a scientific term. For those of you who don't know the science thing. Almost never means the following. Let's say I throw a dice. Right? And again and again and again, many times.
46:41
And how likely is it that I throw a six every time? It gets smaller and smaller and smaller and at infinitum, it's almost never. It's possible, by the laws of physics, it's possible it could happen. But it's going to happen almost never. And that's what almost never means in statistics.
47:02
The probability of you choosing a bad pivot by random sampling for an infinitely long array is almost never. But, in real life, let me tell you. Go back to earth. Folks, right? Go back to earth. In real life, what happens?
47:20
It's front-loaded. The first three. If you choose three bad pivots in the beginning, like Anthony said, you're messed up. Choose three bad pivots, you're done. Because that's the largest array. So if you're unlucky enough to choose the first three pivots badly, you're going to do a lot of work that you shouldn't. Your performance is going to be three times worse than otherwise, right?
47:42
So even with random pivots, I'm not cool. I'm not cool at all. So that's what people do. We discuss this. So, in 1961, five sacred monsters of algorithms, Floyd, Rivers, Tarjan, Bloom, VFE, and Pratt.
48:06
So five sacred monsters, like very good algorithms people, they set out to prove you cannot do selection linear time, guaranteed. There's no guarantee you can do selection in linear time. So you can on average, but you can't in the worst case.
48:22
So they wanted to write a paper that would prove you can't. And they discovered an algorithm that does it. True story. So actually they're like, oh, so let's prove this can't be possibly doable. And they're like, oh, crap, man, look, it's working, it's working.
48:41
I can't believe it. So this is what almost literally happened. So they found an algorithm which is called media of medians. It's a miracle of human thinking. It's amazing human ingenuity that went into that algorithm. It's one of the most elegant algorithms I've ever seen. It's godly.
49:02
And it has only one little problem. We're not going to insist. It has a little problem. It's three to five times slower than anything else. That's fine because on average, in the worst case, it's linear. It's just like if you want to wait five times more than usually,
49:20
you're going to be fine. So as a consequence, it's in all books and in no implementations. Like everybody reads about it, analyzes it. This is a good algorithm. I'm not going to edit at all from now on. I'm just going to go with the random selections and stuff. So 1961.
49:40
We have ten minutes to break the record. I am not kidding. We have ten minutes to break the record. We're going to implant a variant of this algorithm, media of medians, that's going to beat everybody. Add the tushy. Everybody. And take names.
50:02
Okay? We have ten minutes. Are you with me these ten minutes? It's like three folks in the back just leave and it's like, ah, the hell with this. I'm not kidding. We have like what, you know? We have like a few more slides and we're done. Nine slides. Right?
50:20
We can do this. So let's start with media of medians, the classic implementation. First of all, I have a particular case. If the length is less than five, I'm just going to do it like, so it doesn't matter. It's a few elements. Not really, I do it by hand. It's just, it's a particular case. But if I have more than five elements,
50:40
it is where I compute the media of medians and the way the media of medians goes is, you don't need to look at the indices there. The way it works is very simple. Divide the array in groups of five elements. So imagine like the first five, next five, next five, and so on. Compute a median for each of those five elements. So now I have a fifth of the array in those little medians.
51:04
That's why it's called media of medians because I have like every five group has its own median. And I compute that by hand, brute force. Right? I have algorithm. I know how to do for five elements. I do it by hand. Five elements, one median. Five elements, one median. Five elements, one median. Then I have a fifth of the array,
51:21
which is the medians of medians. And then I recurse computing the median of that fifth of the array. So I get the final result, the median. Now here's the property of that number that I get. So I compute a medians of each five groups, of groups, of group of fives. And then I take that fragment of the array
51:42
and I compute each median. So I get the number. It's going to be greater than half of the, by definition, it's going to be greater than half of those guys, right? But for each of those, how many other elements are smaller than it?
52:00
Each of those, each of those like fifths, right, has two other guys that are smaller than it. Right? Are you with me? So I compute a median of that fraction of the array. And for each of those guys, I have two others that are smaller than it.
52:20
By symmetry, I also have two that are greater. So this median here is guaranteed, is guaranteed to be between three fifths and three tenths and seven tenths of the real median. It's not going to be at the end of the array. It's going to be within a fraction of the real median.
52:43
Remember when I told you a while ago that if I get to eliminate a finite fraction of the array at each step, I have linear speed? If I get to eliminate, and I get to eliminate at least three tenths of the array by this computation. And that's how the proof goes. You compute a median of five by hand,
53:02
you compute a median of those medians, and then you're guaranteed to be greater than three tenths, smaller than three tenths, and you eliminate those guys, one of them at least, right? And this is what this guy does. M5, this is going to be a brute force function. It just does it.
53:22
It takes like five elements, sorry, five indices. It takes a range and five indices, A through E. And it just, you know, I'm not kidding. I didn't write this. I wrote on a forum and a guy did it for me. Right? I don't know his name because there's a pseudo in like TN.
53:41
So there's a thread I'm giving credit in the paper, but essentially it goes like this. I wrote this. How do I compute a median of five with minimum swaps, minimum comparisons? And there are like ten folks who think that they're from Mars. They think a different way than me because I can't do this. Other people know how to. I have no idea. And actually there's a guy who wrote a program
54:02
that generated these functions. So you go, okay, so how many swaps and how many comparisons do you want? Right? I'm not kidding. So there's one with no swaps and only comparisons and there's one with a bunch of swaps and fewer comparisons. So there's the spectrum there, right? Amazing. Anyhow, this works for five elements.
54:20
I only tested it and guess what? I can only test it for everything because how many permutations of five are there? Factorial. What's the factorial? You should compute it, my friend. You should compute it during compilation. Right, 120. So you have 120 next permutation. The unit test writes itself.
54:42
So this is our brute force median of five and we use it here as a primitive for the groups of five. Very nice. Now here's what you do. So you take these groups of five and you put the medians those medians of five you put in the first portion of the array. That's why I'm having the swap here.
55:00
I'm going to increment j every time. And that's because I want to reuse a portion of the array instead of allocating a new array. Make sense? So I compute a median and then I'm putting it at the front of the array. And then the first fifth of the array is going to be the medians. And then I continue with recursion and everything. Awesome.
55:21
Now, let's make this faster together. Five minutes. So I gave you a background, median of medians, compute medians of five, put them in the front of the array, recurse, and you're done. And now let's make it faster. Let's make a breakthrough after fifty-five years.
55:45
Ideas. So this is optimized up to wazoo. I guarantee it. The M5 is good. So this guy we're looking at. What can we do better here? A sentinel.
56:00
No, it's not going to be a sentinel. Because, because, oh no wait, I have everything here. This is the section title here. Minimize indirect writes. So because I want to minimize indirect writes, what are most indirect writes happening?
56:21
So I do the median five which is highly optimized. I guarantee it. And then I do the swap-a-roo here. And then what I'm going to do is I'm going to recurse to this fifth of the array and I'm going to do a lot more swap-a-roos. So that's where the most writes are going to happen. In this, the swapping here. So I'm going to essentially like make this, these medians, I'm going to compute them
56:41
and swap them and swap them again. It's just useless. It's a lot of overhead. So here's a key idea. Here's a key idea. You've heard of this guy? Rob Banks. A journalist asked him, why do Rob Banks? Because that's where the money is.
57:00
You sound unconvinced. It's like, better run a Robert convenience store because they have fifteen dollars fifty cents, right? How about this? My cars are faster because they have more horsepower. Ferrari. What should you say? To get more speed, do less work.
57:23
This is your hat. This is your attitude. This is your outlook. This is how you think in terms of doing less work. There are a few cases in which more work is more speed. Can someone... There's one case I know of, honest. One case. Speculation.
57:41
Speculative execution is more work and you may actually throw it away but on average it's going to be more speed. I don't know of any other case. In general, there's only so much work you can do in a finite amount of time and you want to do less work. Otherwise, you're going to have less speed. Let's do less work.
58:01
Here's an idea number one. We have three ideas to discuss. Idea number one. Better layout. We use this first fifth of the array where we put the medians. You don't even need to kind of understand every detail of the algorithm but essentially there's a lot of moving data around during this computation of the medians of five.
58:20
So it kind of misses the point. We want to put... What do you want to do in partition? Put the little things on the left and put the bigger things on the right. So how about this? Remember our M5 routine, the brute force? What parameters did it take?
58:41
The array, the range, and five indices. So it actually is able to swap elements wherever they sit in the array. So how about this? Instead of choosing the first five elements and the next five elements and the next five, we're going to choose like this. First two elements, middle two elements,
59:00
last two elements. We compute the middle element and last two elements. So we have two on the far left, one in the middle, two in the far right. That's my first group of five. What's my next group of five? The next two elements, right? The next guy in the middle there and the next two on the right.
59:23
And I'm doing this medium of five which is going to statistically cleverly swap things such that the little things are on the left, statistically. The median-ish things are in the middle and the great-tier things are on the right. It doesn't rhyme, sorry.
59:42
So statistically I'm choosing my indices, I'm choosing my decomposition cleverly instead of choosing the first five like an idiot, the next five like an idiot, and the next five like an idiot. I'm choosing like a smart guy. The average IQ in Norwegian, right? I'm choosing like a smart guy, I'm choosing first two elements, middle element, last two elements. And then the next two elements,
01:00:00
elements, next element, and so on. So I'm going to exploit the fact that I can compute medians of any five indices, not only consecutive or contiguous indices. So we divide the array into five big sub-arrays. Compute a median of the first, as I said, boom boom boom, right? So we're going to have, with the same
01:00:24
swaps, we're going to statistically divide the array into littler things, median things, greater things. And this is going to be part of computing the M5. It's going to be sort of, I'm helping my next step with this step. Minimize work, right? That's what the money is. Minimize work. So after I'm done
01:00:47
with this, amazingly, the medians of five are already going to be the middle quintile, the middle fifth of the array. So we're already done with one fifth of the job we've done. We don't need to move those guys anymore. It's done. It's a done
01:01:00
deal. So the middle of the array is done in the first step, which was computing the median of five, the medians of five. And then we have the littler things here, and the greater things here, and I recurse to one fifth of the middle array, and then I'm going to do the fix-up. So amazingly, I'm going to swap a lot less elements. If you do the math, it's one tenth of the array.
01:01:24
And we're kind of done there, which is amazing. So this was the big thing that took me over. So I thought about this for weeks on end. I was like, there's got to be a way to minimize those swaps. And I said, the key here is that you need to swap from non-contiguous portions of the
01:01:43
array such that the little things go there, the big things go there, and then you're done. That's idea number one. So it's more complicated, not by much. This is a whole function. This is a whole function, not by much. So I have idea number one. Boom. 50% speed up. Not there yet, not there yet,
01:02:05
not there yet. But you know what a scientist does? They never give up, right? So I've got to rig the competition now, right? I'm kidding. So now we've got to make it better. But we gained 50%, and moreover, we gained the right insight, because then every other optimization is going
01:02:21
to compound with this one. It's not going to compete with it. And on top of this optimization, which is layout, I'm going to be able to build more. And here's what I build more. Did my research. Chenin Dumitrescu. Dumitrescu is a professor at the University of Chicago. He's a fellow Romanian, as you can tell from the last four
01:02:41
letters. He's a fellow Romanian. And they proved a very nice conjecture. Actually, they disproved the conjecture by saying, you know what? You don't need a group of five. It's okay with groups of three or four. And that's much simpler. It's cheaper to compute. So they very cleverly
01:03:01
have an algorithm that they call repeated step. And then I said, oh, so this is exactly what I need, a simpler median of five. It's only a median of three, which is trivial. Boom. That's idea number two. Kind of Google. Stack Overflow. It's not a Stack Overflow, but I looked for it, and I looked for all the work in the area, and this was it.
01:03:23
This was what I needed. It compounds with that. So I wrote the guys. I said, guys, do you do this? Oh, no, we do all the, you know, it's just the algorithm was not fast. But it's fast because you compose it with this thing, with the better layout. So now you have better layout, and you have this thing. Awesome. We got to 200%.
01:03:43
We got to 200%. We have minus, like, one minute. About one minute we got to beat these guys. Kind of like 3x. If we get to 3x, we're done. Idea number three, adaptation. So I ran ideas number one and two, and it was great for median.
01:04:02
It was terrible for anything but median. If you want, like, the top 1,000 or the top 1 third, anything that's not in the middle, it was bad. So I worked on this for a long time, and then I said, well, you got to specialize. Whenever you ask for something that's not straight in the middle,
01:04:22
the median, you use asymmetric groups, two of four or two of five. Use asymmetry to your advantage. If the index sort is small, you search a group of, like, two out of four, and you kind of get something like that's biased to the left. If it's great, you get biased to the right. So, as a guy said on Twitter,
01:04:44
you choose a happy case of all algorithms. Right? So if it's in the median, I do what I just said. If it's small, I do one of these tricks. Right? I choose kind of different fractions of the array, and if it's great, I choose, you know, length minus those guys.
01:05:01
Happy case of all algorithms. We're done. Awesome. So, Google for this. Fast, deterministic selection. You're going to find a very detailed draft paper on ARXIV, I don't know how to pronounce it, A-R-X-I-V, .org.
01:05:21
So you're going to find this guy, and it beats everybody to a pulp. Everybody. Like, I did, like, carefully optimized implementations of all existing algorithms. This guy beats them to a pulp. The more important graph that you may want to look at here is the speedup compared to the classic heuristics.
01:05:44
Median of three, median of three randomized, ninth-er, and ninth-er randomized. Ninth-er is a sort of a median of nine, approximate. They call it a ninth-er. It's a package ninth-er. It's a famous heuristic. So these are the best ones in practice.
01:06:01
Median of three, median of three randomized. I also tried median of five, it's about as good as median of three, so I didn't put it to not clutter anything. So what we have here is improvements between, like, what, 10% here, and, like, 70% over the best in the world.
01:06:22
Over the best in the world, friends. The best in the world. Better than the best in the world. We did it. We did it. To conclude, if you want more speed, put the hat on. You've got to do less work. Forget the sentinels.
01:06:41
The sentinels are a mean to an end. They're not the end. The end is less work. And we saw how sentinels are part of it, but in the second case it was simply less swapping. Kind of just choose your layout properly such that you minimize the amount of data movement, access,
01:07:03
the amount of writes you do, essentially. Martial the computation to benefit you most. Arrange your computation, you know, think of everything that influences your result, and make it such that everything flows toward what you need, what you want, with minimal work.
01:07:20
Do Aikido, if you wish, in programming. You know, Aikido, use your opponent's power, all that stuff, right? Okay, don't do it. Don't do it like Vikings with the axe there, right? Use Aikido, right? All right, and most importantly, do your research.
01:07:41
It was very important for me to find the algorithms that the Indians and Stepanov implemented. It was very important, so I had good baselines and I could credit them properly. It's very important to do your research, know what people are doing in your field. And in this case, if I didn't find the Chen and Dumitrescu paper, I would have been, like, blocked, because I didn't have their idea.
01:08:03
I didn't have their insight. And without that, I would have been, ah, that's interesting, but I'm not there. I can't get the performance. So there's treasure everywhere, as Calvin and Hobbes said, you just have to find it. So, announcement. I'm writing a book about this kind of stuff.
01:08:22
It's called Fastware. It's a bunch of optimization techniques, such as how to benchmark, how to reduce strength, cache friendliness, indirect write elision, memoization, and the opposite of memoization, which is obliviation, hoisting and lowering, and a lot more.
01:08:42
Thanks very much.