We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

A Muggle's Guide to Tail Call Optimization in Ruby

00:00

Formale Metadaten

Titel
A Muggle's Guide to Tail Call Optimization in Ruby
Untertitel
A magic-free tour of tail-call optimization in MRI Ruby
Serientitel
Anzahl der Teile
66
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache
Produzent
ProduktionsortSan Antonio

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Submitted for your approval: a circle of torment unchronicled by the poets of old, a terror at the heart of many a horror story: SystemStackError: stack level too deep Tremble no more! Conquer the well of eternity! Behold the secrets of tail recursion and tail call optimization in Ruby! Witness the metamorphosis of a simple function as we explore the hidden power of tail call optimization buried deep within the Ruby VM! Follow the transformation to the revelation of tail call optimization's most ghastly secret: in many ways it's really just a special type of loop construct! The horror!
26
Vorschaubild
44:21
SystemaufrufSteuerwerkGlobale OptimierungElektronischer ProgrammführerTermEreignishorizontRechenschieberZweiPhysikalisches SystemElektronischer ProgrammführerBitAuswahlaxiomFormale SpracheFehlermeldungGüte der AnpassungAblaufverfolgungGlobale OptimierungCoxeter-GruppeMultiplikationsoperatorVererbungshierarchieVorlesung/Konferenz
MehrwertnetzCMM <Software Engineering>Dämon <Informatik>IntelFormale SpracheRechter WinkelMomentenproblemMultiplikationsoperatorObjekt <Kategorie>CASE <Informatik>Formale SpracheTeilbarkeitObjektorientierte ProgrammierspracheFunktion <Mathematik>CMM <Software Engineering>Dämon <Informatik>Web logSystemaufrufTrennschärfe <Statistik>InternetworkingBitPunktTypentheorieTotal <Mathematik>SichtenkonzeptPunktspektrumParametersystemBildgebendes VerfahrenGlobale OptimierungRechter WinkelEntscheidungstheorieMathematikVorlesung/Konferenz
SichtenkonzeptPunktKontextbezogenes SystemGeradeEndliche ModelltheorieMultiplikationsoperatorArithmetisches MittelProzess <Informatik>Demoszene <Programmierung>ComputeranimationVorlesung/Konferenz
SteuerwerkGibbs-VerteilungGruppenoperationSystemaufrufBitTermGibbs-VerteilungTypentheorieAlgorithmische ProgrammierspracheOrtsoperatorGeradeFokalpunktKontrast <Statistik>Vorlesung/Konferenz
SteuerwerkGibbs-VerteilungNormierter RaumEigentliche AbbildungSystemaufrufSystemaufrufResultanteGeradeCASE <Informatik>MultiplikationsoperatorAggregatzustandParametersystemNichtlinearer OperatorFunktion <Mathematik>MomentenproblemBitArithmetischer AusdruckZählenStichprobenumfangTypentheorieLeistung <Physik>Rekursive FunktionGlobale OptimierungArithmetisches MittelInnerer PunktValiditätGegenbeispielComputeranimationVorlesung/Konferenz
MomentenproblemRekursive FunktionSystemaufrufSteuerwerkVorzeichen <Mathematik>ProgrammschleifeEin-AusgabeMomentenproblemFunktion <Mathematik>Rekursive FunktionZählenCASE <Informatik>PerspektiveAdditionSystemaufrufLoopNichtlinearer Operatorp-BlockFormale SpracheProgrammschleifeParametersystemGlobale OptimierungAggregatzustandIterationKeller <Informatik>ARM <Computerarchitektur>VariableDefaultRahmenproblemEINKAUF <Programm>TypentheorieTaylor-ReiheKalkülVorlesung/Konferenz
Innerer PunktGlobale OptimierungSystemaufrufSteuerwerkEliminationsverfahrenGlobale OptimierungRadikal <Mathematik>Arithmetisches MittelInternetworkingBitSystemaufrufEliminationsverfahrenRahmenproblemVorlesung/Konferenz
WeitverkehrsnetzSinusfunktionSichtenkonzeptPunktRekursive FunktionGlobale OptimierungSteuerwerkProgrammierumgebungZählenPunktSichtenkonzeptProgrammierumgebungZählenGlobale OptimierungSystemaufrufDefaultRekursive FunktionHyperbelverfahrenRahmenproblemFunktion <Mathematik>HalbleiterspeicherFormale SpracheDifferenzengleichungDimensionsanalyseWellenpaketVorlesung/Konferenz
Keller <Informatik>MAPSystemaufrufGlobale OptimierungKeller <Informatik>Rekursive FunktionFehlermeldungMAPCASE <Informatik>Vorlesung/Konferenz
SteuerwerkSystemaufrufMathematikBimodulRahmenproblemKeller <Informatik>SpieltheorieStatistikVollständigkeitRahmenproblemSystemaufrufGruppenoperationResultanteKeller <Informatik>Arithmetisches MittelOptimierungVererbungshierarchieNummernsystemFigurierte ZahlInterpretiererGlobale OptimierungAlgorithmische ProgrammierspracheDatensatzSpannweite <Stochastik>Formale SpracheFunktion <Mathematik>ParametersystemGraphfärbungFächer <Mathematik>TeilbarkeitTesselationEinsEliminationsverfahren
SteuerwerkVererbungshierarchieKeller <Informatik>SystemaufrufRahmenproblemHalbleiterspeicherVererbungshierarchieDefaultSystemaufrufProzess <Informatik>OptimierungNeuroinformatikIterationRahmenproblemGlobale OptimierungEliminationsverfahrenGreen-FunktionFunktion <Mathematik>Schnitt <Mathematik>EinfügungsdämpfungRechter WinkelQuick-SortMomentenproblemStabPunktComputeranimation
SystemaufrufSteuerwerkInnerer PunktHill-DifferentialgleichungGlobale OptimierungQuick-Sortsinc-FunktionMultiplikationsoperatorDefaultSystemaufrufProdukt <Mathematik>SoftwaretestMetropolitan area networkVorlesung/KonferenzComputeranimation
Globale OptimierungSystemaufrufSteuerwerkFahne <Mathematik>Folge <Mathematik>KrümmungsmaßWeb logInstallation <Informatik>Patch <Software>Innerer PunktMenütechnikKonfiguration <Informatik>LaufzeitfehlerCodeProdukt <Mathematik>SoftwaretestDreiecksfreier GraphFahne <Mathematik>InstantiierungCASE <Informatik>WellenpaketReelle ZahlDatenstrukturFolge <Mathematik>Konfiguration <Informatik>TopologieKomplex <Algebra>Einfacher RingGlobale OptimierungAdditionFunktion <Mathematik>SystemaufrufCodeObjekt <Kategorie>BitInternetworkingMultiplikationsoperatorEinsMomentenproblemSampler <Musikinstrument>ZeichenketteLaufzeitfehlereCosNormalvektorProjektive EbeneSoftware EngineeringMathematische LogikGatewayKlasse <Mathematik>AutorisierungProzess <Informatik>Güte der Anpassungp-BlockLastDefaultPatch <Software>SoftwareRohdatenDifferenz <Mathematik>TypentheorieVorlesung/KonferenzComputeranimation
CodeOvalObjekt <Kategorie>Physikalisches SystemMereologieStrategisches SpielBimodulZeichenketteCodep-BlockQuellcodeÜberlagerung <Mathematik>Globale OptimierungSystemaufrufCASE <Informatik>Folge <Mathematik>Faktor <Algebra>Klasse <Mathematik>Keller <Informatik>MusterspracheFehlermeldungTypentheorieMixed RealityEndliche ModelltheorieDemoszene <Programmierung>Vorlesung/Konferenz
SteuerwerkGlobale OptimierungCross over <Kritisches Phänomen>Formale SpracheRahmenproblemFunktion <Mathematik>NavigierenSpannweite <Stochastik>Objekt <Kategorie>ZeichenketteSystemaufrufPunktBetriebsmittelverwaltungVersionsverwaltungFormale SpracheGruppenoperationProgrammierparadigmaDifferenteCASE <Informatik>EinsGlobale OptimierungKonfiguration <Informatik>MereologieFamilie <Mathematik>Produkt <Mathematik>MultiplikationsoperatorStrategisches SpielSpeicherbereinigungServerCross over <Kritisches Phänomen>AutorisierungWeb logAbstraktionsebeneCodeMultiplikationGüte der AnpassungSkriptspracheLeistung <Physik>HalbleiterspeicherParametersystemBildschirmmaskeBrowserObjektorientierte ProgrammierspracheTeilbarkeitDefaultRahmenproblemOrientierung <Mathematik>MinimalgradMetropolitan area networkSoftwaretestRekursive Funktion
StatistikSpieltheorieSteuerwerkTopologieGlobale OptimierungGüte der AnpassungCodeDatenstrukturLaufzeitfehlerSpeicherbereinigungPunktSchnittmengeBenchmarkTeilbarkeitAnalytische FortsetzungQuellcodePufferüberlaufFolge <Mathematik>Konfiguration <Informatik>EinsStrategisches SpielCASE <Informatik>LoopCoxeter-GruppeAlgorithmusPaarvergleichRechter WinkelObjekt <Kategorie>DefaultSystemzusammenbruchFaktor <Algebra>Web logStabSystemaufrufHook <Programmierung>FehlermeldungSurjektivitätKeller <Informatik>SpieltheorieSicherungskopieZeiger <Informatik>E-Mailp-Block
Überlagerung <Mathematik>Formale SpracheFahne <Mathematik>ZahlenbereichSystemaufrufOptimierungGlobale OptimierungObjekt <Kategorie>AusnahmebehandlungVektorpotenzialCodeFigurierte ZahlFolge <Mathematik>Ordnung <Mathematik>Metrisches SystemPufferüberlaufRahmenproblemBimodulKeller <Informatik>DifferenteMereologieVererbungshierarchieVollständiger VerbandWeb logDefaultAggregatzustandAdditionStabExtreme programmingCoxeter-GruppeBeweistheorieMultiplikationsoperatorTabelleEndliche Modelltheorie
Globale OptimierungSystemaufrufSuite <Programmpaket>Kartesische KoordinatenFormale SpracheSoftwaretestServerTopologieTypentheorieCASE <Informatik>Mailing-ListeAppletKonfiguration <Informatik>AlgorithmusMAPHash-AlgorithmusPerspektiveRechter WinkelArithmetisches MittelBenutzerbeteiligungHalbleiterspeicherProdukt <Mathematik>DatensatzProzess <Informatik>Prinzip der gleichmäßigen BeschränktheitChirurgie <Mathematik>OptimierungVerschlingungBeobachtungsstudieGebäude <Mathematik>
Videokonferenz
Transkript: Englisch(automatisch erzeugt)
Morning, everybody. Gonna try to start just a few seconds early,
because chances are I will go over time, but I'd also like to give you guys an opportunity to ask questions, and in the event that you can't read any of the slides or anything like that, be able to shout at me and say, hey, could you read that for me, because lots of good stuff. All right, so the first thing I wanna ask you guys is, how many of you guys went to Aaron's keynote yesterday?
Nice, that's great, because I was so psyched about his keynote because it is such a great, like, introduction to a bunch of the things we're gonna talk about today. So without further ado, let's do that. So today, my presentation, as you guys all know, clearly, because you're here, is A Muggle's Guide to Tail Call Optimization in Ruby. I'll get into what I mean by that a little bit more in just a second, but first, let's consider
some alternate titles I was thinking of. So Harry Potter and the Well of Eternity, because system stack errors can be a nightmare, especially those back traces that go on for days. And just kind of wanted to get this out up front. Harry's a Parseltongue, and his scripting language of choice is probably gonna be Python, not Ruby.
So if we can start to kind of come to terms with that now, that'd be great. So kind of actually in that vein, I wanna also give you a warning. Maybe not a warning, but like, for whatever reason, this topic, like, I guess, quite a few topics in our industry, gets pretty religious at times, whether it's like Vim or Emacs.
In this case, we run into that divide between functional and object-oriented languages. So if you'll indulge with me for a moment, I'd like you to imagine, if you dare, a world without Minaswan. And for you guys who don't know what Minaswan is, it stands for Matz is nice, and so we are nice. And I don't know where that picture came from.
I actually can't find it on the internet anymore now that I found it, but it's Matz with a Matz puppet. What the hell? All right, so, from the blog of Guido van Rossum, who, for those of you who may not know, is the creator and inventor of Python. So back in 2009, there was a lot of talk going on about tail call optimization, and here are just a few select quotes,
comments from his blog related to it. How very Pythonic of you to make up stupid reasons for not implementing a very simple optimization. This is very Pythonic because it shows poor decision-making, poor performance, and immature ideology. So I mean, very clearly, like somebody who likes kind of more functional type things, and just can't believe that Python's not gonna include it.
Total opposite end of the spectrum, refreshing that you stuck with your intuitions rather than submitting to these TCO-requesting functional fiends. It blows my mind a little bit that a language feature can incite so much argument, but that said, the good news is no matter which side you are on in this particular matter,
because it doesn't matter, you're gonna be right, because everything, it's all about your point of view, right, so as Obi-Wan will tell you here, it's all, everything kind of depends on your point of view. And also I should warn you, there's a few Star Wars references in here, despite the Harry Potter theme of this talk, because I am so excited for next month,
but stay focused, stay on topic, move along. So, alright, good, I'm glad you guys left, because somebody was threatening that I would get a Twilight crowd, and I am not prepared for that, so what the hell is a muggle? So for those of you who may not know, a muggle in the Harry Potter books
is someone who lacks magic ability. Now if you're wondering why you care about this in this particular context, the way I want to present this talk is without magic. And by magic I kind of mean like, I think you most often hear it with rails, like rails magic, it's doing all these like magical things behind the scenes for you, and on the other side of that magic
emerges something wonderful, but you have to like suspend disbelief in that middle process. But if you're one of those people who loves magic, I don't mean to exclude you either, that works, but it's gonna be up to you to decide kind of where the line between knowledge and magic should fall,
because as much as I'd like to explain to you every minute detail of this, we only have so much time, and I just can't do it. So, let's get down to our subject. So, what is a tail call? Now if, when you guys hear the term tail call, you think about certain types of phone calls that happen between midnight and 3 a.m. in the morning,
I'm gonna need you to put down the Bud Light lime, focus, and let's do this. So, the kind of definition dictionary-wise of a tail call is a subroutine call performed as the final action of a procedure. So that's a little bit terse, so why don't we like look at this in action.
So here's kind of like your canonical example, mostly canonical because it is just about as simple as can be. You have this method, a call and tail position, and you have another method inside of it that is very clearly the last thing that the outer method does before it's complete. To contrast that with kind of your canonical
counter example, you have another method, not a call and tail position, which calls other method and then adds one to it. So this kind of defies that definition we looked at because other method is not the last bit of action that is happening in this method. You have to still add one. And I know there are some clever folks among you who are thinking, ah, but what if we do
one plus other method? Still not a tail call because again, it's not necessarily that it has to be the last thing at the end of the line or on the last line. It really just needs to be the last thing that happens in that method before it's done. And in this case, both cases, once the result of other method comes back, we still have to add one to it.
So the kind of interesting thing about this is it means there are very certain circumstances in which you can make a tail call. So some examples of those types of, they call them tail call sites, are some of these. So in this case, we have like in the middle of a method, not even the end, kind of behind an if statement, we have this return some call.
This is actually a valid tail call because it's attached to return. So really the last thing that this method does is this call to some call. Another one here, kind of like those counter examples we just looked at, is this call with an expression inside the arguments. So this is different than the one we looked at before
because by the time we actually call other call, the expression my int plus one will have already been evaluated, so other call will really be our last operation there. You can also attach it to boolean operations, so you have false or other call. That would work as well as true and other call
because in both of those cases, that other call is really just gonna be the last thing that happens in the method. And the final example here and kind of what gets us into some of the real power that comes from tail call optimization is this recursive count call, which really just recursively will keep calling itself until it gets down to zero. So you kind of get like a five, four,
three, two, one, zero type thing. Again, this one I think is a little bit closer to that first example we looked at where it's not too complex. You can really see that recursive count is the last thing, the last operation that will happen in this function before it returns. And there are many more, which I'd be happy to talk about later if you're interested.
So let's take a moment here to pause and reflect on recursion. So if we would look at that recursive count we were just looking at again, you can kind of see that the output of it's gonna be, as I said, five down to zero. And if we unwind the stack, we can kind of see the recursive count gets called, it puts five, checks to see if that's zero, and then calls itself again with four,
and it continues all the way down till you have your base case where you'll hit zero and you'll exit afterwards. When you have recursion plus a tail call like this, you end up with what is tail recursion. And so it's a special kind of recursion that's particularly useful when combined with tail call optimization because it allows you to do some really neat things
from a kind of like a functional perspective. Like if you're familiar with Church's lambda calculus, one of the things that it was able to take advantage of is this kind of like tail recursion to actually make it so it needed really many fewer operators in some of the modern languages that we use today. And as we'll see in a moment,
you can actually totally replace loops if you have tail call optimization. So in this case, you have side by side really kind of like the methods doing the same thing, but in one case it uses recursion to implement that loop and to track the state of the loop in like actually the arguments to the call versus iterative count which tracks the state
more inside of the actual method body. So like the approaches are equivalent, but there's a catch. And so the thing is tail recursion heavily depends on having tail call optimization in many cases because there are some things functionally you just can't do
if you don't have tail call optimization because you'll just run out of stack eventually. Depending on like how, like some of the arguments and stuff that your method takes in Ruby, like whether it takes a block and uses special variables and some of those types of things, usually it's somewhere under like 10,000 frames kind of as configured by default before you run out of stack.
So for those of you kind of getting up in arms again about into our kind of flame war here, chill out, let's just continue. So why don't we actually get an idea of what tail call optimization is and perhaps you've also heard of this other phrase, what is tail call elimination? Well let's look at both of those.
So the trick is actually that they're really kind of the same thing. One is talking about, well so like tail call elimination though, I secretly want it to be kind of like a, I don't know, tail call destroying terminator. I don't know why someone on the internet made a Harry Potter terminator, I swear it was not me, but I guess I shouldn't be surprised.
So the trick is that the tail call elimination really is kind of the optimization. So it's getting rid of the additional stack frames that you'd need when making a tail call that is the optimization. But I'm getting a little bit ahead of myself here so let's actually pause for a confession before kind of digging into it.
So you'll remember before I mentioned that, so we had our recursive count function and again kind of as Obi Wan alluded to, this is only true from a certain point of view. And the point of view I've actually depicted here is an environment that already has tail call optimization enabled.
And the thing is I didn't mention to you that Ruby is not tail call optimization by default. But we'll get more into that later. So what's really happening is in reality each of those calls, as you can kind of see by the indentation here, is pulling out another stack frame and allocating more memory and what have you so it can execute that method.
Which is fine if you're counting down from five. But if you're counting down from even like 10,000, like I'm not sure who really thinks that this is a good way to move forward with trying to use recursion in any language really. So you can see here kind of your typical
stack level too deep error that you'll run into. And again we kind of come back to, oops, this quote, whoa. So a lack of tail call optimization here has proven that in this case we really can't use this recursive solution because we just run out of the stack. But onto actual tail call optimization to the rescue.
So let's actually figure out the secret magic that is tail call optimization and how it helps some of these situations. So as we discussed before, a tail call is by definition the final action of a procedure. Which means that we can make some assumptions about how our program is gonna work kind of after making that call. And the most important of these ideas
that we can derive from this is that nothing in the calling method is needed anymore. So let's hold onto that idea and kind of figure out why we might be holding onto it if we don't need any of it. So, and actually, so there's a guy, Guy L. Steele Jr., who I think was one of the original writers of the Scheme language,
has this quote that I think also kind of frames this up in a good way. While we may prefer to think of a procedure call as meaning go do this other module and come back, that is only one possible interpretation. We must recognize that there are situations where coming back doesn't matter and these situations may be exploited. And that again I think is kind of the crux of tail call optimization.
So again, kind of focusing on if we don't need this parent stack frame, why might we be keeping it around? There are two main reasons. One is if you'll kind of remember, Aaron yesterday mentioned that in the ERVM, keeping the stack around allows you to see your path through the VM. So if you do kind of like putz caller from IRB
or kind of a debugging session, that's what you're gonna see is really that stack frame that's being kept behind. The other thing it's useful for is keeping a record of where the result of each method needs to go so it can get passed eventually down to whoever it needs to get to. So of those two things, I'll say something perhaps controversial,
but keeping a record of the execution stack I think is really just a convenience. That's really nice for debugging, but in reality it's not something that's needed by any means for the program. So if we pretend like, you know if we skip all the argument and pretend like we don't care about the complete call stack, what can we do to try to avoid that
and maybe cut it down to only one reason we need to keep these other frames behind? So is there any way we can get around needing to know where to return the result of the tail call? As it turns out, no. Thank you guys for coming. Enjoy the rest of RubyConf. Sorry, I don't know, rhetorical questions. Who knows? So if the stack frame of the tail call
needs to know to return to its result to its parent anyways, why don't we just circumvent the whole stack and just jump straight from your deepest call to where it needs to get to? And that is really kind of like the magic of tail call elimination is you can avoid some of those stack frames
and save a lot of computation and memory needed and actually enable a lot of functional solutions. So I kind of went through that quickly. Questions right now? Tylenol, anything? Pause for a drink while you guys think of your questions.
All right, I mean, they're doing a really good job or I lost you guys all five minutes ago. You should have paused for, oh, sorry. So the question was, I said before that it was not enabled by default and the gentleman suggested compiling your own MRI to get it to work. We'll actually get into a bunch of ways
to make it work in just a moment. So if I can ask you to bear with me. So one thing to kind of clear up before moving on, why stop kind of with that? If you're going to just kind of circumvent the parent anyways, why create a new stack frame and then, I don't know, cut the parent out after that?
In practice, what really happens is you, I don't know, you don't end up circumventing the parent. You just reuse that same frame to do the next thing you're gonna do. So there's no, I don't know, something about elimination to me kind of suggests that there's this extra stack frame that you avoid creating or I guess this extra stack frame
that you get rid of. But the reality is you just, it's kind of like recycling. It's like green programming. So here's kind of our countdown function again. On the left, you can see what it kind of looks like without tail call optimization. You go three, two, one, zero. But on the right, with tail call optimization, you can see we're just reusing one stack frame
to go through every one of those iterations. So that's all well and good, but as this guy said, if it's not in Ruby, then no offense, but I really don't care. So about that, Ruby is tail call optimized, as it turns out, sort of. And so it's actually been built into the RVM
since Ruby 1.9.2, but it's actually never been enabled by default. There was some discussion of perhaps enabling it by default around the time that Ruby 2.0 came out, but nothing ever came of it. So let's look at how we can get in there and play with it. So there should be a warning for you here as the dragon will illustrate.
I imagine not a lot of people are using this in production, so don't ship it to production and then write me later and say, listen man, you said this was gonna be great and it was not great. I don't know, test it out. I think there's a lot of opportunity for it and I've not run into any weirdnesses where you suddenly run into segfaults or anything like that,
but again, I would say use caution. So there are a few different ways that you can enable this. So as this gentleman back here suggested, you can compile your Ruby with the flag switched around so that it's enabled by default. You can get into Ruby VM instruction sequence, which Aaron mentioned yesterday,
and it's actually one of those optimization options in there. You can also create your own instance of Ruby VM instruction sequence, kind of like there's a global versus an instance one. Probably best not to screw with the global one, though sometimes it's handy because things like require and load will use
the global instruction sequence to compile that stuff, whereas even if you set up an instruction sequence with tail call optimization enabled, if you call require or load from it, you're not gonna get tail call optimized code loaded. And so the last one here, full disclosure, I am the author of this gem. I created this to play around with tail call optimization.
There's a TCO method gem that allows you to kind of, I don't know, it's experimenting with different APIs to try to make it accessible and kind of feel normal. So I like to pause here for awe because, I don't know, there's something to me that I think is really astounding about a VM that you can like have tail call optimization enabled over here, but not over here, and I don't know,
that's pretty awesome, and I think really speaks to the power and as you kind of saw with some of the C code that Aaron dug into yesterday, the complexity of the VM that Ruby runs on top of. So, how to patch your Ruby to have tail call optimization enabled by default. I don't necessarily suggest that you just go out there and take anyone's patch from the internet
to modify your Ruby VM, but you can trust me, I'm a software engineer, what could go wrong? So actually this is really the diff, like you really just need to flip two flags. Something worth noting is that when tail call optimization is enabled, you can't use the trace instruction, which you may be familiar with as, was it set underscore trace underscore func?
I don't think I've actually ever really seen anyone use it, but I'm sure there are reasons, I'm sure it's out there somewhere. So as I mentioned, Ruby VM instruction sequence is also kind of a gateway to getting into there. Won't go into great detail about that class itself, but there's a lot of like kind of neat things you can do to kind of like, I don't know,
get in there and like play with the VM and get an idea of like what's going on. So I definitely encourage you to look at it. Aaron yesterday mentioned the book Ruby Under a Microscope by Pat Chaugasy, and I would definitely recommend that too. There's a few things he gets into in there which I feel like will help you kind of better understand how you can use this
to like get an idea of what's going on under the covers, but also I think give you a ton of insight into how the Ruby VM is running and again, kind of one of those like awe type moments where it's really like, this is crazy. So this is how you could selectively compile code at runtime using Ruby VM instruction sequence
just if you wanted to go with the totally vanilla, not going to pull another dependency into my project. So hold on to these I sequence option constant up here at the top because I'm not going to repeat that later just to make things easier for you guys to read, but again you can kind of see it's really the same thing we were doing in the diff where we turn on tail call optimization
and have to turn off the trace instruction. And it's worth noting that even if you turn on tail call optimization but don't also turn off the trace instruction, you will not get tail call optimization. But oddly, I guess the trace instruction kind of like overrides it which is something to watch out for. There were many times where I was like, why doesn't this work?
But that's the trick. So in this case, it's actually like, see if we can look at like a real kind of example of how that works. So in this case, one thing I find kind of like clunky about this particular approach is that you may have noticed it in Aaron's talk yesterday. You have to like just pass it like the raw code strings
which this particular code highlighter's doing a good job or let's say a bad job actually because it's highlighting a string as though it were code but in like Vim or something else like that, you'd probably just have a whole block that looked like a string there. But yeah, you have to take just like this string and then pass it to the RubyVM instruction sequence object to evaluate it and actually turn it into
a runnable bit of code that would have tail call optimization enabled. So to try and combat that, I tried to encapsulate at least some of the stuff of actually generating that instruction sequence object. So again, here you still have to use code strings which I think are always going to be clunky but it gets, I don't know, I guess you at least
have to have a little less knowledge of what's required to actually make this work with tail call optimization. Another API I've been playing around with for this is to use like a mixin that does kind of like method decorators. So you can see here it adds this TCO method mixin which gives the class this TCO module method.
So after I've defined recursive factorial in this module, I can then use this like method decorator to say hey, that should actually be compiled with tail call optimization. And behind the scenes, I don't know, it's kind of hacky so again, probably not production quality stuff but it'll go find that method, find the source for it
and basically run through everything we just saw with the instruction sequence. Which I think personally is definitely prettier than a system stack error. And another kind of API I've been playing with, this one's kind of new but I think it's actually getting close to more idiomatic Ruby as you can just do a block of code and whatever is in that block will be evaluated with tail call optimization.
The big catch for this one though is it's still under the covers it's turning this into a string and evaluating it. So even though you have this block here, you're gonna lose your block scope. You're gonna lose the scope. You're not gonna be able to access things outside of it and vice versa. If you define something inside this block, things outside of it won't be able to get to it.
But maybe that's a trade off you're willing to make. And that's actually part of the reason in this case I kind of treated it more like, I don't know, I think it's really nice if you're kind of like using a strategy pattern where you have a particular strategy of something that perhaps would benefit from tail call optimization. So in this case I'd use this TCO method or with TCO method and then define a module inside it
to kind of take the strategy type approach. So let's talk kind of like again about what some of the benefits are here. So Ruby is a multi-paradigm language which really means that Ruby kind of takes the best a bunch of different languages and consolidates that into the awesomeness that you and I all
and we all know as Ruby. But in its multi-faceted approach you run into kind of that same flame war where you have the very object oriented people in Ruby who makes a lot of sense because Ruby itself is so object oriented. But with the introduction or like the presence of things like procs and lambdas,
you really have a lot of functional power too. So there is still even like within Ruby this divide and kind of falls into this multi-faceted thing. So on the functional side of things, tail call optimization enables a wider range of functional solutions by better enabling recursive solutions including tail recursion. But that's not to say that there is not also
an object oriented benefit. The example of it is a little too much to go into in this particular case, but again, the guy I mentioned before, Guy Steele, a few years ago in a blog post made a good, I don't know, made a good demonstration that by having tail call optimization you can actually have better abstractions.
The idea being that with some things, if you can like call it and not have to worry about it stack over, like you running out of stack because of recursive calls, you in many cases need to know less about the internals of another object.
So I think the straw man you put together is like kind of, I don't know, it's not something you'd use every day, but I think the fact that he, I don't know, I think the fact that there is kind of a demonstration of it at all demonstrates that at least to some degree tail call optimization definitely does benefit object oriented designs, not just functional ones.
Did some experimenting also and I found that tail call optimization also helps performance. What may or may not be obvious is that when you kill those stack frames that are otherwise just lying around, you can actually garbage collect those objects a lot sooner. So if you have like, I'm not sure what
a good example would be, but if you had a method that did a lot of object allocations and ended up holding onto a lot of memory somewhere back in the stack, even if you're never gonna go back there, you potentially are still holding it and actually can't like release that memory until you unwind the stack, get rid of that frame and move on. You also get better performance and a benefit
from reusing stack frames instead of just creating new ones every time when perhaps you don't need to. I think there's also a good language crossover opportunity here because with ECMAScript 6, JavaScript is now supposed to be tail call optimized. I think it'd be really interesting to see how they make that transition.
I don't know that any of the major browser vendors yet have actually released an engine that has tail call optimization enabled, but it's definitely in the spec and I haven't heard anything about people like trying to pull it out yet, so we'll see what happens. So, and now kind of on the flip side,
why not tail call optimization? And again, so the thing I kind of brushed over earlier, the stack frames that would be part of a tail call optimization are eliminated from the stack, which makes debugging much more difficult. And actually it makes like certain gems like, if you ever use like pry-by-bug or pry-nav
or pry-stack-explorer in particular, like these guys are gonna be, I imagine would probably segfault if you tried to run it against a Ruby compiled with tail call optimization enabled. So definitely not as nice a debugging story. As I mentioned also earlier too, there's that use, or that set trace function,
which again, haven't seen really in action, but is something to watch out for being broken if you were to turn on tail call optimization by default. As you saw with kind of some of those examples, since you have to like work with these strings of code, it can be pretty cumbersome to use as it stands. And again, as I mentioned, since it's relatively
unused and untested, despite its availability, it's something to watch out for. I actually tried to, so I spun up a version of Ruby with 2.2.3 with tail call optimization enabled, and tried to run the Rails test suite, and it was pretty good, but there were definitely things that failed that didn't fail with like a vanilla form of 2.2.3.
Who knows if it's, I don't know, I'm not sure if it's really important things, I didn't kind of like dig into each of the failures, but definitely I would say an argument for not running your Rails server with tail call optimization enabled. But again, I think there's still potential for situations where perhaps you want to use a strategy,
because I don't know, I don't think you really need it everywhere, but I think there are cases where you can maintain a lot of elegance, and you can get a lot of performance, and just have solutions that you wouldn't otherwise be able to get to without tail call optimization. And in my experimentation, I found that there's still
some weirdness around what constitutes a tail call. We looked at a few examples earlier, but it's hard to figure out, like the VM, for the VM, it is hard to figure out when you're not, like when you don't need the rest of that method body. And so in some cases you have to kind of do kind of like weird things to get to a point where you can actually tell it, or where it will actually know that,
okay, I don't need the rest of this method, I can leave. I don't have an example here, but in some cases where if I just like made sure that even though no code path could reach the end of the method, if I just put a nil there, it would optimize it, whereas in other cases it would just be like, I better hold onto this, I'm not sure what's gonna happen. So some weirdness there, but it may be unavoidable
without, I don't know, changing some of how the VM works. That's it for the presentation. If you guys are curious about me, you can find me on GitHub. I wrote a bunch of stuff about this on my blog. Like actually if you enjoyed Aaron's walk, or the walkthrough yesterday of where he was kind of like digging into how some of the I-Sequence stuff
and what have you, I went on a voyage to try to figure out the source of tail call optimization inside of Ruby. And there's lots of C code, so beware, but I don't know, I found it very informative, and again, like kind of like, kind of awning in just that it,
I don't know, to me it seems pretty magical, so. I work at Datto in Boston. We do business continuity, which is a fancy way of saying backup. We, here you have us with Hillary, because we're actually one of the companies that had Hillary's emails. So if you guys are interested, I can maybe hook something up, just let me know.
But that is it. Thank you guys all very much, and if you have any questions, I'm here, and we have time. The question is if you have to do any Ruby recompilation, if you use the mix-in and you do not, because luckily, again, that's kind of what is
awesome about it, is you just tweak a setting and suddenly you can just say, I don't need those stack frames, whatever. And then you can have just things running in totally different places that have those different ideologies, like yes, I want the stack. No, I don't want it, and they work fine together. Right, yeah, exactly, because you can just isolate it, or sorry, and so he said then you wouldn't crash rails
or whatever, and that's true, because yeah, because it'd just be isolated to whatever scope you had kind of compiled with tail call optimization enabled. Absolutely, yeah, so sorry, so the question was, let's say you kind of went the other way, and you're like, okay, I'm gonna compile my Ruby with tail call optimization enabled by default.
Could you then have blocks of code that you compiled without tail call optimization? And yes, you definitely could. You just would kind of flip those options around, turn back on the, well, potentially turn back on the trace function, turn off tail call optimization, and then you should be able to compile it as you might expect. Well, yeah, I think that's a good point. I think there's lots of situations where just by,
sorry, and so the question is how often when you run into a stack error is tail call optimization the solution? And I think there are definitely a lot of situations where, you know, accidentally you've called something that's just infinitely calling itself in a loop. So, and yeah, obviously in those situations it's not gonna help, but I think there are situations
where if, for example, I guess, I suppose one of the ones that we looked at was generating kind of sequences like Fibonacci or, what was the other one that was in here, factorials, but again, kind of in the keynote, if you guys were there at the keynote yesterday afternoon, if you were doing that for work,
I'd like to know where you work because I don't know who's generating Fibonacci and factorials for work. But I think there's definitely an argument for in some algorithms being able to say we don't care about the stack here and thus allowing you to kind of proceed in a recursive strategy. Perhaps something like parsing HTML would be really good for this.
Anything that kind of has a tree structure is really gonna naturally follow a kind of a recursive strategy, I think, really well. And in those situations it might be nice to say, okay, let's change the game here and let's just recurse as much as we want. Yes, so I have, so the question was if I've benchmarked any of these comparisons.
And I have some, but frankly I didn't see a tremendous benefit from it. I thought in at least my experiments, I think it was probably with generating large factorials, that there was not a significant speed up. But I was mostly looking at execution time.
Potentially there's a benefit there from allocating fewer objects and thus avoiding garbage collection pauses and things like that. But generally speaking it seemed fairly similar to me. Yeah, nice, good question. This is actually something I ran into some,
sorry, so the question is if there's any way to tell if a method is tail call optimized without actually just trying to force it into a stack overflow. And so this is a problem I actually ran into in trying to test the gem to make sure that when it compiles something it's actually doing it with tail call optimization. Depending on how you structure the method,
there are a couple different ways. The probably like most rigorous way to do it is using the RubyVM I instruction sequence object. You can actually take a chunk of code and as I think like Aaron showed yesterday you can actually get it to translate it into like the ARV instructions that it'll run. And when it is tail call optimized,
there are certain flags that will appear in the instructions. So it's not like a bulletproof method, but I think it's the most reliable. The other thing you can do is like you don't have to go all the way to the extreme of stack overflow. If you just call the method once and can check your stack, if it worked, you should have eliminated the last stack frame.
So it shouldn't really grow. It should kind of say stay static, which is another way you can test it and that's the way that I went with for the gem just because I think, I don't know, I feel more confident about actually seeing that the stack frames are eliminated compared to the VM telling me that it's compiled it with tail call optimization.
Other questions? The question is if I've tried to force an exception with this and trying to see if something like RPM agent or things like that would break because of perhaps a dependency on being able to analyze the stack. For New Relic I have not tried it. For Rails it definitely seemed like
there's some potential for it. Like some of the exceptions I remember it seemed like that perhaps for some of the lookup stuff it uses some of the call stack to try to figure out who's parents or who's because some of the failures that were happening is modules not knowing the right nesting. So that's potentially one place you could run into issues.
Where else might cause a problem? I think New Relic is like a great candidate for a problem because I would imagine that it probably does some of that tracing of a call stack to figure out where and what your program is doing to kind of give you some of the metrics that it does.
The question is what do I think would need to happen in order for the greater Ruby community to be interested in tail call optimization and to see it more easily accessible in MRI. That's something actually I've asked myself a lot in regards to this talk because in light of there being some talk about it around 2.0 it seemed like another opportunity to kind of reflect
and say is this something we want, does this benefit the community. I think, personally I think the best way to do that would be to kind of probably make it more, I don't know, like add some kind of language primitive to be able to do it so you could have a method that is compiled with tail call optimization.
I think that's, to me that I think makes the most sense because it's usually like you have a particular method that you want to work with that in. But then I don't know, you potentially run into some issues where it's calling things that aren't getting tail call optimized and so you're getting like some parts of the stack left behind and others not. And I'm not sure if that would work. I think actually on the Ruby issue
that talked about making it able by default, there was a note there that was saying if they wanted to introduce some kind of language primitive it would be somewhat difficult the way that the VM was constructed at that time. But personally I think that would be the way to do it to make it so it's easier for people to explore and to get curious about it. That's actually a lot of what drove me into learning about tail call optimization at all.
It was kind of like, it was at a stand up meeting at work and someone's like, oh if Ruby had proper tail calls we could do this, this, and this. And I was like, I have no idea what you're talking about. So, but then there's a guy, Nathan Bacall posted a blog post saying like, oh hey, Ruby does have tail call optimization under the covers.
And presented with that I finally had like somewhere I could go explore and say, what is this tail call optimization? And personally I think that making that more available for people to explore their curiosity I think would really help, number one, figure out if we care about it. And hopefully push it forward as something in the language. Let's follow up question.
Nah, I mean yes, there definitely are. I was gonna say not that I know of but I don't have personal experience with any of them. The question, sorry, for those who couldn't hear was are there any other optimizations in Ruby in the VM that we might be interested in playing with? I think Aaron showcased a couple yesterday. Though it was never quite clear to me
how those played into the stuff he talked about but that might be my fault. But there definitely, I think there's a hash of options you can hand that particular thing. I think it's like eight or nine different things you can turn on or off. And they're meant to be VM optimizations so I think there's definitely something there that in the right use case could help out.
The question is, do I know the status of what tail call optimization is in other Rubies like Rubinius or JRuby? I don't actually have any idea. I don't know, sorry, I was trying to, so one thing I know is like in researching this, Java is one of the major languages that got lampooned a bunch for not having
tail call optimization enabled. That said, that doesn't mean JRuby couldn't have it because it's a VM. Like JRuby is running a VM that runs Ruby. So I would think that they definitely could do it if it were something that Hettius and them were into. I have less knowledge of Rubinius
but again, I would think that all of these things are probably building a VM in some other kind of lower level language on which they compile and run Ruby. So I would think it's definitely possible but I don't know if they've tried to achieve it. Again, mostly because in MRI Ruby it's like there but it's not really needed.
It's not required for anything. Actually, sorry, a follow up to that. You'd have to speak to Charles Nutter more specifically about this but I'm pretty sure I've seen him tweet that running like the Ruby test suite against JRuby has been successful in places. Or maybe fully, don't quote me on this but there are tests to test that it works.
So if JRuby is really running the Ruby test suite on their VM, then I would say it's in there. But that's, I don't know, guesswork. Oh, what situation might you enable in production? I would think like, certainly not a web server given kind of what my experience with Rails was there
but I think there are probably native type applications where just perhaps for performance or to minimize the amount of memory used by the Ruby VM you might want to use it for those reasons. That said, depending also on your style of programming if you were more functionally oriented you might want to do it so you can kind of have all of the niceties that come with it
kind of from a functional perspective. And also if you have a particular algorithm that you think would benefit from being able to recurse indefinitely, I think that would be another situation for it. Again, I think anything that kind of is related to trees, linked lists which are kind of like
a tree with just one limb, any of these types of things I think really work really nicely with recursive solutions and I think if those are things you work with a lot it might benefit you to try turning on tail call optimization. Last chance?
All right guys, thank you so much for coming. Thank you.