Profiling Ruby: Finding 10x Gains In RSpec and CRuby
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 50 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/37485 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Producer | ||
Production Place | Miami Beach, Florida |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
Ruby Conference 201310 / 50
2
3
7
13
15
17
18
19
28
29
35
39
44
48
49
50
00:00
Product (business)Square number
00:37
Term (mathematics)Expert systemMatching (graph theory)
01:05
Hypothesis2 (number)NeuroinformatikProfil (magazine)Total S.A.Software testingMultiplication signPoint (geometry)Right angleOrder (biology)Memory managementDifferent (Kate Ryan album)Game theoryFigurate numberGrass (card game)Bit
02:23
WordDisk read-and-write headLibrary (computing)NeuroinformatikGame theoryWhiteboardProfil (magazine)Right angleBitTheory of relativityPoint (geometry)Universe (mathematics)TesselationQuicksortVideo gameMoment (mathematics)Goodness of fitCode
04:40
QuicksortCodeMobile appBootingMultiplication sign
05:34
BootingHypothesisMeasurementMultiplication signBitDisk read-and-write headProjective planeFigurate numberBootingHypothesis
06:06
HypothesisMobile appRight angleMultiplication signCASE <Informatik>NumberDifferent (Kate Ryan album)Structural load2 (number)Software testingBookmark (World Wide Web)Tablet computerGoodness of fitSlide ruleCircleSoftware frameworkFormal language
07:01
Storage area networkStatement (computer science)Right angleFiber bundleCodeBookmark (World Wide Web)QuicksortNamespaceProfil (magazine)Product (business)State of matterOnline helpMultiplication signEccentricity (mathematics)
08:13
BootingCodeLoop (music)Computer fileSoftware developerMathematicsLibrary (computing)Square numberLine (geometry)2 (number)Directory serviceText editor
09:13
Cache (computing)ExistenceProduct (business)2 (number)Coordinate systemMultiplication signFiber bundleIntegrated development environmentMobile appLetterpress printingQuicksortBookmark (World Wide Web)Software developerDefault (computer science)Routing
10:32
BenchmarkFunction (mathematics)Physical systemTotal S.A.Line (geometry)Graph (mathematics)Plot (narrative)Right angleQuicksortStatement (computer science)Potenz <Mathematik>Mobile appLibrary (computing)Multiplication signProgrammschleifeGraph coloringLine (geometry)NP-hardGraph (mathematics)DataflowPoint (geometry)Operator (mathematics)Structural loadGreatest elementCausalityBitBenchmarkSocial classSquare numberComputer fileCodeProgrammer (hardware)MereologyDifferent (Kate Ryan album)Function (mathematics)Theory of relativityCASE <Informatik>Patch (Unix)2 (number)NumberScaling (geometry)CurveDefault (computer science)PlastikkarteMoment (mathematics)Physical lawNeuroinformatikSystem callSensitivity analysisFile systemResultantGrass (card game)Computer animation
15:37
ComputerNeuroinformatikBitFunctional (mathematics)CodeOpen setComputer animation
16:04
CodecExtension (kinesiology)Multiplication signComputer fileFigurate numberPoint (geometry)Right angleCodeLaptopPattern languageBitMathematical optimizationBranch (computer science)Strategy gameMereologyReading (process)RoutingObservational studyCASE <Informatik>Instance (computer science)
17:49
Computer filePlot (narrative)Line (geometry)Function (mathematics)MathematicsGame theory2 (number)PlanningPatch (Unix)Hash functionProof theoryCodeComputer fileLine (geometry)Drop (liquid)Structural loadSet (mathematics)Multiplication signMobile app
19:20
Execution unitSoftware testingUnit testingSoftwareComputer fileSoftware testingINTEGRALExecution unitDatabaseCode
20:04
HypothesisBenchmarkComputer fileTotal S.A.Physical systemRadio-frequency identificationMultiplication signFigurate numberComputer fileUltraviolet photoelectron spectroscopySound effectNeuroinformatikGraph (mathematics)Doubling the cubeOnline helpExecution unitNumberExpected valueMatching (graph theory)Parameter (computer programming)Point (geometry)Attribute grammarCurveFrame problemPattern languageSoftware testingSystem callMetropolitan area network2 (number)QuicksortPurchasingMoving averageMathematicsStructural loadForm (programming)Electronic mailing listComputer animation
22:18
BenchmarkStack (abstract data type)BitMultiplication signGreatest elementSystem callStack (abstract data type)Kernel (computing)String (computer science)Declarative programmingAttribute grammarCodeProjective plane1 (number)Graph (mathematics)Scaling (geometry)Graph coloringQuantum state
24:01
Line (geometry)Stack (abstract data type)BenchmarkParameter (computer programming)Revision controlLine (geometry)Error messageCodeStack (abstract data type)AlgorithmCASE <Informatik>Multiplication signSystem callFactory (trading post)Generic programmingEntire function
25:34
Stack (abstract data type)BenchmarkMIDITotal S.A.Physical systemSystem callGraph (mathematics)Generic programmingCurveDivisorProfil (magazine)CodeCASE <Informatik>Core dumpAdventure gameLie groupDifferent (Kate Ryan album)TrailDirection (geometry)Point (geometry)Tracing (software)Sound effectNetwork topologyGrass (card game)
27:04
CodeGraph (mathematics)Projective planeMetropolitan area networkSound effectGoodness of fitNeuroinformatikMultiplication signFigurate number
28:45
Right angleNeuroinformatikSystem callLie groupTablet computer
29:13
HypothesisSoftwareAreaVideoconferencingEvent horizonComputer animation
Transcript: English(auto-generated)
00:17
Sorry for the late start. Thank you for your patience. My name's Xavier. I currently work at Square on
00:23
the production engineering team, and my details are up here. You can see them. And today I'm going to be talking about profiling Ruby. It's a thing that I've done a little bit of. I've sped up a lot of things. In some ways, this is kind of a weird talk for me to give, because I'm actually not really that much of an expert at
00:42
this, and I don't really even like it very much. But I really, really hate slow things, and that's been sufficient motivation to overcome the first two difficulties and actually being able to speed up some slow things, and so I wanted to talk through that with you guys.
01:02
To put it in Matt's terms from this morning, sometimes you just have to take out the garbage, right. So that's, that's what this is for me. Now this isn't, this isn't a tools talk. I want to preface that. If you've come to learn the latest and greatest about sampling profiles, this is not that talk. I do cover some
01:23
basic tools that you need to know about. But this is really about the mindset of how I, how I go into this problem and how I, how I approach it, and it's, it's worthwhile for me. So there are, there are three, three things that I'm always doing whenever I'm trying to figure
01:40
out why something is slow. I'm hypothesizing, why, why could this be slow? What is this thing doing? Computers are really fast, and if I'm sitting there for three seconds waiting for a computer to do something, I have heaps of time to think about, well, what's it doing? It's got three seconds. That's a lot of computer time, right. So coming up with ideas, figuring out how to isolate those in order to
02:01
test whether that's actually your problem or not, and then also figuring out how to instrument different things to figure out which bits are fast, which bits are slow, and then to prove that you actually sped a thing up. So I'm gonna keep coming back to these three points. I'm gonna work through a couple of examples of, of how I've applied this and there's pretty graphs and everything. It's gonna
02:21
be great. To begin with, though, I wanted to talk more about the, the mental game of this, because I've certainly, for me, it's, profiling has always been that kind of thing. It's like, ah, it just seems too hard, right. And some of the best advice I got in relation to this was actually not computer related at all.
02:40
This was probably third year university. I moved into a apartment with one of my friends, and sort of fairly early on in the relationship, he was like, hey, do you want to play a game of Scrabble? Oh, that sounds great. Do you guys know Scrabble, the board game? Hopefully everybody knows Scrabble. And I said, oh, great. Love Scrabble. That's a fun game I played as a kid. So, you know, we got
03:00
a couple of beers, got the Scrabble board out, played this game. He absolutely smashed me. Like, he got a gazillion points. I couldn't put any words on the board. And afterwards, he was like, oh wow, I really need to teach you how to play Scrabble. And that was kind of harsh, because I was like, it's just Scrabble. It's just making words. Like, no, there's like, turned out his mom was like a pro Scrabble
03:21
player, and all they did together was play Scrabble, and he really shocked me. But one of the best bits of advice he gave me. So we're, we're playing along, and there's this thing in Scrabble called a bingo, where if you put all your seven tiles, you get rid of your rack, you get a bonus fifty points. It's kind of a big deal. It's also kind of exciting, like, laying down all seven tiles. It's kind of, just feels really good, you know. But
03:42
like, making a seven-letter word in Scrabble is really hard, and I've never done it before in my entire life. And like, within the first couple of games, he's laid down a few of them. And I called him on it, and I'm like, that's, that's rubbish. Like, how did you manage to get all these seven-letter words? He's like, oh, you just need to look for them. I thought, well, that's useless advice, isn't it? But sure enough, I thought, OK,
04:00
well, I'll try it out. And the next game, like, every single rack I got, I just look at it and say, OK, can I make a seven-letter word? Like, no, no, no. And I kept trying, I kept trying, and then, then it was, yes, I saw one. And like, the very next game, I got my first seven-letter word ever. And I was so excited, and it all just came from, he's like, oh, you just need to look for it. And
04:21
I, this has applied so much to me for speeding up code. It's really easy to sort of in your head say, oh, it's a popular library. Like, much smarter people than me have looked at this. You know, surely, yeah, maybe there's some micro-optimizations, but I don't know how to find those. You really need to actually, that's a really negative mindset. You need to come in believing that you're
04:41
going to find something. You need to sort of walk into the code base, say, you know what, there's so much low-hanging fruit in here, I can, there's plenty of peaches, I just need to go in and find them. So you just need to go into a code base and just believe that you can find something. And honestly, more than off, more often than not, you probably will. This is, I've been absolutely surprised. Every single time I've managed to speed
05:02
something up, I have been shocked at, one, how easy it was, and two, the fact that it's like, wait, really? It's like that obvious that we can speed it up in this way. And so what I want everybody to come away from this talk with is like, hopefully I can inspire you, so tomorrow or Monday or whenever you're back, you can sit down and you, that thing that's been annoying you for ages, you just
05:22
sort of take it for granted. It's like, oh yeah, now my app just takes a minute to boot up. Well, what are you gonna do? Hopefully I can inspire you to sit down and take an hour to look at it. That's really what this talk is about. So let's start with my first example. I know this isn't a Rails conference, but does this resonate with anybody? Just checking. So this is not
05:44
Rails-specific. It just happens that I, it happens to be a good project to demonstrate some of these techniques on. So Rails boot time is slow. I've had this problem so many times. So let's sit down and figure out. We'll hypothesize. Well, what does it do? What could possibly be slow? We'll isolate it,
06:02
figure out which bits are slow, and then figure out how to instrument it and measure. So what does Rails do at startup? Loads Bundler and gems, loads up the Rails framework, runs your initializers. I guess it's, maybe starts a server. But this is kind of basically it, right. This is like three seconds of work, right. Be
06:21
like, well I guess it does that, I guess it does that, I guess it does that. So OK, well let's, that's a good place to start, right. So let's, let's see if we can isolate those things. This is the really, this is my favorite way to do an initial test of something. Just time Ruby, require some stuff, run some code. Turns out in this case that everything is slow. Which is good. It's good. It
06:41
means there's plenty of places to look, right. And this is on a fairly smallish Rails app. Doesn't really do anything. The, I actually, I have a number of different Rails apps, all the, these numbers are not consistent across slides, because I switch between apps. Because they all have something different to show. But, so this is the first way to isolate.
07:02
Second way to isolate, this is my favorite thing about Ruby and scripting languages, is you just throw a thing in the global namespace and go and litter it throughout your code, right. Obviously there's not something that you would share with anybody. You wouldn't ship it to production. But doing this sort of thing, where, you know, you grab a timestamp, write a method to put it out, and then you can just go in and start
07:20
putting, like, stamps all through your code. This, I find, is a really great way to just jump into a problem. The other issue I see, like, when I'm coming into a problem, it can seem kind of intimidating. Like, Rails, I don't really know how it all works. I don't really know what everything's doing. I can jump in and put statements in code with the best of them,
07:41
right. Like, this is what I call the naive approach. And this is my way of discovering a code base and figuring it out. Now, there are some advanced profiles that will sort of do this automatically for all of your methods, and I'll talk about them later. But that can also be kind of overwhelming, right. You get like this massive trace of, like, here is every method execution ever, and you're like, ah! Where
08:02
is this? You can just go in and say, OK, well, we're booting, we're loading Rails, we're doing some bundle stuff, et cetera, et cetera, et cetera. What's also cool about this is it helps you figure out where in the code things are happening. So, if you don't know about the caller method in Ruby, you need to learn it, because this is the other best thing in Ruby. I have a couple of best things in Ruby. And caller, combined with the fact,
08:24
so caller will give you the stack trace of the current spot where you are. Then you can copy this line, open it up in your editor, and it's like some gem somewhere, whatever. Change it, rerun your code. This is amazing. You can just go in and change third party libraries on the fly, which is really,
08:43
really scary, but really, really useful in development. So here is an example. So, initializers. So Rails takes every file in a directory, runs each one of them on boot. And so I, this was taking a little while, so I thought, oh, well, I could go into every initializer I have and stamp, stamp, stamp, stamp, stamp, but
09:01
I could just do puts caller, find the for loop that's running over the code. Like, yes, it's in Rails internal, but I can probably recognize a for loop when I see it. And, and then I can just put some extra stuff in there. And this was really interesting. So I did this on, actually this is one
09:21
of the first things that I did two and a half years ago at Square, where I went and I did this for all of our initializers, and something really stood out at me. And there's one here that you'll notice. It's taking about four or five seconds for initializer to run. Kind of a long time. Seems like too long a time. All right, and as you went in, it turned out that we were shelling out to a bundle exec, so
09:42
it was reloading the Rails environment again. It was like, oh, well, we just don't, we can just not do that. And that sped this up quite considerably. I have since either run this or advocated other people, do this, just this, literally exactly this. Like, open up the Rails code, print stuff out, and like sixty percent of people find just
10:02
stuff there that's slow. My favorite was, there was an app that, they were using Zookeeper, but they were just using it in production for some coordination stuff. But in development, they were trying to connect to Zookeeper, which wasn't there, so it would time out. And it would try again, and it would time out, and it would try a third time before it would continue on. So the app
10:21
was taking ages to start up, and with this it was sort of kind of obvious what was going on. So if you run a Rails app, you should do this, if you haven't already. It's, it's, you'll find something. Quick aside, I think Rails should do this by default for you and log slow stuff. So I made a pull request, if you think so too, you should check it out.
10:41
Big part of this, big class of performance problem is not actually a performance problem, but a visibility problem. Somebody said this to me and I can't remember who it was and I can't find it on my Twitter, but there's performance problems where you have to go in and change algorithms. And then there's just visibility problems, where you're doing something stupid, but you don't know about it. And this is what I consider the class of
11:01
visibility problem. And this is actually a large class of things that are slow. It's that you don't know what's going on. We have a deployment tool at Square that's written in Ruby, and it was flow when you ran it. And it turned out that it was like doing some git operations that we could optimize, but the biggest, that was really obvious once we just put in some logging around,
11:22
OK, well what am I doing now? What am I doing now? So making, if you, whenever you have long pauses of anything, it should be logging, your app should log what it's doing, so you can find this stuff without even having to go looking for it. So that's, so that's kind of the obvious stuff, right. So far we haven't
11:41
talked about anything hard or anything difficult. Put statements and finding stupid things. So you, you crunch on that a little while. Hopefully you'll find something. Then you get down into the kind of trickier situations, and you start having to do the other thing, where you need to prove that things are fast. So when you hypothesize that a thing is slow, that's great. Maybe you can prove that
12:02
it's slow. But if you can't find what's slow, you then need to take the opposite approach and start figuring out, OK, well what's fast? And how do I guarantee, how can I prove that this is fast? And so the most obvious one for me when you're loading up an app was, well, let's prove that require is fast. Just requiring files. So you can do this, there's a library probably most people are familiar with called
12:22
Benchmark in the standard library. It's really simple. It just, you don't even need to use it. It just loops over your code a number of times and tell you how long it took to run. In this case here, I'm creating ten thousand files on the file system, and then requiring it in chunks of, like, I'm requiring five hundred, then a thousand, then fifteen hundred, then two thousand, just to
12:42
see, you know, if there's any change between small and large or whatever. And of course this is gonna be really super fast, cause this is CRuby, and that's awesome. And then I ran it, and I was like, well, so it's, we got twenty-five thousand, twenty-five hundred files, they're not doing anything, and it's taking a second, a bit over a second to load. And that's a long time in computer, in
13:03
computery times. So anyway, the top one is the Benchmark output. The bottom one is a, a Ruby one-liner trying to be a Bash one-liner, except I don't know Bash very well. And Ruby's really good at one-liners. To just make a CSV, which we can then graph. And when I put a graph like this, now it's starting to look pretty scary to
13:22
me. And this is the thing that I was trying to prove was fast. Turned out, actually it's looking kind of slow. So, sorry the colors on this are terrible, but the top one, the top line is one point nine, Ruby one-nine load time. The bottom one is one eight seven. And there are two things that are horrifying about this graph. The first one is that kind
13:43
of exponential-like curve, right. Whenever you see this sort of exponential curve, you should kind of get scared. There are very few computery problems that we deal with day to day that should exhibit this sort of behavior. They should be relatively flat. And the second one that was interesting to me was the large gap between one eight seven and one
14:03
nine. So my Rails app at the time was about twenty-three hundred files. So another interesting Ruby thing, you can just monkey patch require if you want to measure stuff. That was kind of neat. So twenty-three hundred, which was taking about, you know,
14:21
a second. And I thought this was particularly interesting, cause one eight seven, if you look at it, has the same performance curve, but the absolute number, so the y scale is, over here is about two hundred milliseconds for one eight seven. So it's like, oh, maybe just nobody's noticed this before. This was in the fairly early days of one nine. So I was definitely noticing this one second load.
14:42
So the, the curve was scary and the difference between one eight seven and one nine was pretty scary. Now, when you, when you think about that, it kind of makes sense. The require behavior of one nine is much more sensible. It resolves relative paths for you, in case it's sensitive. It's really nice. So it is doing extra work. So it is justified in taking
15:00
some extra time. But this was too much for me. So anyway, now, now we're at a place that's really, really scary. Because require is slow, and that's CRuby. And that's C code, and I don't know a thing about C code. I'm a Ruby programmer who just writes Ruby every day.
15:21
And so this for me is kind of like the take a big, take a big breath moment, right. I have to talk myself up. And so I started with this sort of mantra of believe you're gonna find something, right. This is, you know, I wake up and repeat it to myself five times every morning. It's, it's that sort of mantra. And I have another one as well which I use, which is this one. Computers are
15:41
really stupid. They only do what you tell them. And C code, it's, it's still just code, right. There's function calls. There's variable assignments. I can, you know, if I take a deep breath and just like look in, I'm gonna feel a little bit like this guy, but I can probably figure something out. At least, that's, that's what I believe
16:02
before I open up the require code in Ruby and then close my laptop and, ahh! It's alright. Take a deep breath. It's just C code. It's just unfamiliar. I can work with this. And this actually took quite a bit of time for me. It was like calming myself down enough where
16:20
I could read through this, figure out what was going on. A big part of what I did is say, OK, well, I know it's slow. I don't know exactly where, why it's slow or where it's slow, but I can just try like commenting out some code branches and see what happens. Seriously. This is a really good strategy. So I actually got to a point where I could require basic files with none of the
16:42
extra fancy one nine stuff and got it down to one eight seven performance. Like, OK, well, now I know, like, this is my naive approach that I started with, right. Just going, just doing stupid stuff. But it's like obviously not gonna work. But it, it helps you figure out what's going on. It gets you involved in the code and, you know, you start seeing patterns. You
17:00
start, you know, being able to figure out what's going on. And there's some really neat optimizations in here. For instance, like if you require a file in Ruby, like, it looks for an RB extension. Sometimes it also looks for a dot SO extension, right. And so naively, you can say, OK, well give me all the files and then check. OK, is this an RB one or an SO? No, let's move to the next file. Move to the next file. Except whoever requires an SO file, like, basically
17:22
nobody, and so there's some code in here. This is, this is old one nine, so it's not like this anymore. But there was some code in here to say, well go through all the Ruby files first and then start looking for SO files. Which is, you know, pretty sensible. So this is actually a pretty good read. My, part of my approach for this is I really like to rewrite what I'm seeing
17:41
in something that I'm familiar with. So I went through and I rewrote this in Ruby. Or I rewrote the essence of it in Ruby. What does the require algorithm look like in one eight seven and one nine? And I came up with this. Which says, OK, well if we're gonna load a file, make sure we haven't loaded already by iterating through all the existing files, and
18:01
then, then loading it and adding it to the end. Now, this doesn't account for all of the exponential thing, but even just from looking at this, you should be able to say, OK, well, I can probably make an algorithmic change here to use a hash or a set rather than an array, and that should speed things up quite considerably, right. And so that's, that was my, that was
18:22
my game plan. It's gonna change, loaded, to be a hash. Now, so I wrote up a proof concept of this. It was terrible C code. It changed like every line in the file. And the Ruby guys were like, we can't apply that, which is totally justified. I wouldn't have done that either. But I was able to run this on our app, just as an RVM patch set, and
18:42
drop our, drop our load time from about seventeen seconds down to ten using this patch. So that was pretty cool. Thankfully, the, the Ruby guys said, well, we're not
19:01
gonna take your patch. We'll write a better one. And they did. And as you can see now, require, and two is much, much faster than it used to be. So that's awesome. And it's also looking much flatter, which is really cool. So anyway, that's, that was my experience with CRuby. The next one I wanted to talk about was,
19:23
was this problem. So does anybody, does this resonate with anybody? Yeah, so my problem here is actually different to what most people experience. Most people, when this is doing integration testing or something like that, my problem was my unit tests are slow. And when I say unit tests, I'm talking about, they're not doing
19:40
any database access, they're not doing any file or network. It's just a file of code that was taking too long. Now this is where, like, when I said I really hate slow things, like, I really hate slow things. I had unit tests that I thought should have been taking no more than a millisecond, that were taking ten milliseconds, which is unacceptable. So, so I
20:00
thought, OK, well, I can look at this, you know, believe I'm gonna go, go in and find a peach. This is RSpec. This was kind of scary at the time. I hadn't really dealt with that before. But applying these three steps. Hypothesize, isolate, instrument, figure out what's going on. So what does my RSpec test do? It creates some doubles. That's fine. It does some
20:21
verifying things, which should be no-oping in unit specs with RSpecFire, which if you're not familiar with allows you to stub stuff in a fairly safe manner if your things are loaded. So it basically stops you from making stubs that are just totally made up. And we actually just rolled this into RSpec3, which we released the alpha of yesterday.
20:42
Which is, sorry, the beta of. We changed what we were gonna call it. Because it's actually much more better, it's much better than an alpha. It's definitely a beta. You should try that out. If you wanted help with that, there's a number of RSpec guys around. We can help you out. But it's a really good release. So, and it has all the stuff that I'm about to talk about in it. So if you're really annoyed at your ten millisecond unit specs, you can
21:03
get on to RSpec3. Anyway, and there's some expectation matches. I've actually found performance improvements in all three of these, but I'm gonna talk about double creation to start with. So the first thing we do is we isolate. We say, OK. Well, there's not really that many parameters when you create a double. There's pretty much the number of attributes
21:21
that you have. And so I create this spec that, once again, you don't need to read it all, but a thousand times, it'll create a double with zero attributes. Then, with one attribute. Then two attributes. Then, then three, right. And this graph is what it looks like. Two things. One, the curve looks kind of terrible. And
21:40
three, we're creating a thousand doubles with nine attributes, which in my mind, like, it's not really doing very much. It's taking about three seconds. Like, that's a lot of computer time. So I was kind of offended by that. So I thought I'd go and start poking around. Now, I started with all the approaches that I've already talked about. But then, I got to the point where the, the one I was talking about is really useful
22:01
when you have a long thing. You just log some stuff and figure it out. This is, you know, much smaller time frames. Much harder to find, much harder to tease out patterns using that thing. So this is where I switched to using Ruby prof, which is a tool that instruments all of your methods automatically and gives you a really sweet call stack. Now, I want
22:21
to emphasize this is the first time in the talk that I've mentioned a tool at all. Up until now, it's just been me talking about how I do this. I want everybody to remember that. You don't need tools to, to speed things up. Sometimes they help. Sometimes they're useful. But you really need to approach it without them. And you know, don't let, don't let not
22:41
knowing a tool scare you away from this. So this call stack graph, obviously don't expect you to read all of it. But it gives you the amount of time spent in each of these different methods nested. Interesting ones for me. There's a lot of stuff going on in here, which seems like a lot to me. And more interesting,
23:00
there's a call down here to kernel dot caller, which is that awesome method I showed you earlier that shows the back strings. We're calling that five thousand times. Now we've got a double that we're declaring. So this is, sorry, a thousand creations with five attributes. So we're creating a double in one spot, and for that double, we're calling caller, which is gonna give the same value. We're calling
23:20
that five times. So I look at this. So I have no idea how this code works, why it exists, but I'm pretty sure that we should only call that a thousand times. This is my naive approach. So the very first thing I did was pulled naive up, pulled the caller up, and just called it a thousand times rather than five thousand. Five x improvement. It was amazing. Well, five
23:41
x to that bit. And so then I started thinking, OK, well why is caller slow? It's just giving me a back trace, right. And so I started poking around. I started, didn't quite get to the bottom of exactly why it's slow, but it's kind of, because the Ruby VM, like, does some fancy stuff, it needs to reconstruct it or magic. I'm sure there's people in the audience who know much more about that than I do.
24:02
But one thing I did discover that was interesting is there's a new argument to caller in more recent versions of Ruby where you can say, well don't give me the entire stack trace. Only give me the first two lines, or the middle two lines, or something like that. And so I ran some benchmarks on this because why we're doing it in RSpec, which I found out after reading through it, was we're trying
24:22
to get the first non-RSpec back trace line to say, hey, this is where the error was in your code. And generally that's in the top two to seven back trace lines. So I thought, well, what if, rather than pulling the entire back trace and searching it, we did like a chunking algorithm where we chunked it down. And so I wrote that out, and it's kind of what
24:41
it looks like. It's, you know, about twenty percent faster. In the common case it's much faster. So that was a pretty cool find. And so that was an interesting thing I learned about caller. The most interesting thing I learned was that, so you create a double where you're snapshotting the caller so we can use it later. Where, how can
25:01
you cause a failure? What would ever trigger that caller stack trace? Turns out, you can't. We were just using some of the generic RSpec infrastructure to create these doubles that used the caller, but in this particular case, it was never getting used. So we're spending all this time to capture a variable that we never used. So in the end, after optimizing caller, I managed to just get rid
25:22
of it completely. Which initially made me feel kind of stupid. But then I was like, well, this is like, you start with a naive approach, you learn about the code, and then, you know, you fix it up. And we use this elsewhere in RSpec anyway, so it still got faster. This is the same thing after I was done. Two things to note. One, it's a lot
25:40
simpler. Rather than using all the generic RSpec infrastructure, we added a new concept to make this easier. And there's no, there's, doesn't call out the caller. And that's what the graph looks like. It's pretty flat now. So that's kind of awesome. It's about a two or three x above, just using, you
26:00
know, just using open struct. So it's the same curve just for the high cost factor. So we can probably get a little better, but that's, I'm feeling pretty good about that. Anyway, so I just mentioned Ruby prof as one tool. There's some other tools here. Perftools, Dtrace, Strace. I haven't really used them very much, so I'm not gonna speak about them. I'm just throwing some names up, if you're interested. Some different things here. Perftools is a sampling profiler.
26:23
So Ruby prof, if you run it against code, it slows it down a lot, because it instruments it a lot. Perftools will just look at your code every now and then and be like, what are you doing now? What are you doing now? What are you doing now? Adds very little overhead. So in some cases you can actually run that in production, too. Strace you can track down system calls. If you want to see some cool graphs,
26:40
search for Dtrace and flamegraph. Some people have done some cool stuff for visualizing their things. And this is also exposed in a Ruby 2 API. I think it's TracePoint, I think. I can't remember. But something like that. Anyway, just a dump of names for you to look into if you're interested. So that's, that's kind of my adventure and
27:01
the things that I've learned about. Are there any, any questions about this? I've got a couple minutes left. Pretty straightforward, then. Awesome. I, I could get some nice graphs in there. Yeah, come back.
27:32
I did it, and they were like, this is great. Fix up some things. I did. Yeah, and then it, then I actually ended up getting
27:42
more involved in the project. So it was actually a good, a good way in. So I'm now a committer on RSpec and helping out there a lot. It was, if you ever want to get, like, the ha, like, the best, most awesome code review you've had in your life, submit a pull request to RSpec. Get Mairi to look at it. That man, are you, is he here? Where is he? This man
28:02
here, he seriously gives, like, the best code reviews. It's amazing. So I highly recommend. Would submit again. Yes. How long did it take for me to, the require refactoring? Most of my nights for
28:25
two weeks. A lot of that was just like figuring out, like, you know, an hour or two a night for a couple of weeks, just to figure out C code mostly. So. Yeah. I don't know. That's, if I was actually working full time on it, it would have been a lot quicker. Cool. All right. Well,
28:43
to just quickly finish up, the things I want you to remember. One, the computer only does what you tell it. Right. Just remember this. Whenever you're like getting scared or getting intimidated, you know, CRuby is still just C code, right. Might take some time, but you'll be able to figure it out. It'll just, it'll just function
29:01
calls in the end. Believe you're gonna find something. You know, just visualize, visualize that low-hanging fruit, how tasty that peach is gonna be. If you just believe you're gonna find something, you probably will. And these three things. Hypothesize, isolate, and instrument. Thank you very much.