3x Rails: Tuning the Framework Internals
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 83 | |
Number of Parts | 89 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/31493 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Presentation of a groupObservational studyMultiplication signPoint (geometry)Patch (Unix)Level (video gaming)WeightPlanningRight angleView (database)Computer animationJSONXML
01:02
DreizehnTwitterMultiplication signSoftware frameworkProjective planeLibrary (computing)Cartesian coordinate systemInternetworkingState of matterLinear regressionOpen sourcePosition operatorFormal languageWebsiteComputer animation
02:51
Core dumpGroup actionSubsetLocal ringE-learningTraffic reportingRight angleWave packetLocal GroupFormal languageTotal S.A.Different (Kate Ryan album)ResultantCore dumpComputer animation
05:33
Software frameworkMeasurementBlock (periodic table)BenchmarkMobile appProcess (computing)Dependent and independent variablesMIDIStatisticsCodeModel theoryPrice indexOverhead (computing)Symbol tableSweep line algorithmString (computer science)System callFormal languageMereologyGraph (mathematics)Formal languageFreezingWeb browserFlow separationMultiplication signCartesian coordinate systemIterationModule (mathematics)ResultantBenchmarkBinary codeCodeMereologyImplementationString (computer science)Library (computing)Order (biology)BitOverhead (computing)IP addressCore dumpProfil (magazine)Line (geometry)SubsetCollisionSoftware2 (number)Alpha (investment)Cellular automatonRule of inferenceStapeldateiReal numberForcing (mathematics)System callFreewareRight angleSound effectWebsiteSpeicherbereinigungScripting languagePresentation of a groupComputer animation
14:40
MIDIPoint (geometry)CountingSystem callNumberModel theoryPrice indexComputer-generated imagerySet (mathematics)TheoryStack (abstract data type)Maxima and minimaGroup actionHacker (term)Electronic mailing listObject (grammar)View (database)Presentation of a groupTemplate (C++)Compilation albumArtistic renderingDataflowImplementationFile formatQuery languageDefault (computer science)Read-only memoryCache (computing)BenchmarkPartial derivativePersonal digital assistantData bufferVolumenvisualisierungImage resolutionPort scannerThread (computing)Parallel computingPatch (Unix)Row (database)Error messageNumberResolvent formalismLibrary (computing)Ocean currentWeb browserVolumenvisualisierungBuffer solutionSystem callEscape characterResultantMiddlewareOrder (biology)Partial derivativeDefault (computer science)MereologyGroup actionCache (computing)Directory serviceSemiconductor memoryParallel portMultiplication signConnected spaceView (database)Stack (abstract data type)Point (geometry)Arithmetic progressionComputer fileFile systemRow (database)Error messageTemplate (C++)String (computer science)ImplementationContext awarenessVulnerability (computing)BenchmarkPairwise comparisonDifferent (Kate Ryan album)Inheritance (object-oriented programming)Revision controlCartesian coordinate systemFunction (mathematics)Query languageHookingImage resolutionElectronic mailing listMetreCASE <Informatik>Information securityCountingDirection (geometry)EmailDrop (liquid)TheoryGeometryCellular automatonArmMeasurementFunctional (mathematics)WhiteboardSampling (statistics)Mathematical optimizationPort scannerComputer animation
23:47
Patch (Unix)VolumenvisualisierungPartial derivativeImplementationCodierung <Programmierung>Artistic renderingTemplate (C++)Open sourceMultiplicationBinary fileShift operatorPersonal digital assistantMobile appDefault (computer science)Data conversionLine (geometry)BenchmarkReduction of orderProcess (computing)Read-only memoryCompilation albumMaizeBeta functionFiber bundleSystem callView (database)Group actionData bufferFlagString (computer science)Hash functionRow (database)Object (grammar)Vertex (graph theory)Query languageCache (computing)Inheritance (object-oriented programming)BuildingCASE <Informatik>Data conversionOpen sourceMultiplication signState of matterBuffer solutionVolumenvisualisierungFlagShift operatorRevision controlImplementationBenchmarkComputer fileArtistic renderingSystem callSoftware testingCodierung <Programmierung>Parameter (computer programming)Template (C++)Block (periodic table)String (computer science)View (database)Semiconductor memoryArithmetic progressionForcing (mathematics)Web 2.0ResultantRepository (publishing)CodePartial derivativeIP addressCompilation albumProfil (magazine)Query languageRow (database)Object (grammar)Web applicationProcess (computing)Tube (container)Right angleCartesian coordinate systemAeroelasticityReduction of orderExterior algebraGroup actionGraph (mathematics)Virtual machine
32:54
ImplementationObject (grammar)Product (business)Expected valueRead-only memoryData modelInstance (computer science)Attribute grammarStapeldateiPhysical systemSystem callModel theoryCode refactoringPatch (Unix)BootingSoftware testingProcess (computing)Fiber bundleStructural loadDirectory serviceKernel (computing)Alpha (investment)Form (programming)CircleTable (information)Revision controlPersonal digital assistantStrategy gameDatabaseCASE <Informatik>MultiplicationMereologyUnicodeSoftware engineeringExtension (kinesiology)Computer-generated imageryFreewareModul <Datentyp>Line (geometry)Mobile appConfiguration spaceSoftwareHacker (term)Endliche ModelltheorieImpulse responseComputer filePrisoner's dilemmaMultiplication signClient (computing)Library (computing)MereologyPatch (Unix)Point (geometry)Extension (kinesiology)Revision controlSystem callSoftware frameworkModul <Datentyp>Cartesian coordinate systemStructural loadOcean currentCircleTime zoneSoftware testingFisher's exact testIntegrated development environmentBootingDependent and independent variablesHuman migrationInsertion lossInheritance (object-oriented programming)String (computer science)BenchmarkDatabaseCASE <Informatik>Query languageProduct (business)Exterior algebraBitTrailSequelReal numberVideo gameTable (information)Line (geometry)ImplementationLink (knot theory)MappingLinear regressionSingle-precision floating-point formatOrder (biology)EmailStandard deviation1 (number)ResultantComputer-generated imageryDirectory serviceModel theoryError messageUnicodePresentation of a groupMobile appFiber bundleCore dumpHacker (term)Cache (computing)Row (database)Escape characterComputer animation
42:01
VideoconferencingMaxima and minimaComputer animation
Transcript: English(auto-generated)
00:11
This talk is titled, 3x Rails, but what does this title mean? This talk is about speeding up Rails framework,
00:22
but I'm so sorry, I kind of failed to bring something like, hi guys, I brought a magical patch that makes Ruby on Rails three times faster, so let's just merge this and release rail 15 now. I kind of planned to do this on stage,
00:43
but I'm sorry, I failed. So instead, I'd like to discuss some like, possibilities or points of view, right? So again, what does the title 3x mean?
01:03
Actually, this title is inspired by Matt's keynote at Ruby Kaigi last year, and RubyConf, I think. In that keynote, Matt stated that, he promised that Ruby 3.0 is going to be three times faster than Ruby 2.
01:27
What's happening? So, instead, it's actually, it's so easy to make Ruby on Rails three times faster, so easy, because everything we need to do is just
01:50
to make no more performance regression in the Rails side, and wait for Ruby 3. Then, run your Rails applications on Ruby 3.
02:03
That obviously should be three times faster Rails. Yeah, win. So anyway, my name is Akira. I'm on internet as a Matsuda like this.
02:26
I work on some open source projects like Ruby language, and Rails framework. Also, I authored and maintaining some gem libraries like Kaminari, the pagination library,
02:43
active decorator, motorhead, state of full enum, et cetera, et cetera. And I run a local Ruby user group called asakusa.rb in Tokyo. So, asakusa.rb was established in, I think, 2008.
03:06
We're meeting up on every Ruby Tuesday, and we had, so far, 356 meetups, so far. So, we have so many Ruby core committers and members,
03:23
like more than 30 people. And we had attendees from, like, about 20 different countries from all over the world. So, it's quite a global local group, right?
03:41
We welcome every visitors from any other countries, like, I mean, countries that are not listed here. So, if you're interested in visiting our user group,
04:00
and if you're having a chance visiting Tokyo, please contact me and come to our meetup. Also, I'm organizing a Ruby conference in Japan named Ruby Kaigi. Ruby Kaigi aims to be the most technical Ruby conference, focusing on the Ruby language itself.
04:27
Last year's Ruby Kaigi was like this. And this year, we're having another Kaigi. In September, in Kyoto, please know that
04:41
the conference is not in Tokyo this year. Kyoto is an ancient capital of Japan. There remain so many historical, like, temples and shrines, gardens, and so on, like, showed in these pictures.
05:01
I just Googled for Kyoto. This is the result. So, I think Kyoto is the most beautiful city in Japan. So, if you haven't been to Ruby Kaigi before, and you're willing to, I think this year's one is a really good chance
05:21
to enjoy both the conference and your trip. So, please consider joining the conference. This year's venue looks like this. This is the picture of the main hall, the second hall.
05:49
The venue has a nice-looking garden, Japanese garden. So, we're already selling the tickets, and CFP is already open.
06:02
So, please check out this official website, and submit your talk, or buy your ticket. So, anyway, let's begin the actual talk.
06:21
As I told you, this talk is about speeding up the Rails framework, not your Rails application. To speed up software, firstly, we need to know its speed.
06:42
And, in order to measure the speed, we usually use, like, benchmarking software, like, for example, Benchmark IPS, or Ruby's built-in benchmark library.
07:02
I prefer this Benchmark IPS. For example, if you actually want to measure the performance of your Rails application, for example, you can do something like this.
07:22
It's in, I made a monkey patch, monkeypatching-rails-application.col, and we run Benchmark IPS. It actually kind of runs the request, like, 100 times.
07:44
I know it's horrible, horrible idea, but it kind of works. And it benchmarks purely the Rails part, right? I mean, it escapes the browser side.
08:02
So, this outputs some score, and the, so, how can we improve the score? That's the topic of today's talk.
08:23
My first trial is, of course, Ruby GC, because everyone knows that Ruby GC is so slow. Okay, I believe it, just, like, stopping GC will improve a performance, like, 30%.
08:44
So, let's do this first. So, observe the GC. We have gcstat in the core library, and we have gctracer, which is made by Koichi.
09:01
So, for example, adding gcstat calls to the previous module, it shows something like this. Like, it iterates 45 times in five seconds. And it outputs some, like, gcstat result.
09:23
It shows that it's surely, surely GC is happening there, like, 50 times, right? So, let's stop this, like, GC disabled. Then run the benchmark again.
09:40
Then I got this result. 50, 50 iterations per five seconds. So, the GC adds about 10% overhead in this benchmark.
10:02
I think because Ruby GC is improving recently, like this, we had so many improvements on GC module. So, GC is actually no more 30% overhead. It's, like, just about 10% overhead.
10:22
It's, which is, I think, not a big deal. It's acceptable, in my opinion. So, I'd like to thank Koichi for doing this amazing work, keep on doing this amazing work,
10:42
and also thank you, Heroku, for supporting his activity. Thank you very much, Koichi and Heroku. By the way, let me now talk a little bit more about Ruby 2.3 new feature. Somewhat concerning to the garbage collection,
11:04
about strings. Strings in Rails used to be a big concern of the community. And there actually was a trend, like,
11:21
sending a pull request with .freeze, .freeze, .freeze, .freeze in Rails, and shows some microbenchmark, microbenchmark, which aims to make Rails faster.
11:43
Honestly, I didn't like that kind of pull request, because it kind of pollutes the code base, right? It just looks ugly to me. So, I proposed a magic comment to Ruby
12:02
to freeze all string literals in the file, just in order to stop the .freeze pull requests. It's like this, frozen string literal true.
12:20
It's already introduced in Ruby 2.3, it's already available, so if you're interested, you may try. Actually, I have not tried myself yet, but maybe this will add some performance, like several percent, three or five percent, I guess.
12:44
Maybe. Anyway, let's stop caring about these strings now. It's already solved problem, I think.
13:02
And another Ruby myth is Ruby is slow, because it's a scripting language. We have to parse and compile every time. So it's slower than compiled language. Is it true? I think it is true, but Ruby 2.3 has new features
13:27
that you can pre-compile Ruby code into a binary, and you can load the binary. I'm not gonna talk about this in detail, because it's gonna be described by Koichi,
13:44
the implementer himself. So, don't miss Koichi's talk tomorrow about this. So, which part of our simple Rails application takes time?
14:01
Let's profile. To measure the whole performance, I used a benchmarking software. To profile which part is actually slow, we use profiling software, like stack prof, or like rb-line prof.
14:22
But again, I'm not gonna describe them in details in this presentation, because you may have already known, heard of this, and you may know these tools. These are so powerful, and so popular.
14:42
Maybe you may have heard of this before. And also we have TracePoint, which is a built-in library in Ruby. Again, Koichi's work. You can simply count the number of method calls,
15:02
and you can hook into method, Ruby method call, and put a hook into every method calls. So, you can count the method calls like this.
15:27
This is a sample, example rack middleware that counts every method call happening inside that rack middleware stack. So, with this middleware,
15:40
I get this output from my scaffolded Rails application. The most happening method call is safe buffer HTML safe, and HTML safe, escape HTML,
16:00
attribute something, things like this. However, these are just theories, and I'm sorry, I'm gonna talk something different today through my experience, and I know some weird parts of Rails,
16:24
weak parts of Rails, slow parts of Rails. I'm gonna talk about some of these in the rest of my time. So, Rails consists of MVC. Which one do you think is the most heavy part?
16:45
How about action pack, the C part? Action pack sits on top of so many rack middlewares that would make the method call stack very deep.
17:02
Maybe that would be a bottleneck, and actually Rails 5 introduces a new feature called Rails API in order to reduce this rack middleware depth, I think.
17:23
So, let's measure. This is a very, again, very roughly written rack middleware benchmarking tool. This outputs how long did it take
17:43
per each rack middleware? And I got a result like this. Less than 0.0 point zero, 0.0 point zero something for every middleware, right?
18:04
So it turns out there's no slow middleware in the default stack. I don't actually see any other particularly slow part in action pack, actually, besides routes,
18:26
resolution, and URL hoppers, which I'm not gonna talk about today. So, let's leave action pack. And let's see this list again.
18:42
There are some like safe buffer things and escape HTML things, which is obviously action view. Action view actually has some performance problems. I know that. So, action view consists of
19:01
like roughly these processes. It looks up the template, compiles the template, and renders the template, then returns the HTML strings to the browser. So, let's start with the template lookup.
19:23
Current implementation of template looking up is like this. It calls directory glob for every single template lookup.
19:41
So, the resolver queries to the file system per each request, actually per each render, render layout, render partial, each render. Couldn't we speed this up?
20:04
So, I tried to make more optimized resolver over the default optimized resolver. The concept is like this. Just read the whole file system once and cache that.
20:21
Cache all the file names on template file names in memory. So, this is the trial implementation, which is already on GitHub. This basically just scans through the view path directory only when the application got the first access,
20:43
then caches all the file names, then it performs the view file name comparison in memory, as I told you. And here is a benchmark proving the speed,
21:01
and the result is like this. My version of template resolver is 18 times faster than the default resolver. In a very carefully crafted microbenchmark.
21:26
So, another issue, I think, is render partial. Render partial is basically slow because it creates another buffer per each render partial.
21:47
But in some cases, we don't need a new view context for each partial, like simply rendering footer, header, et cetera.
22:02
So, we probably can do something like PHP include and simply concatenate the partial into the parent template. And the implementation is, I'm sorry,
22:20
still a work in progress. This wasn't very easy, as I expected. So, another idea is we can pass the full path file name into render partial call,
22:44
so that the template resolver doesn't have to look up all the view paths. The API will look like this. Render path with a full path file name, or render relative, like require relative in Ruby.
23:05
The implementation is, again, not yet done. Another idea about rendering is render parallel.
23:22
So, we can parallelize render collection. So, if you have 100 collection, maybe we can make the render collection 100 times faster with using threads, right? I actually tried this, but I saw so many,
23:44
too many connections error from active record. It's obvious. So, this turns out to be a failure, I think. Another render method is render remote,
24:06
which performs rendering via Ajax, particularly for a very heavy partial. Here's an implementation,
24:21
which I did like two, three years ago. I found a repository. I looked at the repository like yesterday, but I forgot what does the name mean.
24:43
Anyway, the API is like this. Very simple. Add remote true to your render call. Then, this would perform the render partial call through the Ajax.
25:02
It kind of already works. I'm sorry, but I'm not using it. So, another topic is encoding support in template rendering.
25:24
The current implementation of rendering the template into a Ruby method is like this. It first dupes the given template source,
25:41
the whole template string, and force encoding the source binary, I'm sorry, the source text to binary, and dupes the given template source again for detecting the magic comment,
26:02
encoding magic comment, then force encoding again for some reason, and finally, encode in ERB. So many encoding conversions,
26:21
but who needs this feature? Who actually writes a non-UTF view file in your application? If any one of you does, please raise your hand. Wow, you do. No? No? Okay, so nobody in this room actually does use this feature.
26:50
Actually, we, sorry?
27:03
Okay, that might be possible, I think. But the actual use case is probably for Japanese people because I see test cases like Shift GIS,
27:23
Shift GIS, which I think is written by Yehuda, but I'm sure nobody does this in Japan. It's just ridiculous. So, the current state is nobody needs this feature,
27:45
so we just can remove this. So, here's my suggestion.
28:00
Do this. So, here's a benchmark for this new version of ERB handler, and this is the result.
28:25
It kind of shows some improvement, but only 1.5 time faster because in this case, it includes the whole compilation
28:46
process in the ERB side, not just the encoding conversions and moreover, this would reduce the memory consumption, I suppose, so let's profile that with memory profiler.
29:11
The code looks like this, benchmarking the memory consumption in, again, the benchmark IPS inside the block
29:23
that repeats the whole template resolution, and the result is like this, kind of shows some memory reduced in string objects,
29:48
and in my opinion, memory usage is very important. It's about speed, actually, because if we could reduce this, then we could put more containers,
30:02
I mean, web workers in the web application container. So, this really is about speed, all right? So, I'd like to propose removing the encoding support, maybe in Rails 6, so, by the way,
30:26
this is about the ERB handler, so if you're using Haml, we have some alternative implementations like this,
30:41
so please try using these instead of the official Haml. The next topic is active support safe buffer. As we saw in the method calls graph, we call this so many times,
31:00
which is currently a very ad hoc implementation. It has a flag inside the string object and flips the flag on and off, so I tried to use Ruby's built-in tainted flag,
31:24
but I failed. But maybe we could make a faster version of safe buffer somehow, maybe in C extension, I guess. The next topic is I18n.
31:42
Sorry, I have only five more minutes, so I'll speed up my talk. Again, it's not yet done, but I have some work in progress in this machine, which probably I'll publish within a few days.
32:06
The next topic is active record, and I have four minutes for active record. So, my main concern about active record is Arial objects when building queries.
32:25
It just builds so many Arial objects, Arial node objects. So, what if we directly build SQL strings from the find or where parameters
32:45
for very simple queries, like just where name equals to something, or find by ID. It's still not published, but it's almost working.
33:05
And the product is called Aranai. So, this is the implementation, the example. Like, if the find call accepts some complex parameters,
33:29
then it will pass the query to super. But, for the simple ones, like find by ID or find by ID string, it directly compiles the SQL query.
33:46
This is actually very cheap, cheap. It's cheaper than compiling the cache, I mean the Arial node cache for,
34:05
what's that, what's the name? Adequate record. I'm gonna skip this part. So, my next topic is model present.
34:25
My advice about model present is never, do never hit model present, because it causes massive method calls inside. Like, if you call, for example, current user present, how many method call will occur?
34:43
So, this is the answer. I see 85 method calls just for user.present, which is ridiculous. So, I suggested a patch fixing this situation,
35:01
but this would turn down, because the Rails core team expects you not to do this. So, please, please don't call present method on your ActiveRecord model,
35:20
or put something like this in your application. I think I have no time running through all these slides.
35:44
This is about speeding up the Rails size initializers. This is about don't require pry doc, pry, buy bug, pry anything in your gem file.
36:01
This is about squashing all bundle gem files into one directory, which is currently not yet working.
36:21
Using require relative instead require, instead of require, which didn't show any significant speed improvement. Like detecting autoload, which causes some
36:41
speed regression in production environment. Actually, I found two occurrences of autoload in production in Rails 5, which happens inside rack two. So, please fix this Aaron.
37:03
On the speeding up test. Previously, our application took one minute on CircleCI, just for preparing the schema,
37:20
inserting 600 tables into the schema migration. So, I changed this to this one single query, which makes, in our case, 600 times faster.
37:40
This is already committed into Rails 5. So, it's available in Rails 5. I'm gonna skip this. Some slow parts in active support, like multibyte time zones.
38:05
So, multibyte, it consists of multibyte charts and multibyte unicode. It loads the whole unicode database version eight, which sits inside active support library.
38:21
But do I actually need this? I'm not sure. And I suppose, at least we Japanese don't use this. So, we can just remove this, in our case, and make the framework smaller and make the boot time faster.
38:47
The next one is time with zone. Here's a benchmark for time versus time with zone. The result is time with zone
39:05
is 25 times slower than the built-in time. So, if you're sure you don't need time with zone, you just can replace your time with zone into time.
39:20
I mean, if you're 100% sure what you're doing. We can also boost some slow parts of Rails with C extensions. Here are some examples, like CGI, escape, CGI, escape HTML, pass blank, hash within different axes.
39:43
Some of these are already introduced into recent versions of Ruby. So, please just use new versions of Ruby, which will bring you the speed, okay? Sorry for the time over, the conclusion.
40:03
So, there is really no one single performance, like bottleneck for everyone for every Rails application.
40:25
Some apps might have 1,000 models, some apps might have 3,000 lines of Rails RB, and the bottlenecks will change. So, in my opinion, Rails is the makase,
40:43
which is nice, but in some cases, we want to customize certain points of Rails framework. Maybe what we need is more flexibility,
41:01
like Merb used to have. So, there remains so many slow parts in Rails, and there can be more, like, alternatives
41:20
to these parts of Rails. So, I would suggest to, like, make Rails more flexible, to be like Merb a little bit, and I hope everyone here to reveal your hack.
41:45
And bring more, like, modularity, diversity, into the Rails community. Thank you, thank you very much.