Improving TruffleRuby’s Startup Time with the SubstrateVM

Video in TIB AV-Portal: Improving TruffleRuby’s Startup Time with the SubstrateVM


Formal Metadata

Improving TruffleRuby’s Startup Time with the SubstrateVM
Title of Series
Number of Parts
Menard, Kevin
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Confreaks, LLC
Release Date

Content Metadata

Subject Area
Ruby applications can be broadly split into two categories: those that run for a short period and those that stick around for a while. Optimizing performance for one often comes at the expense of the other. Over the years, alternative Ruby implementations have demonstrated remarkable performance gains for long-lived applications -- so-called peak performance -- but often lose out to MRI for short-lived applications. In this talk, I'll introduce the SubstrateVM and show how we use it to massively improve TruffleRuby's startup time with minimal impact on peak performance.
Implementation Multiplication sign Principal ideal Virtual machine XML Disk read-and-write head Local Group Oracle Formal language Compiler
Suite (music) Java applet Core dump Social class Purchasing Product (category theory) Decision theory View (database) Interface (computing) Bit Instance (computer science) Control flow Bulletin board system Formal language Process (computing) Internet service provider Order (biology) Phase transition Cycle (graph theory) Mathematical optimization Point (geometry) Product (category theory) Run time (program lifecycle phase) Open set Latent heat Cache (computing) Term (mathematics) Hierarchy Database Operating system Boundary value problem Energy level Scripting language Commitment scheme Presentation of a group Normal (geometry) Traffic reporting Standard deviation Graph (mathematics) Code Core dump Cartesian coordinate system System call Local Group Compiler Personal digital assistant Function (mathematics) Device driver Design by contract Interpreter (computing) Oracle Library (computing) Building System call Just-in-Time-Compiler Code State of matter Ferry Corsten Debugger Multiplication sign Interface (computing) Mereology Machine code Formal language Video game Interpreter (computing) File system Cuboid Central processing unit Software framework Information Extension (kinesiology) Teilauswertung Library (computing) Programming language Process (computing) Software engineering Functional (mathematics) Exterior algebra Steady state (chemistry) Software testing Data type Resultant Trail Implementation Real number Virtual machine 2 (number) Natural number Software testing Statement (computer science) Software development kit Run time (program lifecycle phase) Addition Multiplication Just-in-Time-Compiler Cellular automaton Projective plane Debugger Java applet Performance appraisal Component-based software engineering Cache (computing) Object (grammar) Communications protocol Extension (kinesiology)
Standard deviation Suite (music) Musical ensemble Context awareness Java applet Multiplication sign Binary code Compiler Mereology Formal language Subset Software bug Expected value Emulator Computer configuration Core dump Extension (kinesiology) Library (computing) Physical system Nintendo Co. Ltd. Covering space Product (category theory) Real number Software developer Binary code Bit Variable (mathematics) Functional (mathematics) Benchmark Formal language Discounts and allowances Process (computing) Emulator Computer configuration Hausdorff dimension Quadrilateral Order (biology) Software testing Arithmetic progression Point (geometry) Frame problem Computer programming Implementation Service (economics) Presentation of a group Connectivity (graph theory) Real number Transport Layer Security Simultaneous localization and mapping Virtual machine Mass 2 (number) Number Profil (magazine) Term (mathematics) Computer hardware Speicherbereinigung Software testing Proxy server Subtraction Run time (program lifecycle phase) Addition Multiplication Graph (mathematics) Suite (music) Cellular automaton Forcing (mathematics) Projective plane Core dump Denial-of-service attack Line (geometry) Cartesian coordinate system Frame problem Compiler Performance appraisal Integrated development environment Personal digital assistant Interpreter (computing) Object (grammar) Game theory Matching (graph theory) Spectrum (functional analysis) Abstraction Oracle Library (computing)
Building Structural load Code Java applet Model theory Multiplication sign Decision theory Compiler Parsing Binary code Function (mathematics) Stack (abstract data type) Table (information) Formal language Subset Fluid statics Mathematics Core dump Code File system Social class Trigonometry Software bug Process (computing) Product (category theory) Constraint (mathematics) Mapping Block (periodic table) Software developer Closed set Binary code Interface (computing) Bit Instance (computer science) Perturbation theory Port scanner Arithmetic mean Message passing Process (computing) Fluid statics Repository (publishing) Compilation album Website output Right angle Code Resultant Bytecode Point (geometry) Computer programming Read-only memory Computer file Support vector machine Gene cluster Abstract syntax tree Mathematical analysis Field (computer science) Metadata 2 (number) Number Social class String (computer science) Operator (mathematics) Software testing Scripting language Subtraction Run time (program lifecycle phase) Standard deviation Conservation of energy Computer Java applet Mathematical analysis Core dump Cartesian coordinate system System call Compiler Table (information) Error message Doubling the cube String (computer science) Interpreter (computing) Formal grammar Oracle Library (computing)
Standard deviation Building Web crawler Existential quantification System call Curvature Code Multiplication sign Binary code Mereology Perspective (visual) Array data structure Linker (computing) Interpreter (computing) Core dump Code Software framework Library (computing) Process (computing) Electric generator Building Reflection (mathematics) Binary code Bit Opcode Functional (mathematics) Benchmark Maxima and minima Root Mathematical optimization Resultant Point (geometry) Computer programming Read-only memory Slide rule Overhead (computing) Computer file Support vector machine Run time (program lifecycle phase) Computer-generated imagery Abstract syntax tree Number Read-only memory Profil (magazine) Computer hardware Reduction of order Energy level Subtraction Loop (music) Addition Standard deviation Electronic data interchange Graph (mathematics) Information Debugger Line (geometry) Cartesian coordinate system System call Local Group Table (information) Cache (computing) Subject indexing Uniform resource locator Embedded system Loop (music) Integrated development environment Personal digital assistant Interpreter (computing) Vacuum Oracle Library (computing) Gradient descent
Suite (music) Group action Machine code Java applet Ferry Corsten Code Multiplication sign Scientific modelling Water vapor Mereology Perspective (visual) Food energy Formal language 19 (number) Duality (mathematics) Linker (computing) Computer configuration Core dump Extension (kinesiology) Social class Area Collaborationism Product (category theory) Electric generator Reflection (mathematics) Software developer Structural load Binary code Electronic mailing list Amsterdam Ordnance Datum Interface (computing) Sound effect 3 (number) Bit Functional (mathematics) Measurement Process (computing) Self-organization Damping Arithmetic progression Resultant Point (geometry) Implementation Identifiability Open source Sequel Patch (Unix) Connectivity (graph theory) Distribution (mathematics) Open set Revision control Profil (magazine) Database Speicherbereinigung Software testing Normal (geometry) Subtraction Units of measurement Run time (program lifecycle phase) Shift operator Multiplication Information Surface Projective plane Incidence algebra Cartesian coordinate system Limit (category theory) Local Group Compiler Spring (hydrology) Personal digital assistant Universe (mathematics) Device driver Interpreter (computing) Vertex (graph theory) Units of measurement Routing Oracle Library (computing)
Coma Berenices
a a a a a a
a a the the the
head of the going here yes I have woman my name's can admit you before and I work at Oracle labs which is a research group within Oracle particular work on a team that specializes in virtual machine a compiler technologies some here today to talk much of Ruby which is our implementation of the Ruby language from and how we improve our start-up time with a new tool called substrate the a before I get started and you need to inform
you that when I'm presenting today his research work of a research group and should not be construed as a product announcement please do not buyer cell stock and anything based on what you hear in this talk today act so improving come through the start time substrate the kind of verbose title and the super the creative outcomes these things but it is quite descriptive so if you have if you know the ruby are you don't can I keep track of all the various really implementations out there you might wonder what trough Rubia's isotropic ruby is as I mentioned an alternative implementation of the Ruby programming language it aims to be a compatible with the ruby code above provide new advantages that compared to other Rabin plantations is relatively young it's about 4 years old now by and I like the what is best of breeds approach we actually pulling code from J. Ruby Rabin AEA cinema I like J. the core of reviews written in Java so there's a natural ability for us to share some code there unlike Rabin yes we want to implement as much of the language in Ruby itself so we we will to Levitt a lot of the work being esteemed previously done by implementing the core library in Ruby and then we go on the Standard Library from MRI and more recently we've begun on being able to run arise C extensions so what we actually run arise Open SSL implementation Z live We're currently 97 per cent compatible with the Ruby Corps based on the core library suspects from the the spec project and we hit 99 per cent on the review language facts as these tests we it's a really nice but they're not as comprehensive as we would like so we've also are spent a fair bit of time testing the top 500 germs up there it's active support is 1 that's really popular so use that as an example here where 99 per cent compatible with that we don't quite have the database drivers yet but that's something we are working on so we can run all of Real's yet but of a closing the compatibility gap of quicker to ruby is implemented in truffle of truffles a language toolkit for generating simple and fast runtimes so with truffle I you basically just building AST interpreter for your language and in a stint pairs about the simplest way you can implement a language they're very straightforward to develop the easy to reason about the very easy to debug but the problem is the interpreters as they tend to be slow a we fix that by hearing truffle with another protocol brawl which is JIT compiler so grows a compiler for Java written in Java in it has hoax from Java are in trouble they would use this but column to grow and optimize these AST interpreters through a process called partial evaluation but this is a big deal because a languages start the next interpreter and then they had a performance wall and find the start building of a language specific IBM for that in building a the is a lot of work that requires a lot of expertise and it's hard to get right I went to this itself so Ruby up through review 1 8 was a simple AST interpreter in re- 1 9 introduced the the yard instruction set in a virtual machine so what we want you with truffle say body language stay in the roma based interpreters were it's really simple it will take care of formant part of that with Gorell them in addition to that as a language building tool kit truffle provide some additional features like a debugger profiling general instrumentation us so things that all languages need of you get for free out of box in addition to some of the other just control so inland cashing in being able to a prevent methods you know are going to jail for while from compiling at all then finally travel has this polyglot feature of this a 1st class feature in framework so all languages implemented in truffle I can call into and out of each other and because they all inherit from this base new class hierarchy truffle notes from 1 language the other can be mixed together very easily and when that submitted for compilation with brawl were able to eliminate that cross-language called boundary and so for instance you can call JavaScript code from ruby in them when he gets optimized of there is no performance penalty for calling into Travis scripts of so this is because it's a first-class feature of truffle and to you and force that summer truffles functional and actually implemented
as languages a domain-specific languages with and truffle so if you have involving truffle ruby you might be wonder what we've been up to over the course of last year we actually spun out of Jeri so we used to ship this alternative back injury and that time recalled G reports truffle were now truffle ruby a we began running C extensions from MRI last year Christine who was on a team give a talk at out wanting a blueprint for how we could run a C extensions MRI and some of the work we doing there since then we've now an Open SSL Erie Jason you announced would be I'm working on some of the database adapters so this approach is working and results really promising I would have job Interop so you can call a job of methods on job objects from ruby by using a nice syntax Furies Jerry B and its job Interop it looks very similar in we've been working on improving irony of calls so Ruby has this rich interface for making underlying PawSox calls the truffle has a new feature called the new function interface which is provided as 1 these deals cells within truffle in it provides a almost like rubies FFI and that kind of functionality for Java in truffle languages in particular so we'll be making some early progress but in the short term product around you achieve a high level of compatibility performance amended but we we've had 1 sticking problem in its really did start time so I applications typically go through 3 cycles use of start-up warm-up and steady state a start time is the time the interpreter uses to basically get itself into the state are ready to start running your code a warm-up time is it initially starts running code in and at this point it's cold so it's going to be slow and the idea is that as it executed of multiple times issue get progressively faster to the point where we call it hot and thus warm of so if you have a G. at this is when you be profiling the application of submitting things for compilation and actually compiling them but even if you don't have a jet on your application could still have warm-up phase of the operating system is going do things like populate file system caches in be populating catchlines and CPU and things of that nature and eventually hit a steady state so this is where the your application spends most of its life in most applications of hit some kind of a vent looper or something like that where it'll remain up until I it basically stops executing a very few applications fracture around from there so you can broadly classify applications is 2 types are those that along with the nose the short lived in along with application he spent most the time the steady state so travel ruby like a lot of languages where implementations Routledge will have kind of a slow start from warm time in make that trade off because it will generate very fast food for the steady and the idea is that application will spend so much time the steady state that spending the upfront time to generate faster code will more than pay for itself but we allow applications archer short-lived will use our be pride we have test suites in in this case the Bulletin of your new code can actually be in the start-up phase so here in this graph of the start hut and actually get any longer it just now accounts for a larger percentage of the overall applications life cycle and then you know that warm-up phase and I could largely be wasted work because you spend so little time the steady state before you exit that you don't really gain any benefit from warming up so our children these kind of optimized long-lived applications it hasn't spent a whole lot of time optimizing short lived and so for the we want the chat improve a start time and in order to improve something it's helpful to see what of the current status this so ran a very simple Hello World application and what we can see is arise hands down the fastest it runs at about 38 ms jerry runs it in 1 comma decimal 4 seconds entrust a lags behind Europe will comma decimal 7 seconds of
course nobody really runs Hello world production but at that that was a and nice sweater quickly illustrate things had to get a better sense of what a real world use case would be and turning back to respects so this is that tests the dimension the beginning that we used to evaluate language compatibility so it's nice not the respects the is its modular it's broke up the different components like the test for core language features I test the standard library and so on and the idea is that multiple review implementations can pick various subsets of this test we in order to you progressively add functionality so it is is you start off with the new implementation is not the Ronald Ruby cell you pick up a subset or 1 the components the specs suite that you can start running and evaluate progress that way so it's nice is is so this is the way of testing something that will run on multiple ruby implementations in not favor 1 or the other another interesting aspect the specs music ships for the test chronicled back that looks and feels a lot like our spec so you're gonna have of matches even have of flood . should and things like that in looking in particular at the language that's and so this is the largest test in the world but it's not the smallest it has about 20 100 examples and 38 hundred expectations so I think this is a pretty good proxy for what we can expect from running application of testees and spec looks a lot like our stack Our running in the cinema exempts here and so on US warranties of on various reimplantation again hands down the fast here we can do all those in about a 2nd half a jury becomes in at 33 seconds and truffle ruby again is at the end at 47 and a half seconds and this is really we are proud of that you were making great strides in improving our compatibility of but a lot of people entering test suites and things like that discounts out of hand because this part is a bit too slow she might be asking yourself
well if proper ruby so slow and running these tests suggest just to start time were white being fat in our advantages than and peak performance so I to evaluate the performance and turning to the Op care benchmark you saw max opening keynote he presented some numbers here in the context of injured so what up here it is it is it's of benchmark that the core ruby team is using to evaluate its progress on its ruby three-by-three initiative so basically it's a Nintendo interest payment system emulator written ruby and it basically lines NES games in presenter score which is the number of frames rendered per 2nd in ruby to runs these and basically 20 frames per 2nd so if a B 3 it's 3 speedup objective Eitel had run at 60 frames per 2nd as an interesting aside it's actually the frame rate that rely on its hardware uses so coincidences all around that mass had indicated that engine can run at 2 comma decimal 8 times I work right to open run so they're closing in on that 3 at school not care would be saying Rabin plantations will we can see is that right 2 comma decimal 3 runs about 23 frames per 2nd jerry roughly doubles that at 46 in truffle ruby runs about 8 and a half times at 197 frames per 2nd so a we made the tradeoff where start time hasn't been as great as we would like but our peak performance is really really nice now I've been presenting this kind of in the guise of short-lived verses long-lived applications in terms of application profiles but we can make this more human problem by considering it as a development forces production issue so you typically run on you prior are being in development or under test suites and development and this is where you really have shrugged and applications but the introduction the 10 have a long-lived application profile so balancing between the 2 can be problematic so I'm actually gonna take a step back here in I relate some the expenses I had running J. Rabin Productions which is something I did before starting work at the top of the the so Geary has a some of the same problems it start time isn't as nice as arise but does have a peak performance advantage over so we teamed with it said we wanted to balance for 1 option is just always use optimized for developer time but the idea being that the development team can move quicker where the happier they can deliver more value for your customers on the other spectrum you to say OK we're going to always optimize for peak performance already used during the year in this case everywhere and the idea being that will deliver more customer value by having a faster production product put in modest except that a development team is going to have to incur some additional cost just running tests and things like that a 3rd option is running a hybrid model in and I actually never able to get this to work but I'm aware of teams that did we we do in this situation is run arrive locally so you get that festival time deployed to European production so you did you pick performance there but there is an inherent risk that because Jerry we have different runtime it's wall Tschira B and has very high compatibility with arrive in these edge cases you may see different behavior so if you deploying to a different environment production and you run locally you may have some subtle hard-to-find bugs there see I can certainly mitigate a fair bit of this but it's a risk the less and then you may actually hit some technical hurdles because things like that of Ruby and environment variable or say global lot differ values time you could have made extensions that different in C then there on job well and so on so having experience that this is actually something when it came to Rubio's really interested to know if we could handle better and I think we can so at this point introduce how and that with the new product of project called the substrate the so this it is cover ruby are being implemented in Java and Groovy it runs on the JVM but it would also retarded it to another of the called the graph of the so GDM I you get this interpreters don't have archaea of so brawls that in here you'll still run through the Jedi and J. but it won't be as fast other point though is that how we can target the GDM and it will be functionally correct I would also target the girl the which is the GDM packaged up with for all and you get that optimizing compiler which is howard ever delivered those off carrot numbers but the idea here is the slot in yet another target the substrate him so the same codebase that can be used on multiple the EMS the substrate the is a really different kind of beast so it it has 2 core components to it it hasn't ahead of time compiler of Osuna's native image generator and then it has the services that it will link to the application so the ahead of time compiler that takes a job application it takes the JDK in it takes any additional jars for libraries they your application rely on in it compiles all that directly to native machine so the program in this case the truffle through interpreter so they had a time compiler as really treating job just like his were C C plus plus go or what have you I will you get out of this is a static made of binary in the GDM is completely gone that binary will then have of the substrate VN linked into its still of garbage collection hardware abstraction some these are the features you expect from a virtual machine and to help to illustrate the point a bit more of what I have here the top is a the Legacy fluency but it's a simple ad method in job of so by takes
2 ends in the in calls of math at exact on it and if you compile this of for the JVM using Java see and we get a code on the left and this is a well it's hard to see how that is double byte code so if you if you've a worker job probabilities they have these things called Doc class files and they contain Java bytecode these into the GDM which has a bytecode interpreter so it actually interprets your class files an run interpreter until it determines that it's a hot code path and then it would subminiature jetting using the GATE Institute on the right hand side though we have is actually Sheikh so this is the output from the native image generator for that method written in Java so with the the native image generated does is it performs a close world static analysis of the application and that that that's only the mel full but it's actually fairly simple and if you break it down so every job application has a mean it of if you never done Java see it's basically the entry point into an application was a bit different ruby we've just 1 a script and it starts executing things but in a job we have this notion of a static methods static field stack initializes in on their roughly equivalent to ruby clusters class methods and just code you execute in our classes so if you open a class and revealed start execute the code in it you can run things outside of methods and so on but job a main methods are static and so the analyzes starts there in it determines all the classes program actually uses and all the methods used in those classes and then it throws everything else away so the JDK a which is a job standard library is quite massive I we would how the entire thing even to a static image at the end of this so we throw it could we don't use it's close world so because at the end of this you know 1 of the GDM can dynamically load classes so everything your application could possibly use needs to be supplied as inputs to the native image generator and the analysis needs to be a bit conservative so I thought you have an interface an abstract class in the analyzer can determine concrete subtypes for that in even call site i it needs actually compile and all the classes that could be candidates for this and you need be careful that because you could inadvertently how the whole G K and so we we wonder this with every pushed to the covered repository make sure that we don't accidentally pole and stuff we don't need and if you're interested in learning more about how that process works as a christian grammar gave a talk at the JVM language summit this year and where he gets into the nitty-gritty details of that so what's interesting for us on the trough ruby team is wreaking can't take advantage of the static analysis because when we reload Java classes in the image generator were many execute all the static blocks we can push computation of into the ahead of time compilation process so as I mentioned we actually implement fair bit of ruby with through so we have our core methods implemented in Java once we bootstrap the runtime we then implement like for instance all of innumerable but in really so the the down side with that is every time you start at the top through the interpreter and we need to move the files of the past them in some and they simply never change work rather they change when we update them but then we would issue a new release cover the at that point so there is a a lot of duplicate computation every time you run things with the have have parlor wouldn't push that computation into the native image generator do precisely once and then when you start up the static binary that is the output from the image generator all that's already calculated for you so for instance were doing this to pre passes core ruby files so we pass them we get is T know we just store the Iast he's essentially mirroring blob into the binary and we start a binary which it back at a memory not to spend time doing and file-system operations or building st I we go a step further and include all encoding tables assume ruby has support for 100 10 encodings so I need to those has a a fair bit of metadata that goes with them the same thing with transcoding mappings so if you wanna be able to use convert coding on a string they go through these transcoding mappings these are things that hardly change ever so on we gain a lot by being a bit put those into the other time compilation process I am would reconstruct constraints so a stranger worlds that for use more places so we don't need to dynamically allocate of the bytes for the underlying strength right so what is in us all the whole point was to improve our start-up time so let's go ahead and take a look at the results there so as you may recall this is what we're looking at a Hello world of truffle reviews and all at the end here at 1 comma decimal 7 seconds run this again on substrate B it goes down 100 ms so not quite as fast as I book for closing the gap there again nobody really runs Hello world production so stick that tests we began so that 47 a half seconds on the substrate VN we dropped just below 7 seconds I still has a speed by you multiple times here but I think for a development team of trying to make the decision whether they can accept this in development or not you know this of individual waiting met for tests with complete when right and deliver results in the 2nd half in only having witnessed a 5 seconds so we wanna get as fast as ever I were going to continue to reduce this number but I think we're now into the realm of acceptability for a lot of now to question those
to be sacrificed peak performance I kind of pitched this initially is we have slow start of time so we can have faster steady-state performance so if we look at those op hair numbers again from were substrate VN we actually do you take a bit of a hit so he dropped from 197 frames per 2nd 269 other still makes is about 8 times faster memorize so this is a a pretty you'd advantage in I think probably decent trade off to reduce such start time but it is a 15 per cent reduction in there's no inherent reason for that to happen so what happened why did we take a performance hit so up here is kind of a demon and for running things it it basically out decodes these opcodes corresponding to instructions to the NES hardware and then in a tight loop it uses the opcode is an index to dispatch table and then uses metaprogramming t dynamically dispatch of a sound file using sound us at 2 things are going on here a 1 its flats the results the dispatch table of the and at this generates a very or creates a very small array in the substrate VM doesn't optimize the creation of small arrays are quite as well as the role of does so crawl we can actually avoid the application of the array in some cases in just access the members directly in the 2nd thing is is a descent call becomes metamorphic very quickly which means that we can't use are inland caching which is a way of that basically all the review mutations are able to optimize a method calls so if you Thom Peruvian particular and visible to you take a step further and optimize their programming with inland caching I which none of the other connotations are able to do and I gave a talk on this a couple years ago at 3 because you're interested in how that works the point is is because this goes metamorphic we're not it will use that inline but even at that level so we have to do method location and as it turns out are calling functions as a tad slower and the substrate VN is on the as for these things the substrate team is aware of an the attendant fixing so i in summary I think truffle start at time it is fixed and when as fast MRI and that is our goal but if at the par-4 on is a viable 1 and person really excited to say that substrate DM is now publicly available so if you been following trouble ruby all of when things were often aster criticized about is our start time is a fair criticism I I don't agree that into that but a woman we addressed it we would often say here we get this this kind of thing on the side of the substrate be that will just you know make start time faster don't worry about it it was beginning to look like vaporware but It's not publicly available you can use it were were relying on it in the person that think what's also missed about substrate VM is it helps validate the the approach we're taking which offer a B a which is building on top of this truffle AST interpreter framework but in addition to getting things like a debugger and profiler for free now we get this awesome virtual machine and that salsa some time problem in from the perspective of the Trustworthy codebase were really don't have to do anything special to take advantage of that others may be half-dozen different code paths where we need to of disable something's up because the relying reflection and things like that that are available and the substrate the but for most part the same user could be used the targets of the graph the in the substrate in with there are 2 modifications as so there is some future work here a substrate the currently doesn't support compressed groups this is something the world and they're going the fixing but if you not familiar with it and it's an opposition to Javier Mori has 64 bit Chadians supporters of the of 64 bits wide but if you have a keeps more than 32 gigabytes you can actually represent those and 32 bits so that the the subject the amazing copy that optimization over yet but when it does that watches will consume Afterman memory cache lines of improved things like that but they're looking improved their a hand when Adi mentioned there things we can do much of the B side to to take better advantage of substrate and currently were only building in core library into the image how we could do the standard libraries while others a few hurdles there but there things we can clear this more stuff we could precomputed push into the the native in the generation process that are not doing that in both of those would help improve our start time it further we on Avila reduce the overhead calls to to to new functions as well so I mentioned we have this truffle and if I think for making native calls and that's really cool but since we're building up a static image we ought to be of just at the behind native functions and call them directly avoid some of the overhead with dispatching those calls end of that prompted a bit I were often increases the 2 things so 1 is a start time and the 2nd is our memory consumption I we actually haven't really spent any time looking into memory consumption yet the but we believe that the substrate the can also solve that problem for us it's something that we need to look into that more but I have some slides here just to when I'm making available so you can look at them in a secure tells you how run truffle ruby STM binary but here's the information on our benchmark environments I provided some links to related talks if you're interested in learning a bit more and this here's a a
picture some girl team the crawl
team at large is actually a little over 50 people now oracles of invested some significant resources into the various projects here so these are a lot of people were involved but not all of them we've had some alumni we've had in turns out we have university collaborations the basic point is is this a lot of work and we more than what it was doing so so I'd like to just acknowledge the efforts of all the other contributors am particularly the truffle ruby team others proceed in Petaluma but stopped McGregor been model and bring in fish and from the substrate team boy never know they should really helpful polymer stuck together in there yet his my contact info I will talk about the stuff of that's here insulin more much of Ruby please reach out I finished and trying run through the application or a library with truffle ruby a were always looking for use cases I'm happy to work with you in I see if the your is work with us if not we can try to get that result then truffle reviews completely open source so you can know positive of the project and yet it's said it wasn't questions that we have a few minutes left the question was Has real
support so that's coming along the problem for us is the database of actors the so child reviews is that the new implementation and as the database drivers of basically shift as extensions of there's a pure ruby version of the post press driver but for the most part the debate database drivers have a native component to it us of MRI has so they've extensions for all the drivers injury has a new job extensions from the drivers the other problem is the extension API was really a API eyes of an MRI extension literally is taking arise internal functions in a long you call into the runtime and the same is true of Jerry B job extensions social through B has its own runtime were not implemented as arrived in when implemented its tears of so our options were pick 1 of those and become compatible with it or and thence the ecosystem to adopt yet a 3rd extension API so we decided to go down the path of a work compatibility with MRI which means were not taking functions that implement arrive from pretending they're an API and studying in her own implementations of them in that works in progressing nicely but it's a really large surface area because there's no defined public API we need to just figure out what people are using and support them and that's how we've gotten Open SSL the Jason extension of years running and so it's really just a matter of time at this point though we were doing it is with yet another tougher language called so long which is an LLVM bit code interpreter so we use claim to compile the extensions down LBM bit code and then rather than generate machine code from that see long interprets the code generates AST nodes and we use truffles on Interop functionality to combine that with the truffle Rivest he nodes the just all works together in list so we we have issued our 1st SQL call up with the most well adapted but we're making progress post press 1 right now and but we've someone's been chipping away at the sequel like 3 1 but once that that's done we really should be close to running off rails the rest of it like active model active support action and we handle those well spring is gonna be problematic of I'm not sure full ever really wanna buy from we might have to do it with the substrate and image yeah yeah a measure of that picked up so basically the the question was is a start time dominates the application profile a white in the whole world application show a larger effect then of running a test suite which you would think pays the start time once in engines drug test this turns out that has to be does for being an exacting so that you actually pay the start time multiple times throughout the course of the own and then but a secondary effect is garbage collection so when running Hello world are you exit quickly there's an opportunity to generate a whole lot of garbage the substrate DM garbage collector is different from the JVM 1 in we had tinned it quite as well yeah so so the question was the fear of statically linking the JDK into the resulting binary is that limit the on the job because you can make a really great of so we do have to forgo the job at Interop feature I mentioned because that relies on runtime reflection of the substrate VM actually recently gained from limited reflection capability so we might be a body you of stock providing Interop for classes that are already linked we can't dynamically load classes so it is I a job and Belgrade you might be accustomed to drop in a jar and in implementing like a new water interface some like that you wouldn't be able to dynamically but but from the perspective of truffle re read what what we need in the JDK is known in advance so I went on the problems that way and so the question isn't whether Oracle hope to get out of this them so I worked for Oracle labs which is research group within Oracle and and it's a bit different than what you get from the product groups so were supposed to you kind of identify and investigate new technologies that could be useful to other products of groups I think it's best maybe look at this as the brawl team as a whole so we have implementations of jobs that are really be we have this too long projects for LOB and big food we have the native function interface with truffle we have crawled GE's energy compiler we had substrate the in all these things can work together with 1 another the so the various languages helped improve the development of both truffle and the try grumble so if we only had 1 language you risk overfitting and things like that that so grows actually now shipping is part of Java 9 so some of this is already started triple its way back up into other parts the organization the they are the that's about it of that incident you looking for OK so the the question is is a the units that serve at a time doing the native image generation about how does that compare to the time you saved from start-up but I think it's best that thing this lake would you do it a CU would compiler MRI and then you don't compiled again a most new versions released so you're not gonna compile truffle ruby with of the substrate be every time application were compiling is the interpreter not the application being run an interpreter so we actually ship precompiled versions with the broad and distribution I mentioned previously but if you wanted to build it yourself by you could but you would only rebuild but if you actually lost sight in his core implementation files so the question is because we implement ruby in Ruby does that mean that the ability for application code monkey patch core classes right right so have there is no difference but it's start running and user because all we would have done otherwise was passed the codon generate this t anyway so just cutting out that passing and AC generation is when we start running and use of code at that point our core libraries are already initialized in if you wanna monkey patch them that that works just fine so the question is is how do we want I have we looked at using biggest kind of application capture a much yeah background even for a running things faster I guess I'm with that too much and there is a link to job applications in general I could do that for and trouble review probably by our approach was to just try to deal with Stock Times you do what has is to do but I guess even memorize you can do this with spring and rails but were going the substrate rout in maybe maybe jerk work but my intention that but I think everybody world
few but the


  806 ms - page object


AV-Portal 3.13.1 (abea844c86ad1b15ca76e1472346f3fd8bea123a)