We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Packing your Ruby application into a single executable

00:00

Formal Metadata

Title
Packing your Ruby application into a single executable
Title of Series
Number of Parts
69
Author
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Recent languages like Go compiles a project into a nice executable, why can't good ol' Ruby? We have built an packer for Ruby to do just that. It is 100% open-source, and can produce executables for Windows, macOS and Linux individually. By packing, distributing Ruby apps are made extremely easy, additionally with intellectual property protection. Auto-updating is also made easy, in that the executable only needs to download and replace itself. So, how we did it? How to use it? What goes under the hood? What future will this bring to Ruby? That's what will be unraveled in this talk!
43
Thumbnail
29:29
44
AngleGoodness of fitJSONXMLUMLComputer animation
Hacker (term)CollaborationismSimultaneous localization and mappingAtomic nucleusTwitterWeb pageHacker (term)Open sourceLink (knot theory)Software developerCollaborationismComputer animation
Service (economics)WindowIntegrated development environmentInstallation artSingle-precision floating-point formatProjective planeFerry CorstenCASE <Informatik>Cartesian coordinate systemVirtual machineProduct (business)Archaeological field surveyComputer animation
WindowInstallation artDistribution (mathematics)CodeComputer programmingDrop (liquid)Multiplication signProjective planeProduct (business)Order (biology)Virtual machineFormal languageRule of inferenceWordGraph coloringForcing (mathematics)Computer animation
LoginSoftwarePauli exclusion principleElectronic signatureIntegrated development environmentVirtual machine
LoginInstallation artBinary fileInformationCompilation albumPhysical systemMultiplication signVideoconferencingConnected spaceFrequencySoftware testingOffice suiteComputer animation
LoginInstallation artTerm (mathematics)Continuum hypothesisSoftwareServer (computing)Error messageModul <Datentyp>Run time (program lifecycle phase)Revision controlOperator (mathematics)Integrated development environmentLattice (order)Virtual machineWindowComputer programmingOperator (mathematics)Revision controlRun time (program lifecycle phase)Module (mathematics)Error messageComputer fileComputer animation
Installation artLoginComputer clusterTerm (mathematics)Continuum hypothesisSoftwareNetwork topologyRevision controlInformationMiniDiscComputer programmingInstallation artWindowRight angleRevision controlProjective planeComputer fileTrailMathematicsSingle-precision floating-point formatComputer animation
Revision controlMiniDiscData typeInterpreter (computing)Computer programmingDot productInstallation artDrop (liquid)View (database)MereologyComputer animationLecture/Conference
Online helpStaff (military)Integrated development environmentVariable (mathematics)Demo (music)MereologyDirectory serviceConnected spaceWind tunnelComputer fileComputer programmingFile systemCASE <Informatik>Single-precision floating-point formatSoftwareAtomic nucleusRow (database)Revision controlComputer animation
Integrated development environmentRevision controlQuicksortTraffic reportingPoint (geometry)Computer animationSource code
Process (computing)Revision controlComputer programmingWeb pageSource codeHome pageAddress spaceComputer animation
Convex hullNormed vector spaceCuboidComputer fileCompilation albumProjective planeComputing platformWindowComputer animation
Interpreter (computing)Variable (mathematics)Integrated development environmentStaff (military)Physical systemConfiguration spaceWordSingle-precision floating-point formatInterpreter (computing)Uniform resource locatorParameter (computer programming)SummierbarkeitSource codeComputer animation
Binary fileStaff (military)Demo (music)DreizehnForm (programming)Source codeProduct (business)Letterpress printingStudent's t-test
Interpreter (computing)Distribution (mathematics)Installation artLibrary (computing)Computer fileWindowComputer animation
Demo (music)Staff (military)Interpreter (computing)Compilation albumFrame problemContent (media)CompilerSource codeComputer animation
Staff (military)Demo (music)LogarithmMathematicsCompilerCommon Language InfrastructureCompilation albumProjective plane
CompilerCivil engineeringVideo game consoleFiber bundleSpring (hydrology)Cloud computingCommon Language InfrastructureStaff (military)Nonlinear systemBackupRankingBinary fileKeyboard shortcutAliasingDatabaseCodePhysical systemSoftware testingAdditionServer (computing)Binary fileReal numberInferenceProjective planeServer (computing)Computer animation
Price indexVideo game consoleSpring (hydrology)CompilerFiber bundleComputer configurationAsynchronous Transfer ModeRevision controlThread (computing)Integrated development environmentServer (computing)Uniform resource locatorString (computer science)Letterpress printingMereologyParameter (computer programming)Uniform resource locatorComputer animation
MiniDiscDifferent (Kate Ryan album)Functional (mathematics)WindowCoroutineLibrary (computing)MereologyAbstractionNetwork socketRevision controlServer (computing)Projective planeUniform resource locatorInterpreter (computing)MiniDiscSemiconductor memoryComputer animation
Spring (hydrology)Local ringMiniDiscIntegrated development environmentFiber bundlePrice indexAsynchronous Transfer ModeView (database)Computer configurationThread (computing)Demo (music)Revision controlComputer fileComputer networkPhysical systemLibrary (computing)Interactive televisionSemiconductor memoryFile systemVirtualizationDemo (music)MereologyVirtual machineReading (process)Operator (mathematics)Combinational logicShared memoryLibrary (computing)Level (video gaming)Physical systemComputer fileSource codeMathematicsKey (cryptography)Real numberStandard deviationStructural loadObject-oriented programmingInterpreter (computing)Virtual memoryComputer animationSource code
Revision controlPhysical systemLibrary (computing)Computer fileInteractive televisionBlock (periodic table)StatisticsComputer fileMultiplication signVirtualizationInformationMetadataProjective planeFile systemReading (process)Lecture/Conference
Computer fileStaff (military)Physical systemLocal ringMathematicsLogarithmProjective planeSemiconductor memoryMiniDiscLocal ringUniform resource locatorDirectory serviceComputer animationLecture/Conference
Local ringComputer programCodePoisson-KlammerUniform resource locatorArithmetic meanProjective planeComputer animation
Computer fileStatisticsHacker (term)MathematicsLogarithmSoftware testingComputer fileLevel (video gaming)Internet forumSpecial functionsRight angleTrailCodeComputer animation
Price indexMathematicsEmailSystem callSource codeOperator (mathematics)Physical systemLogicCore dumpMereologyComputer animation
Library (computing)System callFile systemImplementationComputer fileCodeComputer animation
Goodness of fitLibrary (computing)Computer fileSound effectFinite state transducerComputer animation
Physical systemRevision controlFingerprintRouter (computing)Radon transformLink (knot theory)FirmwareDistribution (mathematics)Read-only memoryFile systemRevision controlKernel (computing)Router (computing)Computer animation
Physical systemRevision controlFingerprintRouter (computing)Radon transformFirmwareNumberUtility softwareFile formatKernel (computing)MereologyCartesian coordinate systemData structureProjective planeDesign by contractTheoryLimit (category theory)SpacetimeKernel (computing)MereologySoftware developerComputer fileBlock (periodic table)Level (video gaming)Computer configurationStability theorySound effectSuite (music)Multiplication signField (computer science)Goodness of fitCodeComputer animation
Computing platformPhysical systemVirtual realityCodeLibrary (computing)Multiplication signField (computer science)MereologyPoint (geometry)Projective planeCodeDistribution (mathematics)Cartesian coordinate systemFood energyPopulation densityWindowUser interfaceComputing platformComputer animation
Physical systemInformationIntegerError messagePointer (computer programming)Electric currentVirtual realityProcess (computing)Computer fileSystem callParameter (computer programming)Negative numberPrice indexInheritance (object-oriented programming)Physical systemLibrary (computing)System callMultiplication signReading (process)Prisoner's dilemmaQubitWater vaporGraphics tabletMereologyComputer fileComputer-assisted translationVirtualizationFile system1 (number)Computer animation
Data structureSystem callRow (database)Physical systemOperating systemComputer animation
StatisticsEmailMultiplication signEmailSystem callWindowExtension (kinesiology)Graph coloringComputer animation
Extension (kinesiology)Computer fileLibrary (computing)AerodynamicsStructural loadComputer fileExtension (kinesiology)Structural loadProcess (computing)MereologyLibrary (computing)Multiplication signBus (computing)Uniform resource locatorMoment (mathematics)Perfect groupOpen setWorkstation <Musikinstrument>Computer animation
BackupBinary fileRankingStaff (military)Pointer (computer programming)Integrated development environmentLoginComputer fileVariable (mathematics)RootSystem callProjective plane1 (number)VotingDependent and independent variablesUniform resource locatorReal numberNetwork topologyVoltmeterWritingFreewareRead-only memoryComputer animation
Spring (hydrology)Integrated development environmentThread (computing)Asynchronous Transfer ModeOpticsFlow separationDirectory serviceConfiguration spaceUniform resource locatorServer (computing)System callMobile appRoot
Thread (computing)Local ringAsynchronous Transfer ModeServer (computing)Socket-SchnittstelleDatabaseCache (computing)Projective planeArithmetic meanLibrary (computing)Computer fileRun time (program lifecycle phase)View (database)Multiplication signComputer animation
Link (knot theory)Computer hardwareMultiplication signProjective planeAreaRun time (program lifecycle phase)Computer animation
Distribution (mathematics)TwitterComa BerenicesComputer animationXML
Transcript: English(auto-generated)
Good morning, RubyConf, I'm excited. How are you doing today? My name is Minqi Pan, I came from China, Beijing,
and I'm a hacker of Ruby and C++. I'm also a Node.js collaborator, so I do a lot of open source development, and this is my GitHub page.
And those are the links. Oh, I can take off this, or not. And today, we'll be talking about how to compile your Ruby application into a single exit book. So where does the story come from? You know, the other day I was installing the GitLab CI runner on Windows machine,
because our project is in C sharp, and I thought, geez, GitLab was written in Ruby, it will be, it seems very hard to install Ruby on Windows. Probably I need to set up the Ruby environment and do some gem installs and stuff. But it turned out that that's not the case.
They made a single executable for me to download. It's 46 megabyte. After I download that executable, I just executed and installed a service on my CI machine, and voila, it works.
So why is that? Because they're not writing the CI runner in Ruby, they're writing it in Go. So Go has this nice feature that can build your program into a single product that you're free to distribute. It feels so nice. So in order to solve this distribution problem, people will just need to drop Ruby and use Go.
Thanks all for coming. But seriously though, some company does that indeed. Like Heroku, I remember time ago when I installed Heroku CLI tours on my machine.
On my Windows machine, there was an installer based on Ruby installer that basically installed the Ruby along with the Heroku's project code into my machine. But they changed that. The last time I tried, Heroku CLI is no longer in Ruby. I guess it's also in Go or Node.js or something like that. So I'm kind of jealous about the Go language.
I love Ruby. I've been writing Ruby for years. I think we all do love Ruby, right? It's so sweet. We love the language. By this jealousy, I'm trying to bring that feature into the Ruby world. So what problem are we trying to solve here?
Before the Ruby packer, how we distribute our program? Well, something somewhat like this. We have to first, well, assume that the user does not have Ruby installed on their machine.
We have to first let them install the Ruby environment and then probably install the REM and then install Ruby. And those all take a long period of time to complete. I recorded this video in China and they even got worse connection than the US.
And that takes a long time. It's definitely not friendly to the end user. And also finally, after we got the Ruby environment set up on user's machine, we need to let them do some gem install stuff. And there are a lot of dependencies to download.
I think overall, that's not a good experience. And if you consider Windows, it's a disaster. You have to, yeah. So the problem is installation is slow. Tons of files to download. And if in China, the Great Wall somewhat helps.
It even makes the thing worse. And sometimes people forgot to use sudo. And it's error prone, especially with native modules. And you have to care about the Ruby runtime version. Say if you use the lonely operator that was introduced in Ruby 2.3
and the machine was installed with Ruby 2.1, then your program won't run. And also, updating. After you distribute your program, you want to keep it up to date. So how do we do that? We have to do the same gem install thing again.
And the user does not even know that you have a new version, right? And on Windows, you have to install it again. So, more version checks. It's also cumbersome to update, as cumbersome as to install, I guess. So, after I try this packing your project
into a single executable idea, it turned out to be much better. Look, you can compile your Ruby project into a single file. This is one project that I compiled. It weighs 40 megabytes.
And you can just execute it. Dot slash your program and it runs. And also, on Windows, the user no longer have to install those installers anymore. You drop a EXE to them, 33 megabyte.
And, well, they cannot double-click it because usually it's a command line interface, so you just bring up your CMD and it runs. And this is the best part. Updating has been made so much easier once you pack it into a single executable.
It feels so nice that I want to show it to you in the demo. So, this is the program that I packed. And when it runs, it will check for a new version of it. And if it finds a new version, it will download a new version and replace itself. Well, the download step happens
on the temporary directory. It downloads the file to some temporary location. And move that file to this single place to replace itself. The network connection is a little bit slow, so we are probably not doing it. But, I've prepared for this, so I have a recording.
It was best when I recorded it. Yeah, inflated the thing and replaced. Look, and it resumed the execution and you now got 0.4.
And it happens on Windows as well. So you can sort of just put a self-check for versions on when your program begins. And the updating process is so easy because it's just my file, just replace yourself. Is that the idea?
And the tour that I made will help you produce, execute both like this. You can get the source code of the tour from this address. It's on my GitHub page. It's called Ruby Packer.
And I even made a homepage for it. It's used to probe the idea that this thing works. Because this tour itself was written in Ruby and this tour is used to compile Ruby projects, so it must be able to compile itself. So this is an example of the thing that it compiles.
It actually can generate those files on three platforms, Windows, Mac, and Linux. You can just download and it works out of the box. So, how to use this tour? Well, there are several scenarios.
The first scenario is, I have this tour installed at this location. If I don't give it any arguments, it will just try to produce a single Ruby execute ball that is blank. It's just the single Ruby interpreter execute ball. How it looks after it's finished.
I have some examples here. This is the final product. And if you execute it, it's just a Ruby execute ball. So you can print one plus one. And it's cool. So I wish Ruby was distributed in this way. Actually, it's not. Currently, Ruby was distributed in source code form.
And there's no, almost no binary distribution. And there is some for the Windows, the Ruby installer. But that distribution contains so many files. The standard library is so huge. What if we can distribute Ruby
in just one single execute ball? That would be so better, I think. And, but you must have said that Ruby contains so many things, it's just not one single execute ball. The contents IRB. Well, IRB is inside this.
I will tell you how it works later. Let's see another scenario where you can use this tool. It's the ordinary scenario where you want to compile your own project. Say if you have a Ruby project, like the one that I got here,
you just give the entrance of that project to this tool, like being UBC, and it will begin compiling. And after that, you will get execute ball for your Ruby project. And that's usually the main scenario that is supposed to be used.
And since this Ruby conference is totally on Rails, you cannot forget about Rails. So if you have a Rails project, you can use this tool with the entrance set to bin slash Rails, and they will produce a single execute ball
for your Ruby, for your Rails project. I have already compiled one here, so we can show it. Like this execute ball, you can see it's 38 megabytes. And I can run it.
And they will say you have to add another command. Let's run it at server. Rails always starts with this one. Then you go to local host, 3,000.
Yay, you're on Rails. And the fourth scenario is you can actually pass in the gem to let it compile the gem. And also, there are parameters for you to specify how to update URL to check for new versions,
so that the auto-updating part is built in. And that's because I made this tour on top of another library called lib auto-update. That library basically abstracts away the difference between Windows and the Linux, and use the socket functions to communicate with the server to check for new versions.
So what happens on the whole? How do we make this work? The basic idea is we put your project into something like a mounted disk in your memory, along with the Ruby interpreter.
And we mounted it at a special location called __enclosedio-memfs. You probably already seen that in the backtrace. That is like a virtual file system in memory. And if you...
Do a demo here. I'll put a... We can use the other one, the Ruby interpreter, the blank Ruby interpreter
that will be compiled. And we can give it a entrance, and that entrance you can even reference a file inside the virtual file system. And that's the reason why we can bring up the IRB. And if you look at the load path, that is all in the virtual file system
that is actually in memory, part of the executable. So that when you require something, it actually will search inside this virtual memory to give you that. And you can do all kinds of file system operations on it,
and it all just works. Like if I wonder what is on the load path in this virtual memory. There being include lib and share, so Ruby standard library is actually embedded inside this super executable. It's inside lib, Ruby.
Well, you know, you used to have this huge library installed on your system, but now it's embedded inside this executable. So all the global level gems are there. This is very similar to what you would install on your machine.
And you can even read file out of it, like file.read, give it a virtual path in IRB. Look, that's the IRB that we just evoked. So that's the idea. And also, the normal files are,
you can also read normal files. And they are, like, if I wanna see what is on my, what is on the load path of the user's disk. Oops. And those are reading from the outside. So it's a combination of a in-memory file system
and the real file system. The key is if it starts with this thing, it goes into itself, otherwise it's supposed to be outside. That's the idea. And all kinds of file operations are supported on this virtual file system.
Like, I can read this file, and I can get its, it's 11 hours. I can get its metadata at the time that I compile this. You see all the, those information are stored when I compile this project.
So that's the idea. Whatever starts with this path, goes to your memory. The others, they'll go to your disk. So where is your project? We put that, we put your project into a special location under this virtual directory called local.
If you compile, use this tool to compile your project, you will find your project here at the location. So, to wrap it up, you then hard code that entrance. Because it used to run your project, use the Ruby, via the Ruby interpreter,
and put your entrance at the argv bracket one location. But now, since we have compiled this into one single executable, we just preset the argv one to your location, so that the user, when the user fetches your final executable, it just runs, runs from this location.
It just works. So how do we do this? There are so many file APIs, we cannot hack them one by one to add this special functionality, right? It's just not maintainable. Ruby is subject to change here, 2.5 is coming up. So, we did not change on the Ruby level.
Yes, we hacked some code on the Ruby level, but in a limited way. Like, Ruby IO.C is the source code for lots of IO operations. We didn't change much, we just included extra header to intercept some of the system calls.
And most of the logic is actually inside this small library, it's called the libsquash. What is libsquash? That is actually the core part of it. Well, I want to take this opportunity to thank a few people,
this library is not solely made by me. Dave Wiley-Lavsky made a library called Squash-Fields. It is a file system, the Squash file system implementation on fields, and I take lots of code from him and make it into a library.
And also, this is my good buddy, he's called Shen Yuan Liu, and he helped me write this library as well. So, this library basically implements SquashFS. How many of you have heard of this file system?
Oh, thank you, very few people. SquashFS is a compressed, read-only file system that was used by LiveCD versions in many Linux distributions. Like, I used to play with the Ubuntu LiveCD, and the LiveCD was actually a SquashFS that can be mounted by the kernel when you run it,
and also used by some router companies as their firmware. So, we are trying to invent this tour to compile your project. We are looking for a data structure to hold it,
and I was looking for different options, and I think this SquashFS data structure fits our needs. And it has been there for a while, it's stable, so we used that, and we tried the effectiveness of it. If your project weighs over a hundred megabyte
after making it into a SquashFS file, it's been compressed into just like 16 megabytes, and there are tours for it. It's called MKSquashFS, it compiles your project. Meanwhile, with the final data structure,
being able to arbitrarily access, randomly access, so it's a good data structure. The compressing part is important. You don't want the user to have a huge file when you distribute it to them. You cannot let them download something like a multi-android megabyte that is unfriendly as well.
Takes a long time to download and update, so we choose one with compressing in it. And that is actually part of the Linux kernel since 2009, so we cannot just use that
because it's GPL licensed, and the code is part of the kernel, you cannot use it on the application level because like the malloc or just kmalloc, they are part of the, they're not at the user level, it's the kernel level. So we cannot use the original implementation, so that's when I found the Squash fields.
You know, it's a fields implementation, so it's on the user space, and it's MIT licensed, so I took his code and made another library called libsquash that removes the fields part out of it, because we don't wanna make
any assumptions about user's environment. We do not want to assume that they have fields, we don't need that, so it does not depend on fields, it's just a library implementing the Squash FS access part of it, and it's MIT licensed because the fork project is also MIT licensed.
So that's actually an important point. I'm not gonna expand on that because time is running up. And also it compiles on three platforms, Windows, Linux, Mac, so that if you use this, your application is super distributable.
And this is the important part, I designed this library to make it mirror the system calls, to mirror a lot of system calls. So it can just come as an in-place replacement for a lot of system calls, the file system ones. And this is another important design. You know, a lot of system calls do not have path in it.
I have to distinguish the path that begins with underscore underscore that special screen, and path that do not. Some system calls, like open, read, write, they actually work on file descriptors. They're just integers.
How do I know that a file descriptor was opened on a virtual file or a real file? So I have to come up with something like a virtual file descriptor. Like if you do open call at the first time, it has a path in it, and I will distinguish if this is a real path or a virtual path. If it is virtual path, I issue you
a virtual file descriptor, and that was actually generated by this system call. I duped the zero file descriptor to make it look like a file descriptor, and let it coexist with other FDs issued by the system. But I have it in my record, I have a global data structure
called global FD table, so that I remember what FD are issued by me and what are not. So after you get that FD and made some calls after that, like read, then I will check if this is my FD or the system's FD. If it is my FD, then I will read the data for you from the Squash FS.
Otherwise, it was sent to the operating system. So this is how the header, the magic header works. I just intercepted a lot of system calls redirected to that one. And we spent a lot of time hacking Windows as well.
Look how ugly are they? We intercepted those calls as well. So it works on Windows as well. So what about native extensions? Well, we did some compromise on this part. We could make it so perfect as to record
all the native extensions at compile time, but that is really hard. For the moment, when you call dlopen or load library on Windows, we actually extract that file to a temporary location, but it works. For now, it's a temporary solution.
We extract it to a temporary file and then do the dlopen on the temporary file, so it still works. And when the process access will delete that file, so native extensions work. So what about Rails? Why does Rails work? This example, Rails is special.
It will create some files in your root of your project, but Squash FS is read-only. You cannot write anything to it. Like Rails will write a lot of logs into the root of your project. It will create a lot of temporary files.
So how do we solve that? Actually, we also intercepted some of those calls to a temporary location. So when it tries to write files to the Squash FS, we actually redirect that call to a temporary location and remember that file and also delete that temporary file when we access it.
So if you want to see the files that got created, we have a special environment variable for it. Let me create a new folder so that you see, and I move the Rails inside of it. And if I run it with io workdir, and I say this dot is current with dir,
and some things will appear in there. I'm sorry, it should be, you can put it there. Something already popped up. We can delete that first. And run it again with a server.
You will see that the log folder and the tmp folder are just created. Because they're trying to write it to the root of your project, but we redirected that call to a temporary location and that temporary location was specified as the current directory. So the log is there, and your config is there.
So if you're really going to distribute your Rails app in this way, this is an opportunity for you to spit out the config to make it configurable. So that it distributes with the config folder,
but not your code anymore. That's how it works. So in summary, this is your project. And we use make-squash-fs to compile that to yours.squash-fs. And we compile the two libraries, libsquash as the runtime access to that file,
and libautoupdate to give you an autoupdate feature. And we compile the Ruby runtime. Then, as last time, we compile the runtime. So it's sort of like the PPAP process, right? So it's a... I have a project. I have a app-wash.
Your project libsquash. I have a pen. I have pineapple. Apple pen. Pineapple pen. Your project libsquash with autoupdate with runtime.
Your project libsquash with runtime. We start equally link them together. It becomes your.exe, and you distribute and enjoy your executable. Thank you very much.