Optimizing sandbox creation with a FUSE file system
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 490 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/46900 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2020155 / 490
4
7
9
10
14
15
16
25
26
29
31
33
34
35
37
40
41
42
43
45
46
47
50
51
52
53
54
58
60
64
65
66
67
70
71
72
74
75
76
77
78
82
83
84
86
89
90
93
94
95
96
98
100
101
105
106
109
110
116
118
123
124
130
135
137
141
142
144
146
151
154
157
159
164
166
167
169
172
174
178
182
184
185
186
187
189
190
191
192
193
194
195
200
202
203
204
205
206
207
208
211
212
214
218
222
225
228
230
232
233
235
236
240
242
244
249
250
251
253
254
258
261
262
266
267
268
271
273
274
275
278
280
281
282
283
284
285
286
288
289
290
291
293
295
296
297
298
301
302
303
305
306
307
310
311
315
317
318
319
328
333
350
353
354
356
359
360
361
370
372
373
374
375
379
380
381
383
385
386
387
388
391
393
394
395
397
398
399
401
409
410
411
414
420
421
422
423
424
425
427
429
430
434
438
439
444
449
450
454
457
458
459
460
461
464
465
466
468
469
470
471
472
480
484
486
487
489
490
00:00
File systemBitMultiplication signGoogolFamilyMereologyCuboidComputer animation
00:55
SoftwareBuildingScalabilityGreatest elementEndliche ModelltheorieResultantGoogolRevision controlOpen sourceProjective planeWebsiteGroup actionSource codeBitRow (database)Universe (mathematics)Disk read-and-write headWeb 2.0Pressure3 (number)Metropolitan area networkCuboidPrice indexPhysical systemComputer animation
01:51
Group actionData structurePhysical systemSource codeEmailGroup actionFile systemParsingoutputSemiconductor memoryLibrary (computing)Function (mathematics)Computer fileCompilerData structureInclusion mapMereologyObject (grammar)Directory serviceProcess (computing)NeuroinformatikCASE <Informatik>MathematicsCuboidImpulse responseLine (geometry)BitSingle-precision floating-point formatMetropolitan area networkRight angleComputer animation
04:04
Process (computing)Group actionFunction (mathematics)outputCompilerComputer fileProcess (computing)Revision controlGreatest elementVirtual machineInternetworkingFile systemNamespaceLink (knot theory)ResultantSpacetimeIntegrated development environmentPhysical systemNumberMultiplication signComputer animationProgram flowchart
05:00
Group actionInstallable File SystemData structureRead-only memoryParsingFunction (mathematics)Multiplication signGroup actionData structureReading (process)Source codeIntegrated development environmentDirectory serviceSystem callCompilerProcess (computing)Flow separationComputer filePerturbation theorySoftware bugUniform resource locatorSpacetimeState of matterRight anglePoint (geometry)WordMereologyPhysical systemComputer animation
06:44
Forcing (mathematics)Semiconductor memoryMereologyReal numberMetropolitan area networkGroup actionSource codePhysical systemRight angleMeasurementBitSpecial unitary groupMarginal distributionCASE <Informatik>Different (Kate Ryan album)Uniform resource locatorKeyboard shortcutComputer fileBlogLimit (category theory)Process (computing)NumberSystem callMathematicsTime zoneOperator (mathematics)Asynchronous Transfer ModeDemonFile systemDirectory serviceComputer clusterRootMultiplicationComputer animation
08:33
Virtual machineLocal ringoutputMultiplication signOverhead (computing)MeasurementGroup actionBuildingNumberRight angleProof theoryCuboidAreaTotal S.A.MassAutomatic differentiationComputer configurationSoftware developerAbsolute valueMoment (mathematics)Data structureComputer animation
10:00
Maxima and minimaCommunications protocolComputer programmingCASE <Informatik>Multiplication signArmLibrary (computing)Musical ensembleProcess (computing)Software testingMUDRoundness (object)Symbol tableCellular automatonBitElectronic program guideProjective planeCodeCuboidFormal languageCompilerPoint (geometry)Software bugQuicksortGoodness of fitArchaeological field surveyMessage passingCondition numberSource codeSocial classWeb pageRight angleGoogolBinary codeForcing (mathematics)Physical lawSoftwareKeyboard shortcutReading (process)Order (biology)Extension (kinesiology)View (database)Electronic mailing listOpen sourceDampingOnline helpThread (computing)Run time (program lifecycle phase)Inheritance (object-oriented programming)Link (knot theory)MultilaterationWeb 2.0ScalabilityLogicKernel (computing)MereologyCommunications protocolPhysical system
15:20
Point cloudFacebookOpen sourceComputer animation
Transcript: English(auto-generated)
00:05
We have Julio Moreno here. He's gonna talk about optimizing sandbox creation with Foos file system. Is that right? All right, give him a warm welcome. Thank you.
00:20
Is the timer working? Okay, great. Hello, everyone. As I've been introduced, I'm Julio Moreno. I work for Google. I'm in the Bazel team. It's my first time at Fosdom, so I'm giving a talk now about Bazel. So today, I wanna talk to you about Bazel and specifically about sandboxing and how we've been trying to optimize it to be faster by using a Foos file system.
00:42
We have 15 minutes, as you know. I'll fill them up with my talk. If you have questions, we have to wait until later, but you can find me. I'll let you know how. Okay, so before we get into sandboxing, I want to recap a little bit what Bazel is. Even here, either last year or the year before, our team had a booth, and I know it was very popular,
01:03
but this year, we don't have one, so if you don't know what Bazel is, just go to the website at the bottom. Bazel.build, or basically, I wanna tell you, basically, it's Google's build system, and basically itself is the external version of it, the open source version, which essentially lets you build and test any kind of project that you have, right?
01:21
And it's specialized in integrating trees, source trees that have many different languages, and the goal is to build anything very quickly and reliably. And by reliably, I mean you want your builds to be deterministic, so if you build the same thing twice in a row, they should give you the same results. And that's actually where sandboxing comes into play.
01:41
But before we get into sandboxing, we have to tell you a little bit how Bazel actually models things. So the basic concept we have to understand for this talk are Bazel actions. Bazel action is essentially a command invocation. Like if you're familiar with any other build tool,
02:01
like for example, make, any command that make runs essentially becomes an action in Bazel. And Bazel represents this in memory with a data structure called action, of course. And the action contains a command line. In this case, we have an example for a CC compile that takes a source file and generates an object file. And as part of the action, we register in memory
02:20
what the inputs of that action are and what the outputs that we expect from it will be. Now you will notice that the inputs here contain the compiler itself. I've simplified it by just listing the binary, but of course that includes any libraries that we may depend on, et cetera. But the important thing is to see that we have the source file, parser.c, as well as any includes that the source file
02:41
might have inside, right? A C file, we have, in this case, the parser.c file has include parser.h. So that becomes part of the inputs of the action. And then when we run this command, we expect that the compiler will generate just one single .o file in the same directory where we run the command. Now this is great. It works.
03:00
But now the problem is, look at that dash capital I dot. Right, the C compiler, when we, that was our memory structure, when we put the file system into play, the file system has more things. And in this case, in the same directory, we have the parser file, the header, we have another header called lexer.h, right?
03:20
There is nothing preventing the compiler from reading that file, right? If your parser.c source file contains an inclusion of these other header files directly or indirectly, and you haven't told Bazel about this, right? It's not part of the memory data structure, then things will not work eventually because if that header that you didn't know about
03:40
changes, then Bazel doesn't know that it has to reveal this action and then your build will not be correct in the end. So we want to prevent these kind of things. And the way we do this is with sandboxing. Now with sandboxing, we have two things that we have to take into account. And the first thing is actually isolating the process. So when we run the compiler, it can only do
04:01
the things that we think it should do, right? So here we have our process. Now it's a more clever version of CC and we put it inside sandbox. Now the sandbox will prevent things like, you know, it happens that this compiler wants to check the host name of the machine. Or for some reason, it wants to access the internet.
04:21
Or it wants to access a file that we didn't declare in our inputs, right? So the sandbox will block all those accesses or mock the result and make sure that the process behaves only in the way that we thought it should. On Linux, we implement this today with user namespaces. And on Mac OS, we use this deprecated tool called sandbox exec. And you have there a couple of links at the bottom
04:41
that explain all of this in more detail. Now, okay, that's how we actually prevent the process from doing things, but I'm not gonna touch any more of this in this talk. What we want to look into is how we actually prepare the file system for this to work. Right, because we have to run this command somewhere. And the way we do this is we create
05:02
kind of a truth environment for the command. So essentially we have the same data structure as before, but now when we want to run this command, right, the CC binary, instead of running it in the source tree, we create a separate sandbox directory that contains only the things that we want the compiler to have access to and see.
05:22
And then we create the sandbox before we run the action. We run whatever it is inside there. We don't use truth, but essentially it's the same idea, right, we just execute the command in a directory. And then we extract the outputs that we generated in that directory and put them back where they belong. This example, you can see that they go into the workspace
05:41
that's not exactly true, they go in a different location, but you get the idea. Now, the problem is that the sandbox directory today is created with symlinks, but all of these things in between are symlinks that point back to the workspace, or whatever the outputs are.
06:00
So when I put the read only question mark, it's like we would like those files to be read only, but with symlinks we cannot do that. Anyway, the main problem here that we have, and the performance issue that I want to talk to you about is that there are, in big builds, actions tend to have thousands of inputs, right? So then this process of creating the sandbox
06:21
for every action becomes extremely costly. We have to do one sibling system call for every input, and when you have, you know, any kind of perturbance in the timing for the action, the action's in the critical path, so any increase in performance there will result in a big impact on the whole build time.
06:41
So we want to minimize that. So the idea here is we will use our Fuse file system to make, to actually replace all those system calls with just one kind of RPC, right? We introduce a process in between Bazel and the file system that's called sandboxfs, and this process receives calls from Bazel that tell it what to do.
07:02
So in this case we want to run the same action as before, right, with different files, I don't know why, but we have an RPC called create sandbox that says please create a sandbox for action one, and I want to put these files in, and I actually want the root directory of this action to point to a specific location where that will be writable, and the source files have to be put inside
07:21
in read-only mode. These are not symlinks, these are just actual real files that will be put in the directory. So then sandboxfs comes in, does some in-memory operations only based on this data, and exposes you those files in the file system. So we can then, with that, go into there and run the command. If you're familiar with Linux or Unix or whatever,
07:44
you may find this very similar to mount that says bind on Linux or null file systems on BSD, and that is essentially the same thing, right? We have implemented this, they let you do this. They let you do a bind mount for multiple things
08:00
into the same location, not with just one source. The other main thing that this does, which is different than bind mounts, is that we can have a second action coming in, and we'll have to do the same process, but look there, we didn't have to remount the file system to apply those changes, right? We just sent another RPC to this daemon that's running,
08:21
and it just added more files into the sandbox with different permissions, different paths, whatever, and we didn't have to remount it. So the performance there can be better. So it can be, it's not yet, but we'll see into that now. So how does this behave? Well, so I ran some measurements a year and a half ago on Mac OS, and this is mostly about Mac OS
08:42
because that's where we had the most performance problems, on Linux, this is pretty good, and we got these numbers. I don't have more recent numbers because I've been having trouble getting things to work properly with more modern builds, but this is just a proof of concept for now. So here we just have three different builds. Two of them are building Bazel with itself,
09:01
and another one is building one of our pretty large iOS apps, and then you have the times for the total build time when we run it without sandboxing. Now when we enable the sibling sandbox, the original one, we got these timings. So for Bazel itself, we see a tiny increase, which is expected because any kind of sandboxing
09:22
will have some overhead, but then for the iOS app, the increase is massive, that's just not acceptable. We cannot have this cost. When you want to do interactive development, you cannot have this cost because people will not enable this feature, and actually they'll be disabling it because of this. So with sandbox.fs, we've got, for the moment, to this.
09:41
The overheads for Bazel itself remain, but for iOS apps, which happen to have actions that are gigantic, they have many, many, many inputs, we've gotten the cost to only 50% for now. I'm sure that can be cut down much more, but at the moment, that's what we get. All right.
10:00
So something that's fun that went in this project is that it was originally written in Go, and I went through every write in Rust just because, basically. But I want to tell you a little bit of what we found there or what I found. So first, we started with Go. I had an intern that came to write this project, and he did a very good job, and he was working by the end of his internship.
10:22
We found that VS Code, for example, has very good support for Go. It was very nice, because he didn't know Go, so just having code conditions that was very useful to get into the language. But then at some point, we hit some scalability issues. The Go runtime didn't like the way the FUSE libraries for Go work, and it was not behaving properly.
10:43
We were hitting some very significant performance issues. And the code started to become pretty hard to maintain. That's my own critique of Go, I would say. There is no way or accepted way of adding annotations in your code school, assertions or, I don't know, thread annotations.
11:01
So things, you know, had a lot of comments saying how the code was supposed to behave, but the compiler cannot enforce anything. So at some point, I just wanted to learn Rust, and this is my side project, so I said let's learn Rust by rewriting this thing. So I did that. The regret was very difficult. I mean, learning Rust and getting up to speed with it
11:21
is hard, but I think it pays off. Something specifically that I found is that VS Code has lots of support, and I kind of like VS Code for the reason I mentioned before, to learn the language, but for Rust, it's very slow. Compile times are slow, as you may know, and they get in the way, even for tiny edits. To get the red squiggles under the code,
11:40
it takes a while, so that's annoying. On the other hand, the code that we have today is much more sane. I feel much more confident that it's doing the right thing, where in the past, I had to look it and maybe trust it. But more interestingly, and the thing that kind of shocked me in the process is that, so as part of the rewrite, I was trying to copy the same logic that we had
12:00
from Go into Rust to avoid having to change anything, make sure that everything remained the same, but in doing this, Rust didn't let me write those same ideas in the same way. The compiler just refused that kind of code, and it turned out that the old code had many threading bugs that were not visible in Go or actually running the tests that we have,
12:20
but the Rust compiler would just catch them and not let me specify the kind of buggy code. I kind of wrote down my experiences with the React in that post there. You can take a look at one later. And some common issues, or common things about this process, I would just mention that pprof, for example, is a profiling tool, also from Google.
12:40
It integrates extremely well with Go. It's super easy to use. It was very useful in finding the performance issues. It works also for Rust binaries with some more effort. It's also very useful in that case. The main problem is that the fused bindings for both Go and Rust are not first class, right? Fused is a C project, and the fused bindings that have been written for these other languages
13:01
are kind of like written from scratch. I wouldn't say they are very actively maintained. They are missing some features. Then you file back the performance, and they'll get fixed. So that's a very big problem for where we are at. I don't know what the solution is, really, except, yeah, we'll see.
13:20
Other things that I would like to do here, and I would say this is an open source project. Wink, wink. I would like help if anyone is interested. Basically, the main problem today is that it says we have a 50% cost in performance, but I'm pretty sure it can be brought down. And one of the problems today is that the protocol that we use to send data between Bazel and some other FES is pretty inefficient. It's very chatty.
13:41
It sends very big messages. We could just make that smaller. Another thing I would like to do personally is I have this other tool called Package Comp, or Package Compiler, which builds any kind of software from Package Source, which is a NetBSD package system in a sandbox. And in the past, I used bind mounts, and it was very complicated to get them to work on my quest, and blah, blah, blah.
14:01
So actually, the original idea of sandbox FES came from this project. I wanted to do sandbox FES for this project, but then I never had the time, and I was just lucky enough to kind of sell it as, you know, we would use it for Bazel instead. So then I could do it at work as a 20% project. So that was good. So I would like to integrate it there. And other things we could look into
14:21
is like Microsoft has come up with their own way of sandboxing. They call it BuildXL. And instead of enforcing things, they actually let the code run as it was, and they sanitized what the code did. Like, they audit, basically. They don't prevent. That's very interesting. Can offer much better performance, but we have to look into it. Maybe we can have the same ideas.
14:40
And finally, you should know that FUSE for Mac is kind of not open source anymore, and kernel extensions on Mac are going away at some point. So these two things are problematic for using FUSE on Mac, and it's unclear what's gonna happen. With that, I'll just give you a couple of links.
15:00
Here you can go to the Bazel web page, the sandbox.fs project page, or just you can contact me below. If you have any questions, I'll be here today and tomorrow. You can find me or not. It's very difficult to find someone. Just ping me if you want on Twitter, and then we can meet anywhere. With that, I'm done. Thank you. Okay, thank you for your talk.