Fuzzing Device Models in Rust: Common Pitfalls
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 542 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/61781 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2023497 / 542
2
5
10
14
15
16
22
24
27
29
31
36
43
48
56
63
74
78
83
87
89
95
96
99
104
106
107
117
119
121
122
125
126
128
130
132
134
135
136
141
143
146
148
152
155
157
159
161
165
166
168
170
173
176
180
181
185
191
194
196
197
198
199
206
207
209
210
211
212
216
219
220
227
228
229
231
232
233
236
250
252
256
258
260
263
264
267
271
273
275
276
278
282
286
292
293
298
299
300
302
312
316
321
322
324
339
341
342
343
344
351
352
354
355
356
357
359
369
370
372
373
376
378
379
380
382
383
387
390
394
395
401
405
406
410
411
413
415
416
421
426
430
437
438
440
441
443
444
445
446
448
449
450
451
458
464
468
472
475
476
479
481
493
494
498
499
502
509
513
516
517
520
522
524
525
531
534
535
537
538
541
00:00
Software testingPhysical systemCrash (computing)outputDevice driverFuzzy logicLibrary (computing)Point cloudHypercubeQueue (abstract data type)Default (computer science)Level (video gaming)TelecommunicationSpacetimeExecution unitInterface (computing)Revision controlObject (grammar)Error messageData structureCodeParsingCodeLibrary (computing)SpacetimeDevice driverCrash (computing)outputKernel (computing)Position operatorLevel (video gaming)1 (number)Fuzzy logicIntrusion detection systemTelecommunicationInformationSoftware testing2 (number)ImplementationPoint cloudFunctional (mathematics)CASE <Informatik>Computer programmingData structureError messageRepository (publishing)DampingSerial portUnit testingProjective planeObject (grammar)Execution unitMultiplication signQueue (abstract data type)Point (geometry)Interface (computing)Line (geometry)Presentation of a groupMereologyOrder (biology)RandomizationSemiconductor memory
08:28
IdempotentExtension (kinesiology)CASE <Informatik>AdditionCodeFuzzy logicImplementationFrequencyExplosionCrash (computing)outputMacro (computer science)Error messageFunction (mathematics)TunisChainMUDError messageFuzzy logicFunctional (mathematics)IterationPresentation of a groupLevel (video gaming)Goodness of fitAreaCovering spaceMeasurementOrder (biology)Letterpress printingComputer fileChainDialectAdditionDevice driverCASE <Informatik>BitPoint (geometry)Unit testingInformationMereologyMathematicsMultiplication signLink (knot theory)Software maintenanceTheory of relativityWebsiteSoftware bugCodeLengthData structureBuffer overflowImplementationEmailRandomizationUniverse (mathematics)
16:51
Dynamic random-access memoryExplosionWeb pageFrequencyCodeParsingData structureExecution unitoutputCASE <Informatik>RandomizationSerial portMultiplication signCore dumpComputer fileGraph coloringUnit testingControl flowLinear regressionSoftware testingSoftware bugComputer animation
19:31
Program flowchart
Transcript: English(auto-generated)
00:06
Hello everyone, my name is Andrea and today I'm going to talk about fuzzing in HTML. This presentation is not about fuzzing itself, but rather how we failed at it. So before I start with a big pause of fuzzing, I will continue to think about fuzzing itself.
00:27
I hope some of you already know about it, I don't have a lot of time. So fuzzing is basically an automated testing technique. The idea is to just send random inputs to a program to see how it behaves in that case.
00:44
And how it works is that you use typically a tool like fuzzer that is going to generate random input for you. And then you're going to call some functions with that random input. And the fuzzer is going to record some findings.
01:03
And if it finds any interesting input files, it's going to write them to where it works. Findings in this case can have crashes, can be hands, but can also be timeouts. So for fuzzing, when you first do it, you typically start from an empty corpus.
01:21
But as you run fuzzing, you're going to generate some interesting inputs. Which is helpful because in the next ones, you can just reuse those inputs and then start from scratch. This helps with finding interesting things faster. So in RustVMM, we implemented fuzzing for VMbird.io.
01:41
We have three fuzz targets, one for the bird.io queue, one for the serialization of the bird.io queue, and one for the bird.io piece. So in the RustVMM project, we only have implementation for the packet, so that's why we fuzzed. During fuzzing, we discovered three crashes, and only one of them is treatable by a protection malicious driver.
02:05
And what we have right now is that we are able to run fuzzing for a request that you're submitting to RustVMM. To the VMbird.io repository. The fuzzing is implemented using libfuzzer. And besides the fuzzing that is happening in RustVMM itself, the folks from Cloud Hypervisor are also running fuzzing.
02:28
And we also discovered a time-out in the program. So this actually brings me to our first info. So what is it you want? It should actually be... Yeah, okay, so let me finish it afterwards.
02:45
It's a people, and that is me. The first people is that you actually have to run the timeout that is appropriate for what you're doing fuzzing for.
03:02
Because the default, for example, for the fuzzing that we were using is actually 20 minutes. And since we are just working with Rust.io and whatever we use, there's nothing that can possibly take 20 minutes to process. So we have to adjust the timeout to 60 seconds in our case, and this was something that was recommended by the folks from Cloud Hypervisor.
03:28
Now, how we run fuzzing in RustVMM is at the library level. So the advantage of this is that it's easy to set up. So it's really important that it's easy to set up.
03:40
It is a good thing. People are like, oh, but you're running fuzzing at the library level, so you don't have the kernel. It's so easy, so simple. So it's like, yeah, it's great, right? I mean, easy is a good thing. And yeah, it's a good thing because you can also run on almost any host. You just have to have a fuzzer installed and the repository, and then you just run fuzzing.
04:04
And it also runs in a user space. There's also disadvantages, of course. The first one being that you cannot cover the whole of your setup. So that means that you're going to have some things that are not going to be fuzzed.
04:20
And then because you are fuzzing in user space, you may need to do some input driver. And this tends to be a bit complicated. And also you can find false positives. With the false positives, the idea is that you will find crashes that otherwise would not be triggered by a driver
04:40
because maybe you have some other case in place. I would say that it's still important to fix these ones as well because you never know how you're going to change your code and how it might end up actually triggering those IDs in the future. And for the mocking of the driver, how it works, we already simplified here,
05:04
but the idea is that the driver is writing something in memory and then the device reads what the driver wrote in memory and it does stuff with the data. The part that we want to pass in U.S. VMM and the part that needs to be done in U.S. VMM is this side of the device.
05:21
And then what we need to mock is actually the driver side communication. And in U.S. VMM what we did is that we started this mocking of the driver from the beginning. So we needed it anyway to run some unit tests. We needed it for other kinds of testing as well. So we had an initial mock interface from the beginning.
05:40
And when we wanted to do fuzzing, we just evolved the mock driver in order to support that as well. Okay, so at the high level, how it happens right now in Rasphira is that we parse the random bytes. We initialize the mock driver with the data that was parsed by fuzzer.
06:01
So high level, it ends up with some descriptors and some key functions that have some random input that they need to process. And then we create the queue and we call these random functions with random input. And yeah, the second big point is that if you are trying to do fuzzing
06:26
and you just start when the project is already mature, what is going to happen is that it's going to be a bit difficult. You might find it very complicated to retrofit it. So instead, I know that it's not necessarily viable to start fuzzing when you start a project.
06:44
But what you can do instead is that you can keep fuzzing in the back of your head. And then when you create some mock objects or some unit tests, you can think about how you can actually use them in fuzzing as well.
07:01
Which is what we did, but not very well. So one of the crashes that we actually found was that the mock driver was crashing on invalid input. So we had to adapt it actually to return errors. Even though it was just one test, we couldn't just crash on invalid input anymore.
07:24
So the idea is to return errors at the level where you want to be fuzzing. That can be processed at higher levels. And so the fuzzing happened fresh. And now for structural fuzzing.
07:43
So without structural fuzzing how it works is that the fuzzer is going to generate some random bytes and then you have to interpret this as the bytes that you have to do for your library. With structural fuzzing it's really nice because there are some tools that are just going to basically
08:02
interpret the random bytes as a structure that you actually need. So it's super nice. What it does is that it significantly reduces the code that you need to write. And even raspbian remiscence is significantly arbitrary. Now we had to change it unfortunately.
08:22
But before we did that we had only 270 lines of code. And now we have around 740 lines of code for the fuzzer. And unfortunately it came with some problems. So that's why we have to actually fix it. The most important part is that it's not reproducible.
08:42
So you can't reuse the corpus that you had in previous runs. Which was a big problem for us. Because basically what happens is that arbitrary is introducing some randomness.
09:03
And that basically means that you cannot reuse the corpus from previous runs. The big problem here is that we realized that we can do incremental improvements for the fuzzer.
09:21
And we didn't check that what we want to implement can actually be implemented through the app. So instead a better problem will be to reuse the corpus that we generated. Okay, and now about when fuzzing actually fails.
09:43
So we had a PR in university map. At this point we were already running fuzzing for pull requests. And there was a PR that was introducing actually an overflow. So here the overflow is that the packet header size addition to the packet length can actually overflow.
10:05
Because the packet length is set up by the driver. This bug, I actually found it during code review. So it was a bit unexpected because I was hoping that the fuzzer is going to find it, which was not the case. So after some time I realized that writing fuzzing for just 15 minutes might not actually be enough.
10:29
Because the fuzzing, this bug was triggered with fuzzing but after running it for 14 minutes instead. So how we fixed that is that we added a fuzzing session that is optional and that runs for 24 hours.
10:41
This one is to be started manually by the REST human maintainers. And should only be started when there are pull requests that actually back the fuzzing relation. This is because we are also consuming a lot of resources when doing fuzzing.
11:02
And also you don't want to block all the pull requests for 24 hours. So typically the REST human site takes activities to execute. So blocking it for one day might not be reasonable for all the pull requests. So the pickle here was not fuzzing for long enough.
11:22
And instead we had to work our way to find a way to not block pull requests, but at the same time to provide more fuzzing. Put coverage for Rust. So in Rust you can actually get coverage information by running a link on top. In Rust you only get live coverage.
11:45
So basically this was the starting point of the presentation. I was thinking I was going to come here and I'm going to show you how great it is to run fuzzing for 15 minutes. And then more minutes and then with purpose and all these really extravagant things.
12:01
And so we ended up with fuzzing for 15 minutes generating time for missed regions. And the coverage of around 82%. So I was like well that's okay, that's good. So then let's just run with some minimal purpose as well. So this is some purpose that we generated from unit tests.
12:20
Let's just feed it to the fuzzer and see how this changes. There was no change actually. So I was like okay, not bad, not bad. Let's just run for two weeks. So what do you think is going to happen now? So actually, sorry.
12:44
So at this point I was like I have to change my presentation. So it's not what I expected. But instead I learned something. So you can't actually use coverage to decide when to stop fuzzing. So instead what you can do is that you can use coverage information to see what parts of your code are not actually covered.
13:10
And yeah, well that's about it actually. This is a summary of the people that we read into. And I think now we have a lot of questions.
13:24
Did you look at like how the fuzzer works and then like what areas were not covered and try to figure out why it wasn't covered in those areas? Yeah, so the question was if we looked at how the fuzzer works and what areas were not covered. Yes, we did and I have a slide for that.
13:41
Thanks for the question. Okay, so actually I have two slides for that. There were some functions that we were not calling on purpose. So because on the birthday of Q, for example, we have some functions that are just iterating over descriptor chains and then they are doing something with the data.
14:03
And at the birthday of Q level you can't do something with the data. So it's like okay this needs to be fast at the high level like at the device implementation level. So it's like okay we're not going to call these functions. Which is a bit hilarious because that's where a top hypervisor actually found the timeout problem.
14:23
Which we were not able to reproduce with birthday of Q but still. And we actually did this one function that shouldn't be called during fuzzing. And then I reran the fuzzing and yeah it's a bit better but it's still not great. And then I looked into what, well actually you can't see it very well.
14:46
Yeah, so I looked into what actually is not covered and you're not seeing there so you have to trust me. These are actually errors.
15:01
So the printing of errors to files. So since in the fuzzer we're not actually initializing the logger, these things cannot be triggered by fuzzing. So there's lots of error with this printing to a file that's not happening in the fuzzing.
15:22
Yeah? What steps have you taken to actually make sure the coverage covers everywhere which these people look at. So just covering certain areas which clearly aren't covered. I didn't understand the question. Which areas, what steps have you taken to make sure the errors weren't covered in the fuzzing.
15:42
Are they covered in futurfuly or unit tests? The question was what measures are we taking in order to make sure that the code that was covered before is going to still get covered in the next iterations. None?
16:02
So right now we're not doing anything. This whole coverage thing is just something that I did for the presentation and it's not automatic in any way. This is actually a good point for future investors to make sure that they're still covering code because what will help with as well is that we could make sure that new functions that we are adding to the code
16:25
are also covered by coverage. So it's a great point to make that way. Yeah? We're talking about the structure of our fuzzing. Yeah? And you mentioned that you cannot reuse the corpus. Can you explain a bit more about that?
16:41
Yeah. Okay, so the question was the whole structure of our fuzzing and the fact that you cannot reuse the corpus. Let me see if I actually have it already here. Okay, so the idea is that what we were using, which is arbitrary,
17:02
when it was taking the input from the fuzzer, it was also adding some randomness to it. So because it was random, basically every time it was writing the corpus, the file was introducing some randomness to it. So when the same input gets read again, it's not helping the same.
17:24
So where does the randomness come from? Where does the randomness come from? This is just how arbitrary it decided to implement. There is actually an issue in arbitrary that they are aware of the problem. They are not actually, it doesn't seem like they are just fixing it for some reason.
17:42
So what we ended up doing is that we ended up doing some custom serialization with info, which is also very well known in the Rust package. It's not much more difficult than it is. And it doesn't reduce.
18:03
Yeah. When you discover a bug with this fuzzing, does it transform into a unit test afterwards? The question is, when we discover a bug, does it transform into a unit test? Yeah. The way that we are fixing this kind of problems is that we are always adding the regression test for them,
18:23
just to make sure that they don't get it. There was another issue. I was wondering about the computational requirements. So how many cores are you using? How many cores we are using? Did you pass it? Yeah. Okay. So when you read this for two weeks, we actually used 96 cores.
18:46
In unit tests, I'm not, so when you're running on 40%, I don't know exactly how many. I think I have one. I don't know. But we've been running on 96 cores. There was another one.
19:01
One minute. What kind of color case does it try to shrink without placing smaller steps, breaking into a very long step? Oh, this is, I'm not going to, you know what, leave it less yet afterwards. Thanks.
19:21
Thanks.