Rustifying the Virtual Machine Introspection ecosystem
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 490 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/47423 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Formal languageMathieu functionLibrary (computing)Vector potentialSystem callAreaGroup actionPhysical systemProjective planeExpert systemPresentation of a groupComputer animation
01:31
VirtualizationPoint cloudFitness functionNumberInformationVirtualizationLibrary (computing)OnlinecommunityProjective planePhysical systemDifferent (Kate Ryan album)Semiconductor memoryProgramming languagePersonal digital assistantPoint (geometry)SineComputer animation
02:59
Context awarenessVirtual realityVirtual machineInterface (computing)Computer hardwareInformation securityContext awarenessVirtual machineComputer hardwareVirtualizationOnline helpSheaf (mathematics)Computer animation
03:19
Visualization (computer graphics)Intercept theoremEvent horizonComputer hardwareInterrupt <Informatik>Control flowBefehlsprozessorSemiconductor memoryMagnetic stripe cardControl flowReading (process)VirtualizationEvent horizonAreaVirtual machineWritingMathematicsDataflowComputer hardwareInterrupt <Informatik>Musical ensembleOnline helpRight angleValue-added networkSemiconductor memoryComputer animation
04:17
Computer hardwarePhysical systemControl flowEvent horizonCore dumpComplex (psychology)View (database)Kernel (computing)Mathematical analysisDynamical systemCore dumpKernel (computing)Operator (mathematics)Computer hardwareSpacetimeCartesian coordinate systemLevel (video gaming)Virtual machineInformationOperating systemMalwareOrder (biology)Range (statistics)Physical systemAreaView (database)1 (number)Basis <Mathematik>Computer animation
05:29
Kolmogorov complexityVirtualizationAddress spaceTranslation (relic)Digital filterProcess (computing)Function (mathematics)Parameter (computer programming)Kernel (computing)Context awarenessSystem identificationStructural loadElectric currentSemiconductor memoryOpcodeEvent horizonComputer hardwareData structureVirtual memoryFunctional (mathematics)Event-driven programmingOrder (biology)Kernel (computing)Insertion lossEvent horizonData managementDependent and independent variablesSystem callWeb pageInterrupt <Informatik>Complex (psychology)Address spaceParameter (computer programming)VirtualizationData structureTranslation (relic)Control flowView (database)Computer fileRight angleProcess (computing)Context awarenessLevel (video gaming)Online helpPoint (geometry)Computer animationProgram flowchart
07:49
Library (computing)Computer clusterMereologyDifferent (Kate Ryan album)Cartesian coordinate systemSystem programmingPosition operatorView (database)Complex (psychology)Virtual machineArithmetic meanMultiplication signComputer animation
08:38
EmulationDisintegrationRight angleAvatar (2009 film)Quantum stateEvent horizonInternetworkingBuildingTwitterHypercubeProjective planeComputer hardwareComputer animation
09:29
Physical systemView (database)Cycle (graph theory)MultiplicationUltraviolet photoelectron spectroscopyProjective planeFitness functionRight angleAbstractionWritingProgram flowchart
10:32
AbstractionConstraint (mathematics)Cross-platformLibrary (computing)Projective planeWindowSheaf (mathematics)Constraint (mathematics)Multiplication signAbstractionComputer animation
11:49
Computing platformConstraint (mathematics)Library (computing)Projective planeUsabilityWindow
12:15
Visualization (computer graphics)Read-only memorySurfaceCASE <Informatik>Right anglePoint (geometry)outputSound effectPoint cloudEntire functionSurfaceComputer animation
12:51
Computing platformRead-only memoryAddress spaceEvent horizonEmulatorAerodynamicsTranslation (relic)E-learningBitObject (grammar)Ultraviolet photoelectron spectroscopySingle-precision floating-point formatSemiconductor memorySynchronizationTask (computing)Semantics (computer science)Projective planeComputer hardwareData managementDynamical systemMobile appService (economics)Translation (relic)Library (computing)Volume (thermodynamics)Control flowQuantum stateComputer animationProgram flowchart
13:51
DisintegrationHypercubeEmulationSemiconductor memoryComputer hardwareEvent horizonInterrupt <Informatik>System callTexture mappingInjektivitätMagnetic stripe cardView (database)Quantum stateRight angleLibrary (computing)Semiconductor memoryReading (process)INTEGRALEvent horizonComputer animation
14:34
Core dumpDemo (music)Context awarenessEvent horizonSoftware frameworkIntrusion detection systemView (database)DebuggerPhysical systemDemo (music)Computer animationLecture/Conference
14:51
FamilyWindowSemiconductor memoryQuantum stateCore dump
15:09
Semiconductor memoryWritingCore dumpSource codeComputer animation
15:27
Term (mathematics)Domain nameComputer animation
15:48
Context awarenessDemo (music)Event horizonCodeGoogolEmulatorSelf-organizationMereologyImplementationEmulationLogicMaizeExecution unitAddress spaceDemo (music)Figurate numberWeb pageRight angleEvent horizonRevision controlDegree (graph theory)Context awarenessWindowNoise (electronics)Bit rateComputer animation
16:23
Context awarenessDemo (music)Event horizonEvent horizonComputer fontComputer animation
16:41
Chi-squared distributionContext awarenessDemo (music)Event horizonSoftware frameworkIntrusion detection systemDebuggerView (database)Physical systemFuzzy logicProcess (computing)MereologyStudent's t-testPhysical systemSelf-organizationQuantum stateVideo gameAreaRight angleContext awareness2 (number)Event horizonDebuggerKernel (computing)Computer animation
17:43
CodeGoogolSystem on a chipSelf-organizationMereologyEmulatorImplementationEmulationSelf-organizationMereologyCartesian coordinate systemRun time (program lifecycle phase)ImplementationKeyboard shortcutStudent's t-testSeitentabelleLibrary (computing)BuildingHyperbolischer RaumWebsiteExtension (kinesiology)EmulatorTable (information)Computer animation
18:29
Mathieu functionUtility softwareComputing platformWindowElectric generatorRight angleOpen sourceComputer architectureProjective planeMoment (mathematics)ExplosionBuildingSound effectPhysical systemMultiplication signVirtualizationAbstractionComputer animation
21:29
Open sourcePoint cloudFacebook
Transcript: English(auto-generated)
00:05
Well, hi everyone, thank you for being here today. My name is Mathieu Talal, I'm here with Dario Hackenberg and our talk is Rustifying the VM Introspection Ecosystem. So I know you are all tired from the fast them because it's afternoon and it's Sunday already
00:20
but I have a quick poll to do. Raise your hand if you've already heard about VM introspection. That's interesting, okay, okay. Sorry? VM introspection in general, the technology. Today we're going in this talk to talk about VM introspection and the goal of the presentation is to
00:44
present why we believe it's a valuable technology, why it has changed already some areas like my analysis, and why we think that it has some unmet potential yet in areas like debugging, OS hardening, or fuzzing for example.
01:04
And we use Rust to go to this, to explore these areas and try to make the potential of VM introspection. So first a disclaimer, we are not like expert Rust developers, we are not going to learn anything new about Rust in this talk.
01:24
It's just a project we made with Rust, okay? So, I'm... I'm doing VM introspection in general, presenting the VM ecosystem today, and then presenting why we build a library in Rust to do VM introspection.
01:43
So I'm heading to my voice to Dorian. Yeah.
02:08
Yeah, so, hi. Since Rust 1.0 came out in 2015, a number of projects around virtualization have emerged.
02:21
You may know Firecracker, for example, because it's one of the biggest projects written in Rust. Rust is a system of a programming language that promises to be memory safe, so actually it's a perfect fit for the virtualization community.
02:41
Yeah, Mathieu has accumulated some information about virtualization in general, so different projects, papers, not only Rust of course, but everything virtualization related, so yeah, go check it out. Yeah, so
03:02
what is VM introspection? Briefly said, it's deriving the execution context of a virtual machine from the hypervisor by querying its hardware state, so quite handful. So what do we mean by that?
03:20
Well, this you may know. You have your virtual machine here around its virtualization layer and underneath the hypervisor, so this is where our introspection agent comes into play. It's actually on the host, and it talks to the hypervisor via an API.
03:42
So what can you do? Well, you can register for hardware events, for example, access to a certain area of memory, interrupt events, register change events, and when one of these events happens, the VM is halted,
04:04
the execution flow is transferred, the control flow is transferred to your application, and you can do whatever. You can read memory, write memory, read registers, or write registers, and so on and so forth. So yeah, what are the core strengths of
04:22
VM introspection? Well, for once you have a full hardware view, so you can see the kernel space, and you can see the user space as well. Also, you're operating at hypervisor level privilege.
04:40
Since you're between the hardware and the VM, you're also able to modify what the operating system itself should be able to see of itself. Yeah, so in which scenarios is this useful? Well, nowadays the main application
05:03
for virtual machine introspection is a dynamic malware analysis if you want to be stealthy. But you can also, for example, use it for debugging scenarios, like maybe if you want to debug a nested hypervisor, or maybe
05:20
you have a compromised operating system that you want, so you can't trust the information you're receiving from the kernel. So now that I've talked about where you can apply it, let's talk about what you have to implement in order to be able to introspect a guest.
05:42
So on a high-level view, we have our introspection agent, and let's say maybe I want to break on a function, for example kernel32.writeFile, and of course we don't want to do this for every process, so let's say cargo.ext, and
06:03
when this function call happens, we want to simply lock its parameters. So since you're outside of the VM, from outside it's just a bunch of bytes. So you have to have a semantic engine that is able to derive
06:24
the context for you. So where are some kernel structs, or where is kernel32 actually located? So with this you're able to retrieve the address of write file, and then
06:40
you have your breakpoint manager, who actually inserts a debug interrupt into the function and then sends registers a callback, so interrupt callback for you. This is where the event dispatcher comes into play. So the event dispatcher
07:04
delivers your, actually registers the event with your hypervisor, and it makes sure that the response, if the interrupt event is hit, that the response is funneled back to your breakpoint manager.
07:20
So, okay, yeah. Last but not least, you have your virtual address translation, because the hypervisor itself has no knowledge of how the virtual memory is organized inside the VM, so you have to identify and walk the page instruction by yourself.
07:41
So you may see that there's a lot of complexity in there, so yeah, I'm gonna head it over to Mathieu to talk about where VMI is today. Okay, thank you.
08:01
Yeah, that's great. Thank you Dorian. Now let's talk about the VMI ecosystem, and by ecosystem I mean application and libraries that exist today, because the introspection is here since a while, and things have, and the values have been developed since that time. You have seen before the complexity, so you are dealing with a lot of moving parts, and it's hard to come to a
08:26
an introspection agent that works for all of the, for many machines, many different systems, and that's part of the problems that we're going to see right now. First we need to see that, we need to talk about the hypervisor support, right, because if you want to listen to hardware events,
08:45
you need the hypervisor to support them, you need the hypervisor to provide you a VMI API to query the events and to listen to hardware events, right? So that's the state of hypervisor support today. You have Xen, which has been the leading hypervisor on this technology, since 10 years it has supported VMI APIs,
09:04
and today, if you just took a ticket in Stark Xen, you already have it built in, you can just use the API right away. And for the rest of the hypervisors here, you can see that we have, let's say, a trend, like two or three years ago, people started to build projects because they're interested by VMI on either VirtualBox, Hyper-V, KVM, QEMU.
09:25
So that's really interesting. So community projects. And now on the state of VMI projects themselves, these are some of the most popular projects in the community that I try to reference here. And the thing that I want to demonstrate here is that they are working in silos.
09:44
The reason for that is that because most of them target only one hypervisor and there is like a historical reason or a reason for that is because sometimes they are academics, they want to write a thesis, so they rush to write something that works,
10:01
and then there's an abstraction layer behind to adapt to multiple hypervisors, right? So we end up in this situation where projects are adopted, but they're not sharing the work they have been doing on their own. So you have silos and you have a lot of duplicated work between each other.
10:21
So our goal today is to break these silos and to make the communities work together and solve the same problems together to build better VMI apps, right? So our idea is basically to unify this ecosystem. What we want is to insert a library here that will have this abstraction layer, enough abstraction to deal with
10:46
any hypervisor, emulator behind it, and to be compatible enough to be linked with any of the projects already existing. That's the general idea that we want to build in Rust.
11:01
So we have constraints for this to work. First, we need to have speed because this abstraction layer will have a cost, and if you come to one of these projects and say, hey, we have this abstraction layer, you can talk to any hypervisor now, but yeah, it's a bit slower. As you have seen,
11:20
VMI will have an impact on the guest, and the more time you spend processing your events, the more impact there is on the guest. So your library needs to be really fast to be convincing. So our first constraint is speed. The next one is compatibility. If you want to write this this layer, we need to provide a C API, of course, because all of these are protecting C or C++.
11:43
So they need to link to our library, and so we need to provide C compatibility. And then we need to be cross-platform. We have some projects for Linux, some projects for Windows, and some projects for both. We need to be easy to maintain on Windows, Linux, or even MacOS in the future.
12:02
So it needs to be, well, yeah, easy to maintain, easy to work on. That's another constraint, because if I have a library on Linux that is not portable to Windows, it's not going to be usable. And then we have a desired quality today to be memory safe. If you think about another use case for this introspection agent, let's say we want to monitor an entire cloud with thousands of VMs.
12:31
We want to be memory safe because otherwise we would just introduce a new attack surface, right? Before the attack surface was the hypervisors and the emulated devices,
12:40
but now you have the introspection agent because it's processing inputs and untrusted inputs from the guest. So I don't want to process these inputs in C. So, yeah, we have Rust. Rust gave us speed, sync compatibility, easy to write a cross-platform library, and it is memory safe.
13:02
So that's why we decided to write it in Rust. And the objective that we want to achieve is a bit like this. We have on the left the VMI apps that are already existing, or the future VMI apps for dynamic analysis, OS hardening, or even fuzzing.
13:20
And we want to adapt them on any hypervisor or emulator, give this flexibility. And for that, we started with this library, libmicrovmi, which is just a unified low-level API to read memory, listen to hardware events, and then later on to build these new crates to perform this task of semantic engine,
13:43
literalization, and breakpoint manager. That's the end goal of this project. So what is the state of libmicrovmi? What have we achieved today? We can read the physical memory.
14:00
That's all the VMI API you can see on the left, pretty much. We can partially read vcp registers, and we started to work on listening on CR3 events. And on the right, you can see that we have a C API, thanks to Dorian, and also a libvmi integration.
14:21
We started working on that to integrate into existing tools and libraries. And regarding the drivers, we have drivers like Xen, KVM, and VirtualBox. We have this compatibility. So what does it look like? I have a small demo.
14:41
OK. I wanted to increase the quality, but it shouldn't be increased automatically. This is a memory dump example. So I'm running Windows XP here, and I'm running a memory dump example on the left.
15:02
So I'm dumping the memory of Windows XP, and then I'm displaying the registers, the state of the registers. And then I'm going to do the same thing for, it was Xen, sorry. This is KVM, and I'm doing the same thing with the same API, right? That's what we want on KVM. So memory dump, connecting to KVM via my API,
15:21
and dumping the domain, and printing the registers. And then we're going to do the same for VirtualBox, with the same API, right? Because it's unified, and that's it. Actually, we dumped the domain on these three hypervisors.
15:42
That's what we wanted to achieve in the first place. And we have, OK, what do you do that? No? OK. And this is a demo about catching CR-free events,
16:01
because we want to show how we catch when there is a context switch in the scheduler, and when you do a context switch, you switch your CR-free registers, because it contains the physical address of the page directory, right? So when you have a context switch, you change the CR-free registers, and you can catch that as a hardware event. So on this Windows XP, I will generate some noise or some activity
16:25
by enumerating what's left on the disk, and on the right, I'm going to catch plenty of CR-free events here. That's it. And it's really fast, because you can see, well, I will tell you. I actually catched...
16:41
The font is not big enough, but... OK. I actually catched 4,000 events per second. 4,000 context switches per second. So that's quite fast. Fast enough, right? So what we want to do in the future for VM introspection? We want to build this OS-independent hooking framework,
17:03
because it has unexplored areas where it can be applicable for intrusion detections, for debuggers. If you want to debug a kernel or have the full state of the system to view the entire system from the host, or just build a new layer of hardening, you can watch some critical kernel areas in your introspection agent
17:23
and kill the process when they are being corrupted, right? Or you can also do snapshot-based fuzzing to fuzz from the hypervisor. All of this enabled by VM introspection and by having a unified API. And of course, at the end, making VM introspection a commodity,
17:41
usable by everyone. And one last thing. We will propose the MicroVMI for the GSoC this year. So if you are a student and you are interested by VM introspection, virtualization, and building interesting apps, you can propose your application. You are part of the HoniLect organization. And you can either improve an existing driver,
18:02
using KVM VirtualBox or even Hyper-V. You can add support for emulators, Camu, Box, or even Unicorn. You can propose a stealth reference implementation based on the EPT, the extended page tables. Or you can solve this little issue on bytegen, adding support for libloading. So you can generate the bindings
18:21
that are using libloading to load the library at runtime. That will be really great. So as a conclusion, we would like to improve VM introspection by improving the ecosystem, and to let this technology exploit its full capabilities.
18:40
Thanks to the community on Reddit, because they helped me a lot building this talk. Thank you for your attention. That was it. Thank you. Do you have time for questions? Yes.
19:01
Does this work on only non-x86 platforms? For the moment, it's only x86. Yes, it's hard-coded. But it's really a small project. But later on, we will add support for other architectures. So there is at least some provision in that? Some? Some, there is some provision in the API.
19:22
Yes, yes. Yes. There is also support for VMware plans? VMware doesn't provide a VMI API. Can you repeat the question before answering? Yes. Is there support for VMware? The problem is that VMware doesn't provide this VMI API. There is no support for them.
19:40
They don't provide you the headers, and they don't open an API for you to work on it. But in the future, they will use the Windows Hypervisor platform, because tomorrow everyone has to use Hyper-V. And if in the future this Windows Hypervisor platform provides a VMI API, you can plug into it and be compatible with Hyper-V,
20:00
VMware, and VirtualBox on Windows, Windows 10. Yes.
20:37
Yeah, or if you have an Android system which is virtualized,
20:40
can you use it first? Well, with VMI, you need virtualization to enable debugging.
21:02
I think that's another topic, or it's another project separated from VMI, I believe. But it could be possible to have an abstraction layer on both.
21:24
Juan? Thank you.