We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

VUOS: Give Your Processes a New VU

00:00

Formal Metadata

Title
VUOS: Give Your Processes a New VU
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
VUOS is a different perspective on namespaces, anykernels and related concepts. The main idea behind VUOS is that it is possible to give processes their own "view" using partial virtual machines. A partial virtual machine intercepts the system call requests and operates like a filter: system calls can be forwarded to the kernel of the hosting system or processed by the partial virtual machine hypervisor. In this way processes can see a mix of resources provided by the kernel (on which they have the same view of the other processes) and virtual resource. It is possible to mount filesystems, load networking stacks, change the structure of the file system tree, create virtual devices. The hypervisor is just a user process so while it gives new perspective for processes, it does not widen the attack surface of the kernel.
33
35
Thumbnail
23:38
52
Thumbnail
30:38
53
Thumbnail
16:18
65
71
Thumbnail
14:24
72
Thumbnail
18:02
75
Thumbnail
19:35
101
Thumbnail
12:59
106
123
Thumbnail
25:58
146
Thumbnail
47:36
157
Thumbnail
51:32
166
172
Thumbnail
22:49
182
Thumbnail
25:44
186
Thumbnail
40:18
190
195
225
Thumbnail
23:41
273
281
284
Thumbnail
09:08
285
289
Thumbnail
26:03
290
297
Thumbnail
19:29
328
Thumbnail
24:11
379
Thumbnail
20:10
385
Thumbnail
28:37
393
Thumbnail
09:10
430
438
Process (computing)View (database)Virtual machinePhysical systemDifferent (Kate Ryan album)Square numberSystem administratorWeightProcess (computing)View (database)WebsiteRight angleUniverse (mathematics)NamespacePhysical lawVideoconferencingMultiplication signSpacetimePermutationCanonical ensembleCycle (graph theory)Data storage deviceSummierbarkeitMultitier architectureQuicksortInternet service providerOperator (mathematics)Computer programmingSemiconductor memoryConditional-access moduleSystem callComputer fileComputer iconThermodynamischer ProzessSoftware bugKernel (computing)VirtualizationOperating systemComputer animation
Installable File SystemStack (abstract data type)Virtuelles NetzPatch (Unix)Operator (mathematics)SubsetSummierbarkeitProgram slicingArithmetic meanType theoryPhysical systemPresentation of a groupReverse engineeringCartesian coordinate systemEntire functionPhysical lawMultiplication signProcess (computing)Service (economics)WeightMedical imagingSpacetimeMathematicsContent (media)MiniDiscSoftwareKernel (computing)Different (Kate Ryan album)Keyboard shortcutComputer fileData miningFile systemResultantWritingAsynchronous Transfer ModeComputer animation
RootLoginPhysical systemEndliche ModelltheorieDataflowMetropolitan area networkGraph coloringCAN busVideo gameError messageVirtual machineView (database)WindowComputer fileModule (mathematics)File systemType theoryComputer animation
RootView (database)Total S.A.Time travelFile systemVirtual machineComputer animation
GeometryBlock (periodic table)Local GroupTable (information)File systemComputer fileProcess (computing)Virtual machineSystem administratorPhysical systemMultiplication signInternet forumOffice suiteMiniDiscBitComputer animation
Convex hullMathematical singularityComputer configurationOnline helpMiniDiscBootingSign (mathematics)AreaPartition (number theory)Social classMultiplication signPhysical lawMassMedical imagingVirtualizationModule (mathematics)File systemLoop (music)MiniDiscExecution unitRootComputer clusterPiWebsiteFrustrationComputer virusSoftwareComputer animation
Communications protocolDefault (computer science)Local GroupLink (knot theory)Loop (music)Tablet computerVideoconferencingLine (geometry)Session Initiation ProtocolLocal ringInterface (computing)Virtual realityNeuroinformatikSoftwareStack (abstract data type)Endliche ModelltheorieSoftware developerStapeldateiDemo (music)Asynchronous Transfer ModeVirtual machineHash functionNumberDifferent (Kate Ryan album)Client (computing)Order (biology)Address spaceMultiplication signStandard deviationIntegrated development environmentPlanningAxiom of choiceOpen setPressureExecution unitComputer animation
Loop (music)Communications protocolMultiplication signVirtual machineComputer animation
Programmable read-only memoryProcess (computing)Multiplication signShared memoryVideo gameVirtual machinePhysical systemMarginal distributionComputer fileView (database)Computer animation
View (database)Arithmetic meanProcess (computing)Water vaporThermodynamischer ProzessComputer animation
Installable File SystemStack (abstract data type)Virtuelles NetzPatch (Unix)Computer fontHypermediaLink (knot theory)MultiplicationBroadcasting (networking)Visualization (computer graphics)Process (computing)Compilation albumModule (mathematics)Hash functionAxiom of choiceWrapper (data mining)Demo (music)Slide ruleCAN busAxiom of choiceRAIDCASE <Informatik>Multiplication signForcing (mathematics)Computer-assisted translationPhysical systemComputer virusSpacetime9 (number)System callProjective planeMetropolitan area networkCartesian coordinate systemFamilyEstimator3 (number)ImplementationProcess (computing)Group actionTrailPosition operatorPoint (geometry)Service (economics)SoftwareMereologyView (database)Semiconductor memoryInterface (computing)Graph coloringEndliche ModelltheorieSampling (statistics)CodeComputer fileFunctional (mathematics)Table (information)WritingResultantOrder (biology)Parameter (computer programming)Buffer solutionModule (mathematics)Wrapper (data mining)Thermodynamischer ProzessVirtualizationReading (process)Overlay-NetzData structureKernel (computing)File systemThread (computing)Computer animationSource code
WikiInformationSquare numberVirtual realityRepository (publishing)Virtuelles NetzVirtual machineMiniDiscComputer-generated imageryInstallation artSign (mathematics)Physical systemKernel (computing)Medical imagingProjective planeCodeView (database)Scripting languageWikiMiniDiscSet (mathematics)Condition numberPoint (geometry)Artificial neural networkSystem callCASE <Informatik>Time travelPlanningAuthorizationWater vaporUltraviolet photoelectron spectroscopyComputer animation
Visualization (computer graphics)Process (computing)Compilation albumModule (mathematics)Axiom of choiceHash functionWrapper (data mining)Physical systemDecision theorySystem callMereologyComputer fileKernel (computing)Process (computing)Multiplication signOpen setReal numberPurchasingForcing (mathematics)Pairwise comparisonGraph coloringIntelComputer animation
Point cloudOpen source
Transcript: English(auto-generated)
So, I come from Bologna. I'm at the University of Bologna, and this icon here is virtual square, which is a team which is developing solutions,
tools, ideas in the field of virtualization. VUS, give your processes a new view. VUS, it's not an acronym, it must sound like VU.
So, which is the point? What a process can view? A process running on an operating system is able by itself to execute the machine operations,
is able by itself to access its own memory, but everything else must be obtained through system codes. So, the environment, the view, the panorama that the process is seeing outside its world,
it's given by the answer, by the reply that the system calls gives it back. Okay. So, namespaces in
the kernel are ways to provide processes with different views, different views on a file system, different views on networking, different views on the process IDs, user ID, or whatever. What we are designing with VUS is something similar,
but completely running in user space with just user permissions without the need to be rude.
This can reduce dramatically the tax face, because we don't use a namespace feature in the kernel. So, you can imagine to run a kernel with a namespace that's disabled,
and at the same time, we provide the solutions that are available to standard users. Nowadays, it happens so often that solutions, tools are provided for system administrators.
But that means that it's another way to enlarge the tax face as faulty programs, bugs can create damages to the system.
So, what the idea starts, the idea comes from a need. Why it is not possible for a user to mount a file system?
If I have a file system image, it's a file of mine. So, mounting a file system image and changing the contents of the files,
including that file system image once mounted, is just a sophisticated editing of a file of yours. Why is it forbidden in a standard system? Just because the idea of mount is an idea of a global operation of the system.
This operation either is provided as a global operation, as in the past, or it's provided just to a subset of processes, but by the kernel in these spaces. Instead, it must be possible to mount file systems as users.
In this presentation, I want to show you first what and why, and then at the end, how, the reverse way around.
So, with other applications, I can need a virtual device. So, I want to create a RAM disk for my application. So, if it's a RAM disk for my application, why I need to pre-root, why I need to need the kernel services,
or I want to use a different network stack, or I want to see a file which is here, there, like the bind mount, or I want to remount the entire file system in car mode, in copy and write mode.
I want to change the name, the time, the user ID, whatever, but I want to do everything in user space, user permission.
So, instead of, I have slides with demo, but I prefer to provide you with a demo result. Here, I have a file system image.
I would like to do this, because this is the natural way to mount a file system.
Actually, not from the kernel, and this is an error of the tool. It's the mount tool that say you're not root, this is an error of the tool. But even if you try to call the system call, you get the error.
How can you do that? You start a UIS machine. This window is running inside the machine. Again, if I try to mount this, I get the error.
But if I add the module to provide the file system virtualization,
and mount this just saying the type of a system, and as I work around to the mount common error, I have to use view mount,
which is a mount, you can use view mount to mount everything else, but it tries the system call. What is nice about live? The problem is that now I have mounted the file system,
but just for this. Actually, this is mounted,
but this is another file system, as you can see. So I've mounted my file system over there. This is just an example. Let us start another virtual machine.
I have here some comments to be. This time, if I can pick up the common, I add two modules.
One is for virtual devices, one is for virtual files. And with this command, I create a virtual device named devramdisk,
which is a ramdisk 100 megabyte large. So as a user, I can create a file system on the ramdisk. Then I can mount the file system,
and now on slash mount, I have the new file system which is into the ramdisk. I want to point out that I'm using the commands that I'd use as an ordinary system administrator
outside the virtual machine. So one of the goal of the process is to use the natural commands to do what you need. Let us go further.
This is a common problem I had. I always forget the comments to loop mount partition of file system units. For example, if you have a Raspberry Pi, the image has two partitions.
If you want to mount one partition to change one file, it is a mess. And second, you need to be root. Now I'm showing you how to access the partition of a Raspberry Pi image as a user. So again, I add the two modules.
Now I use another virtual device module, which is Partix. And you can see that an image I've just downloaded from the Raspberry Pi site.
And now I have in slash dev slash stx my disk. So I can, okay, I'm not root, so I need,
you can see the partitions. Okay, and given there are stx one, stx two, I can mount stx two. So I can mount the second partition.
Here I have the root partition of the Raspberry Pi. I could change the Pi, do whatever I want.
Or let's continue. Okay, now let us play with another toy. Networking, the model for networking is ViewNet.
Now, let me try, okay. Let's try this experiment. This is quite a new development.
It's already unstable, but I wanna, this is a development I have created together with the new camera.
This is a PicoTCP as a user mode stack in Vue OS. And given that the PicoTCP is connected to VD networking, I've used it at the networking interface Slurp,
which is the tool I use by virtual machines to provide user provided networks. Now, we provide in Vue OS different stacks
at the same time in order to change to decide. And you can see I've mounted the stack in DevNet PicoX. In order to say, I wanna use that network because now I have my, here I have my interfaces. But if I say, Vue stack, I open a batch.
And in this batch, I have the PicoTCP interfaces. It's a choice, PicoTCP uses a hash EDs,
so it's quite different from standard networking because usually you have the name, the number of the interfaces, zero, one, we have there. But now I can self-configure the network using a standard DHCP client.
Okay, now I've got the address. And again, this is my machine, my computer in Bologna. I've used the stack in the virtual environment.
Okay, final demo. This is the current time.
Now I can start a Vue OS machine. And now I start another sub-module, MISH, and this time, and I start another clock.
Now I mount, let me move this clock, which is,
now I mount the module, Vumish time in mount. And it is like a Prose file system. So mount contains some fake files. And I can use this to change the view
of the process regarding time. For example, if I put two in mount frequency, as you can see, we have relativistic machines. So the time in the virtual machine
runs twice as fast as the time in the normal life. Okay, so as you can see, the idea of Vue OS is providing means to provide the processes with the view we need to solve problems.
So we have started from the solution. We have started from what is useful. Now I can show you,
okay, all the demos I've done now are in the slides so you can test by yourself if you like all the demos after the talk. This is the structure of OMVue,
which is an implementation of Vue OS. We can provide further implementation in the future. There are processes in the user space, and we use system call interposition
to decide which system calls, but we must be forwarded to the kernel in case you are accessing part of the system which are real. Or the system calls are completely implemented
in the hypervisor code. In order to achieve the better results with parallelism, we use a technique
that we have named Guardian Angels. So each process running in the user space has in the hypervisor a thread which is providing the virtualization for that thread or process. So each process in the user space
has a Guardian Angel thread in the hypervisor. So if a process runs in open, the Guardian Angel tries to see if the path is in a real or in a virtual part.
If it's a real part, it simply says the process forward the call to the kernel. Otherwise, the Guardian Angel and the model choice is for the choice of virtual or real,
and in case it's virtual, it returns the module which is responsible for the virtualization. So the system call, there are wrappers that get from the user memory all the parameter what is needed
to perform the virtualization, and the system call is forwarded to the module. We have seen in the demos Fuse, DevNet, and the implementation of file system, networking or devices or whatsoever,
really perform the actual action to have the result. I have two points to note now. One is, I have created a virtual device,
and inside the virtual device, a virtual file system. So this was a nested virtualization. So try to think that we open a file in the virtual file system.
That call is forwarded to refuse for the virtualization of the file system. How can we achieve the virtualization of device on this? The hypervisor uses a self-virtualization method.
Pure Lipsy is an overlay on Lipsy that is able to grab, to catch, all the system calls generated by the process itself.
So even the read or write from refuse is returned to the choice function. And if it sees that there is a further virtualization in place, it calls the, in this case, the device virtualizer,
and the, for example, for the RAM disk, and it provides the correct answer. Let me point out that the interface to the modules is clean, simple, let me see, kiss.
Keep it simple and stupid. The modules receive just the system calls. So if there is a read here, read file descriptor, buffer length,
over there, the module receives a read file descriptor, buffer length. Okay, so a module is simply created by registering the service to the hash table.
So one module can register, I am responsible for this sub-tree of the file system, or I am responsible for that file, or I am responsible for the data's family,
I am responsible for the other system call. And then it has to provide the implementation of the system call. If somebody want to access that kind of file, so there is a read, what must be the answer?
That's all. Okay, I have no time to show the code, but everything is available. We have a wiki site,
we are redesigning the wiki site, the most important. Here is the set of repositories, and there is actually an infrastructure for a tutorial.
We provide a disk image and scripts to make it easy for whoever want to try the tools,
to have the whole infrastructure ready to perform the experiments I've done five minutes ago. So feel free and please download the image, try the tools, and if you like, if you want to participate,
the project is open to all contributions. Thank you. How do you compare a view with a Jeevizer or a user-mode Linux?
Sorry? How do you compare Jeevizer with a Jeevizer? User-mode Linux, I compare user-mode Linux to the OS or UMU. User-mode Linux creates a complete virtual machine, so it puts an entire kernel.
We don't put anything, we just put the system in a condition to grab the system code and divert the execution of the system code to modules if required. So two points.
We use BFS to accelerate the grabbing the system call catching.
Now we can do just avoiding the second call. Each time a system call is grabbed by Ptrace, you receive two calls, one before and one after.
So using BPF now we can avoid the second call, whether the system call is completely real or completely virtual. We have a system calls that are at the same time real and virtual, for example, open. We have a virtual open, but at the same time
we force the process to make a real open because we have to allocate the file descriptor. We would like to offload many parts
of the decision process in the kernel, but we would need the eBPF which is a comp, which is a long discussion in the Linux kernel in English. So if you're interested and you like the approach,
I help us to convince the Linux kernel in English to add the eBPF for the sec comp system call.
Thank you.