Linux containers and OpenVZ
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 84 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/40032 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Year | 2012 |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 201237 / 84
1
2
3
5
7
9
11
13
14
15
16
17
18
20
23
24
26
27
28
30
32
37
39
41
44
46
47
50
51
52
53
55
57
58
63
68
70
73
75
79
80
81
84
00:00
VirtualizationMultiplication signVirtualizationConnectivity (graph theory)Different (Kate Ryan album)Kernel (computing)Open setComputer animationLecture/Conference
00:38
VirtualizationSystem programmingServer (computing)Single-precision floating-point formatAbstractionSoftware frameworkComputerIntegrated development environmentComputer hardwareWechselseitige InformationEmulatorVisualization (computer graphics)Level (video gaming)Staff (military)1 (number)Context awarenessComputer hardwareVirtualizationPhysical systemMultiplicationPower (physics)EllipseBefehlsprozessorEmulatorComputer animationLecture/Conference
01:54
EmulatorComputer hardwareVirtualizationLevel (video gaming)Visualization (computer graphics)Context awarenessContent (media)1 (number)Computer hardwareMultiplicationOperating systemVirtual machinePower (physics)Instance (computer science)Kernel (computing)VirtualizationComputer animation
02:49
FreewareTime zonePopulation densityScalabilityOverhead (computing)Pairwise comparisonKernel (computing)Instance (computer science)CASE <Informatik>SpacetimePairwise comparisonPartition (number theory)Population densityComputer hardwareMoving averageScalabilityMultiplicationTime zoneVirtualizationFinite differenceOperating systemWindowFreewarePhysical systemLecture/ConferenceComputer animation
04:09
CASE <Informatik>Film editingOnline helpScalabilityPhysical systemBefehlsprozessorPopulation densityComputer hardwareLecture/Conference
04:54
Population densityScalabilityOverhead (computing)Pairwise comparisonKernel (computing)Physical systemComputer hardwareSpacetimeResource allocationOverhead (computing)MehrplatzsystemPopulation densitySemiconductor memoryEvoluteMultiplication signTerm (mathematics)Dynamical systemRun time (program lifecycle phase)Operating systemComputer animation
06:05
Computer multitaskingMehrplatzsystemProcess (computing)System programmingTime evolutionOperations researchMultiplicationPartition (number theory)Kernel (computing)Template (C++)Utility softwareComputer-generated imageryNamespaceVirtualizationResource allocationGroup actionHuman migrationComponent-based software engineeringOperating systemPhysical systemComputer multitaskingEvoluteSpacetimeMultiplicationKernel (computing)MehrplatzsystemBitArithmetic meanLecture/ConferenceComputer animation
06:50
Kernel (computing)Template (C++)Computer-generated imageryVirtualizationNamespaceResource allocationGroup actionHuman migrationComponent-based software engineeringData managementLaptopVirtualizationNamespaceGroup actionControl systemKernel (computing)Medical imagingComputer animation
07:31
Kernel (computing)Template (C++)Utility softwareComputer-generated imageryNamespaceVirtualizationGroup actionDemonVirtuelles NetzInterface (computing)Partition (number theory)Serial portRead-only memoryMessage passingPhysical systemPrincipal ideal domainComputer networkObject (grammar)InterprozesskommunikationComputer fileNetwork topologyProcess (computing)Different (Kate Ryan album)Point (geometry)MereologyView (database)Template (C++)Kernel (computing)Medical imagingComputer fileSoftwareProcess (computing)Rule of inferenceFirewall (computing)Computer animationLecture/Conference
08:08
Address spaceVirtuelles NetzRule of inferenceComputer networkSerial portInterface (computing)Message passingRead-only memoryPhysical systemNamespaceObject (grammar)InterprozesskommunikationComputer fileNetwork topologyProcess (computing)Principal ideal domainVirtual realityKernel (computing)Semiconductor memoryNamespaceKernel (computing)Rule of inferenceSystem callMetric systemTable (information)Object (grammar)Set (mathematics)Message passingSemaphore lineInterprozesskommunikationComputer animation
08:57
NamespaceMiniDiscSemiconductor memoryRootDirectory serviceComputer fileRoutingSubsetSystem callProcess (computing)Set (mathematics)Computer programmingFile systemSoftwareData managementPhysical systemKernel (computing)Limit (category theory)Mechanism design1 (number)Lecture/Conference
10:13
Limit (category theory)Scheduling (computing)BefehlsprozessorMiniDiscComputer networkRead-only memoryWeb pageKernel (computing)Buffer solutionResource allocationTerm (mathematics)Limit (category theory)Scheduling (computing)Computer fileSemiconductor memoryMechanism designProcess (computing)BefehlsprozessorNumber1 (number)Open setSpacetime2 (number)MiniDiscSet (mathematics)Parameter (computer programming)SoftwareGame controllerBinary fileComputer animation
11:05
Scheduling (computing)BefehlsprozessorLimit (category theory)MiniDiscRead-only memoryComputer networkKernel (computing)BefehlsprozessorProcess (computing)Limit (category theory)File systemScheduling (computing)Different (Kate Ryan album)Program slicingMultiplication signDistribution (mathematics)Task (computing)WeightSoftware testingRankingPhysical systemMiniDiscSpacetimeoutputTerm (mathematics)Directory serviceMechanism designQueue (abstract data type)Computer animation
12:37
Quality of serviceHuman migrationLecture/Conference
13:18
Read-only memoryProcess (computing)Computer networkKernel (computing)Human migrationBuffer solutionServer (computing)Buffer solutionComputer fileSemiconductor memoryOpen setState of matterConnected spaceComplete metric spaceMiniDiscProcess (computing)SoftwareArithmetic meanLevel (video gaming)Physical systemCore dumpCuboidComputer animation
13:58
Server (computing)Complete metric spaceProcess (computing)Buffer solutionComputer networkRead-only memoryLoginPrincipal ideal domainQueue (abstract data type)Open setLaptopCuboidHuman migrationCore dumpPhysical systemDifferent (Kate Ryan album)Kernel (computing)Slide ruleLevel (video gaming)Computer animation
14:41
LoginPrincipal ideal domainQueue (abstract data type)Semiconductor memoryParameter (computer programming)Process (computing)Network topologySlide ruleServer (computing)Level (video gaming)Address spaceTemplate (C++)Multiplication signSet (mathematics)IP addressComputer animation
15:44
HypercubeComputer hardwareSlide ruleMusical ensembleException handlingWeb 2.0Software testing1 (number)Response time (technology)2 (number)Open setLimit (category theory)Patch (Unix)SequelStapeldateiPopulation densityDiagram
17:06
Dependent and independent variablesHypercubeMusical ensembleServer (computing)Response time (technology)Slide ruleKernel (computing)ScalabilityWorkloadBenchmarkEqualiser (mathematics)CASE <Informatik>Java appletContent (media)Term (mathematics)Open setCloud computingDiagram
18:30
Descriptive statisticsCASE <Informatik>Software testingWeb pageWikiSlide ruleVirtualizationLecture/Conference
19:16
Memory managementKernel (computing)Virtual realityParameter (computer programming)Parameter (computer programming)VirtualizationCache (computing)Kernel (computing)Real numberWeb pageBinary fileData managementSound effectMiniDiscConfiguration spaceNon-volatile memoryBitLimit (category theory)outputProduct (business)Computer animation
21:22
Loop (music)ArchitectureFile formatComputer networkComputer-generated imagerySlide rulePresentation of a groupComputer filePhysical systemDirectory serviceLecture/ConferenceComputer animation
22:08
Loop (music)ArchitectureComputer networkFile formatLoop (music)Computer fileHuman migrationGoodness of fitDifferent (Kate Ryan album)File systemVirtual machineHeat transferBackupComputer animationLecture/Conference
23:06
Loop (music)ArchitectureFile formatComputer networkComputer-generated imagerySpacetimeComputer fileFile formatBlock (periodic table)Endliche ModelltheorieGroup actionMappingServer (computing)File systemLevel (video gaming)WritingKernel (computing)Medical imagingSlide ruleModule (mathematics)Arithmetic meanMoment (mathematics)State of matterComputer animation
25:00
Patch (Unix)SpacetimeKernel (computing)Cone penetration testProcess (computing)Multiplication signProjective planeCuboidLine (geometry)CodeKernel (computing)Presentation of a groupSystem callComputer animation
25:55
SpacetimeComputer fileMaxima and minimaExpert systemMathematicsInteractive televisionInformationMultiplication signFunctional (mathematics)LogicLine (geometry)Kernel (computing)Lecture/Conference
26:58
Patch (Unix)SpacetimeKernel (computing)Cone penetration testNetwork topologyCodeProjective planePatch (Unix)SpacetimeState of matterKernel (computing)Computing platformComputer animation
27:55
Independence (probability theory)MiniDiscScalabilityRead-only memoryBefehlsprozessorCross-platformComputing platformMikroarchitekturLevel (video gaming)Physical systemScalabilityOpen setKernel (computing)CuboidServer (computing)MiniDiscSemiconductor memoryBefehlsprozessorLimit (category theory)Software developerWhiteboardArmComputer architectureScaling (geometry)IntelLipschitz-StetigkeitLecture/ConferenceComputer animation
30:02
Wireless LANComputer animation
30:50
SpacetimeSoftwareLine (geometry)Kernel (computing)Game controllerData managementNamespaceResultantPhysical systemCodeSimilarity (geometry)Computer hardwareProduct (business)Human migrationQuicksortProjective planeOracleOrder (biology)Patch (Unix)BitGodSlide ruleBackupServer (computing)Open setGroup actionDifferent (Kate Ryan album)Symbol tableStability theoryArithmetic progressionMusIS <Museumsinformationssystem>Functional (mathematics)Bit rateMultiplication signClient (computing)Graphics softwareWeb pageProcess (computing)Meta elementParallel portSet (mathematics)Real numberComputer programmingBinary fileVirtualizationGraphical user interfaceLecture/ConferenceComputer animation
38:22
Computer configurationOpen setOrder (biology)WikiBridging (networking)Data managementVirtualizationInformation securityTerm (mathematics)Integrated development environmentPlanningLine (geometry)Traffic reportingStability theoryRouter (computing)Level (video gaming)Software bugWeb pageRight angleSoftwareProjective planeMoment (mathematics)InternetworkingComputer animationLecture/Conference
41:07
ResultantPlanningOpen setRight angleProduct (business)Internet service providerService (economics)Kernel (computing)VirtualizationTerm (mathematics)BitInformation securityImplementationFlow separationIntegrated development environmentComputer architectureNamespaceSoftware bugCodeProcess (computing)Group actionComputer hardwareRootGoodness of fitEmulatorExpert systemDifferent (Kate Ryan album)Physical systemPatch (Unix)HierarchyWritingEndliche ModelltheorieMoment (mathematics)Software developerSoftware testingMusical ensembleCartesian coordinate systemLine (geometry)Logical constantCuboidDigital photographyComputer animationLecture/Conference
Transcript: English(auto-generated)
00:03
Thank you, thank you Yeah, my name is Kieran. I'm working on open museum This is what I'm going to talk about since it's my first time at Fosdam and in Brussels It's a pretty generic introductory talk, but there will be some features that I especially like to talk about so the agenda for this talk is
00:26
first of all we'll see different virtualization approaches and When we'll talk about containers in particular and open VZ and its components and kernel stuff
00:41
when I'll have some performance slides for you and I'll end up talking about the few new features. We are currently working on so First of all basically in the contexts of the stock
01:00
Virtualization is a technique that lets you divide one big piece of hardware into multiple smaller ones So this is basically partitioning you partition a big system into smaller ones and
01:22
And there are a few ways to do that First of all is there's a hardware emulation which is also virtualization actually so you can emulate a CPU and Run anything on top of that or emulate the whole system, then there's power virtualization
01:43
Which is like Zen or KVM or VMware is doing then there are containers which is what I'm doing and Finally there is a multi-server virtualization, which is out of the context for this talk but it's when you do something opposite you
02:03
Combine multiple pieces of hardware into one super piece, and then you break it into smaller ones But this is out of the context so with power virtualization What those guys do is They have a on top of hardware they have a layer called hypervisor or a virtual machine monitor
02:27
which lets Let's say it create multiple instances of virtual hardware and on top of that instances you run your operating systems So this is what
02:40
Zen KVM VMware and other guys do Opposed to that what we do is we modify the operating system kernel In our case, it's Linux kernel to provide
03:01
multiple instances of the user space which we call containers and It's not just OpenVZ and there's also LXC Well FreeBSD.jl is kind of precursor to the containers. It's not a full solution, but it's there
03:21
When Solaris zones and There is also in AIX6 they have workload partitions which are basically containers, too So these are all container solutions so the comparison between hypervisors and containers is
03:40
With hypervisor you have multiple pieces of virtual hardware, and you can roll multiple operating systems That means you can run different operating systems, Windows, Linux, whatever but it also means You have lower density and by density AI here. I mean how many
04:03
such VMs can you have on the particular piece of hardware and You have to pay some performance penalty and there are some scalability problems like one VM cannot Have all the 64 CPUs that you have on your systems and
04:26
Actually those lower density performance and scalability in case of hypervisors They've been thought by hardware vendors like Intel and AMD
04:41
They're introducing features to help hypervisors cut the corners here and there so they those Performance problems are being mitigated and dealt with as opposed to hypervisors with containers we have we still have one piece of hardware and
05:02
One kernel not multiple kernels, but just one kernel and on top of that multiple user spaces What this means is we cannot run any other operating systems So it's the Linux kernel, so it's Linux user space
05:20
But This also means that we do it much more effectively That is much higher density. So you can have more of those containers that you could have VMs that means Native performance pretty much no performance overhead. I mean you could have hard time measuring the overhead
05:44
In this case and As an added bonus, it's dynamic resource allocation in terms of like memory For example, you can give containers more memory less memory and you can do it all during runtime
06:00
so personally I consider this as a The containers as a natural step in operating system evolution We have multi-task operating systems We have multi-user operating systems and we have now we have multi container multi-user space operating systems, I
06:23
Mean if you think about it a bit, it's pretty obvious idea So, let's go and see what OpenVZ consists of of course most of the stuff is in the kernel
06:41
And it's the kernel which provides the ability to have many containers on top of it and the means that kernel uses To do that are first of all our namespaces that provides virtualization and isolation between those containers then there is a cgroups which is resource management control mechanism and
07:08
Then there is a checkpoint restart It's actually an auxiliary feature, but it's pretty big so I put it on his own that means that you can freeze the container and
07:23
Unfreeze it later. It's like hibernation for your notebook, but just for container Then there are some tools and what we call templates. Basically. These are pre-created images that are used for fast container provision and
07:41
Images of different Linux distros, which you can use as a base for containers So let's take a deeper look into the kernel part each container From the point of view of container's owner. It's a separate entity. It has its own files
08:02
It has its own processes own network device with all the you know IP tables firewall rules and routing rules and metrics Each container has its own devices and
08:20
All the other stuff that kernel provides for example IPC interprocess communications, which is shared memory semaphores and messages Each container has its own set of IPC objects so they don't you know interfere with each other and
08:43
That stuff is Done using so-called namespaces and the Easiest and The Historical the first namespace is truth system call if you know what it does is
09:04
It makes some directory on the system your new route So you cannot go up up there so actually truth is a precursor to the containers It's a file system namespace. It lets you see just some subset of the files not all of them
09:20
So if we take this truth idea Apply it to everything else that kernel provides to programs we have containers so Truth is a file system namespace Then we have a process ID namespace so the container only see its own processes
09:43
We have network namespace IPC namespace and so on and so forth So this is how we provide isolation between containers But that's not all of it the problem is there is one kernel and one set of resources like memory disk and so on so
10:08
There is a need for Resource management mechanism to keep the containers within their limits the ones that we said in terms of Memory CPU disk space
10:24
network And so on so we have war mechanisms in open VZ first one is historically called user bin counters, which is
10:40
It's a set of 20 resource control parameters mostly memory related but there are some auxiliary ones like number of Open files number of processes that you can have in containers so for each container There is that set of limits
11:02
Second Thing is fair CPU scheduler because usual CPU scheduler it schedules the CPU between tasks so it just takes the task from a rank run queue and give it a CPU time slice and
11:22
This would not be fair in because Different containers they have different amount of processes, so we have to schedule between containers first so we pick the container to give a CPU time slice to and then we pick a task within that container and
11:43
that way we Can achieve Fair CPU distribution between containers of course There are weights for each container, so you can have high priority low priority containers in terms of CPU and
12:01
There are hard limits like no more than 10% CPU time no matter what Yet another mechanism is to level this quota because container file system is a directory on the host system We have to limit that space so we have
12:22
Per container this quota per directory and inside the container the user can use the usual Linux quotas and Finally there is a disk IO priority So it's a common disk many containers. They can affect each other in a bad way
12:44
So you can isolate the bad guys by giving them low IO priority and Fifth item here would be networking, but it's pretty much solved Without us. There's a nice tool called TC. Which gives you ability to have
13:04
QoS traffic shaving and all the other fancy stuff with networking to control networking resource Now Third thing is checkpoint and migration as I said before
13:21
That means that container can be frozen and Dumped into a file on disk get the complete state of container all the running processes open files network connections memory segment buffers whatever and At a later stage we can restore it from the file back to memory
13:44
and Restarted so it will continue to run and the nice thing about that is we can restore that on a different system So we freeze the container checkpoint it move the dump file to a different box
14:01
And then we restore container on a different system that is called live migration, so you can migrate containers between two boxes and I'm usually demoing it on using two notebooks So I mean unlike the same solution from VMware
14:22
There is no need for any fancy hardware for that you can you can do it with any to open busy boxes So that was all about the kernel Just one slide about tools. It's You know it's a high-level
14:41
VCCTL is a high-level tools to operate with containers Here is which you can see typical container lifecycle We create the container specify the distro use that template. I was talking about When we set the IP address the same way you can set other parameters like memory
15:01
When we start the container go inside see that there is a usual set of processes inside usual process tree When we can stop it and destroy it and all of these you can do it in about two minutes Compare that to time that you need to set up a physical server. I
15:26
Mean whenever you need something Even though you need a new server you can fire up a new container in two minutes So a Few interesting performance slides
15:44
This slide is showing lamp throughput lamp is Linux Linux a patch MySQL and PHP, right so this is DVD store test test done by Dell that emulates a big web store selling DVDs and
16:05
And what we did we run 20 40 and 60 containers and VMs and we compare How many requests per second can they do? The rightmost bars the red ones are open VZ ones
16:24
As you can see it's more scalable If you look at 20 Then 40 then 60 all the hypervisors are going down on 60 that means They reach their density limit. There is no more
16:41
ability hardware ability to have that many kinds of VMs and in open music still growing that graph except for except for that it shows that containers are more
17:00
Fast it also shows that they are more dense and The next slide is the same test, but we see the response time So the lower the better On the and the others are VMware ES6
17:23
Hyper-v and Zen server. Sorry guys an OK VM here The next slide is we consolidate. It's a benchmark by Intel that is used to measure consolidate workload it runs some heavy Java apps inside VMs and
17:45
Those VMs are grouped into so-called CSUs so we have the case for one CSU five CSUs and ten CSUs and two last Items orange and red are open with each different kernels
18:05
Orange One is the kernel based on her health Health five and the red one is the kernel that is based on her health six, which is better in terms of SMP scalability they fixed all a few
18:21
locks and contentions so Again you can see that it performs better and I Would like to show a slide that shows that Containers are worse, but I was not able to find such the test
18:42
So that's enough for performance And if you want to see more of that stuff and detail descriptions of the test cases It's available on our wiki. The page is called performance
19:00
Now I want to tell about a few features. We've been working on recently First feature is called v-swap or virtual swap so Then I was telling about the resource management
19:20
And those user bin counters. There are 20 parameters and The good thing is that you have a lot of configure and the bad thing is you have a lot of to configure Just because those 20 parameters they are some of them are Interdependent and you can come up with the invalid configuration if you are not careful
19:41
and it's a bit complicated to manage so we thought out and we Our latest kernel comes with this feature called v-swap, which is the answer to user bin counters configuration problem
20:00
Now you only have two parameters RAM and swap so for each container you can say one gigabyte of RAM two gigabytes of swap and The thing is that swap is not real It's it's virtual what happens if the container is over its RAM limit
20:21
The kernel moves some pages out to swap but not to the real disk swap to some some virtual swap actually, it's a swap cache and Then it slows down the container Artificially to emulate the effect of swapping So if you are over your RAM limit, you will be penalized you you will feel that you are swapping
20:46
But no actual disk IO occurs And then if there is a real Global RAM shortage that virtual swap actually goes out to disk to a real swap
21:07
This feature is pretty complicated and and it took us like three years to put it into production, but now it's
21:22
Working very good and it's in the stable kernel and we recommend everyone to use it next feature I Was actually doing a presentation about this feature alone. I mean I can talk for hours about that, but I
21:41
tried down to Have it just in one slide And the feature is called p-loop or the other name for that is container in a file Remember I was telling that Container is a directory on the host system. So all the containers files are actually the host system files. Just a subdirectory of it
22:06
This is the way they usually worked But what VM guys do they usually have one big file and inside this file The whole VM leaves its file system lives and we found out that in some scenarios
22:22
It is good to have everything in one big file It's easier for backups. It's easier for migration and there are some other scenarios It's scenarios. Like if you want to migrate Container inside of virtual, you know transfer a container into the real virtual machine or back
22:44
It's good that you have it in one big file. So for that we kind of re-implemented Linux loop device But we did it in a modular fashion, so it's not Let's say it's not a stupid little device
23:02
But an intelligent one that understand different file formats like those expandable formats Like you have four gigabytes inside the container but outside it's only the size of the file it is The size of the actual files and when more space is needed the file grows
23:25
For that we need to keep in mind the mapping between the inside blocks and the on file blocks so there is a model that supports the transition and You can have different formats like the one used by KVM and QEM and so on
23:45
Then on the lower level there is an IOModule which can talk to a file system Which can talk to the VFS layer meaning any file system or which can talk directly to the NFS server to cut the corners here and there and
24:02
Make it you know more performance wise and One interesting feature about this is you can have a few images Layered on top of each other so I Was talking about just one image file
24:21
but you can Add another one on top of that make the lower level read-only make the upper one read-write and this way You can create an instant snapshot of your container So the lower level is read-only and this is the state of the container file system at the moment
24:43
that you make a mount and this feature is not upstream yet, but we hope we'll be able to merge it into the Linux kernel and Finally yet another slide
25:01
That I'd rather have a full presentation on is our newest project. It's called Creo Then I was talking about checkpoint restore Then you checkpoint container and restore it in a different box This is all done in the kernel, and it's pretty complicated advanced stuff. It is rocket science and
25:27
It has to know everything about processes in the kernel their resources the way they interact and therefore it's a lot of complicated kernel code and
25:41
And we tried to merge it upstream a few times and every time we tried we failed When we decided well Let's keep it off the main line And then there was other guy called Orin Laden who spent like three years of his precious time
26:01
trying to merge the checkpoint restore functionality in the main line and He failed too So we decided that the only solution of putting it Into the kernel is not putting it into the kernel. We now we're trying to do it from the user space
26:20
all the complex logic is in user space and We still need something from the kernel, but we can cope with some minimal changes like export a few pieces of informations that we need through the proc files or Letting us do something that usually user space is
26:43
not interested in doing and Therefore it's a huge user space project and some minimal kernel interaction and I'm very happy to say that Andrew and Linus just it was
27:02
About a month ago. They they accepted the first pile of kernel patches from us with the following command This is a project by various mad russians to perform CR mainly from user space and Andrew is actually pretty positive about this thing as you can see in this comment
27:26
Because you know they also Unsuccessful Attempts to merge it and this project if it will end up, you know in working state It will do the same as a lot of people want checkpoint restore
27:45
And this is the only way to have it in the kernel, I guess So that's pretty much it What we have is
28:02
Well, actually I forgot to tell about it unlike hypervisors which are very platform dependent They have to know everything about Processor architecture and so on containers are Platform agnostic solution so open VZ works on Intel
28:25
ARM, MIPS, Spark I'm not a kernel developer myself But when I got some ARM board as a present I just decided why don't I port open VZ to that thing and I did it in like three days
28:40
So it's it's really it's platform independent so there is no problem supporting other architectures and There are no problems with scalability or disk IO that hypervisors suffer from That means that if you have a huge server with I don't know
29:05
Thousand CPUs and terabyte of memory you can assign it to each to one container and Your container can use all of those resources There are no limits like 4 gigabytes of RAM per container or no more than four CPUs
29:27
And you have the best possible performance pretty much the same performance as usual Linux system gives you Finally since it's done on the different level it plays well with others. What I mean here is you can use
29:44
Zen and open VZ on the same box or KVM and open VZ on the same box so you can have lightweight containers and you can have full-scale VMs so best of both worlds
30:02
That's it for today any questions There's a microphone in here, so it's better if you use it unfortunately, it's not wireless
30:48
interested in Checkpoint and restoring user space and I was thinking that is it possible to integrate it with in the future with Alexi the Linux container in the canal yeah, I mean when when checkpointing will be in the main line
31:08
It will be used by Any any software so Alexi will get it automatically the same way Alexi got Process ID name space and network name space from us, so it means in the future Alexi will be able to do like migration
31:23
Yeah, I mean currently I Speaking about Alexi. I actually have a backup slide in here. I was expecting this question
31:43
What I really want to say is it's not like open VZ versus Alexi What I want to explain is open VZ is a project Historically we had it off the main line And we we are working on it for more than 10 years already, and it was off the main line
32:05
but Seven years ago. We realized it's a bad thing to do and we need to merge upstream And we started that effort seven years ago, and we are working on that We're still working on that merging bits and pieces of containers functionality into the mainland Linux kernel and
32:23
The last time I saw there was about thousand and five hundred patches from us in the mainland kernel This is mostly the name spaces and various cgroup controllers and we are still working on that and those p-loop and v-swap and
32:41
The creo is all going to be in the mainland. We are very actively working on that and When we merge something mainline We use it from the main line and we drop our own code And this code is also used by Alexi, right?
33:01
and So it's not one versus another. It's just like an ecosystem like a symbios. I don't know but currently the difference is Open VZ is very stable. It's in production has lots of users
33:21
And Alexi is more like a work in progress And currently it's not a ready replacement for Open VZ but We hope that in the future. We'll We'll have everything in the main line. So Open VZ will probably become a set of user space tools
33:44
But you know the stuff that we merged so far it took us seven years I'm not sure how many more years we need to push the rest of it upstream So that's the thing about Alexi. Any questions, any other questions?
34:05
We have a lot of time Hello I'm trusting to execute a graphical program within a
34:22
Container how can I do it? Graphical program for example, I want to send boxing Let's me Firefox for example with some some graphical program Yeah, right Well currently X server requires direct access to the hardware
34:44
So you cannot run it inside the container unless you make the container really insecure give it access to Slash dev slash ma'am and other nasty stuff so but there is a way you run So-called X VNC server, which is a virtual X server
35:07
You run it inside the container and you connect to that server using VNC client. It's like our desktop And that way you can run Firefox
35:21
Inside a container and all sorts of goo is GUI apps we have a page about That and it's called wiki dot open visit dot org slash X inside CT CT is a container So it's all explained there. I just tried it
35:43
Two weeks ago, and it works. I mean last time I tried it was two weeks ago What are the contribution done by IBM for LXC and
36:02
Parallels for open visit is it is there true support? to this project IBM Well, I had a slide on that, but it's not here What IBM was actually doing? Okay, let me tell you a big story
36:22
About 10 years ago they acquired a friend French company Mayosis they had a project called meta cluster Which was actually? sort of sort of containers they had containers in order to run Oracle and
36:42
In order to migrate Oracle, so they had containers for checkpoint restore capability and Then IBM acquired that company the end result is they have containers in AIX and They also tried to work in checkpoint restore front
37:02
But as I said before all the attempts failed to merge anything So the contributions from IBM is if I remember right? it's in the Resource management stuff well they also had a project
37:21
Similar to our user bin counters or similar to cgroups and resource controllers It was called ckrm, and it failed to merge into the main line and that project is now dead so historically they did a lot of that, but Not not all of that stuff became
37:42
accepted into the main line what was accepted if I remember correctly is Some stuff with resource management cgroup controllers we were working together with IBM guys on that and also
38:00
The guy who maintains LXC user space tools is a Red Hat employee Hello, yeah coming back on LXC versus open visit
38:22
Am I right in thinking that the LXC is less stable and less good option than using Open visit at the moment, or is it more feature wise?
38:45
Well LXC is kind of re-implementation of open visit in the main line So so it is lagging behind In terms of features yes, and in terms of stability yes, also, okay, yeah
39:04
So I mean I don't want to say bad things about LXC, but it's a very young project, and it's not as stable and you know it's as open easy because we have 12 years behind us and that shows
39:22
Instability and features too anyone else There are two approaches in terms of
39:43
networking virtualization through growing Bridge bridging and a router Vnet and I've read a lot of things about the difficulty in order to get secure environment through virtual bridges
40:00
Will there be more documented Pages on the wiki in order to have a better approach mainly for IPv6 to manage bridges issues in terms of security because
40:21
plenty of people want to use bridges to gain this performance in terms of latency to do telephony telephony through containers and Do you know if there are works in order to? Have an advanced level of support in a IPv6
40:44
through bridges I mean IPv6 works fine with both vnet and vethernet But you are right that it's not as good documented as it should be I take this as a bug report Okay Okay, thank you other plans to support as a Linux or as a Linux well
41:38
once we support it up armor which is
41:42
novels implementation of sa Linux, but it was easier because of our architecture is easier what we think about as a Linux is that it's a security model and containers is another security model and
42:04
Whenever you need that privilege Separation you just have a separate container and run anything in a separate container, so This is how we approach it But I mean if it will be a serious push towards adding sa Linux support
42:23
We'll have to do it, but for now. We haven't really seen great demand of in that I mean because yeah, sa Linux is Making walls between applications, and we are making walls between applications, too
42:40
So you can use our walls if you need that right all right. Thank you very much Yeah Could you comment on the? Security wise the separation between the containers like supposing in a shared hosting environment or something yeah, right
43:04
the thing is Open VZ and its commercial counterpart the product based on Open VZ. It's used by I don't know 80 90 percent of all the hosting service providers and they sell those cheap VPS's for like 10 euros a month and
43:25
You have root access so if it would be insecure all these guys all the huge guys like one and one host Europe they would go out of business very soon and This is a practical
43:42
you know answer To your question it is secure because people are using it. I mean actually This is one answer right the other answer is we did a security audit of our code by a very good security expert solar designer and
44:02
We hired him to do that security audit, and he found three bugs in the mainline code not our bugs security bugs and one minor bug in our code Which we fixed and we fixed all those three too, but so theoretically it is
44:24
secured to practically it is secure and The security is mostly coming from you know. It's mostly coming from Constant care, it's it's it's more like a process Security is not something that you have for granted
44:42
You have to think about it you have to Be aware about all the possible concerns and so on and since Most of our customers in the hosting service provider business. We take security very very seriously I
45:05
See and Would there be a difference with Alex Eden? Sir would there be a difference between Alex II versus open visit security. Are you really can't comment on Alex II security? At the moment. I mean you know
45:22
is Aside from development of open visit. We do a lot of in-house testing We have a huge farm. We have a big team of QA engineers, and we do a lot of automated testing for all the music and
45:41
I'm not sure if Someone is doing that for Alex II So it needs to be tested a lot before it You know you before you can say that it is stable and secure
46:17
Hello, just one question if I buy some cheap with $10 with yes
46:22
Can it's might run might be run by open visit can I run open visit in it? Coulda can I write open visit on top of open visit and so you can do it with hypervisors because they create virtual Hardware right, but we don't do any
46:42
Emulation and It's only one kernel. Okay, and you cannot run open visit inside open visit So this is the short answer Okay, and the long answer is Theoretically you can have nested namespaces a namespace inside a namespace inside a namespace
47:02
But we played with it a bit so you can create you know sub containers within your container We played with it, and we found out that it brings Bad performance penalties So we gave up this idea and we think that flat hierarchy of containers is better in terms of performance
47:26
But some of the namespaces for example process ID namespace is actually nested so you can have process groups inside your container which are not visible to each other I Mean that could be done, but we think it's not practical to do that
47:50
Hello, and what about support of system D, or how could you estimate the plan of Implementions actually we have it running with a few patches
48:04
It needs some support from C groups, so we are getting there actually and So you also I can tell you that fedora 16 with system D patched by us There are a few patches it runs fine
48:23
So it's already done. We just need to send those patches upstream Okay, thank you. Thank you very much