New Developments and Advanced Features in the Libvirt Management API
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 199 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/32583 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Projective planeTotal S.A.Multiplication signBitNumberLevel (video gaming)Cartesian coordinate systemLibrary (computing)Stability theoryCore dumpVirtual realityXMLComputer animationLecture/Conference
00:51
ImplementationStructural loadLibrary (computing)Decision theoryKeyboard shortcutTrigonometric functionsMultiplication signAsynchronous Transfer ModeFormal languageLoop (music)Reverse engineeringNumberCASE <Informatik>Right angleEvent horizonPhysical lawVotingOpen setProcess (computing)Point (geometry)1 (number)Protein foldingState of matterJava appletDemonComputer architectureVideo gameData storage deviceCartesian coordinate systemForestHuman migrationVirtual machineFile systemPhysical systemMiniDiscHyper-VData managementOpen sourceType theoryStability theoryConfiguration spaceQuicksortInformationConnectivity (graph theory)MultiplicationDistortion (mathematics)Device driverProjective planeServer (computing)Message passingParallel portPower (physics)Mechanism designSoftware maintenanceDynamical systemSoftware testingHydraulic motorCross-platformLiquidPortable communications deviceRevision controlInterface (computing)Exclusive orRewritingComputer-generated imageryView (database)Level (video gaming)Storage area networkBus (computing)Generic programmingBitExtension (kinesiology)Lebesgue integrationVirtual realityAlgorithmCoprocessorReading (process)Default (computer science)Computer animation
09:10
MathematicsCore dumpGame controllerUniqueness quantificationComputer fileVirtual machineControl systemData storage deviceCartesian coordinate systemQuery languageMiniDiscStatisticsComputer-generated imageryDifferent (Kate Ryan album)Mechanism designEmailElectronic mailing listVirtual realityOperator (mathematics)Response time (technology)File systemDevice driverSoftware maintenanceDemonAsynchronous Transfer ModeData managementMultiplication signProcess (computing)Goodness of fitSoftware frameworkCache (computing)Limit (category theory)Client (computing)Server (computing)Exclusive orDefault (computer science)Functional (mathematics)Configuration spacePlug-in (computing)MereologyCASE <Informatik>CodePoint (geometry)Open sourceDomain nameGroup actionHash functionMappingHuman migrationFiber (mathematics)SCSINetwork socketInformationReading (process)Position operatorWritingQuicksortCondition numberRight angleSocket-SchnittstelleWordProgram slicingSocial classNetwork topologyObject (grammar)Block (periodic table)SummierbarkeitPatch (Unix)Task (computing)Computer architectureCyberspaceStability theoryLoginForm (programming)Rule of inferenceStandard deviationStatement (computer science)MultiplicationArithmetic meanKey (cryptography)Computer animationProgram flowchart
17:28
Object (grammar)Virtual machineProper mapLocal ringRule of inferenceProcess (computing)Uniqueness quantificationNumberGroup actionOcean currentLoginGame controllerGeneric programmingTrailCategory of beingOperator (mathematics)System callRow (database)Demo (music)BitDomain nameInformationVirtual realityBefehlsprozessorComputer fileSemiconductor memory10 (number)DatabaseAuthenticationLine (geometry)Set (mathematics)Front and back endsMappingFeedbackInformation securityImplementationFunctional (mathematics)WritingCodeIdentifiabilityIntegerUniformer RaumEvent horizonVariety (linguistics)Different (Kate Ryan album)Default (computer science)Limit (category theory)MiniDiscComputer-generated imageryOperating systemAxiom of choiceMereologyEmulatorFunction (mathematics)Control systemStandard deviationMechanism designNetwork socketSocket-SchnittstelleGreen's functionSound effectSoftware frameworkMatching (graph theory)Semantics (computer science)Scripting languageAsynchronous Transfer ModeWater vaporRight angleACIDControl flowExecution unitWage labourMusical ensembleData managementObservational study1 (number)Social classFlow separationMultiplication signPhysicistExterior algebraUniverse (mathematics)AreaExpert systemIncidence algebraInsertion lossVertex (graph theory)Lecture/Conference
25:47
AverageVirtual machineMultiplication signInsertion lossWeb pageMathematicsPhysical systemShape (magazine)Software testingForestComputer-generated imageryPopulation densityLoginGroup actionBitQuicksortDirectory serviceDistribution (mathematics)Limit (category theory)SoftwareString (computer science)Operator (mathematics)Semiconductor memoryInformationProgram slicingSinguläres IntegralFile formatGrass (card game)Function (mathematics)Virtual realitySet (mathematics)Type theoryBlock (periodic table)FreewareOpen sourceMiniDiscArithmetic meanStack (abstract data type)Level (video gaming)HierarchyEndliche ModelltheorieDefault (computer science)PhysicalismCategory of beingPairwise comparisonCausalityTheory of relativityBefehlsprozessorWeightCanonical ensembleVelocitySystem callPlotterData storage deviceWaveletUtility softwareComputer hardwareFree groupFrequencySource codeLebesgue integrationGame controllerNumeral (linguistics)CASE <Informatik>Configuration spaceSlide ruleBand matrixCartesian coordinate systemPartition (number theory)Moment (mathematics)DemonShared memoryLine (geometry)Extension (kinesiology)Perturbation theoryGeneric programmingAbsolute time and spaceBit rateTunisComputer animation
34:05
System administratorCartesian coordinate systemProper mapConfiguration spaceVirtual machinePlanningMiniDiscSoftware development kitScripting languageVirtual realityBefehlsprozessorAuthenticationThread (computing)EmailSet (mathematics)Game controllerFront and back endsPoint (geometry)WordDomain nameComputer configurationNetwork socketData managementFunctional (mathematics)Range (statistics)Socket-SchnittstelleQuery languagePhysical lawIterationMobile WebStress (mechanics)Metropolitan area networkSolomon (pianist)Arithmetic meanDependent and independent variablesMereologyExecution unitWeb pageMultiplication signFormal verificationOpen setState of matterKeyboard shortcutInsertion lossFocus (optics)Lecture/Conference
38:10
Student's t-testXMLUML
Transcript: English(auto-generated)
00:06
So, my name is Daniel Berongé. I've been working on the Libbert project for about seven or eight years now. The time just flies when you're having fun. So I'm here to talk to you a little bit about the Libbert project.
00:25
I'm going to talk about a number of features that we've developed over the last year or so that are probably less well known, but interesting and useful to application developers building virtualization applications.
00:43
So I'm going to assume a little bit of knowledge of Libbert and virtualization, but for those of you who've never heard of Libbert, it's at its core a stable C library API with a number of language bindings to languages like Perl, Python, Java, PHP, OCaml,
01:10
most of the ones that you do care about and some that you don't care about. We try to be a pretty simple to use API.
01:24
We obviously, one of our big selling points is that we are a stable API. And that means in the entire eight years that Libbert has been going, we've never broken the API in an incompatible manner. So if you write an application today, the goal is you should run that same application
01:42
against Libbert in ten years time. without problems. It's a cross-platform API and it's a cross-hypervisor API. We support most of the hypervisors you can name, so KVM, QEMU, Xen,
02:02
both the open source and commercial versions of Xen to some extent. Hyper-V, VMware ESX, VMware Desktop, all the other VMware variants that use the same API as those two. Power hypervisor, LXC containers, Parallels.
02:25
There's probably more that I'm forgetting there, but you get the message. We're cross-hypervisor portable and we're LGPL licensed. Libbert architecture is basically two modes in which Libbert works.
02:43
There's what we call the stateless architecture. And this is where you're just using the Libbert library. And it's talking to some other external system that maintains the virtualization state. So this architecture is used most notably for the VMware ESX driver
03:04
and the Microsoft Hyper-V driver because in both cases you've got an external management server that is maintaining all the information about the virtualization hosts. So we just talk to that and let it maintain all the state for us.
03:21
The other type of architecture is what we call the stateful architecture. And this is what we use when there is no other component in the stack that's maintaining state. So we use this for QEMU, KVM, LXC, and the open-source Xen integration.
03:42
And in this case the Libbert library is talking to the Libbert daemon and the Libbert daemon maintains state about the virtualization host. So you can see in this example the application talks to the Libbert library.
04:00
In the Libbert library it uses our generic RPC mechanism to talk to the Libbert daemon. And the Libbert daemon then talks to the QEMU processors. And in the case of QEMU it talks to the QEMU via the QEMU monitor interface.
04:20
So that's a very high level view of the architecture of Libbert in general. And now I want to get on to talking a bit about some of the interesting features that you may or may not be aware of depending on how familiar you are with Libbert. When you are running virtual machines,
04:43
the vast majority of the time you have some storage attached to those virtual machines. And unless you're running a cluster file system inside your virtual machine, you don't want to have two VMs using the same disk at the same time. Because if you have an ext3 file system inside your guest
05:01
and two guests write to that at the same time, you haven't got any data left at the end of that. So that's a dangerous scenario. The other dangerous scenario involving disks is you're a single VM and you're doing say live migration from one host to another. You want to make sure that one virtual machine
05:21
doesn't end up running on both hosts at the same time. Because again bad stuff is going to happen to your data. So Libbert has a notion of access methods associated with each disk. A disk can either be set up so it's read only, in which case it's safe to share amongst as many guests as you like.
05:43
The disk can be set up as shared writable, in which case again it can be attached to multiple virtual machines. But if you're using a disk in shared writable mode, you're going to be using a cluster file system or some other file system that's aware of the fact that you can have multiple writers at the same time.
06:05
Or the default method of configuring disks is read write exclusive. In this case only one VM can access any one disk image at a time. So those are the access modes. Now the dirty little secret that you may or may not be aware of
06:23
is that Libbert never really enforced this very well. You can set up your disk as read write exclusive and Libbert was never going to stop you running two guests using that same disk. So we had these access modes but they weren't really doing anything. They were just there to show.
06:42
So in the past, well two years really now, we introduced a new bit of infrastructure in Libbert for disk lease or lock management. And this is a way of actually enforcing the disk access modes. The first implementation we did of this was using a technology called SAN lock
07:03
which was developed by the overt project. SAN lock uses something called the disk pack source algorithm for maintaining active leases on virtual disks.
07:23
The actual SAN lock locking mechanism uses storage on the side. So it's not actually locking your disk images directly. You've got a quantity of storage that's set aside as your storage for holding leases. And it's up to the management application how they associate a lease with a disk image.
07:44
But anyway the SAN lock project, although you can use it with storage on NFS, the SAN lock maintainers really don't like you doing that. They really want you to use SAN storage for maintaining leases.
08:00
And the way we've integrated it into Libbert, there's two ways it can work. What we call the manual approach and the automatic approach. Now in the manual approach, the management application, say overt, is responsible for saying this lease is associated with this disk image. And Libbert will just trust it when it gets told that information.
08:24
So when the guest starts up, Libbert will acquire all of the leases that are associated with that VM. And only if it manages to acquire all leases will VM startup actually succeed. Excuse me.
08:41
In the automatic mode, you don't have to do any special configuration. In automatic mode, Libbert will automatically create one lease for each virtual disk you've got associated with your guest. There's pluses and minuses to using automatic mode versus manual mode.
09:04
If you're an application like overt or OpenStack, then your manual mode is probably what you're going to be using because it gives you much greater control over exactly how your leases are stored and maintained. But if you're just doing a lightweight virtualization management application
09:22
and you don't want to have to worry about this too much, then the automatic mode will do what you need most of the time. One last thing about Sandlock. The Sandlock leases, it's an active lease mechanism. So the leases are being continually refreshed.
09:42
So if there's any IO problems refreshing the lease, then that is detected immediately and the virtual machine is immediately fenced, i.e. the QMU process is killed. So that gives good response time to storage failures
10:01
or locking problems, whatever they may be. Now, one of the limitations of the Sandlock approach I mentioned a few minutes ago is that the Sandlock developers only like you using it with sand storage.
10:21
So if your storage is all NFS-based or some other shared file system like Bluster or Ceph or whatever you might be using, Sandlock probably isn't the solution you want. So we developed this second locking plug-in for libvert called vert-lockty.
10:41
And this is intended to be the default locking mechanism for libvert when you deploy it on a host in the absence of any other configuration. And this just takes locks using the POSIX FCNTL locking mechanism. So it obviously requires that this POSIX feature is supported by your file system.
11:02
The majority of file systems support this, but you get the odd case where it's either not supported or the file system developers don't like you using it. I think Oracle OCFS 2 was the last file system I heard of where they don't really like you using the FCNTL locks. I can't remember the exact reason why,
11:22
but in the majority of case, this is gonna be workable for your shared file system. At this point in time, it only works with in automatic mode. This is where libvert automatically determines what the locks are.
11:40
And the way we do that is based on the file path of the virtual disk backing store. We will either take a SHA-256 hash of the file path, and that's the default mechanism, but you can also tell it that if you're using
12:01
LVM storage as your virtual disk, you can tell it to do locks based on the LVM UUID. Or if you're using fiber channel or some other SCSI storage mechanism, you can tell it to do locks based on the SCSI unique ID of the LUN.
12:22
And that's slightly better than doing it based on the file path, because if your storage appears as a different file path on different hosts, then the latter two mechanisms are stable across hosts, so they're slightly safer. So just looking at the architecture, how does that change
12:41
when you're adding the vert lockd daemon? The answer is not really very much. The QMU driver inside libvert just talks to the vert lockd daemon using a simple RPC mechanism. So whenever you start a guest, the first thing it does is it talks to the vert lockd and says acquire locks for all of these disk images.
13:03
And only if that succeeds will the QMU process then actually be started. And these locks are also actually released and reacquired whenever you pause the virtual machine, which is key to making migration work.
13:24
So that's enough about disk locking. The next thing I want to talk about is access control. Historically, libvert had a very simple access control mechanism. If you're talking to libvert over a Unix domain socket,
13:42
you can either talk to it over what we call the read-only socket or the read-write socket. And that basically does exactly what it sounds like it does. If you talk over the read-only socket, you can get information about your virtual machines and your host, but you can't make any changes. And if you talk to the read-write socket,
14:01
you can do whatever you like with no restrictions whatsoever. This is fine for many applications. OpenStack, overt, those applications basically want to be able to do anything at any time, so that's not a problem. Other applications,
14:21
say vert-top, which is a monitoring application that only ever wants read-only access to query IO stats and the like. But every now and then, people crop up on the mailing list saying, well, we want to be able to do finer-grained access control. So I want to be able to say,
14:41
user Frank can access the virtual machine, blah, and he can do X, Y, and Z operations on it. So we developed an access control mechanism in libvert, which allows you to express rules like that. And this access control mechanism operates across all of the drivers that live inside the libvert daemon.
15:03
So that's KVM, QEMU, LXC, user mode Linux, if anyone really uses that still. The access control mechanism doesn't affect the stateless drivers like VMware or Hyper-V,
15:20
because that would be really pretty pointless to do. I mean, that would involve doing access control in the libvert client. Well, you can get around the access control just by talking directly to the VMware server. So we don't even attempt to do access control for those. We only do access control for things where libvert is the exclusive mode of access to the functionality.
15:44
The access control mechanism was done in a pluggable manner, because we anticipate that over time, people will want to integrate with different access control mechanisms. But we explicitly don't want to allow closed source out-of-tree plugins.
16:03
All of the access control plugins in libvert, we want them to be open source and maintained as a normal part of the libvert code development process. So although we have a pluggable framework for this, it's not a free-for-all for anyone to do whatever they like.
16:20
If you have other requirements for access control mechanisms, come to the libvert mailing lists and propose them, and we can work them into the core libvert release. So the first, and currently only, access control mechanism we have is based on policykit.
16:45
Every libvert API has one or more permissions associated with it. But when you go to the API documentation, it'll tell you exactly what permissions are required for which API. And then we map those permissions into policykit's actions.
17:04
So if you want the start permission on the domain object, that gets mapped into a policykit's action called org-libvert-api-domain.start. And there's a whole bunch of these permissions,
17:21
which you'll again find in the online API documentation for libvert, so you can figure out what the mapping is for any APIs. Now, that's any part of the information you need to know. You've also got to identify the object you're managing, so the virtual machine, for example.
17:41
There's three unique identifiers for a virtual machine. There's an integer ID value, there's a human-friendly name, or there's a globally unique uniform unique identifier, UUID. So there's a variety of different ways to identify the objects that you're wanting to control.
18:06
And finally, you've got to identify the user that you're trying to restrict access to. And currently, due to limitations of policykits, we can only identify local Unix users. So this mechanism is only useful if the thing you're trying to control
18:22
is running on the same host as libvert, talking to it over the Unix domain socket. So we need to know the local Unix user that's calling the APIs. But once you have all that information, the permission, the object, and the user,
18:40
then you can go about defining some rules for managing it. Policykit has a JavaScript backend, so your actual access control rules are written in JavaScript, and there's a number of objects provided to you.
19:00
So there's an action object, which tells you basically what API is being invoked. There's a subject object, which tells you the user who's invoking it. And then the action object has a number of properties to identify the object. So in this short example, we're looking at an API call for the permission of getActra.
19:26
And the user is Beranger, myself. And we're looking at a guest running on the LXC hypervisor with a name of demo. And if all of those things match, then we allow access. If they don't match, then we deny access.
19:43
This is obviously a bit of a trivial example that is not really the way you'd do it in the real world, because if you had to define rules for every single individual permission, for every individual object, you'd get a JavaScript file tens or hundreds or thousands of lines long.
20:03
So if you're doing this in the real world, you'd probably set up roles. Define a set of users which are all in the same role, set of objects which you want to manage in the same way, and then write your rules so that you're mapping roles to groups of objects that would dramatically compress the amount of rules you have to write.
20:27
We don't provide anything to particularly help you do that at this time, because this is a fairly new functionality. We're really looking for people to try it out and give us feedback on what works and what doesn't work,
20:41
and what extra things that would be helpful for libvert to provide in this area. The other reason why we chose to use Policykit as the first engine for access control
21:02
is because we have the idea that if you can just write JavaScript backends, well, that means you can write a bit of JavaScript to integrate with LDAP. So if you want to define all of your rules in an LDAP database and then just query them, you can just write a bit of JavaScript glue code to connect from Policykit to your LDAP rules database,
21:21
or whatever other databases of access control rules you might have. So again, we're looking for feedback on whether this actually works out in practice, or whether we need to write a dedicated LDAP authentication backend as an alternative. So we need feedback on this area.
21:42
That's enough about access control for now. And now I want to talk a little bit about Svert. Svert is the generic term for our virtualization security layer.
22:00
This started out with an implementation for SELinux. And the idea here was that you're running lots of virtual machines. Each virtual machine is a QEMU process. QEMU, well, it's attempting to be secure. And if they've got their code exactly perfect, then it might be secure.
22:22
But history has shown us that the QEMU code isn't actually perfect. This may have come as a surprise to some people, it may not. The idea with SELinux is that we have an extra line of defense. In the event that there is some flaw in QEMU that allows the guest operating system to break out into the host,
22:42
SELinux will be used to confine that breakout within the QEMU process. So they can compromise QEMU, but they can't then go on to compromise the entire host. And your QEMU processes are also running as the same user ID by default.
23:01
So if you are able to compromise one QEMU, you can then easily compromise all the other ones. So SELinux also actually protects that. So you can't, one guest can't compromise the other guest. So this has been around for quite a while.
23:20
But in the past year or two, we've made this a bit more flexible. So we've given more choice over the SELinux domain that can be used. So this actually now works for both KVM and QEMU emulation mode. And we've also added the ability to define custom overrides for the labeling.
23:42
So if the standard SELinux policy doesn't work for you, you can write a custom SELinux policy and tell libvert to use that one instead. We've also made it possible to override the labeling on individual disk images. So if you have some disk images you want to have labeled one way,
24:01
and other disks you want to have labeled a different way, you can now set up those kind of rules. Further developing the SVERT framework, we've now introduced a proper discretionary access control mechanism.
24:20
So a few minutes ago I said every QEMU process runs as the same user ID. Well, now it's possible to give them all their own unique user ID. So you can rely on traditional Unix permissioning to separate your QEMU processes securely.
24:42
Currently you have to assign those user IDs per guest statically, but libvert will take care of dynamically setting the ownership of the disk images to match whatever user ID your guest runs under. Slightly related, but also not entirely related, is audit logging.
25:10
And if you're wanting to keep track of who's doing what on your virtualization host, you want to know what operations have happened.
25:22
So the audit log provides a way to find this out. So whenever libvert starts or stops a virtual machine, it will generate an audit record for that operation saying when it started, what SELinux domain it's running under, the UUID of the guest, and a few other pieces of information.
25:42
It will also tell you how many virtual CPUs that guest has, how much memory was assigned, all of the disk images that were assigned to that guest. So you can look back in your audit log and say, well, which guest was accessing this disk image at what time? You can find out what networks it was connected to when it started
26:02
or when hot plug operations were done. Or you can find out what cgroups access control settings were done for block storage. So there's quite a lot of audit information recorded about a virtual machine anytime any change is made to it. So if you have an exploit, you can go back in the audit logs and find out
26:21
what that guest was allowed to do, and that may help you diagnose the problem. There's also general debugging, debug logging. Historically, we just send this all to syslog, but now systemd is available in many distributions.
26:44
We've integrated with the systemd journal. So all of our log information we send to the journal by default if it's available in a structured format. That makes it a lot easier to extract information from the logs programmatically and match on anything right down to individual source file line numbers.
27:08
The last thing I've probably got time to talk about is our cgroups integration. Now, libvert has integrated with cgroups for quite a long time,
27:21
but the way we did that integration was not really too useful, it turns out. For a start, the way we laid out cgroups in a very deep hierarchy caused a lot of pathological kernel performance problems, to the extent that it was completely unusable if you had large smp guests or lots of guests running.
27:44
Now the kernel guys have thankfully fixed most of the kernel performance problems, but at the same time we simplified the way libvert uses cgroups to avoid tickling those kernel problems in the first place. We've now got the top example there was the original way we did it.
28:02
We always had three levels deep of cgroups. In the new way, if you're not using a systemd host, we've got one naming convention. If you are using a systemd host, then we rely on systemd to create the cgroups for us. So we've got the systemd naming convention.
28:23
The key takeaway is that at the very top level, you've got an arbitrary group, and at the next level you've got your virtual machines. So you can now easily set up arbitrary groups of virtual machines and apply resource controls for whole groups of VMs at a time.
28:43
And the way you do this in the XML configuration for your guests, you tell it a resource partition, and on a non-systemd host, we just map this into a cgroup directory using a fairly straightforward naming convention.
29:05
On a systemd-enabled host, we map it in using the systemd naming conventions. So the VM groups have .slice appended to their name, because that's the systemd name for a generic resource grouping.
29:23
And I've got a typo there. The virtual machines have .scope appended onto the end of their name. Actually, no, sorry, that's not a typo. Sorry, that one's showing two levels deep of grouping.
29:41
So you can have multiple levels of grouping virtual machines. And once you've set up your cgroups, there's a whole bunch of performance tunables that become available to you. You can set up relative CPU weighting, which is the CPU shares tunable,
30:01
or you can set up absolute time slices, and those are done by setting a quota and a period. Those are both in microseconds, if I remember rightly. So related to tuning the CPUs, you can set up named CPU models.
30:21
If you don't set up a CPU model, you're going to get generic defaults that KVM thinks is applicable, and you're not going to be making best use of your Intel or AMD CPU features. So you really want to set up named CPU models, which as closely match your physical CPUs as possible.
30:40
You want to squeeze every last ounce of CPU performance. Tuning memory. This is another very important thing if you want to maximize the utilization of your hardware, particularly if you've got a NUMA machine. If you're not doing NUMA placement,
31:00
then you're throwing away all the benefits of your NUMA machine. So you can control this manually by telling libvert what the memory nodes you want the VM to run on, or you can tell libvert to do it automatically. And in the automatic case, libvert will talk to something called NUMA-D. And this is just a very simple daemon that runs on a host and says,
31:24
this NUMA node has got a lot of resources free, put the VM over there. You can also have control over whether you want to use huge pages. Again, this gives you a bit of a performance benefit.
31:42
Although with current upstream kernels, you now have automatic huge page support. So there's not as much benefit to doing huge pages manually anymore. You can also turn on and off memory sharing. So if you have lots of virtual machines all running the same software stack,
32:03
chances are they've got a lot of memory pages which have the same data in them. And so there's something called a feature called KSM, which will identify those memory pages which are identical and merge them. So you only have one copy of this memory page shared amongst multiple virtual machines.
32:22
So you get higher density of virtual machines because you can squeeze more virtual machines into your RAM. And at the bottom there, you can also define various limits on how memory is used by virtual machines, how much physical RAM they have,
32:42
how much physical RAM they're guaranteed to have, and a few other things. On the virtual disk side, you can set a whole bunch of policies against virtual disks. So you can set how many IO operations per second they're allowed,
33:00
how many bytes per second they're allowed. And you can also set this on a VM level as a whole. So if the virtual machines are using physical block devices, you can set a policy against individual physical block devices that will apply to all disks that that VM uses on that block storage.
33:29
This is the last slide. So you've got memory tuning, where again you can set up various policies on bandwidth utilization, which just delegates through to the Linux traffic shaper.
33:43
And that's basically it. That's a whirlwind tour of some of the features of libvirt that have arrived in the last year that are useful for application developers to know about if you want to get the best out of your hardware. So now we've got five or ten minutes for questions.
34:02
Five minutes for questions if anyone has any. And there's a microphone coming there. Okay, one question. I've got actually three questions. The first is about, you're talking about applications. I've got one simple application which is Word,
34:21
which I like as a sysadmin. It's the best. I can script it with the shell. And I can also use Word Manager if I'm lazy. And about this whole thing you told about, how are they related to Verge? Do you implement any new feature in Verge as well,
34:44
like you said about the locking thing and stuff like that? For an application like Verisage, the goal of Verisage is just to directly expose the libvirt functionality to the administrator. So we don't want to put any policy in that.
35:01
We want to leave the full range of control up to the admin. So if the admin wants to make use of disk locking, they have to explicitly specify that in the configuration they provide to the virtual machines. But there won't be something like an XML option where you can say do locking,
35:23
like in the domain specifications? Not in the domain. Locking you can actually turn on and off for the host as a whole. As an administrator, you can turn on locking on the host, and all virtual machines will then be at their disks locked properly. But when I like to do it per virtual machine,
35:44
like you said about the automatic locking thing? Then you have to explicitly specify the locks in the XML configuration. But that's possible. That's planned. And then the second, I try to keep short, you spoke about policy kit.
36:04
Have you thought about using PAM, pluggable authentication models? Yes, we have. But it doesn't really do what we need it to do as far as I can tell.
36:22
Although if you think otherwise, feel free to raise that on the mailing list. But I don't think it's sufficiently flexible for what we need. And then you spoke about UNIX user backend. You said about the UNIX user backend.
36:40
You said explicitly about local users, but when I would use the yellow pages, then I can't use it? Or why you said local? When I said local users, I mean any application connecting over the UNIX domain socket, as opposed to an application connecting over the TCP socket.
37:00
If you use the UNIX domain socket, we can query what the user ID on the other end of it is. And the third, I'm quite short, about the CPU models, about the features like, let's say, MMX, for example. Are all those things multi-threaded per CPU?
37:21
So when I got one CPU with, let's say, eight cores, and I've got 16 VMs, can I use the feature at 60 VMs, or just at eight VMs? I mean, if the VMs are actually doing work, they're gonna contend for resources on your host CPU, but they can all see the same feature set.
37:41
But can they use them at the same time? Yeah, I mean, the virtualization takes care of that. KVM does all that. But just from a point of a sysadmin, can I use, let's say, SSE 4 point something for all 16 machines? Yeah, okay, that's it.
38:02
And I think out of five questions, come and find me in the hallways afterwards if anyone else has got questions.
Recommendations
Series of 19 media