Video in TIB AV-Portal: Ganeti

Formal Metadata

Title of Series
Part Number
Number of Parts
CC Attribution - NonCommercial 2.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Ganeti Ganeti is a system for managing clusters of virtual machines. The talk will introduce Ganeti, its usage, and its architecture. The main focus will be on changes and new development in the last year. Ganeti is a management software for clusters of virtual machines based on Xen, KVM or LXC. It is an open source project funded by Google which has been around 7 years now. It has grown to the size of about 100,000 lines of Python and about 40,000 lines of Haskell code. Besides being used in Google’s internal infrastructure, the project has a lively open source community. Among our biggest users and contributors are OSUOSL and GRNet. In our talk, we will recall, in a self-contained way, the steps to set up and maintain a Ganeti cluster, to monitor it, and to deal with failures. We will also recall the architecture and the interfaces to the utilized open-source components. The main focus of the talk will be on changes and new features of Ganeti, predominantly those that happened in the last year. Speaker: Klaus Aehlig, Helga Velroyen Event: FrOSCon 2014 by the Free and Open Source Software Conference (FrOSCon) e.V.
Keywords Free and Open Source Software Conference FrOSCon14
Point (geometry) Presentation of a group Freeware Open source Virtual machine Data storage device Planning Data storage device Instance (computer science) Virtual reality Term (mathematics) Kinematics Natural language Vertex (graph theory) Quicksort Form (programming)
Trail State of matter Equaliser (mathematics) Interface (computing) Data storage device Data storage device Augmented reality Mereology Hypercube Data management Resource allocation Bit rate Interface (computing) Moving average Digital rights management
Self-balancing binary search tree Source code Control flow Data storage device Instance (computer science) Mereology Hypercube Database normalization Resource allocation Semiconductor memory Interface (computing) MiniDisc Energy level Digital rights management Resultant Asynchronous Transfer Mode
Execution unit Software Information MiniDisc Angle Instance (computer science) Replication (computing) IP address Address space Electronic signature
Service (economics) Operating system MiniDisc Video game Energy level Instance (computer science) Instance (computer science) Line (geometry)
Scripting language Scripting language Information Interface (computing) Virtual machine Interactive television Instance (computer science) Instance (computer science) Directory service Variable (mathematics) Number Integrated development environment Operating system MiniDisc
Personal digital assistant Decision theory Chemical equation Mögliche-Welten-Semantik Endliche Modelltheorie Instance (computer science) Software maintenance
Execution unit Information State of matter Chemical equation Maxima and minima Instance (computer science) Mereology Software maintenance Social class
Demon Asynchronous Transfer Mode Direction (geometry) Chemical equation Bit Parameter (computer programming) Software maintenance Task (computing) Demoscene Sinc function Software maintenance
Domain name Digital electronics Common Language Infrastructure Information Interface (computing) Multiplication sign Line (geometry) Sign (mathematics) Process (computing) Information Vertex (graph theory) MiniDisc Traffic reporting Task (computing) Relief Descriptive statistics Newton's law of universal gravitation
Point (geometry) Demon Group action Information Shape (magazine) Database normalization Goodness of fit Arithmetic mean Wave Right angle Vertex (graph theory) Musical ensemble MiniDisc Monster group Task (computing) Relief
Point (geometry) Dataflow Limit (category theory) Instance (computer science) Limit (category theory) Number Bit rate Personal digital assistant Operator (mathematics) Order (biology) Software testing Task (computing) Relief
Coefficient of determination Process (computing) Channel capacity Multiplication sign Set (mathematics) Limit (category theory) Bit rate Vertex (graph theory) Coprocessor Task (computing) Relief
Domain name State of matter Multiplication sign Instance (computer science) Event horizon Mathematics Process (computing) Radio-frequency identification Operator (mathematics) Configuration space Configuration space Vertex (graph theory) Task (computing) Relief Descriptive statistics
Demon Context awareness Vapor barrier State of matter Volume (thermodynamics) Instance (computer science) Process (computing) Logic Configuration space Configuration space Vertex (graph theory) Damping Task (computing) Relief Annihilator (ring theory)
State of matter System administrator Data recovery Category of being Kernel (computing) Process (computing) Error message Energy level Configuration space Arrow of time Configuration space Task (computing) Relief
Dependent and independent variables Process (computing) Trail Personal digital assistant Operator (mathematics) Parallel port Formal verification Group action Task (computing)
Trail Trail INTEGRAL Instance (computer science) Instance (computer science) Group action Parallel port Exploit (computer security) Medical imaging Data management Process (computing) Formal verification Freeware Stability theory Social class
Point (geometry) Demon Logical constant Source code Group action Trail Inheritance (object-oriented programming) Source code Electronic mailing list Thermal expansion Set (mathematics) Mereology Electronic mailing list Bit rate Power (physics) Process (computing) Operator (mathematics) Touch typing Pattern language
Source code Addition Group action Trail Twin prime Information overload Thermal expansion Bit rate Mereology Electronic mailing list Limit (category theory) Instance (computer science) Limit (category theory) Number Bit rate Operator (mathematics) Order (biology) Data center Task (computing)
Source code Group action Trail Multiplication sign Thermal expansion Mereology Bit rate Electronic mailing list Limit (category theory) Software maintenance Number Local Group Process (computing) Bit rate String (computer science) Right angle
Game controller Presentation of a group Algorithm Decision theory Equals sign Interface (computing) Instance (computer science) Instance (computer science) Affine space Computer programming Number Mathematics Casting (performing arts) Resource allocation Vertex (graph theory) Communications protocol Descriptive statistics Task (computing)
Axiom of choice Algorithm Decision theory Equals sign Chemical equation Decision theory Planning Instance (computer science) Instance (computer science) Incidence algebra Resource allocation Vertex (graph theory) Communications protocol Resultant Library (computing)
Axiom of choice Addition Server (computing) Axiom of choice Algorithm Decision theory Equals sign Chemical equation Instance (computer science) Instance (computer science) Parallel port Sequence Resource allocation Strategy game Vertex (graph theory) Communications protocol Resource allocation Monster group
Error message Block (periodic table) Personal digital assistant Electronic mailing list Planning Instance (computer science) Vertex (graph theory) Parallel port Form (programming) Number
Scale (map) Scaling (geometry) Operator (mathematics) Data storage device
Area Demon Data management Personal digital assistant Interface (computing) Network topology Client (computing) Median Client (computing) Flow separation Library (computing) Library (computing)
Point (geometry) Information Software State of matter Personal digital assistant Electronic mailing list Client (computing) Client (computing) Instance (computer science) Function (mathematics) Line (geometry) Library (computing)
Arithmetic mean Software State of matter Password Operator (mathematics) Letterpress printing Password Client (computing) Instance (computer science)
Scripting language Personal digital assistant State of matter Statement (computer science) Electronic mailing list Vertex (graph theory)
Point (geometry) Execution unit Channel capacity Channel capacity 1 (number) Sampling (statistics) Planning Maxima and minima Instance (computer science) Limit (category theory) Instance (computer science) Mereology Number Planning Digital rights management Mechanism design Resource allocation Quicksort Resource allocation Mathematical optimization Reduction of order Spacetime Social class
Execution unit Channel capacity Chemical equation Data storage device Bit Instance (computer science) Instance (computer science) Limit (category theory) Template (C++) Inclusion map Befehlsprozessor Resource allocation Read-only memory Semiconductor memory Befehlsprozessor Spacetime Vertex (graph theory) Series (mathematics) MiniDisc Metric system Spacetime Social class
Group action File format Instance (computer science) Graph coloring Planning Local Group Read-only memory Term (mathematics) Different (Kate Ryan album) Befehlsprozessor Computer hardware Spacetime Vertex (graph theory) MiniDisc Descriptive statistics Partition (number theory) Computer architecture Simulation Instance (computer science) Group action Template (C++) Word Exterior algebra Resource allocation MiniDisc Simulation
Group action File format Instance (computer science) Menu (computing) Replication (computing) Computer Planning Local Group Read-only memory Befehlsprozessor Energy level Spacetime Vertex (graph theory) MiniDisc Resource allocation Descriptive statistics Theory of relativity Data storage device Menu (computing) Japanese honorifics Instance (computer science) Group action Template (C++) Resource allocation Personal digital assistant Telecommunication Hill differential equation Simulation Asynchronous Transfer Mode
Execution unit Standard deviation Service (economics) Virtual machine Virtualization Revision control Human migration Personal digital assistant Finite difference Computer hardware Visualization (computer graphics) Personal digital assistant Universe (mathematics) Chain Operating system MiniDisc MiniDisc Simulation Physical system
Operations research Addition Code Chemical equation Chemical equation Channel capacity Planning Physicalism Volume (thermodynamics) Data storage device Instance (computer science) Instance (computer science) Local Group Planning Peripheral Thermal radiation Normal (geometry) Vertex (graph theory) MiniDisc Volume Simulation Asynchronous Transfer Mode Flag
Scripting language Twin prime Interface (computing) Disintegration Moment (mathematics) Data storage device Data storage device Network-attached storage Japanese honorifics Instance (computer science) Generic programming Revision control Type theory Interface (computing) File system Vertex (graph theory) MiniDisc
Scripting language Operations research Addition Scripting language Pay television Internet service provider Metadata Parameter (computer programming) Data storage device Instance (computer science) Instance (computer science) Parameter (computer programming) Variable (mathematics) Metadata Variable (mathematics) Mechanism design Integrated development environment String (computer science) Operator (mathematics) Integrated development environment MiniDisc Implementation
Scripting language Data storage device 1 (number) Set (mathematics) Data storage device Instance (computer science) Instance (computer science) Mereology Human migration Normal (geometry) MiniDisc Parametrische Erregung MiniDisc Physical system
Ocean current Chemical equation Multiplication sign Software developer Data storage device Instance (computer science) Data storage device Instance (computer science) Template (C++) Revision control Word Resource allocation Befehlsprozessor Directed set Data conversion MiniDisc Electric current
Point (geometry) Slide rule Group action Data storage device Instance (computer science) Distance Product (business) Number Discounts and allowances Mechanism design Semiconductor memory Phase transition Revision control Software testing Vertex (graph theory) Musical ensemble Analytic continuation
Distribution (mathematics) Service (economics) Public key certificate State of matter Instance (computer science) Food energy Local Group Revision control Mechanism design Word Mathematics Computer configuration Operator (mathematics) Revision control Configuration space Energy level Remote procedure call Vertex (graph theory) Information security Information security Communications protocol Backup Electric current Data compression
Execution unit Public key certificate Feedback Set (mathematics) Instance (computer science) Instance (computer science) Mereology Public key certificate Local Group Medical imaging Computer configuration Software Revision control Video game Remote procedure call Information security Information security Data compression Backup Electric current Data compression
Satellite Time zone Empennage Linear regression Decision theory Virtual machine Electronic mailing list Set (mathematics) Planning Überlastkontrolle Data storage device Instance (computer science) Food energy Category of being Duality (mathematics) Human migration Integrated development environment Vertex (graph theory) Quicksort Metropolitan area network Physical system
Source code Code Control flow Student's t-test Mereology Limit (category theory) Product (business) Human migration Chain Googol Cuboid Software testing Capability Maturity Model
Purchasing Direction (geometry) Virtual machine Data storage device Instance (computer science) Student's t-test Mereology Template (C++) Data management Different (Kate Ryan album) Googol Software testing Smoothing MiniDisc Area Projective plane Data storage device Code Computer network Template (C++) Type theory Word Process (computing) Software Website Data conversion
Point (geometry) Group action Feedback Multiplication sign Set (mathematics) Mereology Affine space Number Product (business) Befehlsprozessor Software testing Office suite Social class Exception handling Newton's law of universal gravitation Email Scaling (geometry) Fourier series Cellular automaton Projective plane Electronic mailing list Basis <Mathematik> Instance (computer science) Software maintenance Equivalence relation Event horizon Software Network topology Directed graph
thank thank you so much as mentioned entity managing just sold let's recall what clusters and kinetic terms and the
whole point of the cluster is to have somebody machines that's the reason why you operate the whole cluster being connected terms
that quote instances because the instance of some sort of a so present tactical machine of course we need to have for the machines to run the whole thing on which in genetic terms of quot notes you use some form of what was said and is the traditional came in this country popular and as major Alex he's coming into the mix for intelligent and give some storage I the plane Williams along with the traditional 1 is using given
decent distributed triplicated but devised in you have 2 copies of each this and Kennedy had some management so the 1st part is that's a lot
of convenience because it provides a uniform interface to all the various hypervisor all various storage solutions you might be using it also helps to tool she was the physical nodes use that's equal rate and In the fall some
policies and keep track of little things like 0 don't please don't on my 2 main so was on the same physical notes and the thing that and it also has to stay in the good state so
part of the reason why you have freedom sources that it can cope with full knowledge the breaking
so then the it providing commands to hand over the instances read all of the costs after what's the remaining of the clusters what is the what should she bought for nodes being powerless I could result you planning it and then we get this result against isn't cyclic that
also at the policy level it already takes care of possible failure modes because use the the as the source of which you typically have 2 notes where you have a copy of the disk so it was no place at 1 note we can stop the instance immediately in but have enough memory reserved for the instance to fiddle with that you can stop without too much OK so how how does it look and feel
basic signatures than it is to start by initializing clusters so you owe provide the command clustering you provide the name of the customer and that name is not just a name to refer to it but it is supposed to resolve to a valid IP
address which is the appeal under which you then always reach a cluster no matter which node companies managing the cluster that the host name will then be of the primary way to communicate with cluster and you can also provide the secondary IP address of that note um because scattered typically has at least 2 were networks involved 1 network where you reach the notes and this separate network which only has to what between the nodes the replication traffic disks usually want to keep that separate that they provide information about a secondary key interests and to begin the instance they communicate to the outside world I had another network you know this is
also used as a command to you again provide the name of the node which has a lot of the
primary because it provides a 2nd and a give you that for all the nodes and then you have an instance you provide all the data you need for an instance so you say well which full of stars solution do we use typically so the T for disk templated the video game you have to specify the size of the disks all the
followed where you you might provide some texts and depending which takes the provided in at that level already said this provides a certain service and make sure that instances providing the same service don't end up on the same physical don't you give the name of the instance and you provide the operating system and yeah it is held constant but the line is
tool to manage little machine and did nothing about what did you do with these machines so the operating system kind of has to come from the outside it and they are going to just provides an interface to specifying how to get the operating system on
the disk of the virtual machine quite simple interface you have a directory for each OS definitions you have a couple of scripts most importantly it creates script that is called to over the environment variables to get all the information need like the name of the instance the devices which will and then the the disks so the instance but number of days which devices size etc. and then it's up us to get whatever data you want to get on the disk before actually starting instance there are a couple of other streets some of them are optional yeah so on interaction said trading
1 thing from maintaining it is another piece of work as an example take a planned maintenance and say well this no they want to replace on this so that had the before and the way you would do it was then it is the same you modify the notes and it change the status in this case the brain so I know can be in 1 of 3 possible worlds online is hopefully the normal all the models than others so strongly connected and Richard and it's a running everyone's happy then there is a node can be offline being well often don't even try to connect to that note but assume that nothing will come out from that node and running on it and strangers comes between
technically but notice fully operational but it's a policy decision that I intend to get but not the In particular get even the not place any new instances on that knowledge and it is then taken that into balance the clusters which of the command each fold it will
also consider cluster
balance the instances running on nodes that are planned to get entity so that will already to be all the instances away from that so the name like get
the confusing all it'll balancing and this is well because it's part of the tool set on top of connected which are called the H tools for historical reasons uh the commands well you tell asked the class the directly to get information about what the current state of the clusters and I should not tell me how to below the cluster but actually carry out the command execute so that will move audiences away from the note which amount to strained once there's nothing on it
you can stay connected don't try to
connect node then you just forward or do whatever you want to do with the maintenance once you're done you just set the tone and again
and Telit balance against that is actually used yeah so that's direction and what happens behind the scenes and there
are things have changed quite a bit since last year so it might also be interesting for people but in the last year so 1st of all these
commands just showing you this gene to cluster monitor blood
and they don't to that Justin into command line interface so what they do is they pass a parameter but do some very basic sanity cheque Jason encoded and over the main socket I give it to the main demons it's going to opinion because of the quality it is looks with center Jason over Unix
domain circuit and then they 10 the even to do it and then wait for the report of how the job progresses infected don't even have to wait on the command
line you can just say well submitted I get later and see what when succeeded or not so I can just tell it to submit the commands and just painful it gets information from specific a specific divided that is a sign that says that's us simple interface of the main what
this time from the neuron the 1st thing it does is it writes the
job description to this so that embodiment of 3 started and no information is lost and the whole
point of custom Minister also have some redundancy in the in a good shape if something goes wrong uh it makes sense to also specific information replicated so it replicates at a couple of notes which are called
monster candidates because they have all the information to take over the role of the master should something happen to action last and wealth waves on on the local descrip directly was the usual right to 10 5 St. removed every name bands and for the remote not it tells some demon please that fighters to the top here then the top is that as meaning we haven't
understand we're not doing anything on it and there are several reasons why
can be here the main reason is that you want to have a limit on the number of stops running concurrent test that used to before I start the reason when all the ending was done in 1 you which much of the past and present at some point which is the limit so but keep the number of threats flow that's no longer the case but still you don't want to have too much going on in the must not because typically the master not also hosts instances
all these operations going on and on 0 and all the resources used the motorways instance so only that's of its cost an instance of this Council was also and that since the resources it latest to in muscle orders kind of a policy of the rate the number of don't you want run parallel it's also front-end tunability now but the other
reason is jobs can be more complicated than you can this
job should only be around once another job either finished or even only the other job succeeded in dozens seed and well don't even try the jobs and you can also have some talk rate-limiting which will be talking about later saying well I'm happy with lot of learning that don't have that kind there should only be a few at the same time I give an example later so once that's all done his he'll kill the present the at this also capacity that we can start to die then the dog gets forked over the new processor and the setting is waiting the reason why wages
as is there might need some locks on because you know manipulates some entity in the real world like an instance they don't want to change event that same time all because they waiting for resources because as some operations are quite heavy on resources in the 0 like copying from disco whatever and you want to limit that we began as domain 0 is also on only rare it's still talks to needs a
configuration to know when is running which notes look etc. and since it's the configuration while the configuration attempt is just a description of what the community because the company is in which state which that should be assumes that is assessed by several jobs which are not
processes it is a protein and for that that takes care of the configuration so subtrees a configuration and all the
updates like I'm waiting for that local that depend job depending on actually faith and that's supposed to run all these updates on node on the job themselves and then the doctors running state which means it does what it's supposed to do except that it doesn't do it personally because the antidepressant barriers nodes to bring up the master node you have 1 last
little notes having instances again the way it works context the demon of that note telling these create that logic volume please do that's DVD command please tell that to that article was etc. will be submitting astronomers through emissions and thus communicating with the adults while also take well tell
that we actually created that this from purpose and it's not that liftoff and we update the configuration in my just calling the properties and then hopefully win the status success it might go to other states as is could be an arrow and then the idea of connecting we do 1 level of our recovery some single long we tried to undo what you've done before I knew that is just as the
shuttle melted the administrators supposed to and pick up the pieces but nobody can use of kernels on it constant in that process as well as submitted the job but I don't want animal what when also I was happy so I mentioned earlier
that operate limiting and before I can talk about that there's another concept which I have to introduce and that is the reason try an it's not quite sure what was dumped the task but
it's not always the case that jobs to amusing themselves
if you have tasks that can be done should be done in parallel that job just as well to do with this operation like modified the cluster does run all the following jobs which I submit for you like verify
every node proven parallel and other commands to the command that response to a lot
of jobs is notification integrity well please get this and knowledge free of instances because it failed to not great is something on the corresponding nodes and then that exports of other felt that can run in parallel like for every instance it 1 just the doctor exploit other jobs and also you have a lot of high-level commands that submits mother talks divinity 1 before balancing the cluster that usually involves moving a lot of instances around and this Alderman submitting will adopt well Kennedy you can have all the tools on top of committee that all either do we use the management of management of
classes so might not be the top layer and males this image of stability and tools keep track of what's actually going on
all why a particular job was submitted and adults annotated with reasons why the executed the reason is just a list of all triple really timestamps patterns can obvious smells like the ball uh source is the 2 and that added this comment so to
begin the true that submitted to drop all the tools that picked up and transformed it and reason some human-readable text on why we actually want to have this operation on the cluster is that every entity that
touches job in some way or another as the command line and the demons taking over the heading off to other genes etc. that can extend that treason trial that's why does this and usually they do it was a a lot of tools before it actually gets connected and can also extend the reason for that and that's the main point you it 1 topics constant of job in reason traders inheritance and as a set of sometimes you want to do a talk radio emitting a typical example is everyone knows group and you want to send the whole not repair all expected power
down in that data center and you want to move all the instances of that group to different groups The that's 1 command but there's a lot of operations and you want to limit the number of instances being used simultaneously in order not to overload this which is on top of that knowledge of something and that's why a
recent addition to the is saying well we can't use that relationship and this community groups together operations that kind of belong to the same task also to some rate-limiting at least in the top fashion by having reasons were that may have as the start was rate minus limit colon followed by a number of then the full
rate the buckets and the whole string falls written about that and she ensure that that will make sure that only and
jobs belong to that reason run together so too big examples that you want to metadata group but only
move sentences at the same time because otherwise will have to to watch a lot on the switch but it just specify the reason why I'm doing it you start with religious right and that's the number and then some you you mean really formation by at representing the group in the 1st place to be a reference to some planned maintenance also some back
that's the real interest the button and it's kind of more fine-grained took control is going on it is being planned and the other thing I want to talk about this looking because there have been some changes as well and it's red up the reason of the status of is that was in the basement is it 1 of the
tasks of at present toss affinity is still keep the cast they're reasonably well balanced so same number of instances all same the amount of resources used in every node and bestest already started adding new instances so that we don't have to do much better than single the and the way to do it is to again and have an external interface the some problem which so the interface works is you call program you give a description of the customer and the instance you want to place on the customs and then in some trace encoding and you expect to get out and this is saying well
place instance here organic itself already also ships with the instance on occasion to what this is called Taylor but think the most popular 1 among all genetic clusters and it is the same the notion of balance as age for we have already seen the same many take the same library OK
so all that is phrases into the realm of locking because when plan what to put the incident I need to be sure that when was a major decisions the results the left
on the instance a server want a perfect balance I need to look all the nodes that were put to graphically the and then see what the best choice reason remaining blocks and while
Colorado that allocation strategy is that all the instances additions are done sequentially because the next instance an occasional visitor well and then I can decide whether the placement there already is a solution which has been small improvements
small but have significant prevent data which is opportunistic locking in all the technology well only opportunistically acquire some locks when trying to base the instances but just give me something not looks and from those choose where to place the instance they might not be as balanced but at least I can create monsters in parallel without waiting for all the looks beforehand that of course
given in the new form of our that I choose some node looks final but none of the nodes of chosen have enough resources to actually all the instances and
then I'm not sure whether compositions on the cluster or not I just say well try again small change small
important reason edit all the crap some lots for that is the number of locks we need at least 1 node of the list I notable block so I wait for it is 1 of those being aware that all 2 in the case of your deeper primary secondary and they're planning to also do this retirement and isn't on these saying but that's kind of future work this a that's all I wanted
to tell about basic and operation blocking intelligible tell about how to use Kennedy in large scale way that but will refer that we the this paper presents work out yeah but also from my side and in Saigon I won't go
with the I would have been a lot of features that are not that well known because we figured we have you know talked last year so which of course is you and and most of the features are more interesting if you run the fastest but some already are quite useful if you and a small cluster and of expanded more than of the
next great so of 1 the
features of the old days of rest at the so we could not be
an it's there are client libraries which actually higher than the median case we don't have use the area where your answers and this is what you use when you have more than 1 cluster and 1 to the 2 was around that for example creation several clusters or yes management and 1
then you might not only as agent to machines and run the commands on the master node um as just mentioned before that's actually what we need is a cluster name for because there are the demon uh runs on the master node and if you and your muscles and interface and we try to keep the cluster running on the letter uh what happened so you can carry over the master node to another node and of tree and the idea has to shown that and that's what we always reasoning so that you can still be accessed them um but then you
need to credentials work but only for writing so uh if you
use if you just read something about the state of the cluster of and you don't need their credentials so that means if you run roughly you might want to secure network in a way that that everybody can read all the data of the information of the cluster this is an example of the
Titans lines that we also show that and so you just import 1 by the and then you create the cluster clients and in this case we just want read something so we only you knew the name and there just on the porch which we use if you want to if you have configured in a way that you want to put it in the state otherwise it takes the point so in this case we want to read the list of instances that running and this the output some
Jason which is not easy to read so if you want to have printed in nicely yet it is some pretty printing here and yes except
if you just 1 region on the credentials but if you want write software writing means doing any operation changes the cluster state like moving instances of changing names text whatever and this is not think this way you only be indicated the cluster and also username password and then you can do stuff but he's right but if you don't want to use the 2nd client or and you want
to write your own scripts you can just do the bare the ice she's seated we can use of you get or whatever you want and here is an example where you just read the list of nodes from the cluster and then you get the eighties and so the was names and you want to get in Europe and you want to know more about a particular don't you can just call the cluster with this um and here's an example for a POST request where each 1 to change something In this case you want to change the state of a node from a from us candidate to false so that it's no longer a and you need to make a post request which is can be achieved with this statement if you want to write it for
sampling and monitoring and this actually thank you the other
ones are these and so on we have another tool quite a space which sort of part of the age was set when you're a cluster at some point you want to do this in future planning involves asking you now that we have a lot to money managers coming in test machines is an American resources on on classes or how long will our resources last and 8 faces away worry which helps to capacity planning on it faster and a it's a that
simulates the resource consumption so you that's new instances like virtually 2 runs of resources and tells you which were shows you run out 1st at end users like internally the same allocation mechanism that was if you really watch at the instances and you Canada specify the maximum number of instances that you plan to run and so we start with the biggest instant at so many as we can and then you get to the next smaller instances of the optimal that the sample as an example here so you don't
really need to on this so you just run this age space and master node and that tells you a bit about your plastered it has 3 nodes the overall capacity like this interview and so here it starts adding instances so it and former instances with the bigger and then I still have some space left for more instances but at the end of this this and and this is series
most violated any reason and it also anthocyanin pluses that symmetric that tells you how balance the classes so if it's 0 it means you're clusters perfectly balanced but that's rare in reality and here you can see it in the and it's actually it will be less than but I mean it depends on the metrics cover only this but also other like CPU and memory so it can actually get the less balance if you want to read in the limit the yeah and yeah so shows you like OK if you have this limit yourself plenty of around so that shows you you should maybe by modest the movements of might be sufficient to justify storage if you can show this
works on the cluster master and the master node but sometimes you don't have a cluster so you need to set up a new cluster and
1 and the same thing and therefore there's simulation that on which
it's based on a given day and a description of the and to foster and then you do the same thing you which specify the instances sizes and then tries to fill it up and tells you watch missiles new words and terms the you can do this for several known groups so overnight yes this concept of having a partition of different color groups or for example different purposes or different you know know the referent architectures and you can do this alternative in simulation so for example you might have some knowledge of this and more disk or slower hardware and you can know and
simulate this to include this in this relation so for this you
have to to give it a description of the cluster which is kind of who brought so this is actually the for it here so we have communication policy which is actually an unknown level so this you might have no group that this preferred place for new instances but you have maybe a last resort no room which is an enormous so computers that are still running somewhere and it's not at everything else fades use that 1 values 1st thing nodes this other resources and this is what you do here and then you also have to give up understanding which means for example if you your menu of worsening of the of the the amount of storage because you always have replication and you give it the size of the instances uh dude allocation is a mode where you start with this and like reduces in steps to and something on it so and it does the same thing here and this year in this case is see gets 33 instances and then I can add 3 more with the fewer CPU and in this case you see OK and I ran out of the
universe use cases that we
encountered was sometimes you 1 thing which realization but not quite 1 so we have to use that 1 too much chains but they don't wanna show this so they don't want that there were 2 machine gets in trouble and some other version was seen running on the same disk is we have only so they want actually not be affected by other services and so on for this you could say OK I
just given them they had only known machine but of course you might still want to have the benefits of virtualization so there might be 2 different operating system than your standard operating system and you still want to be able to migrated easily but you still want to make sure that no 2 which machines is the same and um we call this small kinetic dedicated and it's
actually quite easy to set up an entity you because of the and storage and in the plane of the arguably genes you have to make sure that not true physical volumes share the same physical device disconnecting adjusting things in physical volumes and if you're cost place 2 of them on the same this is cannot know that you actually did this and you have to tell you that you OK I want to use this exclusive story well and if you set up this way you again if you would have changed to think so it will make sure that no 2 instances 1 and the same for the code volume and it also respects this if you do any cluster balancing on
this planet with this this will be taken into account and since this is being on the note probe-level you can actually have in the same cluster normal mode or and additional and of course with this the and then another feature so
India and historically was developed was that IBM storage mind so we actually from the sentence you need some but in the last years there was more more demand version shared and distributed storage on there are few things which we supported by more integrated like her from on receptors for or you can just like some uh distributed file system on all the the nodes and use shared storage for for this you want to have if you have someone on all really cool expanded storage appliance and you want to I for performance reasons you want not only modest somewhere and being led a connected to the the Gnostic about and teachers so that we cannot possibly uh support all different
appliances with all the complications so we have the x storage interface which is you have to provide few scripts for your type of compliance and again it's the just cause them at the right moment and so on so for and it's a generic way to access an external storage but you can this is chosen performance takes and and there I have some examples here
um so it works in a way that you have to provide a few scripts sent to the typical operations which means you create an instance this 1 that's appliances grow room you and then if you want to actually placed the instances of new attention attention to a note from you might need to set some made the metadata and so you In addition additional parameters as I said on so the scripts would like to do some very efficient eventually is string or whatever you expect and the parameters actually transmitted by the environment variables this is a similar mechanism to the
instance of non-Western subscripts other entity already has implemented so this is an example here of this idea of
to different from France storage 1 when IBM storage and then you can do something like this you have your normal instance at the commands you do find a system that has an example lies and then you specify the size of that for any norm this but then you have to tell which which provide looseness this connecting nodes which script run and this actually works that you can run instances with 2 discourse included and uh appliances and you go so I mean it's it's actually really well integrated into the going workflow so you can modify things from the disk lecturing of growing depending on that's supporters you can migrate
instances and since it assumes that all you nodes are connected to the storage is actually quite easy to my red nodes and part and as a set you can actually act more parametres unity doesn't know so doesn't interpreted in way just forwards it to the script so this is just an example you
yeah so this word features that are interesting you deploy at the end of higher among notes and so I want to go over through the uh the current developments and future developments of the last time I think we we were 2 . 9 so I start to 10 to 10 as the current
version that you get rid of is there you and we see that courts we work but unpaid UN there's so diversion contributions to support a lot and more direct access to storage we improved across cluster means so you can actually move instances between 2 crossed clusters but there were a few problems with performance and some teachers
were just missing so this and you can just tell OK move into this cluster and it will figure out which node on the cluster of professors insist on before that you actually have to give it a specific node and that it actually can also convert distance based on the so we augment the set up that you have some tests faster than because you don't have so much storage you just music playing at the end and then you want to move beyond to the production possibility here so discount convergent shouldn't be too complicated but it was an essential for and then there's this to a point that we were talking about which just the cluster balancing so perhaps considering like storage so if you a memory but only the number of city was and this is the kind of action and there's you will and then another feature this divinity upgrades and have some slides
left so so far there were no mechanism to operate within a disyllabic salt and since you have this big ceremony no it's it's kind of tedious to operate a faster and many people were frustrated with that because and if you don't want to take your chances down and and the rest back and we try to make this more easier so this is like the only way so you have to stop the continuance that means actually instances continue running but if anything phase of course and it is not the victim and um then you install new packages you have to come to
update the configuration have to restart all can achieve services and then have to redistribute the configuration and then depending on different words and you to alot most and that everything goes wrong there was no way to get back easily so you have to fix this manually and from 210 on you can use
the energy outbreak along and so on so that means the earliest stages you can use it to to operate to his from potentially to 11 and this approach white work like that so you effect installed in your your new package over the distribution use and then you just say changes us operate and and there's new upgrade to and if something goes wrong you can actually use the operator endomembrane for to the the previous version the current state of our genetic entities to a level which is very redundant in Jesse we worked on our PC security so use some internal democracy protocol for the master node to talk to the but knowledge
and that's a part of it was no doubt secure because all the nodes use the same certificate that means if you have this certificate from 1 compromise who can actually pretend to be the master node and you would think so so now there's some uh improvement at the individual certificates and not just every certificate will be
accepted for the instance moves that are the cross-cluster instance moves you have behind our compression where so network setups and you can often figure as as a forward and not only that the center court we have some bluster supports now integrated so this is stuff like experimental images and is fairly new and we don't have so much from real life set up so if you want to find that's all we have to get any feedback on this and there's another with its Greece and the edge 2 sets and which was also
requested from an external user and so you can have a cluster that has a high fluctuation in low and so for example you might have a cluster that has which machines for you to go 1st to test but you're Europe 1 time zones in the home at nite and you sexuality or if you have of a set of where you only need a lot of which about machines like on and a few days uh every couple of weeks because you really some dates for something I got an otherwise you don't actually need that man instances so what we wanted to do is and whenever the kinds of
you can't shut down some instances and moved the remaining instances to a few notes to use as them as best as possible and the physically shut down the remaining also your save some money for or some energy the environment and for this to the ranking more easy to move them together and then you know what which nodes to shut down you can use this to adjust please so it's actually handling your instances together and so this is contrary to the cost of balancing his you try something that wasn't a fairly quickly on almost but this is like OK packing as many as I can on a few notes on the and trading on rest so at that picture summer regression plan for the instances and um Of course this experiments share storage because if you use you would you would actually need to move the stories in the intake put on her and her so it gains as many nodes as possible but not too many because you don't want to like happening problem the remaining instances so you don't want like 1 sentence about having a problem something and so his age but to peculate node of the few notes that state on and to only the planning was implemented so it's actually escapes to the list of things to do and decision to do so but into 2 13 you can just say OK he's cluster movement instances according to this plan and then you needed shut the nonstop and of course if you want to use this you have to make sure that you're remaining environment and use of this property satellite monitoring system should get a lot when you shut down the machines that as that of course you have to take care of a sort 2 more features so
we have been asking what Alexi support and so far always 6 it exists but it was flagged as experimental and the reason was we also do not have many users and so we don't know how thank maturity and this year we tend to part and was some called which by the most that is maybe not but who would give some money to students that were source products over the summer and they have to like implement reasonably sized feature and this having had a student who work on the support to make sure that actually works to find any
box to set of the QA that probably test this for us so whenever we now do something see breaks and X C 1 . 0 it'll probably communities 13 so that in the future and like migration is still experimental um which was due to some limitations and 60 am enamored with
some of the project that we call that anything support all these different types of storage but um it was not always that easy to convert between the purchase starts templates and we had a student work and that's to convert more directions and this also going now we will release and the the 1st of the future so this is things we've work on or like to work on the future so and as close already mentioned this himself on the job you there are some word sense here we want to get the network part more a flexible and especially compared to work with 96 and we work more on the shared storage on because that's so this is very growing area of interest but they have some interest in literary genius clusters pressing actually then you can work to reduce genius and set up but it was taking the the test here also
whenever somebody has complained about this site which you only have like 3 very different machines and something doesn't work we would like to know about this and yeah we were we continue working on the cross cross since most and some improvements is Hg handling as well and with that
so I'm done but you can check our projects you just would forgone at the end of the 1st 2 but we did not manipulative and usually less than the number of conferences year to with this year's but much anymore next week we are on important on a small at conference also on home conference and yeah that's it from our side In the instant and instant in
the OK let's this so far rewarding has worked but I was kind of restricting the the Fourier of question was intention to talker and term and so far we have any requests yet but I mean I guess as alexia some point you might integrate of speaker this progress because you're going really uh have to question when you never really know what numbers right I mean if you but so the question was about the size of the external contributor basis the trees the majority still comes from people working at Google that kind of 2 big groups that are contributing 1 is Janet uh company providing infrastructure for the creek academic community uh they use a lot of KBN sold lot KT related to Pacha suspect she contribute from them also gene-gene network is a big part of the so can synthesize picture it it's just content to the about sink tempers and so and now the big exception computed contributor is an upon who is the be maintainer for the united package as quite good relation because if if you find something that works in a set of also sense the edges in G tests parallels the KB and at the end so that I think uh because he's using his himself but just roughly a sink agent % of all that has come from people working at Google not necessarily the quality in a lot of people do internships and then looking at t and also with single some people use 20 % of the time to contribution this is how you wrote as well you also when the war was and the question was whether we would recommend using energy for small and this additional was forced to 6 not clusters and we use the adult cells for such small clusters typically in offices to provide names server and similar basic infrastructure so yes we use it also for small clusters but also people report that 1 of the reasons why the is connected is that it's easy to get started even a small cluster and to use it and it scales so to be classes well the we here the question was can so the question was what the biggest clusters of another off I the reaction already some numbers next week so I think the biggest we have internally is like a couple of like and 100 nodes maybe under runs tional 13 hundred instances products that we have several classes that we don't from x certainly I don't know I don't have any numbers yes the end of the story we have so and networks to like G into network that does some stuff on but uh we don't we really we are aware of so that's a sophisticated yet but and yeah that's equivalent part there's this network stuff I just show some topics here of course and couldn't go into detail about what questions OK so thank you and if you have questions you can mail on our mailing lists IIsi China you that


  622 ms - page object


AV-Portal 3.20.1 (bea96f1033d39fbe77f82542458e108105398441)