Autoscaling best practices

Video in TIB AV-Portal: Autoscaling best practices

Formal Metadata

Autoscaling best practices
How did we survive the peak
Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
This talk will cover the basics of autoscaling, different types of auto-scaling, and how you can use your metrics to take good auto-scaling decisions. Targeted to entry level to mid level auto-scaling users. * What is autoscaling * Different kinds of traffic peak scenarios * Autoscaling reactive vs proactive * Autoscaling with external tools - Rightscale, Autoscale API, Heat, Ceilometer * Autoscaling with your metrics - Graphite, Provisioning, Configuration Management
Presentation of a group Software Lecture/Conference Energy level Canonical ensemble Replication (computing) Address space Modem Spacetime
Computer animation Multiplication sign Cloud computing Spacetime
Service (economics) Computer animation Integrated development environment Multiplication sign Physicalism Website
Service (economics) Computer animation Order (biology)
Scaling (geometry) Computer animation Consistency 1 (number) Analytic set Pattern language Cartesian coordinate system Variable (mathematics)
Functional (mathematics) Service (economics) Computer animation State of matter Calculation Forcing (mathematics) Virtual machine Inverse element Event horizon Computing platform Row (database)
Web page Email Service (economics) Scaling (geometry) Multiplication sign Mehrplatzsystem Cellular automaton Bit Mereology Cartesian coordinate system Event horizon Orbit Computer animation Personal digital assistant Order (biology) Website Self-organization Pattern language Object (grammar)
Data mining Scaling (geometry) Service (economics) Computer animation Lecture/Conference Multiplication sign Video game Right angle Bit Lastteilung Position operator
Goodness of fit Service (economics) Computer animation Lecture/Conference Personal digital assistant Prisoner's dilemma Multiplication sign Consistency Cartesian coordinate system Computing platform Hypothesis
Server (computing) Service (economics) Computer animation Channel capacity Energy level Lastteilung Bit Event horizon Digital watermarking Number
Uniform resource locator Scaling (geometry) Computer animation 1 (number) Bit Event horizon
Server (computing) Scaling (geometry) Computer animation Personal digital assistant Artificial neural network
Service (economics) INTEGRAL Interior (topology) Plotter Real number Data storage device Physicalism Bit Variable (mathematics) Cartesian coordinate system Graph coloring CAN bus Computer animation Pattern language Table (information) Metric system
Scaling (geometry) Service (economics) Computer animation Order (biology) Cartesian coordinate system Spacetime
Group action Service (economics) Scaling (geometry) Transformation (genetics) Software developer Instance (computer science) Event horizon Digital watermarking Number Template (C++) Product (business) Computer animation Integrated development environment Lastteilung
Medical imaging Group action Server (computing) Service (economics) Computer animation Logic Software developer Virtual machine Event horizon
Point (geometry) Group action Service (economics) Scaling (geometry) Computer animation Multiplication sign View (database) Order (biology) Point cloud Right angle Traffic reporting Spacetime
Functional (mathematics) Group action Server (computing) Service (economics) Scaling (geometry) Differential (mechanical device) File format Multiplication sign Projective plane 1 (number) Floppy disk Event horizon Digital watermarking Template (C++) Medical imaging Sign (mathematics) Computer animation Order (biology) Cloning Right angle
Self-organizing map Joystick Server (computing) Linear regression Direction (geometry) 1 (number) Analytic set Event horizon System call Digital watermarking Template (C++) Computer animation Order (biology) Computing platform
Point (geometry) Computer animation Multiplication sign Execution unit Endliche Modelltheorie
Message passing Computer animation Information Atomic number Code Multiplication sign Gradient Order (biology) Queue (abstract data type) Metric system Rhombus
Context awareness Computer animation Software Code
Goodness of fit Scaling (geometry) Service (economics) Computer animation Order (biology) Video game Maxima and minima Mereology Resource allocation Number
Service (economics) Scaling (geometry) Military base Decision theory Multiplication sign Maxima and minima Online help Computer animation Semiconductor memory Cuboid Metric system Resource allocation Spacetime
Mechanism design Scaling (geometry) Computer animation Order (biology) Matrix (mathematics) 1 (number) Special unitary group Lattice (order) Game theory Metric system Scalability
Functional (mathematics) Service (economics) Computer animation Divisor Multiplication sign Moment (mathematics) Fitness function Cartesian coordinate system Number
Ocean current Predictability Point (geometry) Server (computing) Service (economics) Myspace State of matter Block (periodic table) Multiplication sign Uniqueness quantification Chemical equation Mereology Measurement Product (business) Medical imaging Computer animation Internet service provider Order (biology) Lastteilung Endliche Modelltheorie
Presentation of a group Computer animation Multiplication sign Shape (magazine) Spacetime Condition number
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation Right angle
Computer animation
Key (cryptography) Insertion loss Right angle Instance (computer science) Theory
and you know and and in the end up here in the end his life that and then right so we going to talk this morning about other the scaling up this presentation is oriented toward beginners and the level users father scanning so just to give you some ideas explore what kind of tools can you store the skull replication so 1st presentations of course uh my name is mark true at all i've been racism in for more than 16 years now from the wrecking modems to setting up networks to anything you mention it I've done it um I've been working on canonical in the past where I was 1 of the founding members of due June last I might apologize for some of that but I've also right now living there will change in Iraq space are trying to address space
into the little salt and the said I like they've also like programming my free time and I like walks on the beach so what is Rackspace um
rats this is hosting company basically but uh get things are spaces from a fanatical support so we have fanatic about everything but it's the 2nd biggest public cloud provider in the world after amazin but with far away 2nd but still limited and uh we where 1 of the co-founders of OpenStack with massive i which is super exciting thing to say so what is not the
scaling if we look at you'll normal hosting environment you have your physical service I'm where you know many days uh you calculate how much traffic do get your website on you put as many services needed to get to that because scenario so what's
the problem with that you you can scale up and love very nicely they basically wasting a lot
of money because all the time at a service sending nothing you're paying for nothing at all and it's not convenient especially if you're a start that's not something you want to do so they this
scenario would be something more like this where you pass from grows up and down based on your traffic with enough leeway so you can cope with small peaks so I did
this from Wikipedia I and I modify the sorry so odd scanning you would consider that any kind of resource that you pull on demand to be able to cope with your service so in order to
understand other scanning whether your application that we need to Luke of the traffic patterns that you half traffic patterns with this will define exactly how you need to scale what kind of things can you do and you can't do it on the talk to scale
properly this would be the most basic ones so so you can see here this on and off fast growth variable and consistent traffic so the on off traffic
is the typical application that you just turn on at nite let's say you you need to run your analytics you turn on the whole infrastructure at
midnight calculate all the locks rows of today and then maybe it 3 and 4 AM you finish without any shut it down the problem is you have physical servers and physical hosting is that the service and doing nothing all the rest of the day unless you give them some of the function it's also the typical thing that binds to so banks uh they do all these calculations on huge platforms and a half on a few states centers in London New York and anywhere else and they don't have the service the inverse of the day I've gone to banks talk with them and they have all these the holes force of machines Xerox
this again that's just for nite calculation they do nothing all the rest of the day there we have the fast scenarios that would be for events like uh constituted conference tickets
the kind of a scenario you have a very high amount of traffic in orbit that for a very short time it would be 1 day 2 days he could also be of your business is also you you created the new toaster and everyone just wants to come to your website so you traffic it's going immensely which is a good thing for you right all of you've been mentioned this lasted which and we all know what would that ends up with then you half by br Abelson artists in this case it's most news annotations may organizations is to go to the web page of this of CNN or the guardian that you will see that uh when there is a very important and use event they will reduce the amount of objects in the website in order to be able to cope with uh with the amount of traffic that they're getting which is 1 of the remediation methods right but normally would would you like is that even if something be Capon's that you would like every single user to have uh for the of your website because that would be judged more traffic to the rest of the site will print 1 is making more money it's also the same for rapid-fire cells like eBay like with that column they will they will have the very best amounts of traffic during part of the they address the those service will do nothing and the last
1 is consistent traffic this is the easiest traffic that the scale because basically you almost have to do nothing throws the discoveries uh is a typical traffic that you get from 9 to 5 4 for example ratio applications accounting applications that just on when the user sitting from after the and start saying something with it and is processing with e-mail you know even if even if up my there's not that many most most normally you can forecast the pattern very easy and know when you need more and less demand for nice the so um UK talk a bit about which
are the scanning with abilities are the so basically what do you do with all these chicken singer coral right yeah they just around running around position and you need to make sure that they're doing something useful so the main ones time-based
reactive on predictive of the scaling and I'll talk a little bit and give examples of each 1 of
those so in time they saw the scanning let's say that
we have a couple service behind the load balancer uh and you know you traffic agree well and you know that you have a 2 x amount of traffic that you have in the next hour because he has been the same every single day of your life so it's something that is not difficult to forecast so let's say that is mine
and all if it's traffic that happens during the month it's November 1st just when the Christmas at prisoners buying spree happens and then you start adding more service to you know your platform and that's the thesis 1
what time base of the skull is good for it's good for on enough applications and consistent applications so in this case in these applications you have to have the service up day you can just on them on to make them run whatever you need to do and then shut them off again it's all good
then there's the reactive father scanning in in which we are actually doing something a bit smarter than that um we are measuring the amount of traffic that uh goes to the service and here we get a number of that so for example in this example we have a corpus so about 60 % capacity ready but when they get more low and 0 to adipiscing capacity that generates a high watermark event so the kind of event that will trigger the creation of another server so when you create these new server the load balancer WAS start sending traffic to what's there so the the amount of traffic
on the other 2 servers will slowly get down to multiple levels and this the scanning has the other give thing is that you can also scale down so if this free service now after the the peak traffic they go down to 30 % that would generate the low water mark
event and you would remove 1 of them and spread out across the ones good the
so this kind of the scaling is very good for faster of applications because of the scaling up is fairly easy to do and for brevity locations as well which is a bit more tricky for that
because you can end up flapping depending on the amount of traffic that you do so you need to be be careful with the and the last 1 and this is the most fancy
1 it's predictive scaling so in predictive the
scanning what you do is that not only you know what kind of traffic and you're getting through all the medics you're collecting but you're feeling all that it takes to an artificial intelligence engine there will predict traffic for so in this case the the AI and you know inside out the forecaster traffic is plus 30 % in the next 30 minutes and that maybe has the fidelity of 80 % so it's almost certain that this will happen so the scale tomatically been you have another server and you're
happy coping with appearance this kind of how the scanning
is incredibly beautiful variable traffic because it's the kind that can actually tolerate the unknown peaks peaks that you can forecast
metrics are too fast we metrics to to to capture on the table to go up and then we'll traffic very well so was the you know
the the traffic patterns and the kind of outer can is we have our I would like to talk a bit about the kinds of stores that you can use and this is heavily color into them fred but right now and plot we do all the scanning a lot better than we do and physical service until there's more tools that will allow you to to other API integration uh to do in Lyon Pixar butanol that into into real application service so this would be your main players
of course amazon as being the biggest not forever in the world has a solution for that and
so those right scale uh OpenStack those 1 as well and we we do space and there's another 1 from Netflix calls prior which is quite interesting to talk about this end so the first one is
missing confirmation so this was created by amazon in order to be able to write the deployed and new service installed application that on them so they are ready for production as fast as possible and the
other on top of that the edit out the scaling groups so the numbers and when you do that you create another scanning group and you start feeling service into that and you connect that within transformation templates so what it does is that these are these confirmation template will define what the new service we look like in this are this kind of but the normal thing with this is that is completely reactive so it will react to high and low water mark event so whenever
this high watermark event Amazon will instantly start a new instance with say a my image and we'll execute the command said you define an on confirmation thin plate on top of as soon as that happens then that will be added in the uh automatically to the load balancer and that several will be ready for production it also supports haven't events so there's a lot of this a lot of companies that use these for developers affordable upper environments in which you know the the
developers will show up at work at ATM and they will face 6 PM on us so you're able to shut down those machines are in the rest of the day done all of this is the 3rd using base images so a mice could be it and they migrated from snapshot or could be a day so as soon as provided by Amazon or any other 3rd party
so at Rackspace we just launches well another of the scanning tool these are the scanning new cool is also uh all about scaling groups because that's easy and that's very uh I would say brain logical so you creating a new server you add that to the the scanning group and it what it will do in our in our event is that when you define the high and low water mark events he will use a snapshot of that server to create new services and as much as can be
fixed as much of the point in time or it can be the latest notions for your service you can recover from the latest point of view of 1 of your service in that's ever in that of the scanning report the 3rd 1 is right scale rescue ESA tool that was created In order to be able to see if
we can simplify the promise into the clouds it was 1st supported by Amazon but now supports everything from Amazon right space when this is over and over the oceans something of new 1 so it supports any kind of a scenario it's also based on a scanning groups because this again that is logical those
hi and know what amounts but this is the differentiator in rights go what it does is that when you define the sign the watermark events every single serving notice getting rid boats in order to decide if they want another surrender group or not and that on in order to avoid a spike innocent and um floppiness so when the majority of the service but that they need another 1 the other scanning the trigger will happen and another server we created all these new service are created using a base image that is provided by scale because they haven't up of that or those tools like starts the under I and the agent monitor and top of that there on the right skills creates that you attach to the to the template and the function of those services but the bad thing about it is also embodied cost money so if your company despite all resources and you need to put your money somewhere else maybe you might not be a solution for you and 1 of the last ones is heat
so heat was created inside the OpenStack project us a clone of clot formation and then it evolved to a bit further than that we provided uh from Rackspace also some diesel compatibility with or internal predic forgot that we are given lunch to the community in time which was called checkmate so it's the it's compatible with both confirmation and checkmate and he provides called for of confirmation among so it also give you high and
low watermarks events we give you shelled events as well and he uses hits creeps so all these hits creeps there are basically templates defining Jason which are very similar to confirmation the templates so you can define the kind of humor so you want to find the base so as soon as the you wanna use undefined all and events on anything that you need to do in order to get the over and collections and this prior
which you was created what Netflix and it was just announced last month this is very interesting because they they've the are the 1st ones torture use AI on collection so what they did is that they used something this call analytical regression In order to calculate the probabilities of a new server on the other than the POV that then you a directional efficiency intelligence platform I think it's uh it's based on a Kohonen
engine but I'm a certain because they have the Polish anything yet but what they do here is die they predict the futility of the traffic
so these should be longer amount of time and again it's kind of like mythology the longer your in time the more difficult it is to actually predict with accuracy what will happen but the shorter you are in time so let's say the last 30 minutes before the traffic happens the futility of approach of your of your uh model will go up up to a
certain point where it's almost certain so you can say that there's synset 70 80 % was chances that you will have 10 % more traffic in the next 2 minutes so with these what you can do is to stay as close as possible to the traffic and only allocated respect the resources unit the all of these and fed into the Amazon API by the knife Netflix will try to their implement these for the API is like open stock but as soon as the political with the and also of course you can make your own
because this all these different tools but they might not fit your business they might not fit what you want to do so what would be the best way to create that and so in order to grade that's this is will be just to collect your metrics to be able to collect as much as you can get any kind of
metrics they get the simplest possible feed uh collect them Mr. collect the diamond collectors that's the the tool of your choice and have a good metric delays got something that can store metrics for a long time like are the whole like whisper which is what graphite users which can store information we use a lot of occurs a lot of atomicity for a long time the and then of course you need to write your another scaling code my recommendation is use message queues because this is the kind of this is the kind of
software you want you to use Mrs. cues for it that it does Escobar awareness in a synchronous way and the all the other advantage that you have you that this is very close to your business if you write your own code is brakeless do what you
actually need it understands exactly what you want to do and what your business needs so it's fairly easy to to achieve the most without this goes in each invest time and
money into developing so what you do
for taking the most of artist so all the scanning was not invented to make your life easier it was amended to make the most money so the less money to spend on on service going all the more money you can spend in something as like all some parts of the beaches and it's it's of course that was a scaling is
dangerous as well it's a very dangerous thing so we were of the crack and it's it's not a good thing In order to avoid that my recommendation is where we can afford this can you please always have minimum and maximum allocation numbers because what you want to do is having your the scaling engine all the way down to 0 because then you have
no no service but the others can is happy everything's good but at the same time the want to have a million box that deal from Amazon from space so when you want to do is have a maximum allocation so you want to have a human after that that goes yeah actually we having this amount of traffic everything's going or some so yeah that will allocate more service but be careful with letting out uh the the scanning and you take those decisions for you because we we had some customers already at Rackspace that came back to us with a huge meal and crying for help so my recommendation is to
stay with the bases so out the scaling can get very complex very fast if you start throwing the all kinds of business metrics all kind of late you model will deviate and will do crazy things they might be right but most of the time wrong so stick to the basics stick to CPU memory on roast these and work I because those
are the ones that will help you to scale the best then on top of that may be where you want to do is to out your business metrics but a manual way that you can review of the scanning mechanism and see if that works for you not and I said it's very important key Peruvian your out scaling because if you let it go he will do also things for you but we will also be a waste of money for you it's Akiba being rather scanner mechanisms keep having meetings with that we saw the matrix sit down check them to make sure that they feed to your business and that the gap between your real traffic and this the amount of so they have indicated the Cisneros possible because of the and that's that's the game it's trying to get that compass now as possible in order to use your all your resources for what the are supposed to be used this said that the Sun recommendations from
netflix I've seen the same kind of parton a with Rackspace customers a scalability it's never dangerous to scale up on the scale up too much on this you are concerned about the bill of course that is never dangerous to that it always helps you and the important thing there
is that the facing in time is it always place a factor and when you skilled unskilled on slowly because when you it when you traffic picks up very rapidly it's possible that it goes number up at the break and pick again so you can you scale down so we're able to not having to shut down and start service
again because that always has a cost and also don't apply the same kind of for the obligations you have 5 different applications you need to review each 1 of them don't use the same for everything because it's it's not a one-size-fits-all you need to make sure that your other scanning fits Europe occasion it's so phasing in and phasing out another scanning is very important and is 1 of the most dramatic things so whenever you facing you have a certain amount of time that it takes from the moment you say I need a new so to the moment you have to use a function and that amount of time varies widely it's amount of time
that your provider all your parfum from takes in order to install the US image and get the server up its amount of time that it takes from that's ever being at many loci measure the image to being the in the same in the same spot as all the rest of the service and production and it's among the same fixed for the load balancer to what traffic to that so so that time is very important very crucial for you that's why uh a predictive model work so well because if you know that they will take you 5 minutes or the steering collection and you can predict the futility of your traffic 30 minutes in advance that give you 25 minutes when account for you also need to have in mind the Commission time the commissioning a server is not always easy you have lots of unique things on a server you might have sessions you certainly will have lots of only per to that server the traffic lots be d-block block locks any kind of so you need you need to take them out of the 7 for shutting it down and you might also have any other kind of things that you have unique on that server that need to be taken out before it destroyed so had mind also the Commission time because that also plays a very important part and it's actually what makes scaling down the most difficult thing and sometimes in this golden images can help just a to the point where you want faster the primal so with golden images is that if you keep taking them for a fixed a snapshot that fixed Mashhad in time will delta from the current state of production so that image will in time be slower than actually the playing a myspace image but if you want to keep doing snapshots and use the latest national you might also in incorporate corruption into so it's a
very tricky balance that so you need to make sure that whatever you do it's something that will ensure that this snow and there's no corruption in your image and this corruption you new service and if that means that you need to out another check before this evidence in production so read it's always more important to be able to serve traffic right down to the earth and
then to be overwhelmed by traffic and that's my presentation so any kind of questions you have the and the really war we can think of the solid and the the the and that is the of the of the yet so the question is that and then DCO how seemed well all this getting these sometimes
to be able to the shape so you have just 1 so in this getting rid of all this is the 7th a statistician will create another 1 for you say you always keep in a shape which just 1 7 with this condition to think that it will extend I kind of doubt and then they had passed on and it's a good way to save money by letting them on the or space the the Conway for you and but is also well we'll after scanning and so it's it's really a shame in then east the sky this forecast scanning the setting where it and also if that happened is that you know me that several the time right so this ever goes down and it takes some time to get of several of you said if I sing this space for 5 minutes thank you for the then questioned about what and all this it's and think if thought the
ball until the lung right while and then like you know in the
book and long came so what the and In instance the key to this is horrible as the gluing of solutions instead those in the years to in the of the of the world so this is it this year and then you'll hear the wheel the go on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on about this 1 that the the minister of Health enough and the loss of an that's the same thing and it can only have the right to of the right of this St this was the thing this 1 down here so this is the 1 that and this is that if this is the this is not the end of what is the your on and on about it right and so that's the example of the you like the the if you have a and on and on and so on and so forth the was it was the half or so there's a lot of a lot of and here you can use the microphone we use the we would like to be my own this 1 this 1 is that in years it use the theory we have and