Logo TIB AV-Portal Logo TIB AV-Portal

Last- und Performancetests in der Cloud?!

Video in TIB AV-Portal: Last- und Performancetests in der Cloud?!

Formal Metadata

Last- und Performancetests in der Cloud?!
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Die Cloud™ ist unendlich und skalierbar. Punkt. Warum ist es dann noch wichtig die Performance und Skalierbarkeit von Cloud-basierten Systemen zu testen? Skaliert nicht mein Anbieter mein System, solange ich mir das leisten kann? Ja, aber… Cloudanbieter skalieren in erster Linie Ressourcen. Sie sorgen nicht automatisch dafür, dass Anwendungen schnell, stabil und – viel wichtiger – skalierbar sind. Performancetests sind ein wichtiges Instrument, um ein System und dessen Laufzeitumgebung zu verstehen.
Web page Presentation of a group Server (computing) Service (economics) Structural load Real number Multiplication sign Mehrplatzsystem Range (statistics) Set (mathematics) Parameter (computer programming) Focus (optics) Mathematical model Scalability Computer icon Information technology consulting Dimensional analysis Response time (technology) Thermodynamisches System Term (mathematics) Different (Kate Ryan album) Semiconductor memory HTTP Software testing Software framework Computing platform Computer architecture Task (computing) Point cloud Focus (optics) Channel capacity Software developer Sound effect Bit Database transaction Scalability Tensor Degree (graph theory) Category of being Arithmetic mean Word Computer animation Personal digital assistant Order (biology) Statement (computer science)
Service (economics) Code Mehrplatzsystem Connectivity (graph theory) Multiplication sign Characteristic polynomial Scalability Facebook Workload Thermodynamisches System Different (Kate Ryan album) Term (mathematics) Core dump Software testing Mathematical optimization Descriptive statistics Point cloud Area Dependent and independent variables Scaling (geometry) Structural load Projective plane Stress (mechanics) Data storage device Sound effect Category of being Visualization (computer graphics) Personal digital assistant Order (biology) Statement (computer science) Point cloud Right angle
Complex (psychology) Group action Functional (mathematics) Building Server (computing) Service (economics) Characteristic polynomial Streaming media Run-time system Event horizon Power (physics) Goodness of fit Roundness (object) Thermodynamisches System Term (mathematics) Operator (mathematics) Authorization Energy level Software testing Computer architecture Form (programming) Scripting language Area Internetdienst Scaling (geometry) Electric generator Channel capacity Sampling (statistics) Data storage device Fitness function Plastikkarte Database Cloud computing Cartesian coordinate system Netiquette Degree (graph theory) Personal digital assistant Internet service provider Order (biology) Data center Point cloud Right angle Complex system Lastteilung Software architecture Quicksort Fiber bundle
Group action Code Multiplication sign Set (mathematics) Perspective (visual) Public key certificate Facebook Estimator Mathematics Different (Kate Ryan album) Semiconductor memory Elasticity (physics) Cuboid Series (mathematics) Office suite Endliche Modelltheorie Stability theory Point cloud Area Curve Channel capacity Structural load Stress (mechanics) Bit Instance (computer science) Measurement Connected space Type theory Befehlsprozessor Order (biology) Configuration space MiniDisc Normal (geometry) Pattern language Quicksort Identical particles Point (geometry) Server (computing) Service (economics) Streaming media Event horizon Hypercube Scalability 2 (number) Product (business) Power (physics) Frequency Goodness of fit Latent heat Thermodynamisches System Software testing Utility software Configuration space Form (programming) Pairwise comparison Multiplication Information Projective plane Planning Basis <Mathematik> Division (mathematics) Extreme programming Cartesian coordinate system Limit (category theory) Scalability Leak Personal digital assistant Logic Point cloud
Complex (psychology) Context awareness Code State of matter Multiplication sign Dynamischer Test Range (statistics) ACID 1 (number) Set (mathematics) Parameter (computer programming) Function (mathematics) Hurewicz-Faserung Information technology consulting High availability Web 2.0 Mathematics Mechanism design Computer configuration Different (Kate Ryan album) File system Endliche Modelltheorie Series (mathematics) Social class Point cloud Area Channel capacity Structural load Software developer Gradient Electronic mailing list Bit Instance (computer science) Measurement Connected space Arithmetic mean Data management Process (computing) Internet service provider Order (biology) Configuration space Summierbarkeit Right angle Text editor Cycle (graph theory) Virtual reality Spacetime Point (geometry) Slide rule Server (computing) Service (economics) Vapor barrier Characteristic polynomial Rule of inference Machine vision Product (business) Revision control Frequency Goodness of fit Thermodynamisches System Term (mathematics) Operating system Energy level Software testing Codierung <Programmierung> Booting Mathematical optimization Tunis Form (programming) Pairwise comparison Scaling (geometry) Weight Physical law Database Skewness Line (geometry) Cartesian coordinate system Cache (computing) Kernel (computing) Integrated development environment Software Personal digital assistant Point cloud Library (computing)
State observer Server (computing) Multiplication sign Range (statistics) Set (mathematics) Branch (computer science) Parameter (computer programming) Rule of inference Scalability Product (business) Revision control Inference Mathematics Latent heat Goodness of fit Roundness (object) Thermodynamisches System Term (mathematics) Different (Kate Ryan album) Cuboid Energy level Software testing Extension (kinesiology) Error message Area Scaling (geometry) Channel capacity Software developer Stress (mechanics) Sound effect Staff (military) Bit Instance (computer science) Cartesian coordinate system Limit (category theory) Degree (graph theory) Wave Process (computing) Computer animation Software Integrated development environment Internet service provider Order (biology) Configuration space Point cloud Freeware Marginal distribution Resultant
welcome my name is the best in time I'm going to talk about the looking performance says the cloud and so I would like to to give a brief overview what I actually mean by that and why and how you can do do performance tests of that the bit about me and my name is a device set of the to start get up and to the tenants and on for the last 7 something years I've done lots of consulting and development work with a with a strong performance and focus on performance and architectures of architecture and system architecture and this ultimately led to the founding of order uh not to have years ago where we build tools and a platform and services and real for services around low testing and performance testing basically HDP based systems and so on now before we actually dive into a the topic of a performance testing in low testing and I would like to uh the and define or to talk about some some of the the the it'll give some definitions of about some of the basic words that are involved in this topic from and the first one with the performance and it's quite interesting that that performance is often understood in multiple ways and so used interchangeably with other terms that we will come to an end I would just want to make sure that we are you on the same page what we are to actually talk about um so performance is to the all that that that interesting the ability of a system to fulfill a task within a well-defined dimensions and and this is uh basically efficiency as so the task could be a transaction or a work request or something like that and the dimension of the time then we get something like response time or could be memory usage this you search or even money so you could also define performance in terms of the efficiency how how much does it cost you to server is itself a specific transaction at the so the statement like 1 server can do to to 250 transactions-per-second within a defined quality criteria would be the statement of the efficiency of the system all about the performance of the system this is heavily simplified courts and the next term I would like to talk about which is often used interchangeably with performance is escalates scalability and and actually they are not that much the same they are not really really comparable the to to 1 another and the where performance was the efficiency of the system to fulfill the task within a defined dimension scalability on the other hand is describes the effectiveness on how you can grow capacity of the system by adding resources so the the degree on how effective you are in translating Resources Inc into capacities and and if you take the statement from before like 1 server does 2 and 50 request per 2nd within a defined quality of range than the statement of tensors can do 10 fold that would be uh would be actually very good scaling the good scalable system like 100 per cent of all the additional resources are translated into additional capacity for the system or throughput in this case at their so there are differences different mathematical models and categories to describe scalability which I won't go into and that's just to give you a good distinction of what performance means and what scalability means because that will become important later on this talk the so and I would like to to ask a question that was asked by Jonas 1 when they are not sure how to pronounce his name he is the founder and CTO of light bends the icon framework maybe someone parameter and and to give a presentation thing many years ago where he asks to really nice questions the 1st 1 was how do you know that you have a performance problems and the answer is if your system is low for a single user then you have to perform better a performance problems the dialectal sloth you the that but and next question he he he asked in this presentation was
obviously how do you know if you have a scalability problem or scaling problems and the answer is yes if you system might be
or is fast for for a single user but really slow under under heavy loads or I traffic and this is a really nice nice and visualization of the of what's what what the core difference between performance and scalability at the
and so um yet Robert Robert Johnson he was the director of self engineering at Facebook wrote an interesting article in the Facebook engineering block also in 2010 I think where Facebook was really small like 500 million users so they they just reach 500 million users and and see if you talked about how well they do and performance optimization projects and scalability and scalability improvement project at Facebook and they and he made a couple of really interesting statements 1st was found yeah that that scaling usually hurts performance so they are contradicting each each other and also that efficiency projects so uh efficiency means performance and and uh that efficiency projects really give you don't get enough improvements to have a big enough effect on skates so they are reaching the area where it is more effective overall to to have a better scaling system versus a better performing system so they are sacrificing efficiency for scalability and and and and another code was that um efficiencies importance to but they think of it as a separate project from uh from scaling so they separated this completely so the except performance testing so we know we know what performance is and what is not um and now we take a look at what performance testing actually actually needs and the best definition this is a slight completely in English because the English Wikipedia such as much much better than the from 1 and the and the article on the the the the article on um suffer performance testing I think it is and is a really really nice description of performance testing so when you do performance testing would be orders of practice testing practice a general where you on the out and determine how our system performs under a particular workload and you take a look at the responsiveness of the ability of this of the system that your testing and the and to put it in another other terms and you they they all have in common or other the this is the category of testing methods and testing practices and they all have in common that they all induce a well-defined workloads to to a system that is on the test or S S U T system under test and you do that in order to observe the systems behavior and to verify the performance related characteristics or fusion if you need to a guarantee certain service-level stand you you can use performance tests in order to verify those characteristics and pants and you also want to do perform assessed to simply I understand the the behavior the the internal behavior of your system that you're testing yeah the there are lots of yeah categories or it's testing methods that could be summarized as performance tests and they are not all very well yeah well defined or it is often times and not as simple as it sounds to me to say OK this is stress test this is a spike test it's it's and it's more about what the what the goal is that you want to achieve for for selecting the the right testing testing testing methods in your in your case but we will go into some of those testing techniques and later later in this talk so now we have 1 final piece to to make the portal complete we have to talk about this cloud thing so the cloud and what here to to sell you the cloud I just want to describe what we have to by a and they all those workloads cloud vendors that are available to us we get basically infrastructure as a service so we get that works compute storage and all those things we get Platform-as-a-Service sometimes but but the most important thing is that we get API is another motivation but the across all those services and components um and we get this on demand which makes it really easy to it to to to achieve a cost-effective and scalable system so that not to be to the actual the topic so what what about performance tests in In this cloud and and you could now ask the question why is this now
relevance be because I just told you that the cloud is
scaling for you you just buy more stuff that's what your credit card once more uh and more servers whatever the and the obvious problem with that is that that that's getting resources doesn't necessarily mean to scale and of this is only true if you are there if you if you are from the very well defined as a system systems architecture that that that powers your entire application and and to give you a really really stupid and simple example and how this it's true authors in that my scripts to that and so so maybe a running in the Adobe as clouds and the true if you're familiar with it but it doesn't really matter maybe you have an automatic load balancer that's gaze automatically for you which is which is a predefined there are pitfalls their spell but in general your goods and then you have your applications will all whatsoever to year that that you have some provisioned and and autoscaling group which means that it will automatically add more resources to your problem if you have a higher throughput or I load on your on your system and then of course you have just some sort of persistence layer may be behind that and if you if you have this the scaling well that's getting well and only 1 master over for example than your things will break eventually so it is it is it maybe maybe a 2 simple example but just to give you an idea that you just can't round up your resources and get automatically more more capacity order of the system but if you if you are a bit familiar with uh with the the cloud services that are available today than you might think of what about all those fully managed services that that you get from them where the provider actually cares about everything you just say what you need and how much you need and you pay for it and they they manage basically on all the provisioning of resources below that higher level so that's an example for that would be a very land or the contradict and I think it's called functions but I'm not really sure there's this term dynamo DB database and data store and they just an event streams and cues and everything and those services are on really resources but but they are not high level services that are managed by the uh by the cultural a club provider for you so in this case on the yeah the but what is what is the problem with that and basically it boils down to and to complexity and so complex systems are all got what what what is the physical in English forward the and the that is the systems interact with which is other uh and in in in in high degree than they are a complex and this is something that I've taken from from the physics of area so the problem is complexity and the complexity hasn't simply vanished it's it's not like that magically you are moving from your on-premise data center to the cloud and everything is simple and easy and it might be easier to get started and to to to build um yet more sophisticated systems but in the end either you have some additional complexity there or your providers managing the complexity for users to so some other fit or some of the complexities hidden and of other complexity is not not the it shifted from from from from your side and 1 from your operations team for example to the team's added AWS and ever and whatever and that's it turns out that this complexity often um has a non trivial impact on performance characteristics of of your system and this is especially true for all those fully managed services that are are available and and the only thing that you can do to come to to deal with this complexity in this building up a good understanding on what you're actually dealing with and 1st of all you have to know your own application you own etiquette in your own software architecture and system architecture that that you are they that did you are responsible for that you are designing in terms of the performance characteristics and yeah so systems of architecture and all you you also need to know your runtime environment and in this example would be that the cloud run-time environment that you're using and there's also extends to all those services that you're utilizing for sample from the AWS cloud will form and the other any other Cloud vendor and all those other third-party vendors that you're using and know logging as a service databases of the stars so many as a service things that you could possibly use and you need to have a basic understanding at least a basic understanding what's happening there when uh when when when you have a certain idea traffic scenario although it's scenario on your on your system in the ghrelin but so tennis persuaded it is quite obvious that you of course we also conduct performance tests and low tests and all those kinds of testing in in the cloud the the and so and would like now to to to go all briefly over some of the and the more important and testing methods and take a look at uh what they actually mean and what you what you can achieve with those and and what is particularly important when it when it comes to doing uh looking at the clouds is year you alone as you go along here but literally that's the problem was that he had a lot and they are all in the form and we have a lot the the and let me let let me quickly would repeat the question and then moved the discussions later on the the question was in the army and how how can you how can you build up an understanding of the system that is not all right is more or visible or I OK now but but let's skip this discussion to at a later and OK 1st off we have low testing which is maybe the
simplest and the it's the simple form of a performance test and where you induce a normal or an expected will close to system and you want to take a look at the many of the latency at the throughput or or whatever OK criteria that that is important to you and usually do that in order to verify non-functional requirements or to to see if you are able to to hold your your that agreements yeah but that's physically what I want to below testing the other testing methods are a bit more interesting stress if the for example and is uh basically a low test but you are going beyond now explicitly going beyond the normal or expected workload that that you that you have expect to see on your on your system and and to do that's to to see how the system behaves in in its design limits for example or when you on a note you got you want and understand how the system behaves at those limits and and you can also utilize stress testing or a series of stress testing to figure out what the capacity of his system actually is you can increase the traffic over time and then the see the point when when you violate your quality criterias for maybe when you're system eventually that doesn't serve any requests at all because of the In the server died so but we have a that I just told you that and you have to define what a precarious but that's basically the same with a low test and then you use steadily increase the traffic in multiple faces for example is went on and on and take a look at when you are hitting the ball when you're violating the quality criteria that he defined for and the but that the again and right and when he went when it act as stressed as as it is really important to have them a deeper look at all you you you have the ability to have a deeper look at your your system so you should have all your monitoring tools and profiling tools available so that you can actually learn something when you when you are inducing the traffic uh in into your into the system and you can use a stress that's not only to to to see what the capacity is obviously but you can also start to identify the next bottlenecks for example if you want to push the boundaries even further and since the need to have some data an idea where to look next in order to improve the performance of your of your system and it's a good tool out like a or a sense to to determine the capacity per resource so you can do a stress test and and just use 1 applications of a for example and then see how much users or how much requests or how much X you can handle using that particular resource and so on with that idea in mind you can basically do the scalability testing you can you can earn the now change the perspective on 2 young how how effectively can you translate more resources into more capacity to your system or were we trespassed seconds more and more users and so on and and this is basically the the foundation for capacity planning and cost estimation because then you are maybe offenses sort of also and you haven't even launch a product but you have a hockey stick growth and you know that you have no ten fold the users per month or so and then you need to know about what what what's the what what will it cost in order to come to end handle this this series this growth scenarios and what you do for this stability scalability testing is basically a series of stress tests where you and say OK you have many 5 resources like 5 servers and you measure when do you begin to violate your quality criteria and in this case is a slow 170 the capacity maybe we 1st per 2nd or concurrent users or connections it doesn't really matter and then you add more resources may be more application servers and then you get maybe we have roughly the double throughput and do it again that more resources and and over over time most certainly it will flatten out and this is now a good set of basis to uh to see OK and maybe we now we just need to go in that area and we are basically good because it works as we expected to work we have almost linear growth them perfectly fine but if you want if we need to go here than we might have a problem and we we we might actually actually need to act immediately to to fix this scenario and to get the a comparison with the the them or wait and had to and I initially talked about performance vs. scalability and that's the Facebook actually seperates this into 2 different project us so what now what would happen if you increase the performance by 10 % then you will get something like that basically the same curves but 10 per cent higher than what happens if you if you fix the scalability problem than you may be the beginning of more less the same but but the more you know you more resources you add the more um does the scalability project have on on performance and I'm notes saying that that you should only focus on scalability because most of us doesn't don't really run such a big system that as well we need be a problem but it is important to to to to to see the difference so that you actually know what you're in for when you are working on performance for scalability and the this that's about it next up we have some spike tested and spike testing is trying to answer the question of how does your system behave under extreme load spikes and and you want to know do you can can you can utilize the elasticity of the clouds good enough and can you react fast enough to to sudden changes in the traffic pattern that you're seeing iterative system and there are
several reasons where you have a way we can actually plan and those scenarios for example the marketing division has a crazy idea to Singapore certification to half a million users at the same time and maybe you want to prepare for such a scenario or talk them out of its 2 to distributed moral the day or even mailing campaign or advertisements bought onto the or or stuff like that or maybe maybe you are we are about to release a big feature and here you and yeah they can maybe the marketing divisions as all this will go viral 100 per cent and then you need to be prepared and therefore those scenarios and basically what you're doing is you are a running running a low test but you compress the the the traffic to to very sharp of spike in essentially and you are then taking a look at how good can you absorb this as spies wereat does system the 1st how does it fail which is really good information especially when you are in this situation that your office the people actually know what to do and how to how to mitigate maybe such and sudden increase in traffic the the and then we have a so testing sometimes also called endurance testing which is that kind of like the opposite of spike test where you basically want to know how the system behaves under a very um yes under maybe normal loads the situation but for a very long uh a long time and is this is basically yeah along low tests and that the definition of long it's not up to you basically but normally that means that many many hours maybe even days that kind of depends on what application you're looking at if it's an application that you are only deploying once third-quarter then maybe you want to run longer test because you know that your systems are writing for longer periods of time but if you are I crazy deploying lecture 2 times a day or so then maybe is not that important to to ensure that you can run the system for many days without any memory leaks 4 disks spelling out or whatever and it's and also was a quiet and nice to do come down performance troubleshooting maybe that area and them I and I worked on and on and on advertisements over time and we had really strange situations where the for no 1 in 1 day so to them after deployment we suddenly see strange CPU spikes on those on those at servers and hands then we basically did uh um artificial test that should that what that this an but that was aimed to to look at a specific code power and then we just 104 for like 10 hours also made many many billions billions of requests and you have and finally saw so what what the problem problem and this is what what I meant but by those testing methods on really clearly distinguish it's that's and it's kind of a so test but also kind of the outperformance troubleshooting test of you know I can get the idea OK next up the actually the I think the most important testing methods when it's when when when it comes to the cloud and and this this configuration testing and this configuration testing changes the perspective to know what's kind of changes do you see in the observable behavior of the system when you are changing the environment model you're not changing the the the the environment but you changing the test and now you are actively changing the system that you are testing and run the same test over over again to get a comparison between multiple sets of configurations can we do this all the time I will I will come to that give me a minute at so this implies that you have to do a series of tests obviously because with 1 test you can't come compare anything to to something and and and it is a really nice uh technique to to learn about the environment that you are a running and tend to give you some examples of what I'm actually talking about when i th sort what configuration and when we go from top to bottom in in the cloud that this would be the starting with instance types for example you have compute optimized memory optimized Iowa optimized and whatever in society of different sizes and performance and normal performance and whatever and and this and this is a and the I think the the most practical example on what you want to do when you when you're running in the cloud knew what to do performance and configuration configuration test anyone to do is to to increase the throughput but maybe you also want to do this in order to get roughly the same throughput but and yet much lower and lower cost which would be yeah optimizing the cost efficiency in that in that regard and then we have many other services I just took a couple of examples here on the scaling configuration for example where you have to define of skating you you basically define a group of servers and then you define skating policies when to add more resources for how long to wait before getting even further resources went to scale down how long does it take to to add the to to brood of the of those new instances be before they become available those are there are many many apparently does that go into how to could figure out how to deal with of scaling group that you maybe want to know how this behaves if you are actually have a young using its own beyond reading the documentation and clicking buttons in the in the light and then there is the throughput of the provisioning and what I mean by throughput provisioning are those managed services that I talked about earlier where you basically say OK I want to have this event stream needs final 20 megabits of throughput there and you can roughly model it's about and against your the at a concert business logic but sometimes you forget to model box for example and then you you you see that you are rejecting book I'm sorry and will get
and yeah the main problem is to forget to model and model issues that that are not there by design and and you should always go ahead and and run a dynamic tests to figure out so if you are actually write it write about your own assertions is happening at the next point would be on DVD are using the those services that are offered by you or by of code environment or made maybe by other vendors are using them in ride or optimal optimal weight and they are not only in the cloud but basically everywhere many pitfalls that you can take when you when you just start to look at how this whole those services and systems behave under under load and and the list goes on obviously this is not not so much more plant-specific but you have the hypervisor most of the time when when you run on a virtualized environment and you know 1 8 8 every as you have some little not here what when it comes to the hypervisor that you can decide and then you have the operation system level obviously network tuning kernel tuning although settings that are are available to us and you have your web server application server stack where there are many configuration options versions dependencies that you can compare to 1 another but also suffer configuration know like a database connection pools time also and so on so forth and also maybe even softer dependencies in even things that you just use but don't manage by yourself can have a significant impact on the performance of his system and with the configuration tested and simply can't compare 1 version to another version or 1 TLS library to another to library and stuff like that and yeah and 2nd configuration testing is actually something that we uh do 100 % over time when we do a consultant work for a for our customers and this is the uh out the uh the most important technique to the yeah helped them to improve the performance characteristics and at the time next up uh something that I haven't found this and this term actually but some of the I'm not sure if I missed something I like to call it availability or resilience testing and this is a little bit inspired by the principles all scales engineering animal if you're about them that's from netflix they they offered that of the published a manifesto uh think where they all talk about Don how to and how to really make sure that your system as resilient as possible but anyhow the all those all those things I talked before they are things that you have directly under controlled maybe you have also under control when and how he deployed but um most of the time you forget that sometimes you have to deploy even when the system is a is under a heavy load of for some if you need to roll out of fixed because something this and then maybe you want to do that with all downtime and and then you have to answer to the question of all can you are you really sure that you can run as you don't and deployment under heavy loads and this is something you should also at least think about testing and verifying those of those processes that and to them yet and the next thing as that's that when you're running in a class environment you basically are confronted with constant changes to your infrastructure there are uh some automated tools that spin up new servers and shut them down again if you need something like like a service discovery tool and you want to I and II reassured that you see the changes fast enough so that you don't run into any any problems there and so these are all all the scenarios do happen all the time but most of the time that someone or does this is simply forgotten that that you should not only try this out on your deaf environment and see OK the new servers going up and the services available and everything's good but some of the other things suddenly change if you are seeing a lot of a lot of a load on your on your system and this goes on for on for a failure scenarios and fade over the verifying that your failover mechanisms are actually working what's happening in the unknown a connection to Europe catching server against law drops packets or the of the and 1 1 of the database reads slave suddenly dies so do you are you able to cope with with those in areas when you are confronted with a with a high traffic scenario and yeah OK then I I would like to to raise a question so um is it this this any difference to what we actually dates or and I had to do before we were uh and able to run those in the cloud and don't have to care about the servers uh ourselves anymore and I would clearly answer this question with the line and I think that the NSA and P I to the other is the requirement of the testing methods haven't really really change basically it's is it happen around for decades everyone knows what performance tests and low test in such things were a long time ago but what's has really changed is the the abilities or the possibility to to run those tests that uh in the in the cloud context and the most important thing here to keep in mind is that that's environments are something that is really interesting when you when you take a look at the cloud and what the actually provides you with and if you really utilize all the CPIs and automation at possibilities then suddenly becomes really easy to that provision test environments not but QA environments but really the former means of production grades performance test environments or maybe even beyond the performance and by the production environments to to see if you if you can handle larger and larger and traffics and the for ends by test environment I mean everything from infrastructures of this so the service configuration code deployments everything if you if you are able to to automate this then it might be really easy to to span of the performance test environment in the morning and 1 or maybe afterward 1 or so to look at all the data into the environment but then you have an environment that you can actually work and played play with and if you if you do that correctly then then you can do this a lot more cost-effective and more flexible than we were used to do it have 1 more quick question and is there is someone working in environments where you have 1 perf test environment where you can run like a quality a that QA a environment for performance on the 1 hand to hand the 3 you want to work with OK so so you have a OK do you have the 3 environments wall more than 3 environments all the way to the is it for 1 nice OK so and yeah good and the problem usually is that that performance environments that are and the the that are capable of handling this traffic and are comparable to to a production environment are really expensive because you just have to buy lots of resources to tumor or the production environment and should not have to do this like 2 3 4 and more times it is get it gets even more more expensive and this is quite quite a mismatch because you have not this space this size use light may maybe 10 of them and then you have to wait in order in line before you can actually go out and do your performance test of your future or your product or a service that you are launching an and I know it's and it's really hard and maybe it's just the vision but let's imagine that you can uh provision such an environment on
demand for the for the period of time that you needed for and then shut down afterwards and save lots of money and time and to to manage and to keep the systems up and running 24 7 even if you're not using them 24 7 and OK but there is a 1 a remaining problem or 1 challenge which is a and you have the ability to reproduce a your test environment perfectly even though if you can automate all of those provisionings steps like like like creating the infrastructure networks firewall rules and servers and services you have um this little nasty thing called state in your system you have a product databases for example but you also have a state in terms of caches or file system caches application application there are many many areas where you have stayed in your in your system and the and for me it's a pretty much an unanswered question how to how can you deal uh in elegant way so that you can set your system and in a in a state that has some a good starting points to make a comparable performance tests and the and how do you how do you manage those as those tested so so quite a quite a challenging task and from from our our experiences of clear if you're if you can use production data if you have to use fake data is comparable to production data is it's too optimistic too pessimistic there are many challenges and on on how to how to create or deal with tested and how to how to manage it maybe data skewness changes and for for 1 feature and then you have to and the I have a good mechanism in order to now how to identify those pages so that it stays comparible from from what 1 environment to another environment and the and how the automated all those state handling in data handling the problem I don't have a good answer for that and the heavy-duty would you think about it and I'm almost done on a quick recap so but I talk about that resources or scaling resources is not the same as scaling applications and the I think that's uh um the most important thing why this is still the case it's simply complexity is not the is it's still there it's may be hidden from us and and the only way out of this problem is building up the better and good understanding on an hour or systems behave and I and I would like to to encourage you if you are also already writing stuff in the cloud and think about doing the same um yeah and work that that you do that you do in approved provisioning your production environments and and just try to apply this for the performance testing environments as well and in the end you want to have this cycle right it's not only for so for engineering or a development but you know what it it should also also supply on all those non functional criteria so that we have just talked about so you want to design design something you want implemented made me that encode but maybe an infrastructure you want to um measure if you want you we want to validate it and you want to do it over and over again and so this would be the end thank you time you think that the only that was the the so that is a little bit of that and the only barriers there is a little well what about in the so the question is and that housing hosting providers of switching from service to service this slide and you see all you know it yet yeah and and you actually see a fibration the performance of this so this is this that OK that and how can I answer that uh what was the actual question sorry people the OK so we so fucking get up and so so you you you are now going from renting renting a server to to renting a database and manage database and you if you now that that the quality or the performance of the system isn't as stable as it used to be used by renting a server managing the database yourself right OK and still the and I think it doesn't really matter what kinds of tools your using and it's it's it's not the it that is not that there is any any difference from and from the of testing your all server with the database and then you manages a database that you are and you basically rent and because it is in the end is the basic and talk to it's just the same so you can basically use the same thing to was to do that and and at the end of the term your statement that that you are seeing different performance characteristics of those systems are good is is a good argument for running Peace performances in the 1st place the country because you want to see these performance differences for once you you could tell your own database provided that every nite at the 2 in this time range the performance degrades was happening on your and I can you can prove it because you have a series of tests or maybe you are um the having some configuration problems or you can do some configuration tuning which would that be the case for a configuration test where you compare multiple settings or maybe you can prove you can provision Morris resources for your data and manage database so that that would be the ground from where you build up to to get a better better understanding on how your manage database behaves when when the true that's how I was the last of the ending with a size and play the only was originally is of form the work is of working is it good long-range order is the sum of his interest is the years is called you have to be good they are the yeah and sometimes the sometimes you can can or the question is how on the actually deal with traffic spikes in the unique in in a cult environment and so if you if you are if you should go for scaling versus or maybe even combined with uh over-provisioning so that you have enough capacity um and and so the the other common was that's the most up most of the time you you can't read we predicted a traffic spike this is this is true sometimes it is true and and especially when when we are asked to test things because in the our customers plan began marketing campaign or coordinated cross-media complained complain then you can sort of a piano and guess what the what the traffic spike will look like if it really happens um but otherwise not you can't really fully prepare for that because it's a big as it is in the unknown if you if you are expecting such traffic spike than most of the cases solely relying on scaling uh won't you really work in my experience because and it's 1st 1st it's really hard to get all those skating policies right and to optimize your so managers in a way that they actually boot up really fast and are ready in and so this fast enough so and you really as every acid this is the case and you you you you see that all those services and editor AWS are more designed to run on a on a bigger scale you have and not only 1 instance per uh at the output per hour availabilities 1 but multiple ones and you have to have some
degree of free capacity of headroom in order to absorb the the the the the the the incoming wave and to give you more time to spin up more uh more instances and if you are expecting such a spike and then value you can basically prepare for and just round of your desire capacity it so that you you you can absorb the traffic better and otherwise yeah I'm not really very they there's no general solution and no perfect general solution to that particular problem actually properties of of the all of the of the the case and it is still in place but 1 thing is that my name is Heather the we got of the day and the state of that the on that build a whole new 1 Please I only want a mobile and from all of his on they Beijing so how did I know that this is the high and low is not good and they are also used in other it it's found this OK they allocate you hire and let me check if I can summarize the question and you are having trouble with the argument that you that you can do run on that that you should run performances against the cloud if you have an environment that that is then is changing and not only because you are changing it because the the provider itself this or other customers that are using these infrastructure Iiving into inference on your system as well right so that noisy neighbor problem for example and stuff like that OK and this is and this is quite a quite interesting and that I get this question actually a lot of them and so from diarrhea some ground truth and on and particular familiar with familiar with a daily assignment so sure about all the other car environment out there but for a given user can say that and if you if you avoid some very basic stupid ideas like using these micro instances for your production traffic for example that that have a that they the they can have or what was called like thing credits if you credits or so so for a short amount of time they can do a lot better at it in terms of this if you performance and then for the rest of the our your throttled to to to low lumo whatever level this is in general a really bad bad idea on and bets in most of the time and the northern neighbor problem is actually not not such a big a big thing if you are using instance sizes that are typically the typical for for the production environment so for example when it comes to network network performance we we we utilize the 8 areas that were quite a lot because we are running tests from a couple megabit-per-second to double-digit gigabits per 2nd range and we rarely see any situation a big difference situation in the in the network performance or packet performance actually so if you are using HTN which isation for example and use the correct settings in it when you're on your own servers we rarely see in that big of a change in the in the general performance if you are running uh no emission mission-critical Stock Exchange application then maybe you don't want to run it and become the 1st place but for most of the applications that are running it is is quite an and for the he the performance is quite predictable for the for the sake of environment and is used for those mature and providers and it this is at least an hour or observations and the observations of our of our customers so that would make it quite compatible at least within a specific margin of error effect of the other experience with this book is that there you didn't get to go solar at normally or do 1 more thing like is normally the biggest performance issues are problems that are uh configuration based and the just development stuff engineering staff box that that were introduced into the systems so normally you have to work through a lot of stuff before you can actually reach the area where you need to look at the underlying approach performs the living in 1 of the things in the world because who here is what you get a size in the and is more the review the rules of I was so the question is about the the order in which to perform those methods and but the order is from down not strictly speaking normally we we just do a quick low test to start just to see roughly if you if you reach the area that you were actually in 4 and then it's highly depends if you if you are a it if your goods uh then we run a scalability test directly to see stressed use of stress test to to to determine the capacity and deceive your system scales beyond your limit this is most of the time customers request when when they are running on the class and and then configuration testing of used as a as a job and to at a troubleshooting or dividing tool or to to quickly verify that um that that the change you are and you are making in order to optimize the performance because you song low test that you have a big problem here and you an idea and you put a target and and applies then you want to do basically configuration test with the 1 version may be rose the future branch that you're testing usually are using a tool like like we build then you just had a play button from the test again and just compare and compare these things in 2 to 1 another but they are in you know I said that configuration testing is the most important and interesting thing it's most of the time it isn't really that what we do the 1st because you 1st have to see where in that if you are getting your goal of as least roughly how the rule for and most obviously I use my of to a most of the time and end time pretty pretty familiar with uh with that song which of land-based uh and performs a testing tool and I the and I have to look into Jaime from time to time because customers are using it J. material and see are there are lots of other other tools out that and in the end it doesn't really have to be careful because the same kind of thing before that it doesn't really matter what tool your using at least not at at 1st because it's much harder to to get an idea on what you want to learn what you want to test how you want to test how do you organize those tests who was responsible for testing uh who should be present when you are doing a larger test those are all things that are much harder to come yeah to uh to start with a compared to the out picking the ideal tool for your solution or Yom monomania adjusted to the to the Boston last extent so I would say them it is more and more important to get started quickly and to have of the 1st one and wanted to 1st the 1st results quickly so that you can iterate on on that in it and if you need to switch it will are too high and the idea was it to its that is something that edges from my experience comes a little bit later on was something in your question or was answered OK the OK FIL the the the but the