The Art of Performance Evaluation

Ottawa, Canada

"Contrary to common belief, performance evaluation is an art." (Raj Jain, 1991) Successful performance evaluation may not be achieved with merely executing common benchmarking tools. This talk presents fundamental principles of performance evaluation and how you can put them into practice. Do you understand what exactly "pgbench" does? Is it appropriate workload for your performance evaluation goal? Common benchmarking tools like "pgbench" are handy for just comparing system A and system B, but if you intend to deeply understand the performance of your system, answers to these questions are critical. In order to conduct a meaningful performance evaluation, the methodology should be elegantly designed to meet the goal of the evaluation: choose metrics for the goal, and choose observation techniques for the metrics. Each step requires careful consideration and deep knowledge about the target system. It cannot be done mechanically. This is why performance evaluation is an art. This talk presents principles of designing performance evaluations and shows how you can put them into practice by introducing the speaker's experiences of performance evaluations with PostgreSQL.
so today I'm going to talk about the actual performance evaluation and finally University researcher at the University of Tokyo and I'm I'm doing there is isn't stories systems research so this OK is going to be out of its conception and opinionated took so I hope you could enjoy this diagram that I'm happy to I'm happy Feichtinger euro opinion on this topic so what was it
started with this Christian what is the goal of information technology what is the information technology my answer
is the information technology is transforming the computing power into the business or social values on you can just say users value for this and penetrating machine the 1st automatic data processing machines in the world and this is built for the national population census in the United States for that population only the very fundamental information for the nation and to actually this tabulating machine is body machinery in Pollock into a viable information
and performance is the key criterion of art this transformation if we think the so uh computing power is not necessarily machine I'm showing you what is the kind of the computing power so the national population census was done by human historically and tabulating machine addressed going into the performance of the the processing thing so they volutional information information technology is actually the history of this in the input in the lumen of this transformation right and
and to the that acknowledge friend it's still basically about formation of computing power into user value because data is a very very good example of that that attempt to extracting valuable information for the massive amounts of data or Internet Internet of Things or cyber-physical system this in an attempt to capture the real world with many sensors and doing something interesting key so but to date the data is the social values and the performance will be a thing is becoming more and more important so that from
technical perspective there system is the very important fundamentals of fundamentals of our transform meaning uh processing power into user value data with engineers very important today because they know how this works so very deep knowledge of database system is directly connected to the performance of the and
performance study of the nervous system is really into and maybe you have to know how hard disk works are as as the works of processor works how memory works home network works and of course you have to know how traditional works and even the whole business works OK so it's really uh and to inform prefix of hardware to logics of frication and that every component technology is changing and growing very laboratory in this field of IT and sometimes Architecture of Computing Systems is also drastically changes so learning every complaint technology is necessary but not sufficient so important point is that
all 3 important point is that equation as a way of learning how this transformation works how bitterly system a can get to work together and achieve performance that is
performance evaluation the performance evaluation is the key losses to understand performance of systems and the important point is the performance evaluation must be uh users value all the entities so every performance evaluation starts with fitting a goal the goal of which contributes to the use value and then there are evaluation procedure is designed and designed toward such goal
when talking about performance evaluation maybe many people think performance evaluation all let's benchmarking all I can do it with a expansion or RPG range for the discrete yeah that's all are you may you remind some broke costs and our fresh in you know the creator faster than during old-fashioned graphics the could be the base and exact that frankly speaking I didn't not performance evaluation yet
benchmarking is an important skills and is actually in the part of performance evaluation but this is just a means of performance evaluation and there
are many individuals skills with his uh his desire for performance evaluation by bats they not the core part of performance evaluation right for performance evaluation is goal will lead to full 6 so the fuel cell free of
orchestrating individual skills is the core part all of performance evaluation this take example
sorry for example if your bosses who wins faster storage for apples arrested at by so maybe in your goal might be to find the best price performance and this is the and rumbles about village is list all is is is available on the market then by all users this available on the market and try all races these with the bench 10 you don't find this is is the the good of this brute force that makes sense if you have if you have much much money and much science so but if you have a feel so feel performance evaluation again
it's not the 1st characterize the was grown and model performance with some performance metrics and is to meet the performance so if you have more than 1 you can estimate performance mistakes on data sheets and played
invited you model really is measurement of Ontario is is this so uh in this point you can you actually you measure the performance with the option is is and so that's what additional call from the model and you haven't now good precision model and now that the candidate is at the end of March And finally called from the detailed performance of candidate is sixties and you get the final without so good point all of these approaches you do not have to try all these on the market and a quite a quite animal it can be used in the future evaluation telescope and another
important aspect of performance evaluation is that uh go could be changed because performance evaluation is the closest of understanding performance food it's for might depend insight into the target system and sometimes it identifies more important goal of and go we really changed so keep aligned his kids considering wanted that important goal for you use of the value and then and then when the core is changed the orchestrate their individual skills for toward the goal of this is the philosophy of of of evaluation and it enables it to you to a meaningful performance evaluation of this philosophy it can be said
that the the out of 5 was evaluation this is very famous book about performance evaluation and it's it's uh performance evaluation is an act so and cannot produce mechanically and each requires intimate knowledge of target system and can constellation of of the selection of methods and tools so people how can we
developed are all philosophy of this is not so easy and analysis speaking and a cornerstone of philosophy is experience of lot but the important thing is silly behind experiences so silly forms the foundation of our experiences and experiences helps deepening the understanding of the theory for the chief trying this cycle of the a way to develop the performance of the developed world your own philosophy and thought that he
didn't solicitor I experienced useful for want of orientation so that it may introduce some examples of this the measurements latency of profit cache memory system in in way architecture each false if have a local memory and the forces that are connected with the interconnect right to be have full access to remote merely takes extort cycles is exhaustive use cycles all our
modern process that has a material cash FIL smaller caches faster like this so far it is really difficult to measure this great about it is possible for the maybe this is interesting graph for you all are another example
is the case with traditional discovery this discovery is meant for the range growth Enterprise Storage with 160 hardest rise and fall when it starts a my caused by exposure are indeed many regiments on this machine of war example sponsors response regret
and more interesting example is the performance of PCI Express crushed its full basidiospores for storage of I latency is very very low relative to to the existing storage devices and soul in this case latency of processor interconnect is very believable other portion of violated the you can meet that all of our
study here's Paul consumption and energy saving in of database system for this is my motherfucking so Montrose at the functionality of changing the operating frequency in in the London and so by injecting application performance performance the information into frequency controller unlisted really it I was save energy saving can be achieved with maintenance of meeting some sort liberal and they
still sometimes require us a question of how it works so this is the graph or follow consumption structuration so it will if the system is idle conventionally fluctuated uh by many factors like temperature or are told power law in the building of over a graph of the so in another major power consumption ivory precise really had to weeks do experiments noting the polymer is very stable period so this graph tools for our consumption is there during relatively stable in full field you to succeed in so when I was doing this experiment I shut off of laboratory every midnight instead experiment in the area and finish finished experiment at 60 years and he went to bed to when everyone is waking up and
this is just the the story all the remaining part of this Stoker at within the fundamental slope of 1 the violation below principles and is basic techniques this is just a sort of basic spherical side or how much time so if you want to a lot more detail you can track this book and in this so I'm going to expression the celery with my practices and experiences of it and it's
move onto principles principles very simple and I and full of defining goals defined called fast so performance evaluation is that must be you the value-oriented sold define the goal of which contributes to use of value and this is very very uh difficult for authors usually sold initial question of perform program might be very value and uh usually subjective full your job is to related it over there and sit Korea and objective called so if you consider a goal maybe you can you already have finished 50 per cent of your job or a 60 per cent of eligible for the remaining pulses keeps the and the goal oriented and designed the procedure toward the goal using appropriate techniques so we want to basic
techniques of performance evaluation there 3 basic techniques of of what they were efficient and fast 1
modelling monitoring is
the expect expressing their performance in a quantitative form so you during the mathematical form the real system is the too complex to wasting of the two fold outlooks formation is the key to understand the mechanism of the target system and to find the important variables and formulate the performance in time of it might account for so this is smoldering and the course of scientific uploads OK I'll let
me give you an example for maybe you know how how this works and the hardest consists of rotating Prada and the header the money magnet here to read downtrodden so this throughput of hard disk drives can be modeled by this easy to from article form so this is the the how many bits of swept by the head in a 2nd so from this small you download that throughput is proportional to the density of the density of the cylinder and the density of outermost sitting there is is the largest and the solar inner cylinder has a lower density and you get
a call from that we measurements so medium and clearly depicts the debate that perform so that other most brother as the largest throughput and in most of the inner product of robots another example the
latency of harvest drive radius your hardest tribes is called consists of the crater the and will additional latency so if you are if you are a lucky enough and data access latency the rotational latency is almost there and but in the most unlucky case you have to wait a proper to roll around the of latency can be expressed in this form and you can call from
the model by measurement this graph shows the made this great is the x axis shows that the distance and y axis to the vacancies so as you can see out there that I don't think distancing increased latency and also this week old lady the band is exact equals to the rotational latency of this how describe our was obtained thinking about being and another
popular modeling techniques to a model came a consists all summer and the we can choose and its behavior is described by I've only of customers and other social science so when you consider CPU performance you can model CPU again 1 of the few a customized CPU instructions and summaries is execution unit in CPU so when you consider storage system as the I O request and so storage controller or individual disks in when you consider the the system customized security and service is that after execute the query execute a process and the
muscles simplest and this present are useful monitoring is and then 1 will allow expression to much did you about this is that this consists of class obvious and 1 infinity Q 4 and we some conditions on reliable later and understand distribution and if you do it as a more mask average response time can be computed as this simple form and it is not so
difficult to extend this to multiple felon servers and using this model we that we can do some interesting and I think of here I've written performance model of order to system we of valuable of all the things of the quake re-creation FIL frequency goes down the throughput decreases and the response time ingredients will you get model of and above all the response time of readers through multiple obvious to a modern and so by by parameterizing left specifically creation and I can form is a response time in this display so and this graph is this graph shows the couple model average response time of Y axis and the slope with them and up right Y axis is the response down on its axis is throughput of system and the measurement results is this graph on the floor of the agency this model freely much is we the triangle of actual measurement so only this simple model you can uh evaluates the performance box of leakage so this so this may damage 1 the is the basis of of frication performance aware energy saving other expression in the form of right the so the monitoring is
a very important step to understand systems maybe good models for life you are good of you and enables you to predict performance and scaling trends and also it helps you to note notice that the incorrect measurement of body is caused by implementation 5 because you know the trend of 10 to the you know the trend of the performance by more than full and that's their system is actually very complex you generally and sometimes more modeling can with rotation about model but this is not about news but small there is also informative because it's it indicates that you're missing something important in monitoring all measurements so that you can you can try and try to include your model and it maybe maybe it's also indicated that the system is too complex so in that case so this is a fine but you should change launch what toward the pole k a full
next technique is measurement and when doing
performance evaluation and many people tends to do measurement fast but measurement without modeling is totally pointless because you cannot understand the results so if you have a model you can validate the result with a model for again understand all this is working for and if they the difference between measurement and the model we can notice that model or majoring in something is wrong so far so of you do not with such we've measurement and do not produce to start with modern
and this is the guide for meaningful measurement full measurement is not just running benchmarking too long there really seems to be considered as also so what role and matrix and the measurement environments and measurement methods and tools then the conduct medium and and analyze the results OK will this is
a very simple fact example story so here assume that I want story systems that can support of some TPS transactions throughput and radical Islam them ideal for many clients so in this case metric is that I outsourced storage systems
so before measurement model performance In this case broke size and concurrency of ideal might be important for performance and also the ideal Juliet attitude depths in wirelessly as in system might be important and the
design of the measurement of environment before think carefully measurement of environment before conducting measurement so then you want to measure of violence of storage system to ensure that I only crispies the early reaching the storage system not just by the main memory and performance measurement is the monitoring of full resources in the system for designing and violent is that designing the flow resources in the system and then a bit
because of the appropriateness of tools find the that performance evaluation is very goal oriented forces therefore do not start we're you're familiar to but appropriate to for their goals and began how about what metric is really measured for example I'll start very famous tool for measuring higher-performance it can measure the number of ideally class issues slum operating system but it cannot measure the number of ideal request actually issued to the individual disks behind the storage controller all the number of of lead right system calls for modification so be careful about what you are here which so we all started
and run a major 10 maybe in the simple case you can run major element with exhaustive parameter values but when you're doing measurement with multiple variables maybe is unrealistic so from experiments with minimal states and you have a good model of model which tell you where you should made next and
after the measurements do not to analyze the later vote so if you a little matches with the model of available to it means that you have a correct understanding of the system the figure and but you we those do not much resume models that something is wrong so she's trying to include model measurement missile J. U.
S. has moved to the simulation simulation
is that for if you don't do not have a target system or if it does not exist in the real 1 unit our simulation is very effective way to evaluate performance or assimilation is also used for the 2 behavior analysis and modelling is uh you Jerry for difficult for monitoring and everything in detail all the system is you just difficult so I'm wondering is down with some of approximation but simulation can do much more of the 2 and in terms
of the delivery system in money into a single simulation technique is ideal the trade in I only play a fast trains the i apart time during the actual execution of our called and literary the ideal pardon of hypothetical devices so in this way you can divide the performance of wireless devices with realistic work this graph
so that only those of ideal play on buyers divisive we use to be see Benjamin in this case and fully training I O ruffled on wireless hypothetical system right and have this and is not this can be checked as it's the floor again tell that how how you can be handled thing every system in very in
simulation is a kind of what-if analysis and based on some of the functions so unrealistic assumptions can easily related to nonsense results beach yeah how about those assumptions and so simulation results other techniques is a bit dangerous so keep in mind that the assumption must be divided into 3 the uh other techniques right modeling or measurement OK I don't know I
cover that's really basic techniques for monitoring measurement and simulation and Michigan it is mostly useful by itself so in order to do deeply we understand the system's performance using multiple techniques and by dating each other is the key OK so what I'm not being
