We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Deep Dive Into Query Performance

00:00

Formal Metadata

Title
Deep Dive Into Query Performance
Title of Series
Number of Parts
542
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
If you look at data store as just another service, the things Application cares about is successfully establishing connection and getting results to the queries promptly and with correct results. In this presentation, we will explore this seemingly simple aspect of working with PostgreSQL in details. We will talk about why you want to go beyond the averages, and how to group queries together in the meaningful way so you’re not overwhelmed with amount of details but find the right queries to focus on. We will answer the question on when you should focus on tuning specific queries or when it is better to focus on tuning the database (or just getting a bigger box). We will also look at other ways to minimize user facing response time, such as parallel queries, asynchronous queries, queueing complex work, as well as often misunderstood response time killers such as overloaded network, stolen CPU, and even limits imposed by this pesky speed of light.
14
15
43
87
Thumbnail
26:29
146
Thumbnail
18:05
199
207
Thumbnail
22:17
264
278
Thumbnail
30:52
293
Thumbnail
15:53
341
Thumbnail
31:01
354
359
410
Query languageFront and back endsBitDatabaseCartesian coordinate systemBridging (networking)Computer animation
DatabasePoint (geometry)Service (economics)Query languageError messageDependent and independent variablesMathematicsBuffer solutionResponse time (technology)Point (geometry)Query languageService (economics)Cartesian coordinate systemTable (information)Error messageResultantComputer animation
Dependent and independent variablesQuery languageSoftware developerResponse time (technology)Query languagePoint (geometry)DatabaseComputer animation
Query languageQuery languageResponse time (technology)Cartesian coordinate systemDatabaseIdentical particlesTerm (mathematics)CASE <Informatik>Point (geometry)MereologyDifferent (Kate Ryan album)Computer animation
Query languageMultiplication signAverageMetreComputer animation
Electronic data interchangeLibrary (computing)Latent heatDependent and independent variablesQuery languageDistribution (mathematics)Source code
Dependent and independent variablesQuery languageDatabaseQuery languageInteractive televisionNumberMehrplatzsystem
Error messageQuery languageNormal (geometry)Dependent and independent variablesResponse time (technology)WeightError messageTable (information)Computer animation
BackupCAN busDependent and independent variablesMultiplication signSoftware maintenanceDirection (geometry)Multiplication signQuery languageFrequencyResponse time (technology)Dependent and independent variablesComputer animation
Dew pointGamma functionMeta elementQuery languageResponse time (technology)Query languagePoint cloudDatabaseCartesian coordinate systemView (database)Self-balancing binary search treeDependent and independent variablesQuicksortE-bookProjective planeState observerCASE <Informatik>NumberTerm (mathematics)InformationSoftware developerHard disk driveCombinational logicFlagMeasurementParameter (computer programming)BitSoftwareMultiplication signAugmented realityStandard deviationComputer animation
Meta elementQuery languageSoftware testingInformationRevision controlFunction (mathematics)Revision controlFocus (optics)Uniform resource locatorIntegrated development environmentCore dumpSubsetQuery languageFunctional (mathematics)State observerComputer animation
Group actionIntegrated development environmentDatabaseProfil (magazine)Goodness of fitQuery languageCASE <Informatik>Cartesian coordinate systemWindowQuicksortBitVideo gameDifferent (Kate Ryan album)Parameter (computer programming)Computer animation
View (database)DatabaseCustomer relationship managementTable (information)MathematicsQuery languageDatabaseTable (information)CASE <Informatik>Different (Kate Ryan album)Subject indexingComputer animation
Service (economics)System identificationDirection (geometry)Goodness of fitInformation securityDatabaseService (economics)Multiplication sign
Integrated development environmentChemical equationPoint cloudCASE <Informatik>Query languageDatabaseSubject indexingInstance (computer science)Configuration spaceServer (computing)Web 2.0Different (Kate Ryan album)Cartesian coordinate systemComputer animation
Server (computing)Service (economics)Instance (computer science)CausalityInformation securityCodeConfiguration spaceRevision controlClient (computing)Cartesian coordinate systemInstance (computer science)Multiplication signClient (computing)Query languageLimit (category theory)DatabaseData storage deviceMathematicsComputer animation
Query languageDifferent (Kate Ryan album)Group actionMathematical optimizationCASE <Informatik>Query languagePlanningDifferent (Kate Ryan album)Computer animation
Dependent and independent variablesBefehlsprozessorComputer networkMiniDiscQuery languageBefehlsprozessorInheritance (object-oriented programming)Physical systemResponse time (technology)Latent heatMultiplication signComputer animation
VolumeQuery languageQuery languageSoftware developerJava appletComputer animation
Queue (abstract data type)Dependent and independent variablesQuery languagePhysical systemBackupPoint cloudQuery languageOperator (mathematics)Instance (computer science)DatabaseInteractive televisionGoodness of fit
Focus (optics)Query languageElectronic mailing listTable (information)Statement (computer science)InformationMultiplication signCASE <Informatik>Right angleVariety (linguistics)WeightSocial classQuery languageStatement (computer science)InformationElectronic mailing listPlanningInternet service providerSet (mathematics)Table (information)QuicksortParsingComputer animation
TwitterMultiplication signPlastikkarteSubject indexingStandard deviationDifferent (Kate Ryan album)Right angleMathematical optimizationINTEGRALDatabasePlanningVolume (thermodynamics)Query languageComputer animation
Program flowchart
Transcript: English(auto-generated)
Now, we already spoke here a little bit about developers, right, and especially the front-end developers, right? And one purpose of this talk for me is to really sort of bridge the gap, right, which
I often see between the people who really have databases at the center of their, at least professional life, right? And the people who are writing the application and, you know, databases are like a thing. It's like a toilet, right? You do your business and you move on with your life, right?
I mean, something like that, right? For those people, the database is typically like a black box, right? There is this black box, right? And what I want is I connect to the service point which is provided to me, right? I connect it quickly. I run my queries and that's all I care about.
And all that kind of, you know, change buffer combo, yeah, never heard about it, right? What about queries, right? What would you as a developer think about queries? Well, these are actually pretty simple things, right?
When you connect it to a service point, you add queries. You want them to complete with no errors, right? You want them to provide you correct results set because if you wouldn't, we could alter all the MySQL tables to black hole and get a fantastic performance, right? No errors too. And also you want them to make sure that they complete in that response time what
your application expects, right? And I think that is a very important thing to understand. If you look at from the individual developer standpoint, right, like write an application, hey, performance response time for my queries is all I care about. And how that internal database kitchen works, you know, somebody else's problem, right?
Now if you think about the response time from the database point of view, right, that is often seen like, well, I see that response time for a query is an average, right, or whatever distribution, we'll talk about that later.
But that is different from what business cares about, right? If you think about the business point of view, you think about, well, do my user have outstanding experience in terms of performance if all of the application interactions, right? That means, you know, like a search should work and place and order should work, right,
and whatever. And database is an important part of that, of course, but not is there a complete part. What is interesting in this case is what as database engineers, we often talk about those kind of different events, kind of like that performance and the downtime and
say, well, you know, no, no, we weren't down, it just was taken 50 minutes to run my very basic query. Well, from user standpoint, that performance, very bad performance, isn't distinguishable from downtime, because, A, we don't have patterns, right, then even if people are
very patient, then the browser, right, or some other timeouts will happen and nobody gives a shit about that query, which may still continue to be running. Another thing to understand about query performance is you do not want to focus on the averages, right, I like this kind of one saying, but there was one slave demand who tried to cross
the river in the average one meter deep, right, that is same applies to the query, right, if your average query time is x, well, that means pretty much nothing, right, you need to understand more about that.
And I like, in this case, to look at their, you know, percentiles, and even more, to make sure you can look at the specific distribution of your, you know, of a query, query response time, if you, that, that gives you a lot more insight.
Now, one thing to understand about the percentile, you may be looking and saying like, well, great, my queries have this decent 99 percentile, right, but, but that does not mean on a business
side what 99 percent of your users have a good or like acceptable experience, why is that? Well, because guess what, the single user interaction can correspond to a lot of queries sequentially, right, which all adapt and typically to through their journey user has a number of those interactions, right, so I would say like even 99 percentile that may,
well, depending on your application, only correspond to like 50 percent of user session, so if you really see the complicated large environments, right, they are really focused on
either relatively, you know, short SLA for or the rather high percentiles. Another thing I would encourage you to pay attention to is errors, right, and make sure you are measuring response time for those as well because errors actually can be of two kinds,
fast errors and slow errors. In certain cases, let's say if your table doesn't exist, right, you may be like get the response time straight away and if you put all your error queries and actually normal queries in the same bucket you may say oh my gosh my response times are doing
kind of so well, right, but on the other hand if your query is, for example, error is a lock weight timeout, then that is a slow error, right, it actually will have a higher response time, right, than the normal cases, right, that is why I always suggest to make sure we measure response time for normal queries and for queries with problems differently.
Another thing which is very important is looking at response time over time, right, because traffic changes, a lot of things are going on in the system, right, and just saying hey I'll have a response time of x over some long period of time, it's not very helpful.
Also what you would see in many cases, you still start with like a small performance problems, right, maybe so you know like SLA violations which are if unfixed they convert in the downtime, like for example in MySQL world, right, you may say well I have
forgotten this kind of runaway query and my history accumulates, right, it will slowly increase and increase your response time. If you're measuring that over time and say well something is not trending in the right directions you probably can fix it before
that will be seen as a downtime by your users, right, if you are not then not so much, right. And this is example what we have here, right, what you often may see something like this as well for where all the queries have like a spike in the response time which you often
make a response to something external happening in the environment, right, and I think here is what is very interesting especially for us running in the cloud we only have limited observability to environment, right, if there is some shit going on on an Amazon back end they're not going to tell us that, right, oh you know what we had let's say some you know
hard drives failed which back our your EBS and we had to you know some rebalance yada yada, right. Okay, the another question I would ask is where we want to measure response time, right, from queries. In my opinion both application view and database you are in
the combinations are very helpful because the application can see real thing, right, if your network for example is adding some latency or whatever it is you will see that from application not so much in the database, right, because it only sees from hey I got response to
then it's sent the data back but the database view allows you often to see a lot more insight about what has been going on inside there from application side we often can you know just capture the query time maybe some very basic additional parameters. So what we spoke
from our business view, right, well we already said what that all users have outstanding performance experience of all the application interactions, right, let's now try to break it down a little bit more, right, to what that may mean. In this case I want to introduce this
little project or flag from you this is SQL Commander project by Google, right, I mean which is pretty cool in terms of what it allows to pass you the metadata, right, which you understand as developer all the way to SQL query. They implemented that support
through a number of frameworks, right, and it's also supported in their Google cloud monitoring environment, right, and I would very much see that developed more, right, and for at
least kind of us come to some sort of shared standards between the databases, right, or wherever, how we can augment query information with sort of like tags, values, right, which users care about. So what are the possibilities which can be quite helpful in this regard? Well finding
for example who is our actual user tenant whose query corresponds, right, because we often may have, you know, different performance issues, right, finding the application like or some sort of like a subset of location functionality where many of them may be hitting the application, right, version information, maybe information about like their engineer of a team who is responsible
or not. I often see DBAs or SRAs team having problem like oh I see this nasty query which was shipped yesterday. I know because it was shipped today because I know it wasn't fair yesterday, right, but now having to find out who the hell introduced that stupid query may be problematic
in a large environment. Now a lot of focus and I think the core of a query-based observability may be about the query and by query I mean obviously like a query with sort of like it's
which are same except different parameters and that is very helpful because well obviously they have a different complexity, different expected SLA and so on and so forth. The next way also to break things down for me would be to look at the schema or database and why is
that interesting? Oh I just let us know just right now what it's being cut a bit. Yeah well anyway life is life. I'm just not going to be lucky in this room, right,
but well we can blame our windows, right, on this conference we can and should blame windows. Okay well why schema and database are also good because often we would separate in the multi-tenant applications different tenants by schema, right, and in that case
that gives us a good profiling for performance of their different schemas, right, like we can see here an example with the PMM tool. Another thing what I found is very helpful to
find a way to separate the data by different tables, right, in many cases you want to say hey you know what how our query is hitting given table is affected especially if you did some change which relates to the table. Hey you know what I changed the indexing on this table,
let me see how all the queries hitting this table is impacted. Very helpful because there may be some surprising differences. Database users, that is another thing which is quite helpful because that often allows us to identify the service or application, right, if you're following
good security practices you would not let all your applications, right, just use one username, you know, not a good idea, right, and also find human troublemakers, right, which are doing having direct access, right, and so many times you'll find somebody, you know, running the query, right, and say okay well yeah it's slow but wherever I'll go for lunch,
you know, I have time, well you may have time, but your database may not, right, so we also, like here's example how we, you know, provide that. I also mentioned database host and indexes in many instances, in many cases that is very helpful because
even if you may think oh my different database instance should perform the same, well world is a messy place and world in the cloud is even messier place, right, they may not exactly have the same performance due to, you know, some strange configuration differently,
having a bad day, right, or even maybe having a different load, right, and that is a good to be able to break it down, right, when you see some of your queries are not performing very well. I would also look at the same stuff from a web server or application server instance
because again if you have like maybe like 100 nodes, you deploy the same application, you may think hey they're all going to perform the same, hitting the database, well that is not always the case, right, I have seen changes from people saying one of them is misconfigurable for some reason, cannot connect the cache, so it's, you know, hitting, you know, 10 times more
queries right on database than it should be, or the application rollout didn't go well, where you've eliminated nasty query on 99 of application instances but not some others, right, it's very good to actually be able to validate that because what you would see or like again
from a DBA standpoint, you know, developers, sysadmins, storage people, they are going to tell you shit, right, but they are going to lie, right, they are going to lie, right, maybe not intentionally, maybe because of their ignorance and limitation of their tool, but as a DBA, a CD or something, you want to point them out to their shit and say look, please, I have evidence,
right, evidence is good, right, so clients costs, custom tags is very helpful if you can extend, that is what we spoke about the SQL commenters, something else which I find very
helpful which we cannot quite easily get with MySQL but being able to separate the query by the query plans, right, often you may have a query which looks the same but it may take different execution plans, right, and often that may be correlated to its performance.
In certain cases it is totally fine, right, very different situations, sometimes MySQL optimizer may get a little bit, you know, crazy just and have that optimizer plan drift for no good reason which may not be very easy to catch, right, and would be helpful to do. What I also would like to
highlight, right, is when you find the specific query and say hey this query has, you know, nasty performance, right, we often want to understand where that query response time comes from, right, and that is some of the things, right, where it can come from, certain of them are relatively easy
to find out, right, certain are not very well, right, for example, wherever query has waited on available CPU, right, because system was overly saturated, well, you can't really see on
query basics, right, you can only see those things, well, my CPU was kind of like a super packed, right, on a period of time. Okay, here are a couple of other things to consider when you're looking at the queries. One, you want to really look at separately at the bad queries, right, versus victims because sometimes you will see, oh, queries are getting slower but it's not
because of them, it's about some other nasty queries, right, maybe that is your Java developer who thought, well, you know, to solve my problems I will just launch this stuff with, you know, with 200 threads, right, and make sure I am good but everything else is kind of
slowed down, right, and that's maybe tricky. One thing is what you should not forget the running queries. In many cases, like if you look in performance schema, queries by dash, that gives you what happened in the past but believe me, if you started, you know, 50 instances
of some very bad query which continues to run, well, that may be the reason of your problem, not the past, right, and to connect to that, I think it is less problem in MySQL right now, right, if you're using query timeouts which is a very good practice, right, because if you say, you know what, for all my interactive queries, by default, I set the timeout of, let's say,
15 seconds, then you should not care too much about your past queries because, well, you know what, everything gets killed after 15 minutes. Also, 50 seconds, right, you should not ignore stuff which is invisible from a query standpoint, right, databases do a lot of shit in the
background, you may also do things like your operation teams like, well, backups or provisioning another node for cloning, right, or the cloud or wherever your VM system may need to do something in the background, it may not be directly visible but that can impact the query performance,
right, so sometimes, well, when you observe a query impact and you can't really see what is causing that, it's possible. I also would encourage to avoid what I would call like a biased something. I see people sometimes would say, hey, you know what, we will set long query time to one
second and only look at the queries which are more than one second in length, well, you may be only focusing on the outliers, right, and missing the possibility to optimize our queries, right, or actually even focusing on the queries which provide, which are responsible providing that experience, right, for your users. Okay, we find another thing like a last minute
I have or something, I wanted to say, hey, what I would like to see from MySQL to do better. Where is Kenny? No Kenny? Yes, he's always hiding, right, he probably wanted to
get another sandwich, damn it. Okay, so here are some things that I would like to see. One is better support of prepared statements, right, and right now, right, it's kind of, you know, not done in the same way, right, which is, I think is a problem,
right. Now I would say consider grouping data by time in certain cases. Right now you get like all the statements in one table, right, and you have a lot of statement variety that table tends to overflow, right, which is not really helpful, right, and if you have to kind of
reset your queries all the time, that is not very, you know, good practice in my opinion. Provide list of tables query touches, right, that is very helpful because, well, MySQL parser already knows it, right, it knows tables or query touches, but it's very hard to parse
it out from a query, especially if you consider views, right. I don't know by looking at the query alone wherever something is a table or a view, right, so in this case. Information about plan ID, right. I would like to see for the query, right, some sort of plan hash or something so I
know then query is using something like that and also what I would call like a top weight summary, right. Right now we have information about the weights in MySQL performance query and about query, but I cannot see and say oh that query was slow because it spent xyz amount of
weight on something or whatever, right, or at least kind of like some small class of queries, right. Yeah, I don't think that's convenient. Well, with that, that's all I had to say. Hope that will help you to avoid tuning your indexes by the credit card.
And, yes, oh, I have time for questions. You told me like, Peter, five minutes. Oh, to answer, I have time for questions. Yes, any questions? No? Oh, yeah.
What's the difference or advantages of this SQL commenter thing compared to what OpenTracing standard does because it starts kind of tracing the whole thing? What's the difference of SQL? Well, what I would say in this case, yes, I mean there is obviously OpenTracing framework, right, which you can use. This gets specifically to the database
and specifically in every query, right. If you look at the OpenTracing framework, I think, you know, getting every query, right, maybe a lot of volume out there, right. And again,
I also think, well, the good thing if also SQL commenter, right, is what that does it automatically, if you will, right. That does not require you to take an extra integration.
Okay, anybody else? Oh, yeah, I mean, it works with MariaDB as well. Yes. Well, there are practices,
there are no good practices, right. Like you can, there is a lot of optimizer hints you can use, right. So you can actually force the query to go like this particular stuff,
right. But that also prevents optimizer choosing different plan if better plan becomes available. Yeah. Okay, well, then thank you folks.