We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Faster Analytics with MariaDB 10.2

00:00

Formal Metadata

Title
Faster Analytics with MariaDB 10.2
Subtitle
Advanced SQL features - CTEs and Window Function
Title of Series
Number of Parts
95
Author
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
MariaDB 10.2 has brought two new important querying features, Common Table Expressions (CTEs) and Window Functions. Both features provide greater expressibility to queries, thus opening up opportunities for the optimiser to provide speedups not previously achievable. With a focus on analytical queries, we will see how to improve query performance sometimes by an order of magnitude compared to regular SQL.
Keywords
22
Thumbnail
54:22
27
29
36
Thumbnail
1:05:58
38
Thumbnail
1:00:58
65
Thumbnail
44:43
75
91
Thumbnail
1:21:58
94
FreewareOpen sourceQuery languageFunction (mathematics)Window functionSoftwareTable (information)Regulärer Ausdruck <Textverarbeitung>Personal digital assistantStaff (military)Type theoryMultiplication sign1 (number)Window functionBasis <Mathematik>Point (geometry)Query languageLevel (video gaming)Table (information)SequelCASE <Informatik>XMLComputer animationLecture/Conference
Table (information)Regulärer Ausdruck <Textverarbeitung>Window functionFunction (mathematics)Personal digital assistantView (database)Linear mapLocal GroupProduct (business)Pairwise comparisonSummierbarkeitCodeQuery languageThermal expansionAlgorithmMaterialization (paranormal)Mathematical optimizationInheritance (object-oriented programming)Order (biology)Condition numberStack (abstract data type)ExistenceSystem on a chipMathematical optimizationOpen sourceNatural numberResultantExterior algebraQuery languageLinearizationCASE <Informatik>Table (information)Selectivity (electronic)Run time (program lifecycle phase)View (database)Machine visionCondition numberSoftware developerMultiplicationSpacetimeCodeStudent's t-testMiniDiscAlgorithmGroup actionInheritance (object-oriented programming)BitMultiplication signProduct (business)Identity managementMereologyData storage deviceDatabaseRight angleProjective planeFreewareFrequencyTrailEstimatorState of matterDegree (graph theory)Direction (geometry)Medical imagingComputer animation
Condition numberThermal expansionStack (abstract data type)SQL ServerMathematical optimizationRecursionTuring testFormal languageComplete metric spaceHierarchyQuery languageElectronic mailing listView (database)Vector potentialTable (information)Electric currentFunction (mathematics)SequenceBoss CorporationRegular graphRootOrder (biology)EmailPartition (number theory)DeterminismLocal GroupSmith chartQuery languageResultantView (database)Function (mathematics)Key (cryptography)RecursionElectronic mailing listNumberPoisson-KlammerRow (database)Right angleParameter (computer programming)Group actionException handlingAlphabet (computer science)SequenceSummierbarkeitType theoryTable (information)System administratorOrder (biology)NeuroinformatikEmailFlow separationPartition (number theory)Regular graphWindow functionProcess (computing)Server (computing)DatabaseVapor barrierComputer chessLibrary (computing)Network topologyComputer programmingMereologyDifferent (Kate Ryan album)InformationMathematical optimizationCondition numberVideo gameProper mapProjective planeTuring testObject (grammar)Formal languagePower (physics)Data structureSoftware developerSequelArithmetic progressionComplete metric spaceCASE <Informatik>Real numberWordPoint (geometry)Rule of inferenceGraph (mathematics)Computer animation
Regular graphFunction (mathematics)EmailOrder (biology)Partition (number theory)Smith chartBoss CorporationRootSimilarity (geometry)AverageFrame problemTimestampDatabase transactionSummierbarkeitRow (database)Window functionPrice indexPersonal digital assistantQuery languageCountingRankingRegular graphChemical equationSet (mathematics)Negative numberPosition operatorBitBenchmarkMultiplication signNeuroinformatikResultantOrder (biology)SequenceFigurate numberAverageTimestampPopulation densityRow (database)Window functionSubject indexingSequelFrame problemFunction (mathematics)Database transactionInformation securitySemiconductor memoryoutputQuery languageDatabaseElectronic mailing listRankingData structurePoint (geometry)MereologyGame controllerLine (geometry)Right angleWebsiteDataflowNumberElement (mathematics)Data miningLogicPerspective (visual)State observer2 (number)CASE <Informatik>Cartesian coordinate systemType theorySummierbarkeitDemosceneInsertion lossMiniDiscComputer animationLecture/Conference
RankingOrder (biology)Partition (number theory)Regular graphFunction (mathematics)Price indexWindow functionCountingService (economics)Personal digital assistantAverageVirtual machineLocal GroupMaxima and minimaDatabase transactionQuery languageFrame problemRange (statistics)NeuroinformatikMereologyRow (database)RadiusLaptopRight angleResultant1 (number)Social classEqualiser (mathematics)Multiplication signSubject indexingAreaDigital electronicsSimilarity (geometry)QuicksortPoint (geometry)Disk read-and-write headSet (mathematics)CASE <Informatik>Service (economics)DatabaseGroup actionBitSequelDifferent (Kate Ryan album)RankingMaxima and minimaFunction (mathematics)Query languageSlide ruleTable (information)Parameter (computer programming)TimestampVirtual machineAveragePhysical lawBlock (periodic table)Time zoneIn-Memory-DatenbankOrder (biology)Content (media)Electronic mailing listDatabase transactionOverhead (computing)Cartesian coordinate systemMultiplicationSubgroupStability theoryWindow functionRegular graphNumberPosition operatorIntrusion detection systemComputer animation
Window functionFunction (mathematics)Frame problemBlogMeta elementRevision controlDefault (computer science)Distribution (mathematics)Repository (publishing)Category of beingComputer animation
Entire functionFreewareOpen sourceEvent horizonComputer animation
Transcript: English(auto-generated)
Hi, my name is Vicentiu, thank you for coming. It's my first time at FrostCon, so I'm not quite familiar with the audience and the technical level of everybody, so if there are any questions during my talk feel free
to interrupt me, there should be plenty of time for discussion. So first of all, just to get a general idea, how many of you here know of MariaDB? Okay, how many of you actually use SQL on a day-to-day basis, or...
Okay, so people are familiar, that's good, but feel free to ask me any questions if there's something which is unclear. So the purpose of this talk is to show you how you can make queries run faster,
especially analytical ones, using the new features which are now in our latest stable release, which is MariaDB 10.2. So we're going to talk about common table expressions, window functions, how these two work together, and I'm going to go through a few practical use cases.
All of these are actually problems that I had to solve at one point, so the examples are from production, and I'm also going to go into a bit of details about what the development status is for these features, what's coming up in future releases.
So let's start with common table expressions. So common table expressions are a new way to express what MySQL and MariaDB used to call derived tables. So whenever you wanted to do a SELECT in a FROM clause, you would have to create a derived table.
Well here we can define this derived table outside of the current SELECT. So here in red I've highlighted what is the new syntax for CTEs. We start with a keyword, WITH, then there's the common table expression name, in this
case, engineers, and then we have what's going to be inside this table. So after this definition you can use the engineers keyword, as if it were a table in your database.
And then you can SELECT FROM engineers and potentially filter out using other conditions. Here this is where you use it, so in the FROM clause you have engineers. Here's the identical query that you had to write before. So here we have the FROM clause which has a SUBSELECT inside.
The nice part is that you can use multiple CTEs and multiple CTEs can reference each other. So here we start with engineers and then we go to Europe only engineers by filtering
the engineers previously defined table. If you want to do this with derived tables you would have to write a query within a query and it gets pretty hard to track because you have this nested versus the linear view of CTEs. So from a readability standpoint CTEs are obviously a win, but there's more to this.
Due to how CTEs work there's actually room for optimisations that the query optimiser can do to give you faster running queries. Here's another example.
Here we're trying to select, to compare how products did in the current year versus the previous year. And here we are using the same table twice. So first we define all the sales for all products in a given year.
And then we just join them together to see which products have performed better this year as opposed to last year. So so far. You can identify CTEs using the WITH keyword. They are similar to derived tables in that you can actually write most of the time identical
queries with derived tables. They can provide you more efficient code and I'm going to go into details on why this is so in a few of the optimisations that we support. So how do we execute CTEs? So one idea is to just execute the query which defines the CTE, store it into a temporary
table and whenever you need it you do it again and again. And this is the basic algorithm. It always works but if you have multiple references like here you have to create two temporary tables which is not optimal.
However when you do find that you are reusing a CTE twice it is a good idea to create only one temporary table per distinct CTE expression. This is one optimisation and it helps by saving space and execution time.
However when you do this you cannot perform other optimisations. So it's either this one or something which MariaDB has done with views before and it's
something we call CTE merging. So here we have a select, we're selecting engineers from a table that we have already and you can think of this like a view. The trick is that since the CTE is used in a join but there is no group by or distinct
clause inside the CTE you can actually merge these two queries together to perform just one simple select. So you can transform from the CTE expression to a single select query and in this case
the query optimiser has a vision across all the tables involved and it can perform all the possible optimisations that it can. So by rewriting the query you give the optimiser a chance to do as much as possible.
This is similar to how views work already, this is not something new, it's just that due to how we implemented CTEs this came natural. Now here is a different optimisation and this is an alternative to when you don't have
the conditions supplied by merging CTEs. So here we have two problems, we have a group by in the CTE, so if you were to merge this in the parent subquery you would change the results completely and you also have some
conditions in the where clause in the outer select. We obviously cannot merge, however maybe it makes sense to push the condition from the outer query into the CTE because you're still going to filter out things and if you filter
out them before computing a temporary table this will save on disk space and execution time. So in this case you get smaller temporary tables and this will potentially speed up your
queries when you cannot do CTE merging in the first place. This feature was actually implemented by a student and this is something we are very proud of that we get to work with the community a lot. So all this was contributed in a three month period, if you've heard of Google Summer of Code it's a very interesting project and if you are in charge of an open source
project, I do suggest you try and apply, it's something that's very worthwhile. Comparing MariaDB to MySQL and also Postgres and SQL Server, so when we try to implement
CTEs as well as window functions coming later, we tried to do as similar of a job as other database vendors, so both us and SQL Server support CTE merging and we do condition
push down but we don't do reuse and the reason why we don't do reuse is because whenever you choose to reuse a CTE you invalidate the other possible optimizations. So that's why we currently do not support reuse but we are thinking of adding methods
to decide whether to reuse CTEs or not. So this data is from 8.0, this is from I think April this year.
I haven't looked if there have been differences, but from my understanding condition push down was not in the MySQL plans when I asked the developers, so it's probably still not there.
For Postgres they don't do either merging or condition push down, maybe if somebody is more familiar they can explain the reason for this. The only information that we are able to gather is that Postgres considers CTEs as optimization barriers so the optimizer can only work within the CTE or after the CTE but not
mixing them together. Another interesting feature is that CTEs can be accessed in a recursive manner, I'm not going to go into details on how you can do this but the interesting part is that this actually makes the SQL language Turing complete, so you can write any program which would create
in a different language with recursive CTEs. An interesting example is that somebody has actually written a chess library with recursive CTEs, so you can actually play chess with I think Postgres server in this case.
Regressive CTEs are more sanely used for hierarchical queries, so if you have a tree structure you can actually query within the tree and this makes your life a lot easier when you have to get a list of say parts for an object or a list of employees which report
directly or indirectly to somebody. Okay, so enough about CTEs. The main takeaway from here is that they are essentially views but they are local to the query, you don't have to create them separately. You will get better performance if you use CTEs wisely and with recursive CTEs you have
a lot of power to express complicated queries. Any questions so far? Okay, either nobody understood anything or everything is crystal clear and this is trivial
information. Okay, so let's move on to window functions. This is a project I personally worked on and it's something that can be very difficult to explain to somebody but it's something that can really speed up queries when used properly.
So window functions act like regular functions as well as aggregate functions. So the trick is that they can access multiple rows from the current row that they are computing their value on which means that with this you can actually try to eliminate self-joints
so whenever you are joining the same table to itself you can potentially get rid of that kind of query and this can sometimes speed up queries that take ten hours for large datasets to fast one minute queries and I have examples of this.
So like I've said, window functions are computed over a sequence of rows like aggregate functions such as sum but the result gets outputted once per each row in the result
set not once per group. You can find them using the over clause and let's just go through a few examples to make my point on this. So here we have a simple table with users, they have an email, a first name, a last name and they're split up by either regular or admin users.
If we forgot to add a primary key to this table we can generate one with the row number window function. So row number here takes no parameters but it has to have this over keyword right here
with empty brackets afterwards, there is a reason for that and it will generate an increasing sequence of numbers from one to five. This is very similar to a function, a regular function but the trick is that it has to
have an order. So if you look at this number here, this order is not guaranteed, currently we will make an exception for this kind of query in a future release but as it stands this order is not deterministic. You can also get this order, so we've scrambled them.
Yes, I have an order by email and I'm ordering the rows by email, they are in alphabetical order but row number doesn't know which order the rows actually should be looked at.
In order to fix this we have to specify an order by 4D row number. So we're adding an order for the rows for that window function. Now what happens if we want two sequence of numbers, one for admins and one for regular
users? Well you can split this up with another syntax, partition by. It's very similar to group by. All the rows are split up by the account type so you can look at it as if these were
separate tables and we're computing row number across admins and regular users separately. Make sense so far? I have another example and I think that one will clear up how partition by and order by work.
Go ahead. So regular users, there are three regular users, so there's one, two, three regular users and then one, two admin users. Yes, it starts a different sequence based on account type.
Let's look at how this behaves like an aggregate function and I think this will make sense how order by works. So say we have this sensor data here. We have something which records a value and it puts a timestamp and the value in
the database. We have time and value. If you were to plot this data, this is how it would look like. Now what if we want to smooth this data out? So right now it's noisy? Well, we can use average as a window function to perform a moving average over the data.
So here we do average of the values and now we have to order the values by time so that average knows which part of the data to smooth out. And it will smooth out three rows before the current row and three rows after.
But these rows are ordered by time. So for each point here, the red line, the value is computed by three data points before, three data points after and one data point, the current row itself. We can increase these by making a larger moving window and any questions about this?
Let's look at how this moving window works in a numerical example.
So it's the same example as before, but instead of average I have sum here, just so that I don't have to divide by the number of elements in the frame. In blue we have all the rows which take part in the computation. Red is the row which we are computing the value for and in parentheses I have the expression
which is actually run behind the scenes. So first there we sum up between one preceding and one following and then two preceding and two following rows. So the first row doesn't have any preceding rows, so you just add the current row and
the row right after, going one step further. We now have one preceding row, so we are adding the previous one, but we also have an extra new row, so now we are adding both the previous, the first, the next one
and the current row itself. If we move one step further we start losing rows from the first example there. And now instead of adding the first row itself we have to remove it and then it just keeps on going like this. One more step you can see that even in this case we start losing rows from the top.
And here's an interesting observation. Whenever you advance the window to the next row you have to do one removal from the previous computation and add one extra row to get the new value.
Which means that we can actually compute this by just in constant time. So whenever we need to go to the next row we just add one value and remove one value. And this is constant time, which is what actually makes window functions very fast.
This is the full example, so we can see here that it's pretty symmetrical. You get to add values and remove values as each row comes around.
Okay, now I'm going to go into the three practical examples which I have. How much time do I have? I think it should be... Yeah, it should be plenty of time. Okay, so I'm going to have some benchmarks for these results. They are run on my computer so they're not real benchmarks
as in very strict, but this should give you a rough idea of how performance improves with window functions. So what we have here is a bank example. We have a set of transactions for various customers and we want to get their rolling account balance.
So basically how much money you have in your account after each transaction. And we want to do this for each individual customer. We have a customer ID, a transaction ID and a timestamp for the transaction. And then there's the amount either when you pay or when you get money deposited into the account. Either positive or negative value.
The way you would do this with a regular SQL query without window functions you would use a subquery where you would sum up all the transactions from the beginning of time until the current row. Okay, so here we find all the transactions for the current customer
and then we get all the transactions which are previous or current one. And all these are summed up. With window functions it's a lot cleaner to express this. So what do we do? We have a sum over the amount.
We split the data by customer. We order it by timestamp. And we want to add all the rows from the beginning of time unbounded proceeding up until the current row. All these are ways to specify how the frame moves when you advance the row.
And it gives you the same result. Make sense? Performance. With regular SQL no indexes. It's basically all that the database can do without using any additional data structures.
This basically increases in an n-squared fashion. 10 times more rows implies 100 times more computation time. With SQL but with an index this gets improved. But this will impact insert performance because you have to update the index as well.
With window functions this becomes a linear query. With a slight offset here because the data set here turns from memory to disk. And we have a slight penalty for disk access here as well. From something that takes 4000 seconds to 500 seconds it's a 10x improvement.
Another very frequent query is to get the top n values of a certain data set. So in this case we're going to get the top 5 earners by department.
So what are the top 5 salaries and who gets them? So here's the data. We have two departments, sales and engineers, each with their own salary and employee name. If you want to get the top 5 earners you need to count how many people are before you that earn more than you.
And you only want to select the values which have at most 5 people that earn more than you. So how do we do this? We have to count how many people are not myself, it must not match my name.
It must be in the same department as me and it must have a salary greater than mine. And whoever has less than 5 people that match this criteria then they are one of the top 5 earners in the department. Now the trick is what if I also want to rank people?
So I want them to have a rank, either 1, 2, 3 or so. Well, if you want to do this you have to basically repeat the query but have it in the select list as well. So it's the same thing, it's just now I'm counting them and adding plus 1 so that I get the ranking correct.
Notice that there is a rank here that matches. So we have 2 people that earn the same amount. This causes problems because if you want a dense rank then it's a bit complicated. But let's see how we do this with window functions instead.
We have the rank function which only works as a window function. You can partition and we just express what we want to compute. So we partition by department and then we order by salary. And we order it descendingly because we want the top 5 greatest values.
Implicitly it starts from the lowest value to the highest value. And now you would be tempted to filter out these so that you get a top 5. You would want to do where ranking is less than 5. But this does not work.
You are not allowed to use window functions in the where clause. The reason for this is that window functions are computed after everything else in the select is computed. So everything gets computed as if there is no window function and then we add the column afterwards.
This has to happen because otherwise you would get ambiguous results. And this is where CTEs come in. We just wrap that query in a CTE and then filter out the CTE expression.
So this is how window functions work with CTEs quite nicely. You can also use a derived table so you can have a select inside the from clause if you want. But with CTEs it's a lot more readable. And let's look at performance. This one is one of the big ones.
So here with regular SQL once we get up to 20 million rows this never finishes. Not in the lifespan of the battery of this laptop anyway. With regular SQL and indexes it gets better but it still takes a long time after a certain point.
With window functions this actually completes almost instantly. Because we're just going through the table once after we've sorted it. So it's just one sort and then one go through the table.
Sorry, I did not understand the question. Yes, we do have...
Maybe I don't understand your...
So how would you use having in this... I'm guessing you're referring to this query, right?
Well having is computed... Rank is computed after having. You can only use... You cannot reference it here. You can only use window functions in the order by clause and in the select clause there.
Because it doesn't exist when having is getting executed. Which is why you need to wrap this query up. Query into a CT and then get the results filtered.
Okay, and my final example. So this example is again from a similar use case scenario. It's a bit filtered out but the idea is that we have a certain number of machines. And whenever a technician goes and fixes this machine
we put a timestamp in the database that service has happened for this machine at this particular time. And you're interested to see what is the average time between services. So let's try and find out which machines break more often.
Here's the data. Timestamp, machine ID. Simple. Now what we want to do in order to get the average time between services is to get the difference between consecutive services for each machine.
Doing this with regular SQL is a bit of a pain. The idea is to join the table with itself and match the same machine IDs and then look for the transactions that have a time before the current position.
And from those transactions get the maximum one so that you actually get the most recent one that is before the current one. And then just do a difference between the current time and the previous time. This is one way to do it. There is probably multiple ways to do it
but I think this was the simplest one that I could come up with that could actually fit into one slide. And we have to group by machine ID so that we get this value for each machine ID for each transaction. Now we can do this a lot easier with window functions.
We have a nice function called lag which is able to access previous rows from the table. So here what do we do? We get the machine ID, we get a time and then we want the value that was before the current row.
So lag just looks one row before. It can take multiple parameters. It can also look at ten rows before. It's a function that takes either one parameter or two parameters. And lag is getting computed by machine ID and we order it by time
because that's how we want the values to be sorted. And then you can just do an average from the time difference between the previous time and the current time. Make sense? I think this is way simpler to comprehend than the previous huge blob of selectors.
Time-wise this is also faster. So no index is unfortunate. With an index it's not so bad. But at the same time for a 10x improvement in speed,
a 10x increase in data size, you get a 10x increase in computation time. However, window functions are ten times faster because you only have to do one select instead of having to do... I think the way this looks like it's...
So instead of having to do a join and then use an index lookup to find which value is the maximum one, you're just able to scan for the table once after you've sorted it. You can see again the penalty when we're dropping from in-memory database to disk access.
This is normal. And one thing I didn't point out... Let me go back to the first example here. You can see that there is a penalty with small datasets. There is a small overhead when you start computing. This quickly becomes negligible once you get to a certain point in data size.
This is an extra step that the query executioner has to perform. But if this step speeds up the previous query quite a lot, then this will give you big improvements in performance.
Well, that's all I have. A summary is that you can use window functions to eliminate self-joins and this will potentially make your queries more readable,
like we've seen with lag or with the average example. And in a few cases you can get faster queries. This is not something which is free, so you don't get this speed-up without rewriting your queries, but it is something that can drastically improve if you're willing to rewrite your application
to make use of window functions. Thank you very much. Here's a few examples of more window functions which we support. This is a full list. We don't support group concat yet, but this is something that we are planning to do.
It's just been a technical difficulty in getting it to work correctly. And this is a few things that we don't support yet. These are a few features which are planned for the next release, 10.3. So probably all of these will be implemented in the next stable release.
Thank you very much. If you have any questions... Which MariaDB version?
This is MariaDB 10.2. It's not the default version which you find in Debian, so you have to get the custom package. We do have custom repositories that make installation of 10.2 in Linux distributions simple. If there are no questions, you can still find me around.
Feel free to ping me. Thank you very much for coming.