Learning to Hack on Postgres Planner
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 35 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/48285 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
PGCon 201914 / 35
1
3
4
10
12
13
16
17
19
20
24
28
29
33
34
35
00:00
Computer animation
00:35
Computer animation
01:09
Computer animationDiagram
01:56
Computer animation
02:38
Computer animation
03:45
Computer animation
05:02
Computer animation
06:13
Computer animation
07:06
Computer animation
07:55
Computer animation
08:33
Computer animation
09:10
Computer animation
09:46
Program flowchartComputer animation
10:26
Computer animation
11:16
Computer animation
11:57
Computer animation
12:36
Computer animationDiagramProgram flowchart
13:26
Computer animationDiagramProgram flowchart
14:04
Computer animationDiagram
14:45
Computer animation
15:19
Computer animation
15:59
Computer animationDiagramProgram flowchart
16:59
Computer animation
17:36
Computer animation
18:16
Computer animation
19:15
Computer animation
20:06
Computer animation
20:39
Computer animation
21:23
Computer animation
22:52
Computer animation
25:45
Computer animation
26:43
Computer animation
27:27
Computer animation
35:05
Computer animation
42:45
Computer animation
Transcript: English(auto-generated)
00:05
Okay, so I work at Pivotal on Greenplum. I wanted to give a talk about Postgres Planner, and the goals of it are to provide a tangible, trivial example. So this is not going to be a super fancy optimization, and to actually start a discussion on what is involved in
00:23
adding an optimization. So some type of... I have a table of contents, which is mainly useful for as we come back to each part to know what we're doing. I'm going to talk about planning basics, and then I'm going to go over the guidelines that I usually think of when I'm
00:40
thinking about an optimization for Planner, take into account that I've mostly worked on old-timey Planner, because like older versions of Planner I have not worked on. The patches you'll see are on current Postgres, but most of the the Planner that I've worked on is older, so I don't talk about partitioning or
01:02
planning for parallel, or JIT, or any of that is taken into account. And there's a case study, and then some discussion. Okay, so if you want to at the beginning go look up the patches that I'm going to talk about there on my fork of Postgres, and then there's also the slides, this PDF, and the
01:21
glossary of terms is there as well. So, query planning. You go from a query to somehow magically get results. We're going to talk about the arrow in the middle. So a query, a SQL query is parsed, and then query planning happens, and then that resulting plan tree is executed. In Postgres, there are a couple of data structures that we're going to talk about, so I just wanted to put them up here
01:46
so you can see them. This is the query tree. So a query like this, you can see it's color-coded, the constants one, two, and four are there in the query tree, and I'm going to be using plan tree visualization diagrams like this for the rest of the presentation.
02:03
So the sort of first phase of planning, I'm going to call it semantic optimization. So these are transformations that you can make to the query tree that are logical, based only on the SQL query itself. And
02:21
it has to be a, you have to create, the transformation you do has to result in a semantically equivalent tree. So constant folding is one classic example of this. You can evaluate a function and then replace it with the result of that evaluation, or an evaluated expression rather. And so this is the same query, the query tree before and after
02:43
this optimization. After doing some sort of pre-processing and semantic optimization, we do cost-based optimization. So this is related to the actual data, and in the underlying relations and different statistics about the data decisions during planning,
03:03
join order is the most common one that people talk about. And then after you've done some of these, there are others. So for example, you might find that if you're doing a series of joins, given whatever predicates are on the joins, that it might be better to order them in that way versus the first way, based on the selectivity,
03:23
or how many rows are filtered out by a predicate, as an example. After that, you'll sort of, considering those different join orders, you'll add access paths and do costing so you can actually decide on a given join order. There's a lot more there, but I'm gonna move on for now.
03:42
There's other good presentations on this topic. So plan tree, this is the plan statement data structure that I'll also be using throughout the presentation, the plan tree member of it's the one that we care most about. And this is a visualization of that. Oh, also, please interrupt me with corrections, questions, comments. I might
04:01
respond to you by saying I'm gonna address that later. So, but yes, please interrupt me. I like being interrupted. I want this to be a discussion. Okay, so my guideline. There's sort of four questions that I think about or ask myself about when I'm, a lot of times when I'm reviewing PRs. So I also worked on Orca, which is
04:22
the Pivotal other query optimizer that Pivotal created. So some of this is, this sort of crosses over regardless of what query optimizer you're working on. So the first one is, does your optimization always retain semantic correctness? So the, this is an example from the optimizer readme, but basically if you have
04:43
there, you can't move inner joins to the nullable side of some, an outer join. So this is just one of the classic examples of the type of optimization that you can't do because you, it's not semantically equivalent, you could get wrong results.
05:02
So the second rule is the one that I think is the most fun, which is, does it inhibit downstream optimizations? So this kind of for me breaks up into three parts or sort of different components, and the first is that optimization order matters. So there's a bunch of comments in the code in planner.c, a lot of them
05:24
that talk about different pre-processing steps and why it matters if you do one before another. So this query, and we're going to come back to this rule, so I'll just kind of gloss over the example a little bit, but basically in this example you have a
05:42
subquery, and then you have two quals here, a equals c and c equals 7, and you know in the initial query tree those are the quals on the query, and then you have the b equals 5 from the, in the subquery. But we do subquery pull-up before we try to do any sort of pre-processing or contradiction detection on the actual quals, and the reason is that
06:06
we're able to, in this case specifically, if we do that, we can determine that. So we have by transitivity a equals 7, so down here we can actually, if we do the pull-up and then the pre-processing, we can tell that 7 and 5
06:22
actually are never going to be equal, so we can turn this entire query into a no-op basically. Does that make sense? I haven't done this slide for anyone yet, so I don't know if it makes sense. Okay. Yeah. So the next part of does it inhibit downstream optimization.
06:48
So I think of it, this one is sort of the classic, well this optimization is great for that class of query, but here's an example of another type of query which it causes a regression for. And we'll get into that a little more. And the last one is, I'm not going to talk a lot about this,
07:05
but I included it because I've been reviewing old rejected planner patches. I've been looking back to 2014 and since, and what I've noticed is that one of the most common reasons for returning or rejecting patches, it has to do with
07:23
expectations that different parts of the code have, the number of, also it's a big source of bugs, but number of columns, one was supposed to be there and it was removed, or the group I was supposed to be like this, or the range table was supposed to be aligned with the target list in this or that way. So a lot of these sort of expectations that are implementation specific about the query tree and then eventually the plan tree
07:46
are important to think about, but are less cool. So I'm not going to talk about them right now. Okay, so number three is, is the improvement in execution time worth the cost in planning time? There's a lot of examples of this.
08:04
People talk about, you know, catalog lookups, things like that, but I think join order is the most classic sort of database-y problem like this, which is that if you do a fully exhaustive join order, to determine the best join order given perfect statistics and exhaustive search, you can get
08:20
the best join order, but that's not usually very worthwhile and would cause planning time to be quite long. And number four is, is the, so this is basically an all programming or software development, but is the cost of the complexity cost
08:42
of the code that you added worth the performance benefit? And again looking at patches that had been returned, a lot of times it was, well you added a whole new API and I don't really know if this case is very common or this is a very obscure feature that no one cares about, I don't want all this code I have to maintain, or
09:04
this API is interesting, but I don't see any reuse potential, so. Okay, so we're gonna do a case study, two tables each with one column, and this is the query. So I did do this same query at the talk at Postgres open last year, so some of this will be
09:24
review. The semantics of this query are for every tuple and foo, you're gonna, for each a, you're gonna see does there exist a b in bar which is equal to null? So just quick review of null semantics, so we're all on the same page,
09:45
ternary logic with null, the part that's most important here is for the equality column, if either p or q is null, then the equality is going to evaluate to null, that's important for the semantics of our query.
10:02
So I thought that because the qual would never be true, that we would have this plan, a result node with a one-time filter false. I don't know if anyone really knows what a result node is, so I'm not going to talk about it, but basically in this case, it's letting us project even though we're not doing anything, it's letting us return a column.
10:21
So this is not the plan that I got, I got this plan which requires at least a scan of foo and a scan of bar, at least one, depending on a few things, but so I was like, well, is that really necessary? It doesn't seem necessary for correctness because the qual will never be true.
10:45
And so now I'm going to talk a little bit about how I decided that it wasn't necessary and that I wanted to change it. So how do you go about doing that? And there's just a series of steps, I don't know if that's what everyone does, but just what I kind of think of, and the first thing I do is
11:02
characterize the query. So this is just getting a minimal repro. That's all that this is, but basically, what is the part of the query that's most important, is most impacting the thing that we don't want. So what about the query actually matters that we want to actually change? Because that's important because the query trees get big.
11:22
So you want to focus on just the part you care about. So we don't care if there's a select, in the select list there was a bunch of aggregates, like reduce all the noise and then think about the part you care about. So in this case, it's when you have a provably untrue qual, what other queries that have a provably untrue qual do get optimized into,
11:45
I'm going to call it a no-op plan, a result node with a filter, a one-time filter false. So what are some queries that have that? So that's my next step, which is analogs, so finding an analogous query.
12:02
So this is the most basic example, get a from foo where false, and that does not scan anything, which is great. And so then the next level up is, okay, well, is it about null? Is the reason that this is not getting optimized about null?
12:22
The next basic question. And no, that's null equals seven is sort of the most basic thing that evaluate function can do during constant folding. So that gives us our no-op plan. And just for going forward as a node on my notation, I'm going to ally to the parts of the query tree that are not relevant to what we're talking about,
12:43
so you'll see sort of a simplified version. So then the next thing that I do is identify the transformation that's happening to the query tree that turns it into a query tree that can be turned into the plan. Because not most of what's happening that we care about during semantic optimization is going to happen to the,
13:06
everything that we care about is going to happen to the query tree itself. So in this case, for null equals seven during, you know, during constant folding, we're going to take the qual and we're going to evaluate it and it's going to get turned into a null const. So for the next
13:26
query for the next analog that I used was, is it about a subquery? So my question was, is it that that when we have an expression where one argument is a subquery, it can't be optimized for some reason?
13:41
And that's not, that's actually not true if you have this type of query. So note that there's no any sublink. So there's, it's just an expression sublink. So this does get optimized. And the type of transformation that happens is the, you know, so the entire qual, it's a different expression. It looks a little bit different, but
14:01
it gets turned into a constant null. So this is our original query null equals any select b from bar. And this is what, it does have constant folding done to it. I mean we go and we attempt to do the best that we can. And you can see over here this test expression. So the sublinks test expression gets folded into a constant null. So we do, we do something and that's a transformation
14:25
that's happening. So now I'm thinking, you know, where can I sort of inject something? And if you compare the three queries, these are their original query trees. And this is after we've done constant folding on the query tree null equals seven and null equals select b from bar have the same post
14:43
processed or pre-processed query tree. And null equals any has, its qual is rooted at a sublink node. And then in the end you get these plans. So null equals seven and null equals select b from bar have the same plan from the end. So then I had two ideas for
15:00
places to add this optimization. The first was in constant folding. So what we talked about in the beginning with my guidelines is that first you would do the pull-up of the any sublink into a join and then you would do constant folding. So I thought let's just add a sublink case to constant folding, to the constant folding mutator. And so I did that and the
15:26
basically, this is what's happening now. And my idea was just to add a case that said if it's a sublink and it has a test expression, evaluate that expression. If it returns a constant null, then instead just return that from the sublink case.
15:45
And I guess what I thought was that if I could just eliminate that pesky sublink, I would be able to keep from having a sub plan in the final plan. And then that would let me get rid of the rest of the plan.
16:00
So I wanted this transformation to happen. And I did it and I got the plan I wanted, but it turns out that that actually violates my first guideline, which is that it needs to be semantically correct in all cases. So, yes, I
16:33
knew that we would never need to scan the, well, we would never return data, so we wouldn't, we didn't need to scan the table.
16:47
Yeah, and then where can I add code that will make that not happen? Yeah. So null semantics we talked about before, now we're going to talk about any semantics, capital A and Y.
17:01
So back to the question this is asking. So does there exist a b in bar which is equal to unknown? So the really cool thing is that we can actually just ask Postgres the answer to this by moving it into the select list because when it's in the where clause, it's filtering our results. When it's in the select list, we can actually see our results, right?
17:22
So in this case, we get a row back, it's null. Postgres is like, well, I don't know if null equals any because that's unknown. Now if we truncate bar, we get false and that's because if you ask
17:40
does, is there a b in bar which is equal to null and there, if there are no b's in bar, the answer can definitively be false. So now when it's in the where clause like it was originally, we get the same answer regardless. We get nothing, zero rows are returned.
18:01
So this is just to come side by side of the query tree for when it's the null equals any expression is in the select list versus when it's in the where clause. And it's pretty similar, just different parts of the query. So this transformation doesn't work because if it's in the select list, you can get wrong results. So what could we do instead?
18:21
So I just scoped it to the qual only. There's a really convenient function called pre-processed qual condition that I just added the same logic there. So if after constant folding the test expression is a constant null, replace the sublink with it. And that worked.
18:41
And I got the plan I wanted. But I guess I'm arguing that it kind of breaks my fourth rule, which is this is a really specific query. It only works if the argument, the constant, is a null and I don't know how common of a use case it is.
19:01
Maybe there was a reason there wasn't a sublink case in the constant folding mutator. So alternatively, I thought about what if we added it to the any sublink pull-up. So going back to the original query, null equals any select b from bar. Finding analogs. The first analog here I tried was replacing the
19:24
the null with a. So a is from foo. I don't know why I made it yellow because that makes it less clear. But yeah, it's the same a from foo. And so this gets pulled up into a join and it's because we are in this case semantically saying this is basically the same thing as if we evaluated the equality of
19:47
a and b. As long as we've deduplicated b, we can do this as a hash join. So now with null, I thought maybe we can just pretend that null is part of foo and then do this as a join also.
20:04
So in convert any sublink to join as the function where this is happening, I just added and you back the original slide that had the code on it has these three patches. But yeah, so you can see that the query trees are pretty different at this stage.
20:24
So this is for a equals any after we, so the original query tree is here and then we pull up the sublink into a join. And so this is after convert any sublink to join with a bar on the left side. Okay, so then these are the two query trees a equals any and null equals any and
20:45
they're basically the same except for here's the var and there's a const over there. So it seemed fine, seemed like a good idea. So then this is after and again, they're exactly the same except under the op-expert you have a var versus a const.
21:05
So with my patch I was able to get that type of shape and then I could get the no op plan. So I basically turned the query into select a from foo join var where null equals a de-duplicated b.
21:20
So that was cool, I thought. And the kind of cooler part of it originally that I thought of was contradiction detection for a case like this. So what if instead of just doing it for constant null, I did it for all constants and I could detect this kind of case where seven equals any select b from var equals five, well five and seven are never gonna be equal. This seemed really cool.
21:44
But so I get this no op plan, but then I was like yeah, so what about when there's not a contradiction then you get potentially worse plans. So I actually don't know because I haven't timed it but I imagine that if you make a hash sub plan,
22:01
if you have a hash sub plan on a case that's pretty simple that it would be faster than doing it as a nested loop semi-join. Again, I didn't time it but it seems like you have the potential if you did this for generating much worse plans in lots of cases. So just going back one slide because I might have done that too fast.
22:21
You get why this is not, you don't need to act, you're not going to produce results, right? But I didn't think really hard about it because I was mostly doing this for this presentation. So I don't know but I think in a lot of cases you'll get worse plans if you made this optimization. It also seems kind of silly if you think about it, but I
22:42
because seven is not part of foo, but I I do think that there's a potential to do something, something kind of fancier than what we currently do. Like maybe you could move somehow you could move the seven down and make it a filter or something on the scan and then you could do the contradiction detection.
23:05
But or you could yeah, I think that there's something that we can do to do better than this. But I think the other really important thing to note is that it's not just that you could get a worse plan in some cases, it's that what you during this stage of planning during semantic optimization the way that
23:24
you don't produce alternative and then pick based on cost. So once you have decided hey let's do a join instead of a sub plan you don't get to go back and redo it once you have stats access to stats and things like that. So once you make a choice during semantic optimization, it has to be basically always better and this isn't an example of that.
24:19
You're saying because you have three kinds of ways to join versus just a hash sub plan?
24:24
Yeah, yeah, yeah. Yeah, I haven't thought about that, but I guess increased flexibility is good. Yeah, one of the things from working on this presentation that I thought a lot about was cases when hash sub plans were good
24:43
because I think traditionally, especially working on GreenPump, I've learned to hate sub plans and look for subquery pull-up at all costs. So but I think when it's done well and the hash table you're making is quite small it can be pretty fast and
25:03
but the other thing is I'm pretty sure that the hash table for hash sub plan like that can't spill, right? You can't if you make it too big. No, no hash. No the for that for a hash sub plan the hash table that you actually build.
25:25
Yeah. So that would be a problem in GreenPump for us actually. So I think sometimes there's a good reason to make everything into a join because there's a lot of work around optimizing and making joins work well, I'd say.
25:43
Okay, so I think this breaks two and four. I would argue that at least in one case you'll produce worse plans when the join isn't eliminated. And it's a pretty narrow case again. So we're favoring contradiction detection when you have a constant left argument kind of specific. Going back just on the guidelines
26:09
again just to review we talked about one semantic correctness we talked about you know two downstream optimizations and potential regressions, which I would say that
26:24
we didn't talk much about planning time. Or complexity cost that much, but those there's lots of good examples out there of in the commit-fest app you can look at past patches and see kind of some of the arguments. I think these are some of the most common reasons patches don't get in for optimizations.
26:44
Some other ideas that were brought up using the stats. This of course does not work because perfect stats, your stats can get stale. Your stats are usually stale and wrong. So you can't just tell that that in order to tell that the subquery that the table is empty so that you can do it no matter what part of you so you could do the
27:05
constant folding in every select list, where clause, whatever. So you can't rely on stats. Another suggestion was to actually execute the subquery to see if the table is empty. I think that probably has a significant would violate rule four of adding code complexity and potentially performance impact as well.
27:27
So I wanted to have a discussion in the last, whatever, ten minutes. We can do questions first, but what I've started doing is looking back at, as I said before, at planner patches from
27:41
all the way back to 2014 and what I what I kind of want to start to understand is what what are the things that, so let's say that you, you know, are a Postgres user and you think of, you get this plan it's not a good plan, and you're like this optimization is semantically correct and would make this plan better.
28:01
So what is the gap between that moment and then actually your patch being committed? So I'm glossing over the first part of it where you're writing the code getting, you know, there's other presentations that talk about getting a text editor and all the kind of like sort of procedural steps
28:20
but actually the code, let's say you wrote a patch that works for your query. What's the gap between there and something that's committable to Postgres? And I mean, it's kind of a difficult question to answer, but I think the more data points we have, the closer we can get to making it easier for people out there that see bad plans to help us to improve planner and not just from a theoretical sort of database perspective
28:47
but actually from a practical user perspective, and then we'll have more people hacking on planner and so that's kind of what I wanted to talk about. There are things, when is it okay during planning to do things like go get something from PG constraint?
29:05
When is it okay to mutate the plan tree? When is it okay to use path keys for something that they're not already being used for? So I wanted to talk about that and I also wanted other people, if you have guidelines that you use when you're thinking about query
29:22
optimizations, I wanted to see what they were. So, mm-hmm.
30:33
Mm-hmm. That's an interesting idea that for Orca, that's something that
30:54
we use a lot of GUCS to kind of enable and disable certain types of optimizations and then for the ones that are exposed to the user they can enable ones that are more relevant to them or not
31:08
which I understand has its own, people have thoughts about, but one of the things that I was talking to them last week that they've been discussing recently is having, as you said, kind of a two-stage optimization process where there are a base set of
31:26
optimizations that might happen or transformations that might happen and then you continue to do planning. So you produce a plan and then after that you continue to run the other transformations and only if you,
31:43
and Orca produces like 10 alternatives from each transformation or it's configurable, so full alternative plans, which is different, but you only consider and continue to produce alternatives if they cost less than the plan that you chose with your sort of initial pass of optimization.
32:02
So there's lots of little tricks like that.
32:59
Other thoughts or questions?
33:22
Uh-huh. I've never never thought about that. So it would have to be something like user defined function or some, yeah. Do we ever, is there, I don't, I've never seen a part of planner that would say
33:42
require that a subquery can, well if a subquery gets pulled up to a join, I can't think of any place where we say you have to keep this because it has a function in it, but a function, well the function would have had to be volatile. Yeah, if it's volatile then you have to execute it.
34:01
Yeah, and I don't think could you make a user-defined function that wasn't volatile that created, had a side effect like that?
34:36
So role reversal hash joins specifically we do not do unless,
34:41
yeah, we don't do that as far as I can recall, but what would you consider that class of types of optimi, I can't think what else would be analogous to that. That was my whole goal the whole time. I also just, for the sake of time, I wanted to
35:10
put up here there are two presentations that Robert Haas, there's probably more online and Tom Lane, some presentations, Robert Haas has a couple of them on planner and Tom Lane has one that goes through
35:21
some of the names of the major functions and things like that. You can also look at the optimizer readme is really good. And so these are good sources for getting some more basic information. I kind of wanted to touch on specifically thinking about thinking about optimizations, and then of course old planner patches are a good source as well.
35:45
And then this is the link to the code and the slides again, patches, and for reference, well, I guess no one has anything else to say.
36:03
Okay, I won't get you in trouble Peter.
36:37
I did not, yeah, so ORCA is a top-down planner. One of the things that Peter and I were talking about yesterday
36:42
that I think is interesting is that I actually don't know enough, I mean, I don't know enough about query optimizers to be able to come up with an example where the thing that makes it easier to add new optimizations to ORCA has anything to do with the theoretical difference of being top-down versus bottom-up. But the architecture itself, the way that it is, one thing we talked about is that once you have an optimization in mind,
37:06
actually the act of adding it is, usually you can get to the point of doing that in a month and like, as opposed to, I think, the architecture of planner makes it a little bit more time-consuming to get to the point where you can add code
37:21
that is architecturally consistent. Yes, the ordering, well, other than, and conceptually, but the literal ordering, yeah.
38:06
Yeah, I totally added stuff and didn't look at any of the other code like when adding it, which is not a luxury. I'm sure you have what's playing. It also kind of can lead to bit-rat though. No one ever has to look at any of the rest of the code.
38:23
All right, let's get one person who hasn't asked a question or brought up a point to do it and I'm gonna call someone out by name and embarrass them if no one volunteers. Okay, three, two, one.
38:44
Jimmy, do you have any questions or comments or thoughts?
39:21
I planted him in the audience, actually.
39:46
So I have never looked at the, so Postgres has its own window implementation that is become a little more similar to upstream Postgres now. I mean, so I've never looked at it and it's scary and I've never had to, so that's the answer to that.
40:04
Over there. So the question was, what kind of work is being done on optimizing self joins? I'll defer this to Tom.
40:42
I've looked at some past rejected planners, not patches on that, but I don't know what the latest and greatest is. Like removing self joins, I think is what you're saying.
42:26
Yeah, one thing I didn't talk about was how to figure out if the optimization you would like to add is going to benefit a large enough class of query. That I don't have an answer for.
42:41
Okay, last I just have to acknowledge, so the diagrams were all made in ticks and I copy pasted this from someone else. So Kai Ting made these diagrams, initially made a couple of trees for me, and so I copy pasted it and replaced everything. But we're working on trying to make a little library that will let you give some parameters to generate
43:03
plant trees. So and then Jesse for going over a lot of my examples with me, and then that's it.