What's in a Plan? - TIB AV-Portal

What's in a Plan?

00:00

1

Related Material

PGCon - PostgreSQL Conference for Users and Developers

Formal Metadata

Title

What's in a Plan?

Subtitle

And how did it get there, anyway?

Title of Series

Number of Parts

37

Author

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/49101 (DOI)

Publisher

PGCon - PostgreSQL Conference for Users and Developers

Release Date

Language

Content Metadata

Subject Area

Information Science

Genre

Conference/Talk

Abstract

PostgreSQL's EXPLAIN command is a powerful tool, allowing users to expect and understand query plans in detail. However, the output produced by EXPLAIN is not a direct representation of the plan; it omits some information and displays other information in modified form. In this talk, I'll discuss the differences between the plan as displayed by EXPLAIN and what is actually stored in the server's internal data structures. I'll also discuss why the plan includes the information it does, how that information is computed, and how that information is used at execution time. In my 2010 talk "The PostgreSQL Query Planner", I focused primarily on the types of plans that the planner was capable of generating and the reasons why it was necessary to compare various alternatives. Tom Lane's 2011 talk "Hacking the Query Planner" discussed the overall order of operations in the planner and explained some of the motivation behind it. In this talk, I will focus less on the overall operation planner and more on the data structure which it produces as output, namely, a tree of Plan nodes. I hope that this will be helpful in understanding PostgreSQL's execution-time behavior as well as in learning to write patches that touch the planner and executor.

PGCon 201833 / 37

1

53:34

Hacking on PostgreSQL

2

2:50:38

Linux IO internals for database administrators

3

51:11

Continuous Integration for Commitfests

4

43:35

5

50:17

Standard SQL Gap Analysis

6

47:01

Your Herd of Elephants: PostgreSQL Replication

7

48:53

PostgreSQL Partitioning

8

39:09

Scaling WAL apply performance on Followers

9

31:39

Introducing PMDK into PostgreSQL

10

50:44

Deep Dive on Amazon Aurora

11

42:12

Vacuum more efficient than ever

12

58:22

PGCon 2018 - Lightning Talks

13

54:08

Migration of a 5 Node Oracle RAC to PostgreSQL

14

20:28

PGCon 2018 - Opening Session

15

51:33

Flexible Indexing with Postgres

16

48:00

Authentication in PostgreSQL

17

41:15

De-mystifying contributing to PostgreSQL

18

49:27

19

47:06

Growing up new PostgreSQL developers

20

34:12

PostgreSQL Extension APIs are Changing the Face of Relational Databases

21

45:45

Next generation PostgreSQL replication with pglogical 3 and BDR 3

22

44:25

23

46:18

24

1:00:26

PostgreSQL Partitioning: Then & Now

25

45:24

Recovery use cases for Logical Replication in PostgreSQL 10

26

47:51

Credereum - blockchain-enabled Postgres

27

44:10

Securing PostgreSQL

28

48:22

The State of Postgres JIT - 2018 Edition

29

46:54

The Buildfarm client as a development tool

30

52:02

Securing Your Data On PostgreSQL

31

50:25

zheap: less bloat, fewer writes, and just plain smaller

32

49:10

Towards ACID scalable PostgreSQL with partioning and blogical replication

33

42:09

What's in a Plan?

34

45:07

Jsonb "smart" indexing

35

31:27

Let's pull the plug on the autovacuum?!

36

29:23

PGCon 2018 - Closing Session

37

3:04:53

Let's Build a Complex, Real-Time Data Management Application

Automatic playback

Speech

Text

Image

00:00

Lecture/Conference

07:41

Lecture/Conference

15:17

Lecture/Conference

21:12

Lecture/Conference

23:22

Lecture/Conference

29:43

Lecture/Conference

32:02

Lecture/Conference

37:47

Lecture/Conference

41:12

Lecture/Conference

Transcript: English(auto-generated)

00:15

I'm going to be talking about something that does exist that today and sort of

00:20

some of the things about it that I find interesting and so the title of the talk is what's in a plan how did it get there anyway and by a plan I actually mean the data structure whose name is capital P small L small a small n so many of you are probably thinking well I know how to see the

00:48

information that's in a plan you just let explain and you're on the query and it prints everything out and you're kind of right but it turns out that explain might be slightly lying to you so that's one of the things that you'll

01:07

sort of see is a little bit of a recurring theme as we go through these slides so since I'm talking about the data structure the plan data structure in one sense the the question what's in a plan is that stuff talk over closing

01:26

session yeah I'm going to just group these fields into a couple of categories and I'm going to go through each category individually I want to mention that on many of the slides throughout this talk I've got

01:42

either explain output or C code taken from the PostgreSQL source code in many of those cases I have made cosmetic changes to make it fit on the slide better so I don't think that those changes are of a nature that will confuse anybody or certainly not mislead but if the output that you see here

02:06

doesn't exactly match what you see in the source tree that that's the reason so actually I'm not quite correct when I say that this is the stuff in the plan this is the type def struct plan definition from plan nodes dot H but

02:23

actually this is only the common information that appears in every kind of plan so this plan structure is embedded in some larger structure that additional fields that depend on what kind of plan node this is so it might be a sort or a sequential scan or an index scan or a gather node or whatever

02:44

and so they're going to be additional fields that are effectively beyond the end of this structure that pertain to whatever a specific kind of plan you've got and the way that you know what kind of plan you've got is that the very first field of the plan is node tag type and I'm not going to talk

03:01

further about the node tag except to say that it's a giant enum that tells you what kind of plan you've actually got and I'm also basically not going to talk very much at all about any of the node type specific data so I'm just going to focus on the common structural information that's part of every plan node so we can break that up into a couple of categories there's

03:22

the node tag which I just mentioned there's some costing information some things related to parallel query support there's a target list and something called a qual left and right subtrees and I think those things the target list the qual on the left and right subtrees those are really kind of the meat of this and then there's a list of init plans there's two fields

03:44

called X param and all param and if you have no idea what those fields store you're not the only one and then there's the type specific information so if you think about it in one sense it's a little strange that we store

04:01

costing information in the plan because actually the way that planning happens in PostgreSQL is we first start by generating paths which are in a sense candidate plans and then we pick the cheapest one and we convert that into a final plan so why do we need the costing information we already know that

04:22

this one was the cheapest one otherwise we would have thrown it away so that's kind of an interesting question I'll talk the big answer of course is explain right we like to be able to run explain and actually see the costing information and if we throw it away and we wouldn't have it it's

04:42

actually used in a few other places as well and I'll show that on the next slide but there are four fields that pertain to this in the plan data structure there's the startup cost the total cost and the startup cost means the amount of effort in abstract cost units that will be required to

05:00

generate the first tuple and then the total cost is the effort to generate all the tuples then we have an estimate of the number of rows that this plan node will produce which as you can see is actually a double value not an integer and then we have the plan width in bytes per row again estimated because we do a lot of floating-point math when we're

05:27

estimating things and we don't want to throw away the precision actually we do throw away the precision in a lot of places but but not everywhere also the other advantage of the double is you can represent numbers which are really

05:40

really big using the double like bigger than 4 billion so how does this get used explain is a big one if you run explain you see all that stuff there's a few other places where this gets used in the executor if we're building some kind of a hash table specifically if we're doing a hash join or a hash sub plan the row count and the width are used to set the initial size

06:04

of the hash table which is not totally critical but getting that approximately right at the beginning makes things slightly more efficient for a hash join we use this information to decide whether or not we should fetch the first outer tuple before building the hash table or after

06:23

building the hash table and the reason why that's important is because it might turn out that one of the inputs to the hash join is empty if we try to pull the first tuple that's going to have to be probed into the hash table and we don't get one then we can completely skip building the hash table on the

06:42

other hand if we try to pull the first row that goes into the hash table and we don't get one then we can also skip building the hash table and we can skip generating any row from the other side so we actually use the cost to decide which of those two optimizations to try there's a plan shape called alternative sub plan where we actually generate two

07:03

different plans for a sub plan how many people have actually ever seen an alternative sub plan is that a question or you've seen one you've seen one okay yeah few people there's one there's a couple of examples in the regression tests that's the only place where I've ever seen

07:20

an alternative sub plan but apparently they happen so then and we also use this to decide between custom plans and generic plans okay I gotta speed up so parallel query there are basically

07:41

three fields here for parallel query there's a field called parallel aware which answers the question does this plan node know anything specifically about parallelism a parallel safe flag meaning could this part of the plan potentially be executed in a work or can it only be safely executed in the leader process and there's a plan node ID and to

08:07

understand why a couple of these fields are here I've just got some very simple examples of plans so in this point in this first plan that I have here on the slide we've got a gather node on top of a merge join and then the merge join is a parallel index scan on a and an index

08:23

scan on B so both of those are index scans but the parallel aware flag is set on the first one and what that means is that the rows in a are going to be divided between all of the participants in the parallel phase of the operation whereas for B since the parallel

08:44

where flag is not set on the plan every participant is going to see every row and if you think about it that is a hundred percent necessary for correctness because if you have some workers compare a subset of the rows in a to everything in B and then

09:02

another worker compares a different subset of a to everything in B and so on you will get the right answer if you take random sets of at subsets of a and compare them with random subsets of B that won't work and we have to know which so we have to know which which one of those things we're doing so that's the need for the parallel where flag the plan node ID is

09:24

necessary because of plans like this here we have a gather on top of an append with three parallel sex scans underneath in each of these parallel sex scans we need to divide the rows between all of the workers and for that to happen the workers have to talk to each other but about what they need to know some way to identify which plan node they are

09:45

talking about and the plan node ID serves that function you could say well why not do it by table oid but it turns out you can reference the same table multiple times in the same query and so the oid can be there more than once the plan node ID is unique this could maybe be

10:03

used for things besides parallel query but right now not so much okay so now we gotta get to the heart of this or the start of the heart of it we've got the target list which is the things that this node is going to produce PostgreSQL has a kind of volcano style executer where tuples start at the bottom of the of the tree and they percolate

10:26

up the tree and then they explode out the top there's actually a there's actually a system called volcano and I think it's called volcano because its executor works that that way as well then there's a qual so the qual is actually something that is a bunch of filter conditions

10:44

that apply to this node one or more filter conditions that apply to this node so that's a list of expressions anything that doesn't match any tuple that is that would be produced by this node but which fails to match those expressions gets thrown out and then we have a left and a

11:00

right subtree for the plan so here's an example of what this looks like in the explain output you can see here that the the merge left join which is at the top of this plan tree has an output row in the explained verbose output it's shown in gold here and that

11:23

that node is producing a dot q2 and b dot q1 that's the format of the tuples that it's producing so that's the target list the filter condition which you see there is the qual that's what's being tested at the very end before we admit the tuple you notice this node also has

11:46

type a type specific expression a merge condition and that's something that's sort of part of the join processing itself that's part of the way that a merge join works but then whatever the merge join actually produces we apply this filter to it afterward and throw away anything that doesn't

12:03

match and then this plan has a left tree which is shown here in blue and a right tree which is shown here in purple not all of these fields are actually used for every node for example the sequential scans that appear on this plan have neither a left tree nor a right tree

12:22

because they have no descent of nodes the sort has a left tree but no right tree because it only has one child so what happens here here i have an append node with four sequential scans underneath left tree right tree center left tree center right tree

12:47

so what are the left tree and the right tree in this example anybody know they're nil right so there are a number of nodes that can have sort of an infinite list of children and those nodes have the children in a flexible sized list of children that is part

13:06

of the node type specific data and they leave the left and right trees completely empty and explain lies to you and it makes it look exactly the same as the other case but it but it's not when you use a subquery in your query and you don't put it in the from clause

13:30

then you get these things called init plans and subplans and i've written an example here in the where clause i have select i have where f1 equals

13:42

select min of abs of f1 from int4 table and that turns into an init plan init plan 2 here and there's a plan associated with that that computes the value and then i also have another subquery in the target list where i say select odd from 10k1 where unique1 equals f1 and that

14:00

turns into a subplan and the difference to condense a very complicated topic down to a few words in a slightly misleading way that emits some important details is that the init plan and only needs to be computed once whereas the subplan needs to be re-executed for every row

14:21

so naturally if we're going to have init plans and subplans attached to plan nodes there must be a list of each of these things associated with the plan node right no wrong there's a list of init plan nodes associated with any plan node but there's no list of subplans the subplan nodes just appear in the expressions that are attached to the node

14:45

the filter condition the merge condition whatever it is and the executor when it starts up builds a list of all of the ones that it finds while it's walking down the execution state trait and explain shows that that to you and it also shows you the init plan exists which

15:04

actually does exist in the plan yeah yeah if it referred to the outer copy of the table

15:28

you'd end up with a subplan rather than an init plan which is why it couldn't be an init plan

15:43

it would have to be a sub plan but here it only gets executed once and we end up with an init plan now you may have noticed also on this slide we have this thing where it says init plan two returns dollar sign one what's dollar sign one well dollar sign one is a parameter

16:07

so basically there are these two fields x param and all param which are associated with every plan node and they're basically the same thing they're like 95 the same thing

16:22

they tell you which parameters that plan node cares about the only difference is if a plan node has an init plan attached the output parameters the parameters that the init plan sets get included in all param but are excluded from x param i have yet to figure out

16:44

exactly why we do it this way so here's an example of a case where x param is empty and all param is not empty this is a really useful query from the regression tests

17:01

select one equals all of select of select one and it ends up producing this plan and as you can see the materialized node here has an init plan attached that init plan returns dollar sign zero so the all param set of the materialized node includes parameter zero but the x param set

17:23

is empty because the parameters set by init plans are excluded as a special case all param is used at execution time to decide which nodes to reset when we need to rescan so for example if we sorted some data and we looked at the output of the sort

17:43

and then we need to look at the output of the sort again there are two things we can do one is if we still have the results of the previous sort we can just return those results over again like if we sorted in memory and the sorted data is still in memory we can just back up to the beginning and return all the same results over again the other thing we could do

18:02

is we could throw away the sort of data that we already have we run the portion of the plan that generated that data sort everything again and then return those rows obviously the sense if the input to the sort hasn't changed because if some parameter value has changed

18:21

that changes the input to the sort you can't just reuse the results because now everything's different that was probably totally confusing so you can see here like in this sequential scan if somehow we haven't gotten to how this

18:46

can happen yet if somehow the value of dollar sign one changed then the results of this sequential scan would also change because the sequential scan has a filter condition that says that f1 has to be equal to dollar sign one so if you change dollar sign one somehow don't

19:02

worry about how that could happen in this example it can't but if somehow dollar sign one changed then the output of the sequential scan changed if there were a sort node on top of it every time dollar sign one changed the output of the sequential scan would change the output of the sort would change so you'd have to redo the sort with the new set of input rows

19:36

okay so the executor turns out to rely on x param as well as all param but only barely

19:45

there's only there the only thing that the executor does at runtime with x param is test whether it's empty and it only does that in like one corner case so the only query in the entire regression test

20:01

suite where the fact that we have both x param and all param in the finished plan makes any difference to what happens at execution time is this one that i showed you on the previous slide that's the only case where it makes any difference whether we have x param or all param i'm pretty sure it would work if we tested all param but it

20:22

would like use a few extra cpu cycles so actually having both x param and all param in the final plan isn't really pulling its weight as far as i can tell i think the main reason we actually have both of these is because they're used when we're assembling the final plan we incrementally

20:40

convert the paths that we generate into plans and in that process of building up the final plan we end up using both of these sets in the calculations but in the final plan it all the distinction between the two almost doesn't matter so parameters are pretty important

21:00

everybody see the parameter in this plan anybody see the parameter in this plan yeah where is it yeah it's the it's this in for table dot f1 that's actually a parameter

21:21

so how many nodes in this plan have a non-empty all parameters set which nodes in somebody said two which two the two index scans close but no cigar

21:43

also the append the append depends on dollar sign zero because if you change the value of clearly the index scans are going to produce different results because you've changed which value you're looking for but also that means the append is going to produce a different result

22:10

so it's like it's like that so we don't see the in this case it's the nested loop that's generating the dollar sign zero parameter and that dollar sign zero parameter is being used

22:23

to communicate between the sequential scan on int4 table which is on one side of the nested loop with the index scans on t3i that are on the other side of the nested loop so as we read each row from int4 table the nested loop stuffs that value into a parameter called dollar sign

22:43

zero and then when we do the index scans we use that value of dollar sign zero to drive the index scan so that we get the right rows it's just f1 yeah you you i mean i think you can

23:00

construct a query where you end up stuffing a whole tuple into a parameter but in the normal case you just put in the column value that you need okay so now i'd like to visit explain revisit the topic of explain versus reality to this point in the talk so parallel safe is not

23:21

displayed a plan node idea is not displayed init plans and sub plans are displayed in the same way but only init plans are really attached in the way that they're displayed x param and all param are not displayed although the init plan display lets you infer something about them especially if you also know how nested loops and other part of the planner

23:44

parts of the planner work but it's tricky you can't entirely see what's going on there these are the small lies okay these are the small things that it's hiding it's not the whole truth

24:15

so where we really get into large lies in my opinion is when we come to expressions

24:24

the way that expressions are represented here is totally unlike what is actually stored in the plan in some of these cases and there are sort of hints of that in this display like if you look at this nested loop join here you see that the second output column is the

24:43

constant 666 if you go up one level it's now parenthesized 666 where the parentheses come from what's going on here right and there's other things about it that will start to bother you if you spend as much time looking at plans as i do which

25:03

maybe you shouldn't because then you might end up giving a talk like this the sequential scan on the int4 table i1 produces as output i1.f1

25:21

and that sequential scan i think we can all feel confident to say that that sequential scan really has access to the value of i1.f1 it's scanning that table but this nested loop here how does it have access to i1.f1 it it's not touching the table it's just a nested loop

25:41

so of course the answer is that things bubble up right it's a volcano style executor things bubble up so 666 bubbles up and i1 bubbles up and all of the things that start at the inner layers of the plan tree actually move outward and eventually get to the top of the plan tree or else they get filtered out and they don't so here's a more intellectually honest

26:04

representation of this plan so down here at the bottom this is the thing we talked about before with nested loops we're not really comparing i2.unique1 to i1.f1 directly we're really comparing it to this dollar sign zero parameter which got populated over here which

26:25

makes sense this is a scan on this is a scan on i2 so it's not a scan on i1 so it has to get the value from i1 from someplace and it gets it from the parameter but also in this

26:40

display you can see how the information is bubbling up to the higher level of higher levels of the plan we can see here that it's actually this nested loop left join that first produces the value 666 that's where the 666 really comes from the higher levels of the plan aren't producing 666 they're just passing through what they got from the lower levels of the plan

27:03

so you can see that the materialized nodes output is just the two columns that it gets from its outer sub plan which is the nested loop left join so the nested loop left join produces two columns one of which it takes from its own outer sub plan and the other of which is

27:23

a constant that it generates and then those two columns once they've been produced get passed up to the materialized node and it just passes through those same two columns it takes the first and second column of its outer input and then when you go up and you look at the very top of the plan which i can't reach because

27:43

i'm too short you have the join filter which is comparing the first column of the outer tuple to the first column of the inner tuple it doesn't have any direct access to the underlying tables it doesn't know what underlying it doesn't really know what underlying tables produce those values it just knows that the value scan and the materialized node handed at

28:04

these tuples and it looks at the first column of each one and compares them to see whether that join filter condition passes if it does then it produces an output tuple and that someone has a laser pointer if it does then what happens is i don't know how to use it

28:26

so if it does then it assembles a new output tuple so it takes the first column from its outer tuple which corresponds to what was originally values dot column one and then it takes the first and second columns from the plan on the inner side which is this materialized node

28:45

so you can see how the data is flowing up the plan tree through the what i've labeled here as inner and outer references and when the data flow is some way is in some direction other than straight up then it goes by way of these parameters the parameters allow us to sort of

29:01

pass data sideways or in strange ways through the plan tree and everything else is just bubbling up so the explain output runs around and tells you where the values came from originally in the actual explain output we have values dot column one because that value originally came

29:23

from the value scan and the column one of the value scan and i1 dot f1 originally came from the i1 table and it was produced by the sequential scan and it bubbled up here and this 666 here originally came from this 666 here but because of some idiosyncrasy of the

29:53

when we're executing this we we we're not really doing it that way explain has kind of masked the real data flow here a little bit

30:05

and one thing that i think is kind of interesting about this is that when we initially generate paths the references to the table columns which internally are called var nodes and also references to expressions in the target list in the initial phases of planning those still refer to the table that will really provide the

30:27

value so when we're initially doing the planning the data structures that we're generating actually look very much like the final explain output but at the very end of the planning process after we figured out how we're finally going to execute this thing we no longer care

30:44

about the original origin of the value we need to know where the plan node that's running is going to get the value at execution time is it coming from the outer plan is it coming from the inner plan is it coming from a parameter where is it coming from and so one of the last stages of planning is to go through and run through the plan and to replace vars and

31:04

expressions with new vars that refer to the outer or inner plan and that produces this kind of thing which is what the data structure really looks like internally that is behind now i'm a little head so that's actually all i've got

31:26

on what's in a plan we have a few minutes so i can take a few questions if anyone wants to ask a question yeah i'm going to go with no the question is can i get a summary

31:52

of this non-technically no sorry yeah uh yes so that query i don't think i have it handy

32:07

but you can find this in the regression tests because this came straight out of the regression test yeah greg okay so you yeah so you have encountered something called the use

32:36

physical t-list optimization or sometimes the use physical t-list de-optimization so so what

32:46

happens is uh the way that the data is actually passed up the plan tree is in something using something called a tuple table slot and a tuple table slot can hold the the data for a tuple in

33:01

one of several different formats and if we're doing something like a sequential scan the easiest thing to do is actually cram the entire tuple with all of its columns directly into the tuple table slot and let the next node up just get all of the columns and then only pull out the ones that it needs so if we determine that that optimization

33:25

makes sense in a particular case then the sequential scan will be seen to output every single column in the table and at the next level those columns will be projected away if we determine that the optimization isn't a good idea in a certain case then we'll make

33:41

the sequential scan do the extra work of pulling a tuple apart pulling out the stuff we need and only passing things up that we need for example if there's a sort directly on the top of the sequential scan we don't want the sort to get all the columns because then we'd be sorting really wide data instead of hopefully narrow data and i would stink so in that case we'll tell the sequential scan i'm sorry you have to give us exactly the columns that we really

34:03

need but when we don't care we'll say to the sequential scan hey if you want to just produce everything that'll be fine with us we'll deal with it at the next level right projection is reconstructing a new tuple yep projection can involve computing expressions throwing away columns

34:27

peter yeah i i i had thought of the same thing myself i think having like a raw option for explain or something we can bike chat about the name but i think having a name that displays

34:44

uh some of these internal details that would be really helpful at least for developers because it took me a long time to actually understand some of what was going on here and there's bits and pieces of it that i'm still not sure i fully understand but i definitely have had cases when i was working on parallel query where i was like wait which

35:05

parts of this plan does the planner think are parallel safe and you can't tell you can tell what it actually chose to put below the gathering dog but that can be a costing decision so i yes i think there would be value in exposing some more of these bits well no i'm not talking

35:29

about that i mean that's showing alternative completely alternative plans would be a whole different thing but if you're just trying to judge which things could have been put under gather knowing whether those plan nodes were marked parallel safe would be enough to tell

35:44

you whether that was hypothetically possible knowing the plan node ids is probably mostly useful for debugging but we do do debugging sometimes x param and all param are definitely interesting again at least if you're doing development so i i think it would be cool to have an extra

36:01

mode that basically says you know don't deparse my expressions so cleverly just tell me what it really looks like um and uh and show some of these other details that are probably mostly boring for users although i think it's kind of useful to have some idea how it works so that you you you you don't get confused by the the output um but certainly for developers

36:26

yeah i think that would be handy it's definitely the case that if you do something like this

36:46

even on a plan tree this size but certainly on a really long plan tree if you have outer and error references all over there every instance of the word outer refers to something different and every instance of the word inner refers to something different unless they're on the same line so yeah there's definitely confusion possibility there i'm not sure that would be an improvement

37:11

but we can talk about it yeah alternate sub plan we uh pick two different plans for

37:20

executing a sub query i think usually one of them is a hashed sub plan which will be useful if we're going to execute it a whole bunch of times and the other one is a non-hashed sub plan which will be better if we're only going to execute the sub plan once or twice and instead of figuring out that at plan time we cheat and leave it to the executor to decide which one it's going to do it's kind of a hack and and i say that having asked tom lane

37:43

about it yesterday who said that's kind of a hack right so there's like a debug print plan and it dumps the raw node tree um that is sometimes useful but that is really verbose

38:06

like the the node right no that's true if you if you turn on debug print plan you get something that will absolutely tell you exactly what's in the plan tree but my problem with that

38:21

is for something like this it's like 20 pages of output so i'm no longer confused because explain is hiding stuff for me i'm confused because i can't navigate through this enormous node tree like the just a target list for a typical query is two or three pages long and it gets hard to figure out wait am i still in the same list of var nodes or am i now in the

38:43

next list of bar nodes right well that's that's another way you could do it yeah i mean there's certainly we could do nothing like we've managed to make PostgreSQL as good as it is today

39:04

without doing anything here right and you know you have an idea of how to make it better and that's cool and i have a slightly different idea and that's also cool but you know i'm happy to hear anybody's ideas maybe not get into a long discussion right now but i'm happy to hear anybody's ideas on how to improve this i i just think that you know this isn't the kind of output

39:27

that you probably want to use for everyday debugging but i think it's useful to know that this stuff is happening even during or ordinary debugging because it can make you less confused about what's really going on here and i think certainly for development there are times when

39:42

you'd like a little bit more but maybe not something quite so throw pages and pages of ascii at me as as debug print plan which i have used but only on address yeah we have done some things over

40:19

the years to make it more possible to hook into explain output so for example we have custom scans

40:27

and custom scan providers have some methods that they can call to show things and explain output doing this kind of thing though is definitely beyond the ability of the extension capabilities that we have today part of it is this goes beyond explain in order to generate this output that i'm

40:46

showing you here i had to hack rule utils dot c not explain dot c and so it wouldn't be as simple as just providing an api and to explain if you really want to get this you you actually

41:01

need something to pass some flags down to something much much lower level or rewrite all rule utils dot c in your own extension which i mean you could do but so i'm not actually sure that there's a very convenient way to execute this to turn this into an extension it could certainly be done i mean you could write a function and you pass it a thing and it

41:24

generates a plan and then you copy all rule utils dot c and copy all explain dot c and change everything so that it produces this output i mean you could you could do it in a couple days right it would just be hard to maintain because as the upstream code changed things would diverge if you have an idea of how to refactor it so that we can plug into it

41:45

game game for that too i don't immediately see a way to do it but there's lots of things that are good ideas that i don't know how to do i think we have time for like one more question if anybody has one good everything is perfectly clear okay thanks everyone

Recommendations

10:33

What's in a container? The OCI Answer

31:01

Plan S in der institutionellen Strategie

33:45

ICTP Strategic Plan

27:57

Plan Like Einstein

22:44

What's new in GStreamer?

36:23

What's new in Android

35:19

What's new in Samba ?

27:38

What's cooking in GStreamer

23:09

What's new in Upipe

25:19

What's up in Space?