The lost art of plpgsql
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 35 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/48271 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
PGCon 201928 / 35
1
3
4
10
12
13
16
17
19
20
24
28
29
33
34
35
00:00
Computer animation
01:04
Computer animation
02:31
Computer animation
08:20
Computer animation
09:43
Computer animation
10:35
Computer animation
11:17
Computer animation
14:02
Computer animation
19:49
Computer animation
20:41
Computer animation
23:35
Computer animation
24:20
Computer animation
26:11
Computer animation
27:20
Computer animation
28:02
Computer animation
28:51
Computer animation
30:11
Computer animation
31:19
Computer animation
34:37
Computer animation
37:35
Computer animation
38:28
Computer animation
45:21
Computer animation
47:00
Computer animation
Transcript: English(auto-generated)
00:07
Welcome to the lost art of PLPGSQL. As I said a few minutes ago, for those who didn't hear me, there is code in this presentation. I've tried to make it legible, but you know, it's code on slides.
00:21
So feel free to move down or zoom in with your phone or whatever. We'll get into all that. All right, so quick little introduction. Who am I? This is Lost Art of PLPGSQL, I'm Robert Treat, your humble speaker. I'm an occasional dev in ops and DBA
00:41
when I get the opportunity. By day, I actually lead US operations for Creditive, and we do open source support, a fair amount of it centered around PestGress, because why wouldn't you? So I've been involved with PestGress for a very long time as many people can attribute,
01:00
and occasionally do conference presentations. So there you go. If you're looking for a copy of the slides, if you're on Twitter, I'm at RobTreat2. I will announce once I get the slides online, and I would also say, you know, if you have questions, feel free to shoot me questions there. I'm happy to answer after the talk if you don't get a chance to catch up with me
01:21
here at the conference. Also, if you like to talk and you think it's awesome, you know, praise is always good, so feel free to tweet me at RobTreat2. If you hate the talk and think it's awful, I welcome that feedback. I would prefer it in long form, at least three pages of email, and send that to robert.treat at Creditive.us,
01:40
and I will handle it. It doesn't have to be triplet, but it doesn't hurt. So feel free to send that to the email, and I will make sure to address that in the most appropriate manner. And eventually the slides, as I said, I will announce when they get posted. I'll probably put them in a few places, but one place I should definitely get them up is on our company LinkedIn page,
02:02
which is Creditive LLC, so if you follow that, you'll see the announcement when it goes up. Otherwise, I guess feel free to take pictures or whatever people like to do when it comes to slides, but I will make a copy available at some point in the future after this talk. So, and more nagging will probably help with that.
02:20
So now, having done conference talks before, I know one of the key things in having a good conference talk is to have a little bit of interactivity with the crowd, so I thought I'd start out with a question, which is, how many of you have used a Postgres function before? Okay, I'm a little bit concerned that there are not more hands going up. This may be too technical
02:40
if you've never used a Postgres function before. I believe all of you probably have used a Postgres function before. Maybe you didn't notice that. What we're gonna focus on here, though, obviously is user-defined functions and not just functions in general. There's a ton of functions within Postgres. I would say probably thousands, I didn't count. But we wanna talk about PLPGSQL functions, user-defined functions, and that particular piece of it.
03:03
So there's a couple of reasons to get into this. I am actually a little bit curious, how many of you write significant amounts of PLPGSQL, or have, okay, so a lot. This may not be super deep then for you folks, but it depends on what you do with PLPGSQL. If you aren't using them,
03:22
I think there's a few reasons just to sort of give a high level of why to look into it. Obviously, like performance-wise, using user-defined functions on the server side, you can save yourself round trips with work that needs to get done if you need to do multiple queries to do something. I think that's an argument just for SQL in general when it comes to application code versus database code.
03:41
There's oftentimes you can do stuff within a SQL statement that would take several dozens, maybe hundreds of lines of code to do in a programming language. And so the argument to have that stuff in SQL and put it into the server, I think, applies just for SQL, but it certainly applies even further for things like PLPGSQL.
04:02
The other thing I also see in this sort of day and age where people are doing a lot of app rewriting and everyone is like hot on microservices and all that jazz, I see this sort of trend of people to write apps and not want to write them in ways that interact very heavily with the database. So a very common pattern seems to be
04:21
like you pick a data store and then you pick some kind of API framework which talks to your data store and then you build your app against the API framework. And that's not necessarily a bad pattern. And one of the big arguments for doing that is then you have a stable API because you can use, whether it's a rest thing or whatever, you have this API that's built against the database
04:40
so apps can deal with that and you can do underlying changes to the database and you just hide all that stuff in your API and the application doesn't necessarily have to deal with that. Well you can do the same thing with stored procedures and certainly back in the day when people weren't so sort of microservices oriented, people did do that. There's a lot of applications, if you go back five or 10 years,
05:00
that were built around databases that use stored procedures as sort of an API on top of the underlying schema. And so again, PLPGSQL can be a sort of a pathway towards that. And then I think the last thing, and this is a little bit of a trade off, simplifying portability. What I mean by that in this particular case
05:20
is when you write things in PLPGSQL, I think the track record is really pretty good. There have only been a few releases that have caused problems when you upgrade from release to release within Postgres, going from major version to major version. Very few things have broken over time and they certainly work across operating systems, generally speaking,
05:40
if you're going from one operating system to another, if you're upgrading servers and doing that kind of thing, like all of that just kind of, it's a part of Postgres that's baked in, so it just kind of goes with it, right? And you get that. So that sort of portability you gain, and it makes it easier. I did put a little star on that because there is an argument to be made that when you write this code in PLPGSQL,
06:02
you know, that does make it harder to move outside of Postgres. Now, I don't know why anyone would actually do that, but I hear that people do that. And I will say like, you know, being in the sort of Postgres support business, we certainly are happy to help people move from like Oracle to Postgres, and everyone who has done that kind of a project knows, one of the things that will make that project hard is
06:22
if you have a thousand PL-SQL functions in Oracle and you have to rewrite them all into Postgres, you know, or figure something out. Most companies, if they've done that, they don't like the lock-in of having the stored procedure, so they will say, let's rewrite this into application code, right? And they'll put their lock-in in the application code. Whether that's actually any better, I guess is debatable.
06:41
I mean, me personally, I would rather have it in Postgres, but obviously I'm at PGCon, so I'm probably slightly biased there. You know, just in the sense that like, I don't want to switch databases all that often, so I'd rather sort of centralize it within Postgres and have it that way. But again, some companies, you know, there's a big story, Uber,
07:01
I think like two years ago, decided to move from Postgres to MySQL. The technical merits, I don't know. Their management has a history of being abusive towards their, you know, employees and their customers, so, you know, maybe that was a factor. I don't know. But again, so this could simplify portability, at least in some scenarios. And so I think there are enough reasons to look into it,
07:22
even if you're not gonna model your whole operation around PLPGSQL and stored procedures and all that. You know, one of the things that I noticed and one of the reasons why I actually submitted this talk was that in Postgres 11, which is now, you know, not quite a year old, but certainly like that was the new hotness last year at PGCon,
07:42
we did announce the implementation of stored procedures, right, and I'm gonna talk about that later in the thing, but it was odd to me that it really didn't get a lot of coverage outside of, you know, it's in the release notes or whatever, but like, I expect like Pacquier or Depeze to do a, you know, a blog post on like every feature and you go back and look and none of them wrote about stored procedures
08:00
and I thought, that's interesting. I mean, I know like stored procedures are not like the new hotness in the rest of the world but I think let's not lose sight of the fact it's a useful tool, so. So again, I think there's enough reasons to look into it and just at least understand what are the options available and what can we do. A brief look at Postgres functions.
08:20
It's actually, I mean, you could do volumes. I think at some point somebody will write a book on Postgres functions and the different ways they work. There's actually four different types of Postgres functions in and of themselves, right. There's internal functions, so most of you probably use some of those. There are just plain SQL-based functions that you can write that are just essentially,
08:41
it's a function but only pure SQL that's in there. Then there are the procedural language functions, right. So that's something that's PLPGSQL, that's what we're sort of focused on here. If you're not aware, there are other procedural languages that you can use built into the server, PLTCL, the hot favorite but also like PLPerl, PLPython
09:00
and there's a ton of other ones that are like additional modules that you can download if you're interested in that. So we'll focus on PLPGSQL. Most of what I'll talk about probably does apply to other languages but it just kind of depends. And then there's C language functions. If you're either hardcore or need performance or whatever, you can also write functions directly in C.
09:21
And all of these work similarly, right. They use a very similar syntax when you're actually creating them from the sort of maybe a DBA's point of view. But we're only gonna focus on PLPGSQL for the moment. And so we'll look at that. If you wanna look at some of those other pieces,
09:41
we have documentation, you can go look at it. There is a create function command that's in the docs. Feel free to go look it up. And one of the things that is a little bit of an issue is when you look at that, understand that it's trying to explain all four of those types of functions and how that works together, right. So there are certain parts of that command that will apply if you're doing like a C code function
10:02
versus if you're doing like PLPGSQL. Or like if, I think there's a talk, I think it's later this afternoon, Jeff Davis is doing on doing like Rust based functions. So if you're gonna create a function and use a Rust library, like it's still the create function command at the top layer that you have to use, but how you actually wire it all together underneath
10:21
is a little bit different. But it all flows through that command. So you can look at that. I didn't really wanna get into a super deep dive on the syntax of PLPGSQL. I would say, again, you could write a book. We haven't written a book, but there is really extensive documentation. I mean, it's significant to go through and look at all the different aspects
10:40
of the programming language. It is chapter 42. I don't know if that's a coincidence or not, but I think it's a subtle nod to the importance of PLPGSQL. I do like that it's called the SQL procedural language because I think there's always been a confusion in Postgres about stored procedures and functions and how we talk about it. You can write stored procedures in SQL,
11:03
but this is not that. That's not what that chapter is about. It is actually about PLPGSQL and I guess what you could call a SQL procedural language, depending on how you're looking at it. So as a brief history in the pre-Cambrian era of Postgres, PLPGSQL was actually originally added in 6.4.
11:23
It was not the first language to actually be added. The first was actually PLTCL, which was added back in 6.3. That was such a good idea that we decided, I guess we need a PLPGSQL. The other, I guess, most significant milestone in the history, and there's been a lot and probably people have different opinions on that,
11:41
but I thought the other one that was probably most significant was in 9.0, we decided to install it by default, which meant you could just sort of spin up a server and you'd have access to this language. Most of the other ones, like I said, there's PLperl and PLTCL, which is still there. You actually have to run a command to install that language before you can use it
12:01
or you have to go download something externally if you're gonna use the Rust one or whatever. So now that it's in there by default, though, applications can rely on it at least a little bit more. It still uses the same facilities as far as being created. It's just created by default. If you're in a sort of highly paranoid,
12:20
security-conscious environment, you can actually remove the language from your database and not have it there. Whether that's a good idea or not, I guess is up to your particular business needs. But that is still possible. The basic idea behind the language was really just, we like SQL and it's great, but if we could add some basic
12:41
programming control structures, we could actually do something that was interesting and a little more programmatic and do a little more heavy lifting on the server side, be able to do something, but let's try to keep it simple. And as with all programming languages, I think it starts that way and then it grows over time as people are like,
13:01
can I just add this one more little thing? Another thing I guess worth mentioning also, it's meant to be a trusted language. So in the context of user-defined functions, there's a concept of trusted and untrusted languages, which mostly is a security angle to it. A trusted language, the idea is, you could run stuff in here
13:20
and it won't really affect the file system or the server on that side of it. It can only touch things within Postgres. Whereas you can have an untrusted language, something like TCL or Perl, you can run in a trusted or untrusted context that may actually have file system level permissions. And so that's a thing you have to think about. In this case, we wanted something that
13:40
people could use to do this type of work and not have to deal with the security related issues there. So this gives you a trusted sort of version of that. And I think my theory was that at this point of the talk, you're like, well, I need some kind of hacker tip that will help me in my life. And this may or may not help you, but I'm gonna point this one out
14:00
just to pop it in there. If you're looking around for like, hey, I'm gonna get into doing PLPGSQL and I'm going to start trying to write more applications with function and complex logic, the one tip that I usually give people early on is, go look at the section on get diagnostics. And the reason why is if you're doing any kind of dynamic query or computation within the system,
14:22
you probably need to know about this and need to understand how this works as far as retrieving information and being able to see what your queries are doing within a function. And the name of it, like that section, get diagnostics, doesn't actually sound like something that would be useful necessarily, unless you're maybe doing development. It's actually useful for everybody.
14:41
So it's 42.5.5, obtaining the result status. Again, not exactly a highlighted headline there. But I will point that one out, that if you're getting into this, go read that section, because there are pieces of that, like the found variable and all that, that are really useful for doing more complex logic.
15:01
I thought I hit you with, there's a hot tip right there, you can put that one in, so you got one. Then we'll go back to, all right, so let's look at what does a function look like that's in PLPGSQL, and again, this is code. Can people read that relatively well? I'll sort of walk through it. I think the syntax is very simple for writing functions, right, and if most of you have done this,
15:21
this shouldn't really be anything different. Most of these blocks, if you look at, so like at the very start, right, we're just doing create or replace function, you name the function, whatever your parameters are gonna be, right, would be put in there. Most of these blocks as you walk through this are actually interchangeable in the order, so you may see them in a different order.
15:43
Like here, I'm putting my returns Boolean at the top. I think most people tend to write those things in the beginning. The language clause, right, if I was doing it in a different language, I would put something else in there, like PLTCL or whatever. You'll often see that at the end of a function definition. It doesn't actually affect it.
16:00
Like I said, you can do those in any kind of order, so don't be totally tossed off by that. I don't think there's a particular coding style that has won the war of like, this is how you should write them. I will say, I'm pretty sure this is formatted based on what pgdump would output, so maybe that makes it more authoritative in some way.
16:22
When you're writing these functions, right, basically, you know, you have like an as block, and then this is like the actual PLPGSQL language in here. If you were doing something like a SQL one, all these parts at the top would be the same, right? You'd have language SQL, and then in here would just be a SQL statement of some type, right?
16:43
You have a declare block, where you declare integers, or sorry, in here we're doing integers, but you declare your variables, what you want them to be. There's all different types, any type of variable you can do. You can do record types, which are sort of undefined types until you run a query in order to find them, and there's different ways to do that.
17:03
A begin block, which where it says, hey, here's where we're gonna do stuff, right? It's a real programming language, so you have things like comments, not every, you won't actually see comments in most code, so I thought it highlights the fact that you can actually comment your code. Especially PLPGSQL. I mean, one of the things, well, so as we're going through this,
17:21
I mean, this looks, I think, fairly clean. It's pretty easy to read, but I think one of the problems is that there's, you know, not like a, the tooling around writing PLPGSQL is not exactly phenomenal, right? If you're used to writing pretty much any other language, you probably have better tooling for that than what you'll have for doing this, so that leads into this case where you need to do things
17:43
like try to comment this code and try to keep it, you know, easy enough to read and whatnot, because it will be difficult to go back and either modify it or do research on it, right? If you are looking at the function definition through something like PSQL, right, if you've written it as spaghetti code, like, it's gonna be even worse spaghetti code, right?
18:01
It's whatever. So, all right, moving on real quick. So here we're just doing, like, some SQL. I'm putting it into a variable, and there's multiple ways you could do this, because there's multiple ways you can do just about anything. We're just doing a quick count to see, you know, whether that's there, and then we evaluate it.
18:21
I'm gonna return here if it evaluates in the way that I want, right? So you're trying to return a value back. The basic concept here is we have a function, you pass in a parameter, it returns something back, right? And in here, it's simple, because we're just doing a boolean, but you can do, you know, result sets and multiple rows and columns and all that jazz. So it's all in there.
18:40
But we can keep going, right? Remember I said, if you wanna do multiple queries, if I'm calling this from the application side, I'm just calling the function. Technically, I'm selecting the function. I guess I should be careful about that word nowadays. I would select that function, and it's one call, which will do, you know, multiple queries behind the scenes, if necessary, and then return the output.
19:02
And then again, a little more logic here. Basically, the idea is I need to return something by the end of this function, and I don't know, this might be considered bad form, because I have nested this within a logical condition, but you should always get something back. There is an argument that I could have another return outside of this block, just to be safe, but yeah, why be safe?
19:22
And then of course, I'm ending it down here. There's everything between the end and the begin, right? And actually, this was, the dollar quoting here was another sort of hot feature that got added. It used to be that you had to do this all with quotes, like ticks and single ticks and escaped back ticks and all that stuff, which if you've ever done that in any language, you know that's horrible.
19:42
Most people would tell you, please use the dollar quoting, because that makes most of that go away. If you were to call this function, right, this is not mind-blowing in any way. I just select the function, I pass in my parameter, 42, and of course, it's in stock, because it's 42. So okay, hey, that's great, right?
20:02
We've all seen a function. Like I said, obviously it's pretty straightforward to write these things if you've written any kind of programming language. I think even if you've written, you know, if you've done a recursive query with a window function, like that's probably more complicated than writing PLPGSQL stored procedures or functions. So what else can we do with PLPGSQL?
20:22
There's actually three different ways you can actually access PLPGSQL. One is user-defined functions, that's the thing I just kind of walked through. The next one is do scripts, which I'll show you an example in a second, and then there are stored procedures, which are slightly different than user-defined functions. We'll talk about that in a second. I guess spend a lot of time on do scripts either,
20:40
because there's not a lot to it. The basic idea, and I think this comes from SQL Server, because I know they have the ability to do this, but other databases do too. But it's basically like we wanted a way to run sort of ad hoc, you know, PLPGSQL, to be able to do ad hoc, you know, work within the server, and combine that
21:01
and sort of make that go. So a do script basically does that, and if you look at, the syntax is basically like, you just, you know, it's do, instead of like a select, it's a do, right? And then you've got the dollar quotes here and down here, and then everything else is just PLPGSQL, you know, pretty much just like if you're writing a function.
21:21
It's basically the same kind of thing. And in this case, right, so I wrote it sort of similar. Again, I set a record type here, right? So vrow is a record type. That type will be defined by this SQL query that I run, right, so until that time, it doesn't actually have a structure.
21:41
And then once I do the SQL query, it gets the structure. And then I do this little thing down here, raise notice, and then I'm sort of emitting out to standard out, right, like here's a notice that comes from the procedure. So this is just one way, it's sort of a different example to see, here's another way you could write PLPGSQL.
22:00
But from a basic sort of flow of the language and looking at it, right, it looks basically like a function. And if you run a do script, it'll run, you get notices here. I don't get any actual results at back because I'm not returning anything, right? This is just some ad hoc PLPGSQL running. But it will give me like the do back if you're on like a PSQL prompt.
22:20
And there's a page on this. There's not too much to do. Most of what you need to know about do is, you know, the language that it is itself. You can actually use different languages with do. So you could write just like pure SQL and do that as well. But other than that, you know, it's a useful tool. I think it gets a little bit,
22:42
there's a debate on the usage of it because there's a pretty good argument that you shouldn't just run ad hoc computation. And especially if there's manipulation involved, you know, queries on your server, ideally you'd have that checked in someplace and then, you know, it would be run that way. So that if it changes over time, right,
23:01
you have like version control and history of that type of thing. So I don't know that this gets used a whole lot or if it does, it's probably in ways that maybe are questionable operationally. But there is nothing that would prevent you from putting that into a file, right, and then just piping that through like PSQL or something else and you could do that. You know, or doing a do statement.
23:20
If you're doing like migrations in your application code and you need to do schema changes, right, part of that could be a do statement to do updates to data and that kind of thing. So you can use it with any standard, you know, programming language. So now, one of the things that people often, yes sir.
23:45
Ah, oh, so in the, so for those that are unaware, I maintain this thing called Pajila, the name is up there. It's a Postgres sample database. It's based on a DVD store, which is like Netflix, but like through the mail. And so like it's renting out titles and movies and whatever.
24:02
So within that schema, there's a custom type that's been built, which is the Motion Picture Association rating. So G is a particular one, so I'm just casting it to MPA rating. So you can use custom types if you have them right within it. So yep, that's all that is. All right, so one of the things people often complain about
24:23
when talking about creating functions or even do scripts is this idea that they'd love to be able to run vacuums from within a function, right? To be able to put that in a function and be able to call that at some point in time through a cron job or whatever so that they can do, maybe it's a little more heavyweight.
24:42
Turns out AutoVacuum doesn't always handle its business correctly. I know that's hard to believe, but it's true. So sometimes we want a more heavyweight answer, which is how do I sort of automate that process in a way, how do I script that and make that happen? And so that's always been a problem. I think people thought maybe with do scripts we would get that ability.
25:00
It turns out you don't get that ability. If you were to try to do that, you'll get an error like this, right? Vacuum can't be executed in a function. And these are just two tables within there, right? And so you can't do that. So this went on for a number of years because do I think was like 9.2, 9.3, something like that a while back.
25:22
Maybe even a 9.0 actually, wouldn't have made it default. I don't know. It's definitely been in there. It was in there longer than any version you should be running, let's put it that way. So if you don't have do in your database, we have a different problem. So anyways, so one of the things that has pushed people forward is this talk of
25:41
how do we solve that particular problem? And there was a theory that the way that you should solve that it was stored procedures. But that comes from basically the aspect of in Postgres, a lot of Postgres users are really new to Postgres and they come from other systems and those systems work differently, right? And so they say like, well if we had real stored procedures,
26:02
we could actually do things like doing the vacuum stuff in there, right? And so we can say, okay, let's talk about real stored procedures, right? To talk about real stored procedures, you have to understand why it's a hard conversation to have and it's mostly just around sort of language and people's inability to learn and change their ways.
26:22
So other databases would tell you that the basic sort of difference historically between what Postgres does and other systems are if you had functions in your system, like a user-defined function, right? It's user-defined code that will execute some arbitrary set of commands, do its business,
26:42
and then it will return a result set. Whereas a stored procedure would be a user-defined code that executes some set of commands but doesn't actually have to return a result, right? So we can do things that are not about querying data or updating and returning result sets, right? And so that, I mean, that's a really hand-wavy thing to say, and that's how people sort of looked at that.
27:02
There were other sort of more important pieces to that, and we'll talk about that in a second. But the way that Postgres sort of answered that, and so this is, I mean, going back 10 years or more, the way that Postgres sort of answered that, because obviously we weren't gonna let the other databases tell us we were doing it wrong, as the history of all databases are, right?
27:20
So what we said was, well, what we could do, we have this facility in here, because we can use PLPGSQL to write functions that do triggers, and there's some magic in there. We could do something like this, which would be, we'll create a function, we'll call it a smord-per-major, and it'll just return void, right? And then you could add your begin and your end block, right, here's my PLPGSQL, begin and end,
27:42
and the only code I'm gonna throw in here is just a return, right? Because I'm returning void, I don't actually really need to return any values. And then if you were to call that, well, geez, it's like I called a stored procedure, right? There's no return that came through there, so it's basically the same thing, right? And so when people would say, like, well, of course Postgres supports stored procedures,
28:03
right, this is the hand-wavy, like, lots of hands, lots of wavy, right? In Postgres, this is what they tell people, like, functions are equivalent to stored procedures, and you can use them interchangeably. And that's true if you're coming from MySQL, right? It is not true if you're coming from Oracle.
28:22
Never was true, we like to pretend it was true, and I will say that you could, you know, there's some X percentage of Oracle PL-SQL packages and functions and whatnot, I mean, I've spent my life porting some number of that from Oracle into Postgres functions, right, prior to having stored procedures, so a lot of it actually can be duplicated
28:42
and can be put into functions within Postgres, but that doesn't really actually make it a real stored procedure, even if we try to claim it's more premature. Okay, so then, what's the deal with real stored procedures, right? Well, there are sort of two things that are probably important to talk about
29:01
with real stored procedures. One is that there's actually a SQL standard, you know, that's out there that talks about, here's how you should implement it, at least from, like, a language perspective, right? And so, if you're gonna do real stored procedures, you probably should follow the SQL standard syntax, or at least, like, be in the ballpark, or at least the same sport.
29:22
So that's one aspect, and then the other aspect of it, which actually is important, is being able to allow transaction control, right? That is sort of the key that makes most of those other database languages, their implementations, more interesting, is that within their functions, quote unquote, functions, they allow you to do sort of
29:40
heavy-handed transaction control. And Postgres, again, it's more premature, like, we tried to sort of implement that, because you could do save points within a function, but it wasn't quite the same, right? It wasn't like you could commit certain things and not commit other things, or whatever, like, you could use save points to work your way through, and have some of it work, but it didn't quite give you
30:01
the flexibility you really wanted. And so we said, okay, in 11, we're gonna address these things, and we'll make real stored procedures. And so, create procedure was born. This is what you get if you do it from PSQL, right? This is the help syntax. Command is called create procedure.
30:22
It looks very much like doing the same thing with a function, other than it's procedure instead of function, right? Create or replace procedure, you have a name, right? Arguments, if you remember earlier, I passed in, right, I was passing in an integer. Language, because you need to do a language name,
30:41
you can do procedures in multiple languages, just needs to have the right support built in for that. And then depending on what you're doing, because you can also do like this last one here, as object file link symbol, if you're doing like a C-based stored procedure, like that's what you would do in there. So when I said like, when you look in the docs and you'll see things, certain things are talking about C-based functions
31:01
versus like PLPG SQL and whatever, right? The same works for create procedure. Most of the infrastructure that's used for doing functions is reused for doing procedures, so it's very similar. I think the other sort of big user-facing difference is that instead of selecting a function, you actually call a stored procedure, right?
31:21
And that's the SQL standard way to do it. If you do in PSQL, call, right, invoke a procedure. This syntax, much simpler, which is basically just call the name of the procedure and then the arguments that you're gonna have. I won't walk you through all this, but so this is me changing that from a function into a procedure, right?
31:41
And at first glance, you're like, this looks basically the same and the reason is because it's basically the same. The difference is, or basically here, I change it to a procedure, I slightly change the name. Instead of inventory and stock, it is not inventory and sprock because that's what old people call stored procedures.
32:04
Now I can't return a value, so if you remember in the other code, I had returns here. I changed these to raise notices just because I have to do something and that was the simplest thing I could do to move on to the next slide. So that's what I did and I did it down here as well because I needed to handle it there. And so if I call this, right,
32:21
otherwise everything else is the same, the exact same code as the other stuff, right? It's the same language, it should be the same code. It's just a matter of the interface in and out that you have to deal with. And if you call this, so this is inventory and sprock. I've called it, that's in 42. I get a notice back that it's true, right? If I were to do that with the function, right, I select here, I actually get the row back
32:42
which is the Boolean true, right? So kind of depends on how you want to deal with that. The interesting thing, I guess, or uninteresting side effect of the way that we've implemented stored procedures is while you could differentiate on this idea of whether it returns a value or doesn't return a value,
33:01
in Postgres that doesn't really hold up because as we saw with a function, you can write a function that returns a value or doesn't return a value. And as it turns out with stored procedures, you can write them by default, they won't return anything, but you can actually write stored procedures that do return values. So if you wanted to see that, say slightly different, maybe not shocking,
33:22
but change my procedure to inventory and shock, oh my god, I added this parameter which is an in out parameter, right? Which in the context of PLPG SQL is basically a parameter that you could either pass in and then it will automatically be passed out, presuming you set it to something or don't. But it's a way to sort of force the return
33:42
to go back out without actually returning anything. And then in this case, I'm sort of just building on the previous stuff. It's in stock is my parameter name, it's a Boolean, I have to set a default in here because I need to have a value in the beginning to make this work. But other than that, once I have that put in place,
34:01
instead of the raise notice or instead of setting it equal and trying to return true, I just set that parameter equal to true and this little dot equals is assignment equal. Don't let anyone tell you different. This is equality equal, this one up here or that one there. This is assignment equal and that's the only way you should ever write that.
34:20
That is actually probably the second most important tip. If you accidentally write it some other way, change it to do that. Okay, so I'm assigning this variable to equal true or if I continue to process down here, it's either false or true. And then if I call that, this is the original stored procedure. So I get the raise notice,
34:41
this is the function. So I get the true back. This is calling the stored procedure that returns the value of true. So you can do that either way. So that ability to have a return and not have a return, that sounded like that was gonna be important through the years as we built up to this but it turns out you can do it both ways.
35:02
So I don't know, it's not quite as important. Yeah, yeah, yeah.
35:27
So for those who couldn't hear or for the live stream at home or whatever, so he's mentioning is the in out parameter up here, that actually comes from Oracle, Oracle's PLSQL language. And so he's saying is prior to having the ability
35:42
to do in out, that was definitely not in the original implementation. I think that was in one of the later 9 series. I don't know, do you remember? Somewhere in there. Yeah, so once that came in, before you had that ability, when you're reporting Oracle to PestGrest stuff, it was, I guess, a challenge.
36:00
Like he was saying, you definitely get up your bill on how long it's gonna take or how much work is gonna be involved in order to make that happen. Once we had that, then it became much easier to port that kind of code over. So, yep.
36:21
If you don't, the biggest problem is if you don't and you never end up setting the value, then you'll definitely get an error back, right? Because it will actually try to put the parameter out at the end, so you need something. Could it be null? No, yes, I think you can.
36:40
If Jan says it can't, then I'm gonna change my mind, but. I think you can. Yeah, I think you can. You have to set it to something, but null is something, right? Yes, you could, I think you do that too. So there's also, so this is an in parameter, this is an out, there's also out parameters which don't require the in piece of it. If it's just an out parameter,
37:00
I think you don't have to put it in. Oh, you need the default in there because it's an out and I didn't wanna pass it in when I called it. That's why you need the default. So you have to set it there. If I pass it in when I call it, so you'll notice there's two parameters here, right? But when I call it, I actually am only passing in the one. Sorry, that's why I need the default. If I just made that an out parameter,
37:21
then I think I don't have to set it and it should work, okay. But you can set it to null if you want to and that counts as setting it to something even though null may or may not be something, so. Yeah, so that, like this part here,
37:42
yeah, that's just like if you call it a regular function. So it's just like a result set. Like if you do a select statement and you get data back, like it's the exact same thing. So you can treat it just like any other data, yep. The this thing, it does work.
38:15
Yeah, that's the problem. So this should in, I'm gonna say at least in 95% of cases,
38:21
if you just do p in stock equals true, it will actually still do assignment. But it's just bad style, right? There is a possibility, I'm gonna say there might be some scenarios, I'm gonna leave the door open, where it will actually test that as equality and then move on with its life, which will create the most subtle of bugs
38:41
that you'll never see. And so that's why, like I always say, for assignment, do that, and for equality, do that, so that you can actually tell the difference. Because otherwise, and one of the problems is if you write pure SQL within a PLPG SQL statement, like it operates slightly different because it gets parsed slightly different. So you can, problems abound, so don't ever do that.
39:03
Okay, so now, let's talk a little bit about transaction control, because that's the other part of this that was important. I'll walk you through this. It's the most contrived of examples, but hopefully gets the point across. So I created a table up here called xx, right? And then I just make a little procedure, also called xx.
39:24
And then what I'm gonna do is insert into xx the table, the value of one, and I'm gonna roll that back, and then I'm gonna do select two, and I can commit that one, and I'm gonna do three, and I'm gonna roll that back, all within the context of a single procedure. It's PLPG SQL, it's a begin and end as,
39:43
like y'all know how this works, right? It's good. If I call xx, right, I get my call back, notice there was no return here, there were no out parameters, so it just gives me a call, and then I move on to my next line. And then when I go look at the table, right, I get this sort of marshmallow-looking icon, which is the xx table with the number two in it.
40:04
So you can see that only the number two went in there. I'll let you extrapolate, like, the logic that you can build around that, right? Which is, you could do all kinds of work up here, have some kind of conditional that's gonna look and say, yes, that work is good, I will commit it, or I will roll it back. You could break this, maybe these are like three different pieces of ETL,
40:22
and you'd like to know whether or not these things are working or not working, right? You could track some of that activity within it. So there's different things you can do, and make that sort of as complex as you need, but you now actually have the ability to get some of the stuff committed and not others, and do that in kind of any order that you want, right? And that was sort of the big thing about adding transaction control to stored procedures.
40:41
For a function, it's all basically one long, you know, one long transaction. Starts at the select bit, theoretically, and then there's like save points within if you're doing that, but otherwise, right, this kind of thing would be impossible. And so that was the whole point of, you know, sort of round one of doing stored procedures, and getting that in there.
41:01
Hopefully, that's pretty straightforward. Makes sense to everybody, yep.
41:28
So, I think it depends on the order in which you do stuff within the stored procedure. And I will say that when you start to nest stored procedures within themselves, there are definitely issues with that, right?
41:41
So it's really, and this is one of the things where I think, I'm not exactly sure how to make the better tooling, but better tooling would certainly be useful to understand the transaction context that's in there. In theory, like in your example, I think if you do the first bit as a rollback, the stuff that is in your transaction
42:01
previous to the trigger, okay. Yeah, at that point, I don't think you know
42:26
until you get through the procedure and the other side of it, and then you commit or rollback that. I think that will actually make a difference, yeah. Yeah, right.
42:42
So if you, like in this, I guess in this particular, let's go with this scenario, because this is what we have. So because we do the rollback here first, that will actually bubble up to the previous thing. But I believe this too should actually still get committed as it works its way through, which is really weird, right. But also kind of the point, right, that you could actually do that.
43:01
So if you wanted to have some kind of reasoning for why it got rolled back in the trigger logic and put that someplace else, like you could actually do that, right. And then maybe that doesn't actually go in, but you can have like a log table of like, here's the stuff that's failed because I'm doing some kind of logical check in here, you know, that makes me say like, oh no, don't actually insert that one, right. But I'd like to have a message someplace
43:21
that's not just a raise notice in the logs, it's stored in the table and keeps track of that. So it does get confusing and it is complicated the more you nest that stuff, right. And as I say, like within, it's also different if you nest a call to a stored procedure inside of a function, right. It's different than if you nest a stored procedure inside of another stored procedure, right.
43:41
Because all those transaction contexts are slightly different. So test stuff, please.
44:00
Yeah, at some point you'll just run into, it'll error out, right, and give you like a, you can't call within this transaction context type of thing. So yeah, if you do enough nesting, you'll sort of break the system, right, in that way. I mean, it's not broken, but it'll error back.
44:50
I think it might depend on how you call it from the function. Like if you do like a perform or like, you know, execute, or just put in the dblink and make another connection. Whatever, there's always tricks.
45:01
All right, so a few more minutes left. I got a couple more things to mention. So I talked about transaction control. So maybe some of you are wondering, because I mentioned this earlier, the whole thing about vacuums and being able to do stored procedures in vacuums. And good news, I lied, it's bad news.
45:22
You can't actually do vacuums within stored procedures still. So even though that was the thing that I think spurred a lot of the ideas behind let's do stored procedures so we can do this, it still doesn't work. If you tried to do it, right, you get, so here's just a, again, trivial function. This is not writing it cleanly. So if you wanted to know an example of that, here is one.
45:42
So I just wrote it out, right. I make a procedure, I do a begin on the table actor, right, or I do a begin and an end block for PLPGSQL, and then my statement is just vacuum the actor. And when I run that, I get vacuum can't be executed from a function, which looks like a wrong error message because you're like, but I made a procedure.
46:04
And yes, thank you, that's the heavy lifting, right, like, because we built procedures on top of functions, sometimes you get error messages that look like that, and you're like, am I doing the right thing? And like, you're doing it right, it just doesn't support it yet. So you can't actually, still, we haven't solved that problem.
46:20
How do I make scriptable vacuums? Certainly if you want to do like a select from a system table, you know, where, you know, I don't know, the age of the table is over a certain amount and put that into a function and call that, nope, you still can't do it. Maybe it's good news, you can do that with analyze, because analyze can run within the transaction context.
46:41
So, but with vacuum, nope, still can't do it. I apologize. The upside of that is, I mean, you're all here at PGCon, so you're all heavily invested in Postgres, and I'm sure after Robert's talk, you're like, I'd really love to hack on some stuff in the internals of Postgres, and this means there's still things to do, right?
47:00
So if you want to hack on stored procedures and the support, there are certain DDL commands that will not work. Create index concurrently is a good example of one. Most of the current stuff that Robert mentioned in his talk, if you saw it, most of that will not actually work in stored procedures, because there's issues around, well, all kinds of things, if you saw his talk, or there's issues around transaction context
47:22
and just how that stuff works within stored procedures. Vacuum support, again, doesn't actually work. If anyone actually were to solve that problem, they'd be a hero for the rest of their lives. You know, beers at the Dublin Air for Life, something like that. And then another one that people have talked about,
47:41
because this is in other systems, when you implement stored procedures, you get the ability to return multiple result sets. Right now, we don't really have a way to do that. I didn't check this. I know if you were trying to do, so two sort of workarounds, just so you know them. If you're trying to do multiple result sets,
48:00
one thing you can do is return a record set of cursors from a function, and then you can call each cursor individually and get rows that way. I'm not saying that's a good idea. I'm just saying if you need to do it, it's a way to do it. Same thing with vacuum support. So a trick that you can play on people is while you can't call it directly
48:20
within the stored procedure or within a function, you can call something like DBLink within a stored procedure or a function, which will create an external connection back into your database, and you can run a vacuum within that transaction. So you can kind of make that work. Just realize that some people will say
48:40
that that's abusing the system, and some people will say that's getting stuff done. So two, that's how I do it. Question. I'm not saying it's easy. I'm not saying any of this is easy. It's all not easy, or I'm sure it would be done.
49:01
Yes, sir. Inside this, you can, I would be very careful with that, because you likely could break stuff. Certainly, it's all the same problems that you have within any transaction, right? So it would depend on what's the level
49:21
that you want to change, right? So I mean, I usually find any time you go down to like serial level isolation, like it blows everything up. So, oh, for the purposes of getting vacuum to work, will still not work, no. But yeah, it's more problematic than that.
49:41
Yes, sir. That's a good question. I'll have to look that one up. I don't know that, I don't know there's any,
50:02
how'd you look into it, or you may know more than I. Nope, okay. Yeah, I mean, I would think you would probably just hard code the variables into the PLPGSQL, right? Because when you declare them, you can give them a value.
50:21
Are you thinking you want to be able to update those without changing the do block? Yeah, I got you, okay. Yep, yep, yep. Yeah, so he's asking about like, how do I sort of get dynamic variables within a do block? So that, yeah, I mean, that's the first way that comes to mind, like you said.
50:41
If you had a table where you would store those values, and then the do block could just read from the table and then use whatever's in there, and then you could update the table and run the same code, right? And it would theoretically do the right thing. So that's probably one way.
51:16
Yeah, the problem that I've seen with that, I mean, I guess it depends on how you are managing
51:21
your code and whatnot. It does actually get formatted through pgdump. So I forget what it calls within there, but it's, you know, there is a bit of formatting that'll happen, but you can make that somewhat messy. It's not a hard formatter.
51:51
Yeah, yeah, yeah, yeah, yeah, yeah, yeah, yeah, yeah. Right, the code in the middle where it's like, this is all black box stuff to us. It typically doesn't, doesn't much with, you know.
52:14
Sure. Yeah, I've used like just straight SQL formatters that are out there.
52:21
They get you kind of the way there, but obviously it's not built for the language, so.
52:49
Right, yeah. I mean, essentially what you're looking for is like PLPGSQL tidy, right? You know, which doesn't exist, to my knowledge,
53:02
which would then go through and reformat PLPGSQL, right, in a way that was nice, so, but. Yes, sir. Yeah, that was the really, like even this basically.
53:22
So, but in the sense that like, I wrote the code itself cleanly, quote unquote cleanly, right, but the stuff, all of this stuff up here, like create or replace, the declare, no, not the declare. Everything like the create or replace with the parameters and the as, that stuff is in the order that pgdump puts it, right?
53:40
So, that's why I say, you can put these in any, right? Like, you can put, because there's other stuff, there's parameters you can pass in for like performance reasons and that kind of thing, like is strict and you know, things like that, immutable. Like, all of those parameters can go either at the end or at the beginning, right, around the as sort of dollar quoting block, right, of the PLPG SQLPs.
54:01
Yeah, but this part, right, to Postgres, this is like, could be whatever, it doesn't know, right? Because if this was Perl code, if I just change this to PL Perl, I mean, it might toss an error when it tries to compile it but it might not because it's like, it's just a big string of stuff, right? And it's not create functions job necessarily to say, oh, this is about Perl. When you actually run it at runtime
54:21
is when it really gets evaluated, right? Which is sort of another thing, just to be aware as you're writing this stuff, you know, you need to actually be able to execute it to really know because there are going to be things that you cannot really see ahead of time. Certainly, the more you do like dynamic execution of like SQL and that kind of thing, you won't really know until runtime, you know, when things are actually get bound
54:42
and try to execute that it's actually going to work. So that's just the thing, yeah. I mean, that is the problem of the tooling around it is still a little bit difficult to work with. But, you know, don't do 100,000 lines of code in a single store procedure, I guess. It's like the only answer we have at the moment.
55:01
I mean, that's probably good advice anyways, even with good tooling, I wouldn't want to do that but that's mostly it. I see somebody's walking with food so I'm going to pop over here to the thank you slide. Thanks everybody for coming. I believe lunch must be available if food is coming through the door so I will not hold you up any longer. Again, if you have questions, feel free to find me. I'll be at the conference today and tomorrow
55:21
and probably at the bar later and all that good stuff. So thanks everybody. Thank you.