So You Want To Make An Extension? - TIB AV-Portal

So You Want To Make An Extension?

00:00

1

Related Material

PGCon - PostgreSQL Conference for Users and Developers

Formal Metadata

Title

So You Want To Make An Extension?

Title of Series

Number of Parts

19

Author

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/48963 (DOI)

Publisher

PGCon - PostgreSQL Conference for Users and Developers

Release Date

Language

Content Metadata

Subject Area

Information Science

Genre

Conference/Talk

Abstract

This talk will discuss the extensions development process from the ground up and expand upon that using my experience developing several widely used projects (pgpartman, mimeo, pgjobmon, etc). The talk will mainly focus on best practices, lessons learned and tips for creating your own software projects in PostgreSQL. Back in the old days, aka before 9.1, if you wanted to add some custom code to PostgreSQL, it was certainly possible and not very hard. But keeping track of the status of that code, mainly which version the database was running, was a bit of a chore. With the extension system introduced in 9.1, that chore became much less of a hassle. Versioned software packages within the database itself are now possible and installation & upgrades are tremendously easier. In turn, communities like PGXN have grown to house many extensions to make DBA lives easier. This talk will discuss the extensions development process from the ground up and expand upon that using my experience developing several widely used projects (pgpartman, mimeo, pgjobmon, etc). The talk will mainly focus on best practices, lessons learned and tips for creating your own software projects in PostgreSQL.

PGCon 201717 / 19

1

49:46

Corruption War Stories

2

27:01

Designing your Multi-tenant Database for Scale with Postgres

3

49:33

Lies, damned lies, and statistics

4

50:01

Logical Replication in PostgreSQL 10

5

39:34

PostgreSQL Entangled in Locks: Attempts to free it

6

49:10

Partition and conquer large data with PostgreSQL 10

7

34:21

Pgcrypto avast!

8

44:22

OUZO for indexing sets

9

51:06

JIT-Compiling SQL Queries in PostgreSQL Using LLVM

10

55:08

Oh, that ubiquitous JSON!

11

47:56

Codified PostgreSQL Schema

12

32:36

Don't stop the world

13

38:21

SCRAM authentication in PostgreSQL

14

30:42

Towards Comprehensive Testing Tools

15

42:32

Faster And Durable Hash Indexes

16

49:41

Hash Joins: Past, Present and Future

17

57:37

So You Want To Make An Extension?

18

40:24

Provisioning and Automating High Availability Postgres on AWS EC2

19

47:23

Extending View Updatability by a Novel Theory

Automatic playback

Speech

Text

Image

00:00

Computer animation

02:26

Computer animation

03:52

Computer animation

05:10

Computer animation

14:55

Computer animation

16:42

Computer animation

17:46

Computer animation

18:40

Computer animation

20:01

Computer animation

25:40

Computer animation

31:12

Computer animation

36:44

Computer animation

37:40

Computer animation

38:47

Computer animation

45:12

Computer animation

46:09

Computer animation

46:54

Computer animation

50:27

Computer animation

53:28

Computer animation

Transcript: English(auto-generated)

00:05

I wrote PG Department for Automatically Managing Time, an ID series partitioning. I also wrote Mimeo, which is a logical replication tool. It has some unique features to it that some others don't. So I work for OmniTI. We're a full stack IT consulting company.

00:24

We generally deal with web applications that deal with very large scale, millions of users, terabytes of data, and helping people that are starting off small and they're starting to scale out. And we try to help people with that. And we are hiring right now. We're looking for system admins and web developers.

00:43

So if you have any questions about that, please come talk to me. Or Robert Treat is also here. He's our CEO. So why was the whole extension system even made in the first place? Before this, you could actually write custom code and,

01:01

of course, put it in the database. But it was essentially just a flat SQL file that you would feed into PSQL or whatever, however, or just run the SQL yourself. And that would throw your object into the database. But nothing in the database outside of a schema would really distinguish those objects from anything else in the database.

01:20

So unless you had some kind of external version control, even if you had that, it's still hard to know what version of the code actually exists in the database at that time. And other than removing the schema, there's no way to remove or easily update and make sure your code is up to date in the database.

01:41

And also, there's nothing that if you have a dependency on some other code that you need in the database, there's no way to say, I need this other code in the database for my code to run. So that's why this whole system was developed. So the install is very simple. I'll get into what goes into it before you get to the

02:00

create extension part. But once the code files and stuff are in place, all you do is run one command. And all your functions and everything are all installed. It's also versioned. So you give it a version number, a version name, whatever kind of version you want to call it. So you know exactly what version's installed. It's very easy to update it, either update it or downgrade it, either way.

02:22

You can easily see which objects are part of an extension. So I can show you that here in PSQL here. So if I want to see what extensions I have installed, so I have DBLink, Mimeo, PgChron I've been playing with, PgPartman, PgTap. So if you want to see what objects are in your extension,

02:44

you can do DX+. So there's all the functions, a few tables, and a view that are all part of PgPartman. And you can also have, or it also prevents, if an object is part of an extension, it can't be dropped unless you

03:02

actually drop the extension. So it prevents accidental removal of critical pieces of code. And you can also have dependencies. Like I have Mimeo is dependent on DBLink. You cannot install Mimeo unless DBLink is already installed. So you can set those as part of the creation.

03:22

So getting started. These are all links to the PostgreSQL documentation. It's mostly centered around chapter 36. It's quite extensive documentation, both for how to package things together and actually building it to install onto a system.

03:42

And there's pretty much three commands that surround managing the extension. It's create them, alter them, and drop them. It's very, very simple. The alter extension link in the docs actually has a really good list of, these are all the objects that you can make part of an extension in the database. So in the directory for where you're having your extension

04:14

files, there's a format to those files that has to be followed.

04:21

Most kind of like when you're developing a package for Debian or any other operating system, Postgres packages have critical files and formats that have to be followed. The first one is, it's whatever your name of your extension is, .control is the name of the file that exists in the top level directory of your extension.

04:41

The split between the two that I have between the two here is because the ones here at the top are the ones that I use most often. These other ones are options that are there that are available that I really don't use. So default version is when you say the create or alter extension command, and you don't give a

05:00

version to that command. Whatever is set in the default version of the control file will be the version that's installed into the database. You can set a comment, and if you saw when I did the DX, you can see the description of your extension. That's what gets put there.

05:23

Module path name is mostly for if you actually do C code. PgPartman is the only extension I've written that actually has C code in it. A lot of my extensions are just plain PLPG SQL. So whether you need that or not is whether you're actually compiling code. Requires is the flag to say whether you need another extension.

05:41

It's just a comma separated list of other. The other thing has to be another extension. It can't just be some arbitrary thing. Like DBLink is an extension. PgCrypto is now an extension. So it lists other extensions that are required for your extension. And relocatable is just a Boolean that sets whether after you install the extension, whether it can be

06:01

moved around to different schemas. I typically don't allow that because it just makes the coding of your extension on your end a lot more complicated. But you can let people choose their own schema when they install it. So I'll just skip down to this last one. There is an option in the control file to set which

06:22

schema it's installed to. I see a lot of extensions do that. And honestly, I think it actually makes it more difficult to manage because if the other person creates a schema first, and then you try to tell it to install to a schema, it won't let you install. You have to not set the schema when you call the

06:42

create extension command if the schema is set in the control file. And if people don't know that, it really confuses them. And if I just confuse you saying that, you understand what I mean. So I would advise not setting the schema in the control file, let people set it to what they want it to be. And then there's, I'll get into other things later, that

07:02

you can dynamically just use whatever schema they picked in your code. Directory is actually, normally, like when you call make install in the extension, it puts the files in a specific location. And that's usually fine.

07:20

But if you have a custom place where your extension code exists, you can set that in the control file. If you need to set the encoding, that's there. I've never set this one, because I usually, so far, the extensions I've written usually have to be run by a super user. But if you want to have an extension that doesn't require

07:40

super user for anything, even the installation, you can set that to false. But that means that whoever runs create extension has to have permission to do everything that's in your install file. So all of the object creation and everything. It can make it more difficult for you, but it also can make it easier for your users if you want to have it completely

08:01

super user free. Any questions about the control file? You can actually do a secondary control file, which I didn't even know about until I was going to write this talk. If you just do create extension, I've never actually given a version number to the create extension command.

08:22

I usually just let whatever the default version in the control file control things. But you can actually give a version number to the create extension command, and that will allow you to use these secondary control files. So if you want to do specific things for specific versions on install, you can put them in this other

08:40

secondary control file. Then if somebody gives the version to the create extension file, it will use this secondary control file. It uses all those same options as before, except obviously you can't set the directory and default version for a specific version.

09:00

The actual code that gets installed, and this is for a code that gets installed into the database, not compiled code, so like PLPG SQL code, plain SQL code, Java, PL Java, PL Perl, whatever code you're getting installed into the database, you create a file with this pattern to the name. So the name of the extension, which matches what was in the

09:23

control file, which matches what the name of the control file was, and then whatever the version that's being installed is. So for the latest version of PGPartMan that's out now is actually 3.0.1. I'm working on getting that 2 out very soon. So this is the latest SQL file for PGPartMan.

09:41

So all of your code goes into that file, and that's what's required to actually install the extension, is all of that script code has to be in that one file. I'll get into some things later that you can manage your code a little bit easier by putting them into different files and then pull them all back together. But for when it actually gets installed, all of that code has to be in that one file.

10:02

So this is the macro I was talking about. So if you want to call one of your own functions that you want to ensure you're calling your own function, because maybe you made another function called dblink for some reason, and you don't want the other one to be called, you want to call your own, you can put this at schema

10:24

at, put that anywhere in your script code. And then when the create extension command is called, it automatically replaces that macro with whatever schema the extension is actually installed in. So that lets you let users install the extension to

10:43

whatever schema they want to install to. And then once it's installed, if you actually go and look in the code in the database, you'll see all of those macros have been replaced with the schema that the extension exists in. So I find doing that makes it a lot easier for users than trying to set the schema in the control file.

11:02

It's a lot more user friendly. And this is actually where the big part that makes extensions great is being able to do controlled updates, upgrades, and downgrades. So it's a similar file format.

11:24

So it's the name of your extension, the old version, and the new version. That doesn't necessarily mean old as in previous and new as in new. That means if you can do downgrades, too, so your old version is a new one and you're downgrading to something else. So it goes either way.

11:40

So this is saying I want to go from version 3.01 to that version 3.02. So the code within the extension and the versions and the file names are actually arbitrary. I do semantic versioning for my thing. You can actually do whatever. You can do names for your versions. You can do alpha, beta, Charlie, delta, whatever for

12:02

your versions. It's kind of arbitrary. And Postgres isn't aware that 3.01 is less than 3.02. All it knows is you're going from 3.01 to 3.02. So if you want to do a downgrade script, you do pgpartman 3.02-3.01. And that would mean you can go down or up.

12:23

And the code that goes into that is relevant to whether you're doing the upgrade or the downgrade. So it's all up to you what goes in that function and how you want to follow those code paths. A big thing to be aware of is everything that runs in that update is done in a single transaction.

12:40

So if you have things that have to be done in multiple transactions, you have to do multiple updates. And you have to do multiple versions. And you have to somehow make that clear to your user that when they do these version upgrades, they have to do them in one at a time. So say I had existing version as 1.2.3, and I wanted to

13:00

upgrade to version 2.0, and there's 50 versions in between that. If you do a single command from 1.2.3 to 2.0, it runs all of those updates individually in one giant transaction. So if you need to do those upgrades distinctly, you do the

13:20

alter extension update command. You can give it a version, and it will do that. Whatever version you have installed now will update to that new one. So if you have multiple steps you have to do, you can do that. I ran into that when I had to change the values of a constraint at the same time that I was changing the constraint, which can give a weird error if you do that

13:42

in the same transaction. So I forget the exact name of the error. It's actually in one of the more recent updates of PG Partman that I actually had to do this with, and I became aware of it.

14:05

Yeah, something about triggers can't run in a certain way. But if you didn't know that in the first place, you still have to fix it somehow. And eventually you'd have to drop it and rename it. But there's some things that can't be done in a single transaction that would just never be able to be done.

14:22

So this made me aware of that. And so I usually have a change log file in my extensions that have important information. Like the Postgres change log has important information about upgrading. So I try to do that.

14:41

Any questions about doing updates? Yes, yes. There's a distinct. So I can go over here to show you. Here, let me show you.

15:02

So I have an updates folder. I'll go over the structure that I use for mine later. So you can see I have all of these updates. So if somebody had 1.6.0 installed and they wanted to

15:21

install 3.0.2, it would run every single one of these. It doesn't just run that one and go straight and run the 3.0.2 update. It runs all of them in between there. So it keeps things up to date. What's that?

15:47

Like I said, the versions here are completely arbitrary. You can name them whatever you want to fit your architecture. Like I said, I use semantic versioning. So I do the three dot thing. So whenever I had a new major feature, I would change

16:01

the second number. You would have to combine all of those updates together. And some of them, you could do that if you wanted to.

16:26

I haven't needed to, because if you have all those files there already, the extension management system does it all for you. So yeah, you could do that. And I'll actually get to something important about that

16:40

in a second here. So it's going off the screen there. I can't get down to it a little bit better.

17:04

So like I said, when you give no to up clause to the alter extension, it will. So if you install a new version of the extension files into the file system, it has a new control file there. The new control file has a new default version in there. So if you give altered extension update no version, it

17:22

will update your extension to whatever is in the control file on the files that are installed. But you can also actually give it specific versions. I usually try to do that just for my own sanity. And like I said, any update scripts between the installed version and target version always run in the order there. So like you saw that I had, let's see here.

17:48

I have all of these files here. So I have 5.1 to 6.0, then 6.0 to 6.1. So it follows those paths every time, all the time, because I have them in that order. So it will follow that order all the time of those files.

18:02

Yes. Yes. Yes. So if you wanted to go from 3 to do a reverse, if you want

18:22

to do a 3.0 to 3.0.2, you'd write a file there that says pgpartman 3.0.2 dash dash 3.0.1. That would be your downgrade from 3.0.2 to 3.0.1. Right. Yes.

18:42

What's that? The contents that are in the file Postgres never looks at. It looks at the names of those files to determine the path that's being upgraded. The contents of the files are completely up to you.

19:15

If you're missing a file in here, it will say there's no upgrade paths.

19:21

Yeah. And you can actually do that. There's this really handy function. I haven't really used this all that much, because I don't do downgrades and upgrades. I pretty much just only do upgrades and keep things going consistently. But you can use this function called pgExtensionUpdatePaths, and it returns the source.

19:41

So there's the source version, the current version, the target version, and the path that it will take to upgrade. So going from 2.3.4 to 3.0, it will run 3.4, 2.4, 2.4.1, 2.4.2. It shows you the path that it's going to take to upgrade. So if you give no argument to, like you can see, I gave a

20:08

where condition, I'm sorry. If you give no conditions to that function, like I said, where source and target equal that, it shows you every possible path on all of the files that you have, or all the versions that you have.

20:20

Even if the path actually exists, it still returns, but it says there's a null path. So there's no possible way to go from 2.3.4 to 1.0, but it still has an entry there and just shows a null. So you can find out what upgrade path your code is going to take based on the version files that you have.

20:42

And that lets you do something very handy, which is if you make a mistake and you need to skip a version, which I did in Mimeo. I did an up, so Mimeo's supposed to be 9.1 compatible, so the first version that had extensions available. But I added a feature that was only in 9.2 by accident.

21:02

So that was in version 0.10.0. I added a 9.2 feature, but I didn't want to do that. So if anybody that was on 9.1 tried to install 10.0, it would fail, and there's no way for them to get around that without upgrading the database. So what I did was I removed that 9.2 feature.

21:24

And so before it was going 0.9.3.0.10.0, I made it so it skips 0.10.0, so it goes from 9.3 straight to 10.1. And the way Postgres does the upgrade extension paths is it

21:42

always takes the shortest path that's available. So you can see here, and you can see it'll still give you all the paths. So if you do the PG extension upgrade update path, where source is this, and the destination is anything, 0.0.10, the upgrade path from 9.3.0.10 is still there.

22:05

But then there's a path here that's shorter. So it goes from 0.9.3 straight to 0.10.1. So it always takes the shorter path. And this is in the documentation, too. If you do downgrades, always run this to make sure your

22:21

downgrade doesn't cause a shorter path, because you can severely break your code, because you'll uninstall it and reinstall it and uninstall it and reinstall it, and who knows what state it'll be in after that, if it even works. So if you do downgrade scripts, always run this function to see what path your

22:40

extension updates will take. But that also lets people that were on 9.2, they install this version without any problem. And it will continue to work without any problem. But they also still want to get the new version. So this allows people that did have 0.10.0 installed to still be able to upgrade past it, because the

23:02

path still exists. So I didn't completely remove the 0.10.0 update, because somebody may have installed it, and I would have broken it for them. So does all of this kind of make sense? It took a while for me to wrap my head around it, too, when I had to actually do this. Yes?

23:42

I guess you could. The important thing is if you make another code, make another update that would short circuit another update to keep it separate, you can do that. You just have to make sure that new update incorporates everything that was in all those other paths before.

24:03

Because it's never going to run those in between updates again. So I have to make sure. I had other updates that were in 0.10.0, besides the 9.2 things I did. I had to make sure that those were in the 0.10.1 update as well.

24:20

Yes? Yes. Yes.

24:40

I honestly don't know. I haven't looked at the internals of this at all. But when I found out about this, it's really complicated. But when I ran into this situation, this whole short circuit path feature made management of this code extremely easy. If you're trying to manage this without the extension

25:00

system, it'd be even harder. So it actually makes managing version code in the database very, very easy. So I really like it a lot. That's why I've been trying to, if my code isn't all common enough and that's used enough that I want to make sure I

25:21

have a certain version of it in the database, I always just turn it into an extension. Even if it's not something I release publicly, like internal code will do that and stuff too, just to make sure our internal code is consistent across clients. Or just the schema in general is consistent across clients.

25:42

You can also, if you have code that's in your database already and you want to turn it into an extension, you can do that. And that's actually the last two bullets here. But everything that was a contrib module before 9.1 is now an extension. And say you have DBLink installed already, like I

26:06

keep saying, these versions that are in these extension update files are kind of arbitrary. You can make them whatever you want and you use that arbitrary nature of it to do these kinds of things. So there actually is a from clause to the author extension command. So if you specifically know you want to install, from this

26:26

point forward, you've been assuming that you're just updating the version you have installed now to the new version. So if you don't have the from command, it's assuming you're just talking about the version that's installed now. But if you give the from clause, it looks for the

26:42

update path file that has that listed as the old version. So it does a update from that to the new version. So basically, the contrib system took advantage of that. And instead of a version name, they just call it unpackaged. So if you look in the DBLink source, there's a file called

27:03

DBLink dash dash unpackage dash dash 1.0.1 or something like that. So what that will do is take the code that's in the database now. And if you look in the contents of it, all it's doing, there's a clause to the alter extension command to add

27:21

objects to an extension. So you can say alter extension add table whatever to extension PG part man. So that's what this update is doing. So you really don't give the to option to this, because when you do this, it's assuming you're taking the code as it exists in the database now.

27:44

And if somebody went and did that same thing, like the 1.0.0 version of DBLink is pretty much what it assumes is installed when you run this. And then if you want to update to a new version, you then do the alter extension to update it. I know it's kind of confusing.

28:01

Does that make sense? Or does anybody not understand what I meant there? Then you would just do create extension. You'd install the contrib package or the extension files, and you just run create extension. This is if you had DBLink installed before. Or if you have a set of objects that you want to turn

28:21

into an extension, you can do this. Yeah, and that used to be the way you probably had to do it before, because there was no system in place to remove

28:43

things and replace them if they were broken. Now the extension system kind of has that, because you can jump from different versions to make sure your code is in a consistent place. So if this is making your head hurt, don't worry. It made my head hurt the first time I really started

29:02

looking at this, trying to understand it. But the point is to just make you aware that there is a way that if you have code in your database now that you want to make an extension as it is now, you can do that. And then from that point forward, you can make new update files to update your extension code as if it was an

29:21

extension. So basically what you'd do is if you wanted to reinstall that new code into another database someplace else, you'd have to pull that code out somehow and make a base version of that code as an extension that you could reinstall somewhere else. But this at least lets you do it with the code that already exists in the database.

29:44

Any questions? Or need for clarity? I can try better. Sure. I'm sorry, I couldn't hear that. OK.

30:10

Right. I'll get to that actually in a second with some PG dump considerations. So basically it's around where it really comes into

30:23

play is if you have configuration tables as part of your extension. So extension objects, even like functions, tables, sequences, anything that's part of the extension is not actually dumped out in a PG dump. The only thing that happens when you run restore is it

30:43

runs create extension. That's it. So your extension objects do not exist in any PG dump anywhere. Unless, but so that also means that if you're reinitializing a database or something or restoring a backup, you have to have the extension files installed first

31:00

before you run PG restore. Otherwise PG restore will fail because the create extension command will fail. So this means that if you had a config table, like PG apartment has a config table to say what all the partition sets and stuff are, it won't dump out that config data. So if you go to restore a backup, all your PG apartment

31:20

configuration is lost. So there's a flag you can set in the catalogs called PG extension config dump. And it's mostly used for tables and sequences. So you give it the name of the table. And what that does is it tells PG dump to

31:42

dump that data out. It doesn't dump the object, like the definition of the object that's part of the schema. It just dumps the data out. So if you want all of the data in the config table every time, you can leave the second parameter blank.

32:02

And then every time you run PG dump, it'll dump all the data out of the config table. That's what PG apartment has. There's nothing pre-installed as part of any config table. But if you have some data that's part of your extension in the config tables, like I do with Mimeo, I'm sorry, that's PG jobmon is another extension I wrote.

32:22

There are some alert codes that I have in there as pre-configured data. So what you do is you use the second parameter as basically a where condition to say what data you want dumped out. So I don't want the alert codes one, two, and three dumped out as part of the PG jobmon config, because the

32:46

extension itself installs those. So this is irrelevant for sequences. So this is just for tables. So basically, you're telling it what data you want it to dump out of the config tables, so it will be

33:01

restored in a PG restore. The thing that, and I don't know if this is fixed. I didn't get a chance to test this, but I know it was like this in 9.1 and 9.2, because it's when I ran into it, this data is always dumped out. Even if you do a schema-only dump, because this is schema,

33:21

it's not data. What that means, though, is what PG jobmon is, it's a system for logging the status of jobs that you have scheduled to run and make sure they run. That table can grow to gigabytes, but it's part of the extension. So I don't have that data being dumped out.

33:42

There's really no way to preserve. It's either it dumps the data out every time, all the time, or it never dumps it out. So if you needed to back up that data that's part of your config, you have to make it not part of the extension, dump it out, and make it part of the extension again. This would be something I wish they would fix somehow.

34:03

I'm not exactly sure how it would fix it. Maybe it flagged a PG dump to say whether to dump out extension data or not. But it is an issue. Because you can make any table part of the extension. It doesn't have to be a config table. You can make your orders table and your shopping cart

34:22

thing part of your extension. And then when you dump your schema out, you dump out a one terabyte table. So it can be a problem. So it's something just to be aware of. Yes?

34:44

Because the create extension command is supposed to create them. The scheme.

35:03

It only changed it. I know the reason they did it this way is because the schema of the table is part of the extension. And if you restore the database later, you may not be

35:22

restoring the same version of it. You can actually install a new version of the extension later, and when you do a restore, it installs that new version of the extension. Because all it does is create the extension. The definition of the schema for that extension is contained in the extension itself.

35:40

It's not part of something PG dump does. Because if you have PG dump restoring it and then you do create extension, it fails out. You've actually had people do that because they had a, I'm trying to remember exactly, I can't remember how exactly they got into that situation where they were, it's because they were editing the, it was an SQL dump, just like a plain text SQL dump, that they were editing manually later

36:02

to change data. But they added the creation of a sequence back into that that was part of the extension. So when they tried to run a PG restore, it failed out when it tried, because it ran create extension, and then it tried to create the sequence again. And that would fail because the sequence already existed.

36:21

So yes. In the SQL script file, so wherever you're creating your table, you do create table part config, and then after that, you'd do select PG catalog, or you could set the property of that table.

36:42

Actually, I can show you here. So there's the part config table. And then at the end of that, I just, right there, it's

37:02

part of the definition of the table. In this one, I don't. But in some other ones, I do.

37:22

If you set a primary key? Right.

37:41

Yeah, make it a serial. Yeah. Serial primary key or something. Yeah. I mean, when you reinstall it, it would reinitialize the sequence when you go restore it later. There's no real way to preserve the sequence data that I'm aware of. This is actually not required, but this is kind of how I

38:00

organized code in my extensions. So this is at the level of where the control file and everything is. Any kind of, any script file, like I have some Python scripts that I have along with my extensions. I put them in a bin directory. I have a doc directory for documentation. There's an SQL folder that I made that has all of the SQL scripts in it.

38:22

Do a source directory for C code. I have a unit testing for a test folder. And then all the updates, I actually put in their own distinct folder. A lot of people with their extension, they kind of just put all that in the top level. You can. And that's actually where it kind of has to be in the end when you go, or however you have it defined in your make

38:40

file, that's how it goes. But then also in the SQL folder, I have another folder called functions, tables, types. So you can see here in PG department, there's the bin, doc, so if you look in the bin directory, there's a bunch of Python scripts there. If you go to the SQL directory, I have the

39:01

functions, tables, types. We'll get to how this file is made in a second. And then you already saw the updates folder. I have all the updates in that folder there. So it makes managing a lot easier, because when you actually install the extension, all of the script

39:21

code has to be in that one file in the end for the extension to install. But when you're maintaining it, it's easier if you haven't separated it out. And then I'll show you how to pull it all back in together in a second. And the thing that's really important, though, since your update files are their own file and your SQL file is in its own file, when you make an update, you have to be

39:43

really, really, really sure that the code in your individual files matches the code that's in your update files. So the code that's in your latest update file should match everything that's in your SQL directory hierarchy. Does that make sense? Yeah, so there's this tool called meld that I use.

40:02

It's basically a side-by-side diff tool that I use to open the update file in one folder and then open all the other individual files on the other side and just visually make sure they match. It's tedious, but it's because all of the SQL

40:27

functions, they're in individual folders and individual files. They're all separate. So you just have to make sure you grab everything that's on those other ones. And the thing is, everything that's in those other files is not in the update file, either. So you have to make sure things match.

40:42

So I don't have a huge extension that I maintain. So this method is tedious, but it's trivial for my extensions. If you have an extension with thousands of functions and tables, that would be ungodly tedious. But if you separate things out like this, you really have to make sure the code matches. Otherwise, you can end up in a weird state.

41:12

OK. On mass. If there's a difference, blow up and yell. Yeah. I'm actually running a little bit short on time, so I can

41:21

save questions. No, it's fine. I just noticed. Save until the end for right now. So this actually gets into actually how you build the extension now. So this is pretty much the contents of your makefile. A lot of this is really, I can't get into a whole lot of detail here because it's really arbitrary to what your extension does. You can have your makefile do whatever you want, but these

41:42

are the key parts of the makefile. So pgconfig, it's a path to the pgconfig binary of the installed version of Postgres. When you do a make of an extension, it needs to know where pgconfig is so it knows the version of Postgres you're building against. It uses the installed version.

42:00

You don't need the source installed of Postgres to build extensions. You just need to know where pgconfig is. So you can use the installed version of Postgres to build your extensions. If you have C code, that's where you define your modules. Extension is a thing that matches. So whatever you name, like pgpartman.control.

42:22

So the extension variable means the extension is named pgpartman. Where things get installed to, the prefix is the binaries where Postgres is installed to. If you ever go and look in there, or wherever your operating system may throw these files, there's usually a

42:40

folder called share, and it has all of the contrib modules and stuff underneath it. By default, there's usually a folder called extension. And that's where, when you do make install, it puts everything. If you have documentation files, this will put them into the doc folder that exists under the shared directory. And there's this system called PGXS, which makes the

43:03

infrastructure for building extensions not require you to have source installed. So you can just use the base install of Postgres without the source. And you put these lines in the makefile. I'll show you a whole makefile later.

43:20

And it lets you build the extension without having to have the source of Postgres available. The critical part of the extension is actually this data one. It's actually the files that are getting installed to the share extension directory.

43:42

So basically what I did is I figured my one venture into learning how make works was to see how to take a bunch of files that exist in a folder and pull them all together into one file. So that's what this line does right here. Actually, both of these lines here. So this is like a macro for the extension name, a macro for

44:05

the extension version. So this would be like pgpartband 3.0.2. And then it basically goes through. It goes first to the types folder. Then it goes to the tables folder. Then it goes through the functions folder and grabs

44:21

everything. And then this make macro says, take everything that exists after this colon, which is this dollar quote there, and put it into this file.

44:42

I'll get to that a little bit later. If you have something you need to run at the beginning of your extension, and you need to have something run at the end, if you sort the things, you know. I know for now that the types are always going to be first, and that type file will always run first.

45:01

And then the last function alphabetically will always be the last thing that runs. So that's why I do the sort there. It's kind of arbitrary for me. And then so that makes the one when you saw the SQL directory there. I had run make already. Well, that's why that file was there. So if I run make clean, now that file's gone.

45:25

And if I run make, now that file's there. So that takes everything that was in the SQL file, throws it all into that one file. Because that one file is what the create extension requires to exist.

45:41

And then extra clean is just what I used to clean up that dynamically generated file when somebody runs make clean. That's all. Yes. It's worked on Solaris, Debian, Red Hat, and a couple

46:02

of FreeBSD. So it's worked on a couple so far. So it may not work at all. And if you could wait till questions at the end for right now. And then this is also you can do to actually check and

46:20

make sure. So pg apartment requires 9.4.0 at a minimum. So I have the make file do this for me. If somebody tries to run make install with 9.2, it will fail out. It's not very elegant. I actually got this from, later I got this from the PGXN website.

46:41

It's just basically a grep looking through the shell of what PG version you have installed. And then if it's 9.4, bail out. Or less than 9.4, bail out. So this is my full make file here. So you can see I have a, this is actually, it

47:01

dynamically figures out what the extension version is based on the control file. What's the default version in the control file. So there's all the 9.4 stuff. If it exists, do all this stuff.

47:23

So from here on out, this is just my personal tips from my personal experience. This isn't anything that's required for doing extensions. But whenever you create an object, always explicitly name it, even if Postgres can dynamically name it for you, for example, for indexes and for foreign keys.

47:41

Because if you have to go back and change those things later in an update, if you didn't name it, you have no idea what it got named on the person's target directory. Because if you do a functional index on a column, it just calls it index1, index2, index3, index4, index5. You have no idea what it's going to be called. So just an example for the part config table.

48:02

Even the primary key, you want to name the primary key. Don't do like id default primary key when you define the column. Define the primary key later using the constraint option. The same with foreign keys. For indexes, give the index name. That makes things a lot easier if you have to do

48:20

updates later. Otherwise, you kind of have to assume what people are doing, and then when they run the install, their update breaks, and then you have to explain to them why, which I did have to do before I started doing this. Preserving privileges. So at some point, you will have to drop and recreate a function, because you have to redefine the arguments or

48:41

the return type of that function. Or any object, really. You may have to redefine it somehow. But functions is where I run into this most often. So I kind of use a little bit of the SQL script files can contain any SQL you want. So I create a temp table. What I do to do this is I create a temp table, and I

49:03

know for functions, they're going to have an execute permission. But you can also look up what permissions an object has and reuse those. But I'm always going to want to grant execute on that function to whoever had execute on it before. So that's what I'm doing here is I'm looking up the information schema.

49:21

So what I do is I look that up first, I store that in the temp table, and then I drop the function. But you can see in here, when I grant the execute, it's only got two arguments here. So this is the new definition of the function, and then I drop the old definition of a function here.

49:42

And there's more in the next slide. So I preserve the functions in a temp table, preserve the privileges in a temp table first. And then at the end of the script file, I go through and just do a little bit of anonymous function here with a do statement, and rerun all the statements that are in that temp table.

50:01

And that restores the privileges that were there before, because I dropped the function, but then I recreated it later in my extension update code. So that's something to be aware of, especially if you have a publicly available extension. You have no idea what grants somebody may have given your objects, so you kind of want to preserve

50:21

those whenever you can. I don't know if this is true or not, and I've not been able to recreate it, but I've had several people tell me this, that if, like you can see here at the bottom, I dropped the temp table I created.

50:41

So depending on what your update is doing, when the transaction ends, I've had people tell me this happens, and I've never been able to recreate it, but I'm just putting it out there as a warning. But when the transaction ends, and it drops the temp tables, if that cascade is somehow related to anything you did, it may drop the entire extension, because that

51:01

extension object, I know, I don't know. I've had people tell me this happens, and I've never been able to recreate it. Yes. If you have something in, if you create a foreign key in

51:21

the temp table, maybe that might do it, but yeah, so. So referencing the temp table, that's a good reason. So that may, whatever happens there, yeah. In an upstream update script, anything you create is automatically attached to the extension. Yes. I need to make that part of my thing, too. Yeah, anything you create, anything you create in an

51:41

extension update automatically becomes part of the extension. I need to add that to my notes. So if you create, like for the partition function, I never recreate partition, the trigger partition functions as part of the extension update. I make it a note, because if I did that, then the trigger

52:02

function would become part of the partman extension. Does that make sense? So yeah, I need to add that to my talk. Version checking. If you need to make sure there's a minimum version of Postgres installed, actually, this is a recent thing that

52:22

somebody not even realized you could do, is just do current version, cast it to int, and then compare it to a number. That's a very, very easy way to do it. I actually had this really over-complicated version check version function before. But doing it this way, it lets you also account for, I basically split it at the dots and checked each number.

52:42

But that also lets you deal with betas, alphas, rcs in a custom way, if you needed to. But if you just need to make sure there's a minimum version for some kind of feature, you can just do the if condition using that. And in the C code, there's actually an if macro. So in 10, they change the way the background worker

53:03

struct is arranged. So before 10, it was just a simple bg worker, bgw main. Now it was split out into these two things. So if you have code that only needs to compile in certain versions, this is very, very handy.

53:21

I found that by looking in deviant code and stuff and seeing how they deal with it. Another thing to be aware of is object names have a maximum length, which is by default 63 characters. So if you have the partition extension, it adds a suffix

53:41

onto whatever table name somebody has. But that suffix is very, very important. It tells you what the contents of that data is. But if somebody's table is already 61 characters long, then try to add the suffix onto it. Then I'll run the naming collisions, because it will have the same name. Postgres will automatically truncate things for you.

54:00

So I just basically did this check name length function. Basically takes anything you give it, truncates it down to at least 61. It gets the length of the suffix you're adding onto it, truncates the original name down. So it's always at maximum 63 characters. So there's something to be aware of. That's in the pgpartman source, if you wanted to

54:20

take a look at that. Avoid enums. Mostly because they're very, very hard to edit. Not because they're bad. Enums are fine. They're just bad in extensions. Because they've gotten a little bit better in 10. You can actually add things to them easier. But you still can't delete them.

54:41

It's not trivial to delete them. You kind of have to recreate the enum. And if you made a type or something for somebody as part of your extension, they would have to completely drop every table that had that type or do something to change the value of the enum. So my advice is you just avoid them in extensions altogether.

55:13

Especially for something that's like a default value of a column. Because if you have to change that default value later, you have to edit the enum.

55:22

And if you need to remove a default value, that's not trivial. This is a thing, if you look in the DBLink code, if you want to prevent people from running PSQL and just installing your extension, it'll stop them from doing that. Because it'll actually break. If you use that extension macro, you can't run PSQL to

55:42

install the extension. So this just gives them a nice error message instead of letting them try to do that. Almost done here. I think that's, yeah. So that's pretty much, so PGXN is a network where you can download a bunch of extensions.

56:01

It's where I upload all mine to. They have a command line tool to make it easier to manage. And if you have a complicated code, go look at PGTap. It's a unit testing framework for any SQL code. I use it in two ways. So you can have each file be its own uncommitted transaction. So when you run the tap test, it just automatically

56:21

undoes everything it did. Or if you look at the Mimeo extension, I actually have one tap test depend on the one before it. So I just number them. So they run in order. And then clean up things at the end. But for PG apartment, I just counted yesterday, I have like over 7,000 tap tests for just making sure all of the different time and ID series types and stuff work.

56:42

So it's the only reason my code is somewhat stable when I release it, because I run all my code through this to make sure it's at least predictable for the things that I know that I want to account for. So if you do any extension development at all, I highly, highly, highly recommend you go take a look at PGTap and

57:03

make it part of your extension code to make sure everything is working. So that's it. Mitri Fontaine is actually the developer that made the extension system in Postgres. So if you ever see him at a conference, you can thank him for that.

57:22

And there's a whole bunch of other links for things on extensions and Postgres. Thank you.