Upgrade your Postgres and PostGIS will thank you
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 266 | |
Author | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/66586 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
DatabaseMultiplication signBitLecture/Conference
00:36
Computing platformMereologyService (economics)DatabaseLecture/Conference
01:01
DatabaseSystem administratorMathematical optimizationGene clusterFlow separation
01:28
Standard deviationService (economics)Open sourceMultiplicationGame controllerDatabaseHigh availabilityOperator (mathematics)Computer fileLecture/Conference
02:10
DatabaseBackupRevision controlComputer animationLecture/Conference
02:36
Game theoryProbability density functionVideo gameSoftwareSoftware developerSinc functionOpen sourceLecture/Conference
03:11
Price indexFunction (mathematics)GeometryType theoryComputer configurationJust-in-Time-CompilerDefault (computer science)Query languageMathematical optimizationCodeDatabaseCartesian coordinate systemMathematicsMultiplication signCuboidRevision controlSerial portQuery languageControl flowSubject indexingMiniDiscSign (mathematics)GeometrySlide ruleLecture/Conference
05:02
Price indexSubject indexingVacuumMultiplicationSubject indexingDatabaseNetwork topologySoftware developerLecture/Conference
05:29
MultiplicationSubject indexingPrice indexVacuumSubject indexingSoftware developerPlanningData storage deviceAdditionProduct (business)Hand fanCore dumpView (database)Right angleData recoveryHigh availabilityVacuumCASE <Informatik>Parallel portInstance (computer science)Information overloadPoint (geometry)Multiplication signMultiplicationBuildingAreaQuicksortLecture/Conference
07:25
Data compressionServer (computing)BackupFunction (mathematics)Query languageBuildingWindow functionData compressionWindowAliasingScripting languageBackupSurfaceRevision controlFunctional (mathematics)Block (periodic table)Replication (computing)LogicMatrix (mathematics)Structural loadHuman migrationInstallation artWordBookmark (World Wide Web)Home pageLecture/ConferenceComputer animation
10:23
Matrix (mathematics)Revision controlFunctional (mathematics)Lecture/ConferenceComputer animation
10:53
WindowFunctional (mathematics)Lecture/ConferenceComputer animation
11:12
Data storage deviceFile formatReplication (computing)Server (computing)Computer fileLecture/Conference
11:33
File formatData storage deviceCore dumpDatabaseBitServer (computing)Revision controlError messageCore dumpData storage deviceArrow of timeFile formatLecture/Conference
12:01
File formatData storage deviceFile formatData storage deviceLecture/Conference
12:24
Core dumpLogicReplication (computing)Data storage deviceFile formatComputer file2 (number)Server (computing)Computer animationLecture/Conference
12:42
Data storage deviceFile formatCore dumpTerm (mathematics)Constraint (mathematics)Replication (computing)LogicCartesian coordinate systemSequenceLecture/Conference
13:08
Field extensionRevision controlLink (knot theory)Addressing modeRevision controlDirectory serviceLink (knot theory)Computer configurationLecture/Conference
13:26
Field extensionLink (knot theory)Revision controlDirectory serviceBinary fileField extensionDatabaseExtension (kinesiology)Installation artLecture/Conference
13:46
Revision controlField extensionLink (knot theory)Installation artDatabaseExtension (kinesiology)Revision controlMereologyLecture/ConferenceComputer animation
14:05
Scripting languageInstallation artDistribution (mathematics)MereologyRevision controlScripting languageLecture/Conference
14:28
Scripting languageInstallation artLibrary (computing)Link (knot theory)Revision controlBitField extensionDrop (liquid)Lecture/Conference
14:58
Table (information)Object (grammar)Numbering schemeLink (knot theory)Directory serviceLink (knot theory)Computer configurationDirectory serviceMiniDiscSpacetimeFreewareDatabaseLecture/Conference
15:23
Table (information)Object (grammar)Numbering schemeDirectory serviceLink (knot theory)Type theoryExtension (kinesiology)SpacetimeMiniDiscFunctional (mathematics)DatabaseTable (information)Type theoryFreewareNumbering schemeNumberLink (knot theory)Object (grammar)Lecture/Conference
16:15
Extension (kinesiology)Type theoryProcess (computing)Electronic mailing listScripting languageDefault (computer science)Server (computing)Ocean currentConfiguration spaceMultiplication signDirectory serviceBuffer solutionLecture/Conference
16:58
Type theoryExtension (kinesiology)Directory serviceExtension (kinesiology)Message passingRevision controlScaling (geometry)Point (geometry)Field extensionMultiplication signLecture/Conference
17:49
Link (knot theory)Server (computing)Object (grammar)Scripting languageServer (computing)Web pageFunctional (mathematics)Link (knot theory)Point (geometry)CodeLecture/Conference
18:42
Hill differential equationLink (knot theory)Server (computing)Object (grammar)Revision controlServer (computing)Functional (mathematics)BackupLecture/Conference
19:10
BackupObject (grammar)Directory serviceSynchronizationLibrary (computing)Right angleComputer fileDirectory serviceControl flowServer (computing)2 (number)MathematicsScripting languageCASE <Informatik>Lecture/Conference
19:57
Link (knot theory)StatisticsTable (information)VacuumDirectory serviceStatisticsLecture/Conference
20:16
Link (knot theory)BackupBackupDescriptive statisticsGene clusterInternetworkingComputer animationLecture/Conference
20:36
Revision controlMaxima and minimaScripting languageOperator (mathematics)MetadataACIDVolumeDefault (computer science)NamespaceGene clusterComputer fileNumberLecture/Conference
20:55
MetadataACIDDefault (computer science)NamespaceVolumeRevision controlMaxima and minimaScripting languageOperator (mathematics)Line (geometry)DatabaseField (computer science)Revision controlOperator (mathematics)Scripting languageProduct (business)Lecture/Conference
21:32
Software testingRevision controlService (economics)Slide ruleDatabaseMatching (graph theory)Multiplication signLecture/Conference
22:03
Software testingDatabaseSoftware testingSoftware bugMathematical optimizationProduct (business)Integrated development environmentProcess (computing)Revision controlLecture/Conference
23:15
Software testingSoftware bugProduct (business)Integrated development environmentDatabaseSoftware testingLecture/Conference
Transcript: English(auto-generated)
00:08
Hi everybody I was going Yeah, I'm here to talk about upgrading your postgres and Not only postgres will thank you But also post JS so I hope that all of you here are supposed just users so that
00:25
The stuff that I'm talking about is a bit technical at times so that you can follow Yeah So yeah, I work as a database engineer at Solando It's Europe's I would say a leading online fashion e-commerce platform
00:43
We are in 25 markets in Europe not yet in Kosovo But maybe one day it will come so we already reach like Croatia Slovenia and Romania so and At Solando. I'm part of the database team the postgres as a service team and
01:01
we manage around like 3,000 database clusters on Top of kubernetes so two years ago at phosphor Yeah I've also given the talk about like how we do this and today this will mostly focus about upgrading postgres because as you can imagine with like 3,000 separate database clusters It would be a nightmare as a database administrator if you would not have like great optimization in place
01:27
So my team came up with a tool called patroni. I don't know if you've heard about patroni It's an open-source solution for postgres to provide you with high availability and it's yeah becoming the now the Sort of like de facto standard in the industry
01:41
so if you want to go high available high available with postgres, you will probably do this with patroni and We also came up with this postgres operator tool, which is like a small controller in kubernetes Which allows you to provide like databases a service tooling for multiple teams?
02:01
So for example At salando if you're an engineer, and you want to create a new database you don't have to talk to me You just write a short manifest file submit this to kubernetes And then everything spins up automatically for you in one a minute you have the database Backups and everything and you can change it later on
02:21
You can also remove it on your own and you can also upgrade it on your own by just changing the major version and Yeah, this talk will know Tell you why you should do this every once in a while. I would say maybe at least once a year and So I was thinking about doing a little short question game
02:40
But I don't want to dispose anybody here if you're still using postgres 10, but just be alarmed I was unfortunately this is not like it's a PDF so the siren is not spinning but Luckily we have some wind so it's it's a bit dramatic at least so Yeah, postgres 10 is end of life since last november and postgres 11
03:01
If you're still using it will be end of life this november, and what does it mean is it's yeah? It's open source software But it doesn't get any more updates the developers are concentrating on the new releases and creating and developing the the future releases And so yeah this so this is not a supported anymore and in november postgres 11 will also not be supported
03:23
so If you are using post js, so then I hope at least you will be on postgres 12 So I think the stretched text so looks okay I will quickly go over the last releases over the last years so that Convinced to convince you more to do an upgrade so I hope that you are at least using
03:42
postgres 12 because this was the version that Made it possible for post js to use parallel queries out of the box so when you install post js And you write some queries they can be parallelized by the planner of post SQL with postgres 12 also the deserialization of geometries got a lot more efficient with postgres 12 because there was an API to allow
04:05
to do a quick about a bounding box check before doing the serialization from the disk and What another big feature was reindex concurrently you will see on the next slides where this is an important feature also Down here you see these nice little signs here of course
04:23
With every new release there can also be some breaking changes and before you just run blindly into an upgrade you should first check if there are some breaking changes if they affect your code or your application code and You have to first find some detours for this
04:41
So that's also something you have to make sure every time you do an upgrade Usually there are not many breaking changes But it can be like there's this one feature that you have used before and now they dropped it in postgres So you have to be aware Okay, so postgres 13. This was the big release for the b-tree indexes So of course we are here at the special conference
05:01
We have a lot of spatial data in our postgres database But I would still think that you still have more b-tree indexes to do search on like a numeric and text columns Yeah, b-tree indexes got smaller and faster with this release So that means after the upgrade you have to reindex all your b-trees or like all your database would also be good
05:22
So this is also something that you would have to do sometimes with the spatial indexes That you have just have to reindex all your spatial indexes because the postgres developers thought about okay now we have came up with a new plan for the gist index for example a new storage method and Yeah, so that's why reindex concurrently from postgres 12. It was such an important addition
05:45
So of course with every postgres release. I mean they always tell you like it's faster now. It's Improved in so many ways there are usually some core features which got like here parallel support in this case it was vacuum for indexes and In case you're using multi column indexes
06:03
OSM people looking at you, maybe This also now has incremental sort support in postgres 13, okay coming to 14 14 was quite important for us because we have like in production. We automatically always have a high available setup
06:22
with one primary instance and one replica at least one replica and In the past it could happen that if you overload your primary with a lot of rights that your replicas start lagging behind and behind more and more and With this release now that the recovery speed was greatly improved
06:42
So this was not more likely to happen in the future That's an area where your primary gets overloaded and your replicas cannot follow if you are a fan of using materialized views 14 also had a good improvement here that you can refresh them in parallel and also
07:00
The postgres people in created a new Method for just indexes to build them faster So and of course postgres developers jumped straight at it and said okay now Let's that because it's usually a big pain point for postgres that Building spatial indexes takes so long and with this new method it became like I don't know
07:23
eight times faster or something But unfortunately the query speed was worse worse worse after that so building was faster Query speed was slower unfortunately So that was this was fixed with postgres 15
07:40
So if you're now install postgres on postgres 15 Just the Nexus will be built. I will be able to build faster Not as fast as with 14, but yeah, they had to fix some things that the query speed remains the same Yeah, there was also some improvements for window functions if you're
08:02
Looking at the new features that will come in the next post js version it has some very cool neat features for coverages Which are window functions, so I hope this will also benefit from this improvement for sorting speed of window functions and Yeah, and if you have backups there's also now a server-side compression for backups which makes this faster
08:25
Yeah, and I think I should also note this one here So I know that when you're new to postgres and post js you happily Start your server and start creating stuff in the public schema and postgres
08:41
This will not be possible any longer in postgres 15 So there is now like they have removed this privilege from the public role So you cannot go there and just create stuff in in the public schema you have to rent the privilege to do so So be aware if you have some migration scripts that are relying on this functionality that it was possible
09:04
Okay, so a quick glimpse into the into the future Postgres 16 will probably released this autumn so every autumn. There is usually the postgres release So one of the key features that I picked out was like some great improvements for logical replication
09:23
But I don't know if you are using logical replication probably not and But the the second item here that the copy command got around like 300% faster And I think this is like a great news for everybody who is like usually doing bulk imports of spatial data into postgres
09:42
But yeah, my favorite feature if you ask me is The first one if you have like your SQL where your you have a subquery in your from block You don't need an alias anymore. I Really love it, but I don't know about you, but it's my favorite feature
10:00
okay, so This was of course just I was only scratching the surface So I was picking out stuff that I thought you find interesting for the cup For the last releases. There's a great feature matrix on the postgres home page where you can yeah It's if you scroll down there are loads of features that they list here and then they show you like which words
10:22
Which version supports which feature and I think this if you scroll and look at this It makes it even more convincing for you to run an upgrade So there's here also the post JS support matrix to see is
10:41
Like the new postgres version supporting which postgres version and you see here like the news postures will not support postgres 11 So if you're still on 11 want to use the new functions and postures think about upgrading you see it's a sliding window, so it's Yeah, probably the next post.js will not support 12 anymore
11:04
Okay, and now to the question. How have we automized the Postgres upgrades how to do this? What are this the single steps that you can you have to do? They are basically three ways to run to do a postgres upgrade
11:21
So one is the hard upgrade the soft upgrade and you can also do this with logical replication So hard upgrade means you dump all your data into a file, and then you restore it in a separate server Like if you think about like database downtime server downtime that might sound a bit stupid
11:42
But if downtime isn't an issue then it's still a viable approach I mean I did major version upgrades ten years ago like this. I just Did not want to fiddle around with like creating siblings and running into arrows with the soft upgrade I just dumped everything with dump all and then restored it into a new server
12:00
So you can still do this and whenever maybe the internal storage format of post.js will change Maybe then that this is the only way you could do this so between post quiz post.js One and two you had to do this because they changed the internal format They did it to between two and three, but it was still possible to do the soft upgrade so fortunately
12:24
So in this talk I will focus on the soft update soft upgrade Because there you just install everything like on one server And then you have just one tool to upgrade everything upgrade all the files And it usually runs in like a few seconds Maybe one or two minutes, and then everything is upgraded
12:44
So that's of course in terms of downtime way better than doing the hard upgrade Yeah, replication is a probably provides to the least downtime, but it also has some constraints When it's yeah when it comes to the setup of logical application like sequences
13:00
I think are not supported you have to copy them manually But maybe this will also become more convenient in the future Okay, so for the soft update The quick summary looks like this so you install all the like the new postgres version the new post.js version and
13:21
Then you run this PG upgrade command with the link option you can specify here the old data directory your new data directory old bin directory a new binary directory and When this is done in your database you run this command alt extension post.js
13:41
you can use select post.js extensions upgrade and You have to do this now for then for every database where you have post.js installed So it's not like when you run PG upgrade automatically you are using the new post.js So it's still an extra step you have to do for your extensions. Not only post.js also other extensions
14:02
So for the installation part you install the new postgres version The new post.js which is built against that postgres version and then also this scripts package at least for the Linux distribution and Then your setup is fine
14:20
If you're still on post.js 2 I don't know is there anybody still using post.js 2 here in this audience No, okay, then we can skip that part because there you had to do some tricks with like siblings and so on But fortunately this is not Necessary anymore because now the new post.js 3 is only having the major version in the library
14:41
Names, so not the minor version. Yeah, previously it was always usually a bit tricky to do these upgrades and Yeah, and if you're using sfseagull and you're still on Versions 3 or older you also had to like drop that extension because the library was separated from the main post.js
15:00
Okay, so When you run PG upgrade with this link option why it's so great because it's Creating only hard links between your PG data directories There is no copying of data So if you not would not use link then all the data would be copied from one data directory to another and this is
15:21
What of course require that you have the same amount of free disk space that your database is big So if you're using link only links are used and also one thing that is really great. Is that the downtime? With what cures during the upgrade? Does not depend on the database size
15:41
So if your database cluster is one gigabyte big or one terabyte big the downtime would can be the same It only depends on the number of objects in your database not like the data but tables types Functions and whatsoever So if your database has 1 million schemas then the upgrade will take longer
16:02
But it doesn't depending on the data size Okay, I have to speak a bit louder. So What do you have to do when you want to automize this PG upgrade? Process so here that like I'm listing now all the steps that our script is doing
16:23
So first, of course, you could only run the upgrade If your cluster is healthy if your replicas you have to check if they are streaming and not lagging behind Are there not any bulk imports running at the time you want to do the upgrade? I mean this sounds obvious but still and
16:41
Yeah, then on the same server you create a new PG data directory within a DB make sure that you are using the current configuration like don't just install the new postgres with Default shared buffers and workmen and so ever so you you could just copy these values from your current Configuration, of course, you still have to use some like some paths that are pointing to the new directory data directory
17:04
you cannot use the same pass there and You also have to check for incompatible extensions So don't just like when postgres is released. Yeah, I can go do the upgrade you have to of course make sure that the new post just is already there to support this new postgres version or maybe if you
17:23
Have another extension like timescale. Sometimes you have to wait a bit longer until this extension is also supporting the new postgres version and The last point here is you can also run PG upgrade check. This is just a dry run It will only tell you okay upgrade is possible or not
17:42
That's it so let's you can run this and it will not change anything it would just tell you if upgrade would be possible or not Okay, then for the upgrade There could still be you could you should check in your code if you have some functions that are using maybe Stuff and postgres which has changed
18:00
So then if you have if you put this all into a script You should remove this stuff and then add this after the upgrade and Yeah, when you go into the upgrade you have to of course stop your postgres servers you call a PG upgrade link And if this one if this fails you have of course To undo all this stuff that you have maybe done in point one on the previous page
18:25
If you have dropped already some things you have to re-add them after a failed PG upgrade link command if PG upgrade link succeeds Then it's the point of no return so
18:40
if you then have upgraded and you realize oh shit the performance is pretty bad and Things fall apart, and I want to go back to the previous major version. It's not possible anymore with your server It's only possible if we restore from a backup So if you want to be really secure you should have done a base backup before you run the upgrade and then you can still
19:00
Restore from this backup, and then you're back on your old major version But you there is no functionality in postgres that say PG downgrade or something. It's not there Okay, so when the PG upgrade is finished you can rename your data directory because you remember there are two data directories and this on your server
19:20
You have to adjust the postgres config so that all the paths are correct. I'm pointing to the right libraries and In case you have replicas you can also upgrade your replica servers just by using our sync Maybe there are different ways, but we found our thing to be the most convenient for us to upgrade replicas And it's also like a matter of seconds
19:43
Because it's just the files that have to be updated and Yeah, last but not least I mentioned all the stuff that you might have to drop because of some breaking changes this you have to read in your script Okay, and last but not least the cleanup so everything is done upgraded your cluster is upgraded yay
20:06
You should run you should update the table statistics, so there's this vacuum DB tool that you can run and You can remove your old PG data directory and initialize a new base backup, so this was no very very short and
20:21
You can find here behind these links or if you just search in the internet, and you will also find greatly detailed Descriptions of how to upgrade postgres also like a huge fleet of postgres clusters okay, so How is the user experience at Solando? How do teams upgrade their postgres clusters so as I said they only have to
20:47
update one number in Manifest file so here you can see there is this this is a this is what our teams have to do when they want to Create a new database cluster just a few lines and then here in the version field they just changed 14 to 15 and then
21:04
Yeah, we have this Kubernetes operator, which then rotates all the pots and kicks in this Upgrade script and upgrades the cluster so we can also configure this globally we could say Now all the databases that are still on
21:20
12 or older should be upgraded to version 15 for example That's something we could also do we're not doing this in production, but something we could configure with our tool Okay, and the last slide so the conclusion We found so you know before we had this automatic major version upgrade we had to do this manually
21:44
we had to agree with the teams for matching time slot to do the upgrade and We had to like stopping the service running the upgrade skip. This is was all manual stuff that we had to do and This led to like teams delaying the major version upgrade sometimes when you know when teams approach us and
22:03
Thinking like oh my database performance is really bad And we are looking like oh you're still using postgres 9 6 or 10 Then we usually recommend them to upgrade, but that's if it's not very optimized It's a topic that teams like to delay in their backlog, so Yeah, and but after this optimization now. We see like teams are really adopting it more embracing it more and upgrading now
22:27
More regularly more frequently like once a year they run the upgrade because they want to use the new features and Of course When there is a new postgres version, it's not that like my team makes it available for everybody
22:44
Immediately we usually wait one or two minor releases before we say to the teams, okay? You cannot go with this in production But yeah when there is a new release we usually Into an our test environment. It's already there immediately, but yeah in production
23:02
We usually wait a few minor releases, and this is also something that if you have maybe more critical databases also something that you should do Because still I mean postgres is pretty robust But it can be that maybe yeah, there are some bugs and some performance degradations that are found and then yeah
23:21
You don't want to run into this, but you will you could run into this if you test so Everything that I have explained here You should always test this so when you run when you upgrade your your database cluster first in a test environment Make this one run with PG upgrade see if everything is compatible every the performance is good after that
23:44
And then you can apply this for your production database