We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Upgrade your Postgres and PostGIS will thank you

00:00

Formal Metadata

Title
Upgrade your Postgres and PostGIS will thank you
Title of Series
Number of Parts
266
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Every year, there's a new Postgres major release that improves on performance in certain areas and could provide new hooks for extensions like PostGIS to take advantage from them. If not planned well, upgrading your production databases can become a pain. Sooner than you think you'll be running on EOL (End-of-Life) versions because the upgrade has been postponed too many times. Don't! Did you know Postgres upgrades can be greatly automatized these days with downtimes of only a few seconds? This talk will show you how and will also present some essential features from recent Postgres and PostGIS versions to get you excited for the new upgrade.
DatabaseMultiplication signBitLecture/Conference
Computing platformMereologyService (economics)DatabaseLecture/Conference
DatabaseSystem administratorMathematical optimizationGene clusterFlow separation
Standard deviationService (economics)Open sourceMultiplicationGame controllerDatabaseHigh availabilityOperator (mathematics)Computer fileLecture/Conference
DatabaseBackupRevision controlComputer animationLecture/Conference
Game theoryProbability density functionVideo gameSoftwareSoftware developerSinc functionOpen sourceLecture/Conference
Price indexFunction (mathematics)GeometryType theoryComputer configurationJust-in-Time-CompilerDefault (computer science)Query languageMathematical optimizationCodeDatabaseCartesian coordinate systemMathematicsMultiplication signCuboidRevision controlSerial portQuery languageControl flowSubject indexingMiniDiscSign (mathematics)GeometrySlide ruleLecture/Conference
Price indexSubject indexingVacuumMultiplicationSubject indexingDatabaseNetwork topologySoftware developerLecture/Conference
MultiplicationSubject indexingPrice indexVacuumSubject indexingSoftware developerPlanningData storage deviceAdditionProduct (business)Hand fanCore dumpView (database)Right angleData recoveryHigh availabilityVacuumCASE <Informatik>Parallel portInstance (computer science)Information overloadPoint (geometry)Multiplication signMultiplicationBuildingAreaQuicksortLecture/Conference
Data compressionServer (computing)BackupFunction (mathematics)Query languageBuildingWindow functionData compressionWindowAliasingScripting languageBackupSurfaceRevision controlFunctional (mathematics)Block (periodic table)Replication (computing)LogicMatrix (mathematics)Structural loadHuman migrationInstallation artWordBookmark (World Wide Web)Home pageLecture/ConferenceComputer animation
Matrix (mathematics)Revision controlFunctional (mathematics)Lecture/ConferenceComputer animation
WindowFunctional (mathematics)Lecture/ConferenceComputer animation
Data storage deviceFile formatReplication (computing)Server (computing)Computer fileLecture/Conference
File formatData storage deviceCore dumpDatabaseBitServer (computing)Revision controlError messageCore dumpData storage deviceArrow of timeFile formatLecture/Conference
File formatData storage deviceFile formatData storage deviceLecture/Conference
Core dumpLogicReplication (computing)Data storage deviceFile formatComputer file2 (number)Server (computing)Computer animationLecture/Conference
Data storage deviceFile formatCore dumpTerm (mathematics)Constraint (mathematics)Replication (computing)LogicCartesian coordinate systemSequenceLecture/Conference
Field extensionRevision controlLink (knot theory)Addressing modeRevision controlDirectory serviceLink (knot theory)Computer configurationLecture/Conference
Field extensionLink (knot theory)Revision controlDirectory serviceBinary fileField extensionDatabaseExtension (kinesiology)Installation artLecture/Conference
Revision controlField extensionLink (knot theory)Installation artDatabaseExtension (kinesiology)Revision controlMereologyLecture/ConferenceComputer animation
Scripting languageInstallation artDistribution (mathematics)MereologyRevision controlScripting languageLecture/Conference
Scripting languageInstallation artLibrary (computing)Link (knot theory)Revision controlBitField extensionDrop (liquid)Lecture/Conference
Table (information)Object (grammar)Numbering schemeLink (knot theory)Directory serviceLink (knot theory)Computer configurationDirectory serviceMiniDiscSpacetimeFreewareDatabaseLecture/Conference
Table (information)Object (grammar)Numbering schemeDirectory serviceLink (knot theory)Type theoryExtension (kinesiology)SpacetimeMiniDiscFunctional (mathematics)DatabaseTable (information)Type theoryFreewareNumbering schemeNumberLink (knot theory)Object (grammar)Lecture/Conference
Extension (kinesiology)Type theoryProcess (computing)Electronic mailing listScripting languageDefault (computer science)Server (computing)Ocean currentConfiguration spaceMultiplication signDirectory serviceBuffer solutionLecture/Conference
Type theoryExtension (kinesiology)Directory serviceExtension (kinesiology)Message passingRevision controlScaling (geometry)Point (geometry)Field extensionMultiplication signLecture/Conference
Link (knot theory)Server (computing)Object (grammar)Scripting languageServer (computing)Web pageFunctional (mathematics)Link (knot theory)Point (geometry)CodeLecture/Conference
Hill differential equationLink (knot theory)Server (computing)Object (grammar)Revision controlServer (computing)Functional (mathematics)BackupLecture/Conference
BackupObject (grammar)Directory serviceSynchronizationLibrary (computing)Right angleComputer fileDirectory serviceControl flowServer (computing)2 (number)MathematicsScripting languageCASE <Informatik>Lecture/Conference
Link (knot theory)StatisticsTable (information)VacuumDirectory serviceStatisticsLecture/Conference
Link (knot theory)BackupBackupDescriptive statisticsGene clusterInternetworkingComputer animationLecture/Conference
Revision controlMaxima and minimaScripting languageOperator (mathematics)MetadataACIDVolumeDefault (computer science)NamespaceGene clusterComputer fileNumberLecture/Conference
MetadataACIDDefault (computer science)NamespaceVolumeRevision controlMaxima and minimaScripting languageOperator (mathematics)Line (geometry)DatabaseField (computer science)Revision controlOperator (mathematics)Scripting languageProduct (business)Lecture/Conference
Software testingRevision controlService (economics)Slide ruleDatabaseMatching (graph theory)Multiplication signLecture/Conference
Software testingDatabaseSoftware testingSoftware bugMathematical optimizationProduct (business)Integrated development environmentProcess (computing)Revision controlLecture/Conference
Software testingSoftware bugProduct (business)Integrated development environmentDatabaseSoftware testingLecture/Conference
Transcript: English(auto-generated)
Hi everybody I was going Yeah, I'm here to talk about upgrading your postgres and Not only postgres will thank you But also post JS so I hope that all of you here are supposed just users so that
The stuff that I'm talking about is a bit technical at times so that you can follow Yeah So yeah, I work as a database engineer at Solando It's Europe's I would say a leading online fashion e-commerce platform
We are in 25 markets in Europe not yet in Kosovo But maybe one day it will come so we already reach like Croatia Slovenia and Romania so and At Solando. I'm part of the database team the postgres as a service team and
we manage around like 3,000 database clusters on Top of kubernetes so two years ago at phosphor Yeah I've also given the talk about like how we do this and today this will mostly focus about upgrading postgres because as you can imagine with like 3,000 separate database clusters It would be a nightmare as a database administrator if you would not have like great optimization in place
So my team came up with a tool called patroni. I don't know if you've heard about patroni It's an open-source solution for postgres to provide you with high availability and it's yeah becoming the now the Sort of like de facto standard in the industry
so if you want to go high available high available with postgres, you will probably do this with patroni and We also came up with this postgres operator tool, which is like a small controller in kubernetes Which allows you to provide like databases a service tooling for multiple teams?
So for example At salando if you're an engineer, and you want to create a new database you don't have to talk to me You just write a short manifest file submit this to kubernetes And then everything spins up automatically for you in one a minute you have the database Backups and everything and you can change it later on
You can also remove it on your own and you can also upgrade it on your own by just changing the major version and Yeah, this talk will know Tell you why you should do this every once in a while. I would say maybe at least once a year and So I was thinking about doing a little short question game
But I don't want to dispose anybody here if you're still using postgres 10, but just be alarmed I was unfortunately this is not like it's a PDF so the siren is not spinning but Luckily we have some wind so it's it's a bit dramatic at least so Yeah, postgres 10 is end of life since last november and postgres 11
If you're still using it will be end of life this november, and what does it mean is it's yeah? It's open source software But it doesn't get any more updates the developers are concentrating on the new releases and creating and developing the the future releases And so yeah this so this is not a supported anymore and in november postgres 11 will also not be supported
so If you are using post js, so then I hope at least you will be on postgres 12 So I think the stretched text so looks okay I will quickly go over the last releases over the last years so that Convinced to convince you more to do an upgrade so I hope that you are at least using
postgres 12 because this was the version that Made it possible for post js to use parallel queries out of the box so when you install post js And you write some queries they can be parallelized by the planner of post SQL with postgres 12 also the deserialization of geometries got a lot more efficient with postgres 12 because there was an API to allow
to do a quick about a bounding box check before doing the serialization from the disk and What another big feature was reindex concurrently you will see on the next slides where this is an important feature also Down here you see these nice little signs here of course
With every new release there can also be some breaking changes and before you just run blindly into an upgrade you should first check if there are some breaking changes if they affect your code or your application code and You have to first find some detours for this
So that's also something you have to make sure every time you do an upgrade Usually there are not many breaking changes But it can be like there's this one feature that you have used before and now they dropped it in postgres So you have to be aware Okay, so postgres 13. This was the big release for the b-tree indexes So of course we are here at the special conference
We have a lot of spatial data in our postgres database But I would still think that you still have more b-tree indexes to do search on like a numeric and text columns Yeah, b-tree indexes got smaller and faster with this release So that means after the upgrade you have to reindex all your b-trees or like all your database would also be good
So this is also something that you would have to do sometimes with the spatial indexes That you have just have to reindex all your spatial indexes because the postgres developers thought about okay now we have came up with a new plan for the gist index for example a new storage method and Yeah, so that's why reindex concurrently from postgres 12. It was such an important addition
So of course with every postgres release. I mean they always tell you like it's faster now. It's Improved in so many ways there are usually some core features which got like here parallel support in this case it was vacuum for indexes and In case you're using multi column indexes
OSM people looking at you, maybe This also now has incremental sort support in postgres 13, okay coming to 14 14 was quite important for us because we have like in production. We automatically always have a high available setup
with one primary instance and one replica at least one replica and In the past it could happen that if you overload your primary with a lot of rights that your replicas start lagging behind and behind more and more and With this release now that the recovery speed was greatly improved
So this was not more likely to happen in the future That's an area where your primary gets overloaded and your replicas cannot follow if you are a fan of using materialized views 14 also had a good improvement here that you can refresh them in parallel and also
The postgres people in created a new Method for just indexes to build them faster So and of course postgres developers jumped straight at it and said okay now Let's that because it's usually a big pain point for postgres that Building spatial indexes takes so long and with this new method it became like I don't know
eight times faster or something But unfortunately the query speed was worse worse worse after that so building was faster Query speed was slower unfortunately So that was this was fixed with postgres 15
So if you're now install postgres on postgres 15 Just the Nexus will be built. I will be able to build faster Not as fast as with 14, but yeah, they had to fix some things that the query speed remains the same Yeah, there was also some improvements for window functions if you're
Looking at the new features that will come in the next post js version it has some very cool neat features for coverages Which are window functions, so I hope this will also benefit from this improvement for sorting speed of window functions and Yeah, and if you have backups there's also now a server-side compression for backups which makes this faster
Yeah, and I think I should also note this one here So I know that when you're new to postgres and post js you happily Start your server and start creating stuff in the public schema and postgres
This will not be possible any longer in postgres 15 So there is now like they have removed this privilege from the public role So you cannot go there and just create stuff in in the public schema you have to rent the privilege to do so So be aware if you have some migration scripts that are relying on this functionality that it was possible
Okay, so a quick glimpse into the into the future Postgres 16 will probably released this autumn so every autumn. There is usually the postgres release So one of the key features that I picked out was like some great improvements for logical replication
But I don't know if you are using logical replication probably not and But the the second item here that the copy command got around like 300% faster And I think this is like a great news for everybody who is like usually doing bulk imports of spatial data into postgres
But yeah, my favorite feature if you ask me is The first one if you have like your SQL where your you have a subquery in your from block You don't need an alias anymore. I Really love it, but I don't know about you, but it's my favorite feature
okay, so This was of course just I was only scratching the surface So I was picking out stuff that I thought you find interesting for the cup For the last releases. There's a great feature matrix on the postgres home page where you can yeah It's if you scroll down there are loads of features that they list here and then they show you like which words
Which version supports which feature and I think this if you scroll and look at this It makes it even more convincing for you to run an upgrade So there's here also the post JS support matrix to see is
Like the new postgres version supporting which postgres version and you see here like the news postures will not support postgres 11 So if you're still on 11 want to use the new functions and postures think about upgrading you see it's a sliding window, so it's Yeah, probably the next post.js will not support 12 anymore
Okay, and now to the question. How have we automized the Postgres upgrades how to do this? What are this the single steps that you can you have to do? They are basically three ways to run to do a postgres upgrade
So one is the hard upgrade the soft upgrade and you can also do this with logical replication So hard upgrade means you dump all your data into a file, and then you restore it in a separate server Like if you think about like database downtime server downtime that might sound a bit stupid
But if downtime isn't an issue then it's still a viable approach I mean I did major version upgrades ten years ago like this. I just Did not want to fiddle around with like creating siblings and running into arrows with the soft upgrade I just dumped everything with dump all and then restored it into a new server
So you can still do this and whenever maybe the internal storage format of post.js will change Maybe then that this is the only way you could do this so between post quiz post.js One and two you had to do this because they changed the internal format They did it to between two and three, but it was still possible to do the soft upgrade so fortunately
So in this talk I will focus on the soft update soft upgrade Because there you just install everything like on one server And then you have just one tool to upgrade everything upgrade all the files And it usually runs in like a few seconds Maybe one or two minutes, and then everything is upgraded
So that's of course in terms of downtime way better than doing the hard upgrade Yeah, replication is a probably provides to the least downtime, but it also has some constraints When it's yeah when it comes to the setup of logical application like sequences
I think are not supported you have to copy them manually But maybe this will also become more convenient in the future Okay, so for the soft update The quick summary looks like this so you install all the like the new postgres version the new post.js version and
Then you run this PG upgrade command with the link option you can specify here the old data directory your new data directory old bin directory a new binary directory and When this is done in your database you run this command alt extension post.js
you can use select post.js extensions upgrade and You have to do this now for then for every database where you have post.js installed So it's not like when you run PG upgrade automatically you are using the new post.js So it's still an extra step you have to do for your extensions. Not only post.js also other extensions
So for the installation part you install the new postgres version The new post.js which is built against that postgres version and then also this scripts package at least for the Linux distribution and Then your setup is fine
If you're still on post.js 2 I don't know is there anybody still using post.js 2 here in this audience No, okay, then we can skip that part because there you had to do some tricks with like siblings and so on But fortunately this is not Necessary anymore because now the new post.js 3 is only having the major version in the library
Names, so not the minor version. Yeah, previously it was always usually a bit tricky to do these upgrades and Yeah, and if you're using sfseagull and you're still on Versions 3 or older you also had to like drop that extension because the library was separated from the main post.js
Okay, so When you run PG upgrade with this link option why it's so great because it's Creating only hard links between your PG data directories There is no copying of data So if you not would not use link then all the data would be copied from one data directory to another and this is
What of course require that you have the same amount of free disk space that your database is big So if you're using link only links are used and also one thing that is really great. Is that the downtime? With what cures during the upgrade? Does not depend on the database size
So if your database cluster is one gigabyte big or one terabyte big the downtime would can be the same It only depends on the number of objects in your database not like the data but tables types Functions and whatsoever So if your database has 1 million schemas then the upgrade will take longer
But it doesn't depending on the data size Okay, I have to speak a bit louder. So What do you have to do when you want to automize this PG upgrade? Process so here that like I'm listing now all the steps that our script is doing
So first, of course, you could only run the upgrade If your cluster is healthy if your replicas you have to check if they are streaming and not lagging behind Are there not any bulk imports running at the time you want to do the upgrade? I mean this sounds obvious but still and
Yeah, then on the same server you create a new PG data directory within a DB make sure that you are using the current configuration like don't just install the new postgres with Default shared buffers and workmen and so ever so you you could just copy these values from your current Configuration, of course, you still have to use some like some paths that are pointing to the new directory data directory
you cannot use the same pass there and You also have to check for incompatible extensions So don't just like when postgres is released. Yeah, I can go do the upgrade you have to of course make sure that the new post just is already there to support this new postgres version or maybe if you
Have another extension like timescale. Sometimes you have to wait a bit longer until this extension is also supporting the new postgres version and The last point here is you can also run PG upgrade check. This is just a dry run It will only tell you okay upgrade is possible or not
That's it so let's you can run this and it will not change anything it would just tell you if upgrade would be possible or not Okay, then for the upgrade There could still be you could you should check in your code if you have some functions that are using maybe Stuff and postgres which has changed
So then if you have if you put this all into a script You should remove this stuff and then add this after the upgrade and Yeah, when you go into the upgrade you have to of course stop your postgres servers you call a PG upgrade link And if this one if this fails you have of course To undo all this stuff that you have maybe done in point one on the previous page
If you have dropped already some things you have to re-add them after a failed PG upgrade link command if PG upgrade link succeeds Then it's the point of no return so
if you then have upgraded and you realize oh shit the performance is pretty bad and Things fall apart, and I want to go back to the previous major version. It's not possible anymore with your server It's only possible if we restore from a backup So if you want to be really secure you should have done a base backup before you run the upgrade and then you can still
Restore from this backup, and then you're back on your old major version But you there is no functionality in postgres that say PG downgrade or something. It's not there Okay, so when the PG upgrade is finished you can rename your data directory because you remember there are two data directories and this on your server
You have to adjust the postgres config so that all the paths are correct. I'm pointing to the right libraries and In case you have replicas you can also upgrade your replica servers just by using our sync Maybe there are different ways, but we found our thing to be the most convenient for us to upgrade replicas And it's also like a matter of seconds
Because it's just the files that have to be updated and Yeah, last but not least I mentioned all the stuff that you might have to drop because of some breaking changes this you have to read in your script Okay, and last but not least the cleanup so everything is done upgraded your cluster is upgraded yay
You should run you should update the table statistics, so there's this vacuum DB tool that you can run and You can remove your old PG data directory and initialize a new base backup, so this was no very very short and
You can find here behind these links or if you just search in the internet, and you will also find greatly detailed Descriptions of how to upgrade postgres also like a huge fleet of postgres clusters okay, so How is the user experience at Solando? How do teams upgrade their postgres clusters so as I said they only have to
update one number in Manifest file so here you can see there is this this is a this is what our teams have to do when they want to Create a new database cluster just a few lines and then here in the version field they just changed 14 to 15 and then
Yeah, we have this Kubernetes operator, which then rotates all the pots and kicks in this Upgrade script and upgrades the cluster so we can also configure this globally we could say Now all the databases that are still on
12 or older should be upgraded to version 15 for example That's something we could also do we're not doing this in production, but something we could configure with our tool Okay, and the last slide so the conclusion We found so you know before we had this automatic major version upgrade we had to do this manually
we had to agree with the teams for matching time slot to do the upgrade and We had to like stopping the service running the upgrade skip. This is was all manual stuff that we had to do and This led to like teams delaying the major version upgrade sometimes when you know when teams approach us and
Thinking like oh my database performance is really bad And we are looking like oh you're still using postgres 9 6 or 10 Then we usually recommend them to upgrade, but that's if it's not very optimized It's a topic that teams like to delay in their backlog, so Yeah, and but after this optimization now. We see like teams are really adopting it more embracing it more and upgrading now
More regularly more frequently like once a year they run the upgrade because they want to use the new features and Of course When there is a new postgres version, it's not that like my team makes it available for everybody
Immediately we usually wait one or two minor releases before we say to the teams, okay? You cannot go with this in production But yeah when there is a new release we usually Into an our test environment. It's already there immediately, but yeah in production
We usually wait a few minor releases, and this is also something that if you have maybe more critical databases also something that you should do Because still I mean postgres is pretty robust But it can be that maybe yeah, there are some bugs and some performance degradations that are found and then yeah
You don't want to run into this, but you will you could run into this if you test so Everything that I have explained here You should always test this so when you run when you upgrade your your database cluster first in a test environment Make this one run with PG upgrade see if everything is compatible every the performance is good after that
And then you can apply this for your production database