Postgres at any scale
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 132 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/44974 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 201831 / 132
2
3
7
8
10
14
15
19
22
27
29
30
31
34
35
41
44
54
55
56
58
59
61
66
74
77
78
80
81
85
87
91
93
96
98
103
104
105
109
110
111
113
115
116
118
120
121
122
123
125
127
128
129
130
131
132
00:00
Scale (map)SoftwareBlogDatabaseDatabaseBitSequelGoodness of fitNewsletterMobile appComputer animation
01:08
DatabaseCategory of beingLevel (video gaming)Video gameRelational databaseBitCuboidComputing platformMultiplication signText editorDifferent (Kate Ryan album)Computer animation
02:07
Type theoryExtension (kinesiology)TimestampDefault (computer science)String (computer science)Address spaceConstraint (mathematics)Set (mathematics)BackupFormal verificationData recoveryRepresentation (politics)DatabaseConstraint (mathematics)Query languageLengthCASE <Informatik>Cartesian coordinate systemCodeService (economics)Row (database)outputCovering spaceBinary codeEquivalence relationMathematicsSystem callReal numberData typeMultiplication signIP addressQuicksortLogical constant2 (number)Point (geometry)Type theoryComplete metric spaceClique-widthTouchscreenPattern languageText editorString (computer science)Validity (statistics)Data recoveryBackupInformation privacyFlagEndliche ModelltheorieKey (cryptography)Different (Kate Ryan album)Traffic reportingForm (programming)Computer fileFunction (mathematics)Incidence algebraProcess (computing)WordTimestampSet (mathematics)Line (geometry)Entire functionTable (information)Extension (kinesiology)Time zoneBitHacker (term)Reading (process)Level (video gaming)Generic programmingBlogMereologySoftware frameworkNormal (geometry)Cycle (graph theory)MultilaterationOpen sourceDefault (computer science)TwitterNP-hardSequelMoving averageStapeldateiGoodness of fitMathematical analysisEqualiser (mathematics)File formatThetafunktionData storage deviceRelational databaseComputer animation
10:25
BlogRead-only memoryConfiguration spaceVacuumPhysical systemDuality (mathematics)Term (mathematics)BitKey (cryptography)Projective planeExpert systemLoginSpacetimeSet (mathematics)NumberTunisSemiconductor memoryRotationDefault (computer science)Configuration spaceMultiplication signDatabaseQuicksortInstance (computer science)AreaDifferent (Kate Ryan album)VacuumProcess (computing)Table (information)Goodness of fitSolid geometryComputer animation
12:01
SummierbarkeitFloating pointBit rateCache (computing)Order (biology)Bookmark (World Wide Web)InformationLatent heatQuery languageDigital filterDependent and independent variablesWeb pageSubject indexingSystem callShared memoryDatabaseNumberCache (computing)Response time (technology)PlanningWeb 2.0Single-precision floating-point format2 (number)Cartesian coordinate systemQuery languageTable (information)Row (database)Rule of inferenceWeb pageSemiconductor memoryMultiplication signBit rateWeb applicationBasis <Mathematik>BitInstance (computer science)Different (Kate Ryan album)Run time (program lifecycle phase)Operator (mathematics)Data warehouseMiniDiscPerspective (visual)Insertion lossTotal S.A.ThumbnailDatabase transactionCuboidUltraviolet photoelectron spectroscopyInternetworkingCASE <Informatik>Dot productComputer animation
17:27
Local ringOrder (biology)Data Encryption StandardLimit (category theory)Query languageTotal S.A.Subject indexingInverter (logic gate)Network topologySpacetimeBlock (periodic table)Range (statistics)NumberSequenceSinc functionQuicksortStatisticsType theoryRevision controlCore dumpCartesian coordinate systemCodeBasis <Mathematik>Degree (graph theory)Selectivity (electronic)AverageMultiplication signMathematicsCircleRow (database)Different (Kate Ryan album)Right angle2 (number)PolygonBlock (periodic table)Buffer solutionWordTime seriesEmailOrder (biology)Single-precision floating-point formatDatabaseComputer networkData typeWebsiteTable (information)Parameter (computer programming)Shape (magazine)CASE <Informatik>Query languageData storage deviceSchweizerische Physikalische GesellschaftSpacetimeObject-relational mappingKey (cryptography)Address spaceExtension (kinesiology)Streaming mediaStatement (computer science)Group actionAstrophysicsSearch treeElectronic mailing listPhysical systemOrder of magnitudeBitPartition (number theory)GenderComa BerenicesComputer animation
22:36
Inverter (logic gate)Network topologySpacetimeRange (statistics)Block (periodic table)Subject indexingLatent heat1 (number)Type theoryWebsiteDatabaseFunctional (mathematics)Address spaceMultiplication signSubsetCASE <Informatik>Revision controlNumberMereologyCondition numberLatent heatComputer animation
24:14
MiniDiscQuery languageFinitary relationOrder (biology)Data Encryption StandardSubject indexingHuman migrationDefault (computer science)Constraint (mathematics)Default (computer science)DatabaseHuman migrationQuery languageProduct (business)Table (information)2 (number)Level (video gaming)Constraint (mathematics)Scaling (geometry)Covering spaceGoodness of fitRight angleCartesian coordinate systemEntire functionPlanningMultiplication signMiniDiscPort scannerQuicksortFigurate numberDistribution (mathematics)Process (computing)StatisticsFunction (mathematics)CodeRow (database)TwitterPointer (computer programming)WritingReading (process)Roundness (object)Computer animation
28:48
DemonComputer configurationCache (computing)Replication (computing)Table (information)DatabasePortable communications deviceBackupMiniDiscArchitecturePhysical systemDatabaseScaling (geometry)Cartesian coordinate systemPeer-to-peerExtension (kinesiology)Covering spaceBitComputer configurationDrill commandsBackupComputer architectureInstance (computer science)Single-precision floating-point formatPhysical systemTable (information)Event horizonMessage passingFitness functionMiniDiscHuman migrationSoftware testingReading (process)LoginQuicksortMultiplication signRight angleStructural loadConnected spaceSequelDatabase transaction1 (number)Data modelOpen sourceContent (media)Integrated development environmentOpen set2 (number)Endliche ModelltheorieMetropolitan area networkLimit (category theory)Semiconductor memoryReplication (computing)Point (geometry)Query languageCore dumpBit ratePhysicalismTerm (mathematics)Heegaard splittingPortable communications deviceDrop (liquid)CuboidProduct (business)WindowLevel (video gaming)Computer hardwareCountingCache (computing)WritingConcurrency (computer science)Computer animation
36:22
Hash functionData typeSoftware testingBackupCache (computing)Subject indexingCache (computing)Scaling (geometry)DatabaseHash functionMultiplication signConnected spaceCASE <Informatik>Best, worst and average caseExpert systemConfiguration spaceNumberBitSoftware testingBackupText editorBlogInformation technology consultingWordSequelQuery languageData typeRight angleExtension (kinesiology)Core dumpPoint (geometry)Polarization (waves)Computer animation
39:15
Replication (computing)Constraint (mathematics)DatabaseDifferent (Kate Ryan album)Multiplication signGame controller1 (number)Complex (psychology)Query languageMathematical optimizationMiniDiscInstance (computer science)Computer configurationPhysical systemRange (statistics)Hand fanCASE <Informatik>Cartesian coordinate systemBitOnline helpRight angleSet (mathematics)BackupOpen setQuicksortType theoryScheduling (computing)Image registrationExclusive orProduct (business)BlogLattice (order)Extension (kinesiology)Scaling (geometry)Line (geometry)UsabilityBinary fileSequenceReading (process)Storage area networkState of matterMetropolitan area networkWindowSoftware developerSoftware testingTheorySocial classStatisticsPoint (geometry)Point cloudTunisHeegaard splittingTensorSpherical capBit rateLogicDrill commandsWebsiteTable (information)Computer animation
Transcript: English(auto-generated)
00:06
You'll have to bear with me a little bit. On the flight over, I managed to catch a cold, so powering through a little bit. Between the jet lag and cold, I'll do my best here. Really quick show of hands, does anyone here not use Postgres? A few hands.
00:20
Okay, hopefully I can convince you by the end. The rest of you, you're all my people, so I'll talk to the rest of you. So really quickly, who am I? I work at Citus Data. We turn Postgres into a distributed, horizontally scalable database. Think of it as sharding without all the work. Basically scalable like NoSQL, all the goodness of Postgres.
00:41
I curate Postgres Weekly, so if you've subscribed to Python Weekly or PyCoder's Weekly, Postgres Weekly is a similar newsletter. Really focused on more app dev than DBA, so hopefully you can find something you like in there. I blog a good bit about tech startups, Postgres as well.
01:01
Previously I was at Heroku and launched the Python support there. So Python is definitely my world when it's not Postgres. So today we're gonna walk through different stages of how to deal with data at any size and what things you should be doing and caring about. I don't know how many of you run databases that are like one terabyte, Postgres can
01:23
handle that pretty well. Most people are probably more in the small, medium category. But there are some things you can do at those stages that make life a little bit easier or spend kind of the time investing. So first, starting a little bit small. Before I dig into that, why Postgres?
01:41
As a coworker said, it's the Emacs of databases. And I think this is a really telling comment. I'm a VI guy, but I think the sentiment makes a lot of sense. Emacs is as much a platform as it is a text editor. Postgres is as much a data platform as is a relational database.
02:03
You have to think about it as being really, really rigid. Now it has a lot more in the box. If we look at it, there's a lot of really, really rich data types. Has probably more data types than any other database that I know of. Rich geospatial support. It's well regarded as the most advanced open source geospatial database.
02:21
Index types. We'll get to this a little bit later. But Postgres adds a new index type almost every year. These are deep, kind of an academic foundation, but can have really, really practical use cases. Full text search. JSONB deserves its own kind of call out.
02:41
Postgres got JSON support in 9.2, which now is six years ago. But it was just text validation. JSONB is binary JSON in your database. So think Mongo-like binary JSON documents that you can index search query on so you don't have to have everything as normalized tables. And then extensions.
03:01
Extensions really are kind of where some of that, it's the Emacs of database comes in. Now Postgres can get new features without having to wait for a year long release cycle. So a lot of this stuff like GIS, full text search, comes on as part of that extension framework. So really low level APIs that let you extend things.
03:23
All right, so when you're small, what do you need to worry about? The first thing I would say is pay attention to data types. I talk to a lot of people that say I want to use the most generic thing possible in case I want to up and move from Postgres to MySQL or Postgres to Oracle or something else. The reality then is you're limiting yourself really,
03:42
really heavily to the worst set of all things that are shared across databases. SQLite is a great example of this, where it lists null as a data type. Not as a value, but as a data type. I'm not even sure what that means in database sense. But if you look and say, hey, MySQL, I want to treat text the same as
04:01
Postgres, MySQL, if you insert something too long, it just truncates it. It doesn't say that's invalid, it just truncates it. I care a little bit more about the quality and sanity of my data. So if I want something to be explicitly, you know, say 140 characters long, like a tweet, well, not so valid today. But I want to use like a Varchar 255 or 140 there.
04:26
Timestamp versus timestamp with time zone. Post, dates are hard. Like, there was a great blog post I think a month ago that hit Hacker News of like, by Zach Coleman. It was a fun read. If you haven't seen it, go read it. Dates, times are very, very hard.
04:41
Time zones, date math, all that sort of thing. Postgres is really, really good. Timestamp with time zone is probably what you want by default. As data comes in, it can convert the time zone for you. It still stores it all under the covers as the equivalent of UTC. But it does that math for you, which is really nice.
05:00
JSON versus JSONB. Postgres got JSON support in 9.2. In 9.4, we got what I would call real JSON support. In 9.2, it was just text that was validated as JSON as it came in. But it was stored as text under the covers. JSONB is a binary representation of it, which is really fast. You can index it, all of that.
05:21
If you're using something like a logging service or recording just inputs from an API and you want to preserve white space, use JSON. There's a valid case for it still to exist. That's about the only case. So the first thing is, leverage your data types. If you have it in your application code as a certain data type, look at using the same in your database.
05:43
All right, constraints. This is a thing I find most people don't intentionally put in, and just assume the application is going to take care of it. The reality is, without database constraints, you can definitely have invalid data within your database. Things like limiting the length of strings if you really want to.
06:03
You should have that same constraint within your database. Postgres has things for IP address data type. Is it actually a valid IP address or not? Foreign keys are great, so that if you delete a user, are you deleting all that user's data? That's more of a concern now with GDPR and those sort of things.
06:21
So making sure you actually have foreign keys, so that as you delete something, you make sure cascades are really, really important. This is a lot easier to do when you first start. When you first start, you're probably thinking, I just want to build something fast. Putting this in place early makes all the difference. Coming back in and adding that years down the line, it's a lot of work, and you realize
06:40
you've actually got to go clean up data at that point. This is one that's really common in the Rails community. I don't see it as much in the Python community, but I think it's a really good pattern. In Rails, they don't actually delete data, which I know is probably a concern now for privacy and other things, but there's basically a flag on every model
07:03
they create that says deleted at is null, and they just set that value, and then they filter that out constantly. So where deleted at is not null, then you don't return that in the query. This is really, really nice. If you've ever accidentally run a query, which we never have, delete from users,
07:21
and then you forget the where clause, and everyone remembers immediately after. It wasn't a full second. As soon as you hit enter, you're trying to cancel it, and what just happened? It's still a pain to clean up with update set deleted at, but this is far, far, far better to clean up
07:42
than where you actually went and deleted it and have to do a point in time recovery against your database. So soft deletes are a pattern I really, really like and would like to see more of. I'd encourage you to consider if they can work for you. So the other thing is I would say spend some time mastering your tools. How many here have like a Bash RC or like a Vim RC set up?
08:04
Okay, most hands. How many have a psql RC set up? That's actually more than I expected, maybe 10 to 15 or so. Psql is a really, really great Postgres editor. It's the CLI tool that ships with Postgres. Out of the box, it's okay,
08:21
but what you can do is customize it. It has a lot of really built-in richness, so you can add things like tab completion. Backslash X auto is an awesome feature that based on the width of your screen and the width of the query output will auto-format the output of the query so that it either like word wraps or puts like a entire kind of record line by line by line.
08:42
One thing I like to do if you search for like psql in my blog, I like to say the history of every command I run against a database. So for every database based on the name, I have a history file. If you've ever had someone come to you and say, hey, can you run this report for me? And you hop into the database, you run a query, send them the report.
09:02
Great, you're done. Three months later, they come back and say, hey, do you remember that report you ran for me? Can I get an updated copy of it? I'm like, I have no idea what you asked for, what I ran. I go recreate it, it's probably not the exact same thing. By saving the history of every query I run, it's great to come back through in some form of ad hoc analysis.
09:21
And I don't have to do anything extra by just setting that up in my psql history file. Backslash timing is a nice tool that shows the output on every query you run, how long it ran. All right, so backups, who here has backups on their database?
09:40
Who here tested them within the last month? All right, so backups do not exist unless you test them. Setting up backups from the start is good. If you haven't tested them, like I know you think they work. I don't know if people saw kind of the GitLab incident from maybe it was a year ago now, and they tried not just five different backups,
10:01
like five different methods of backups, and failure, failure, failure. Start a process of testing your backups. It doesn't have to be daily, weekly, monthly, it's probably a good interval. I would go and do that now. Try to restore your latest backup and let me know if it failed or succeeded.
10:20
If you're not testing them, there's no point in even putting them in place because they're probably not gonna work. All right, so early on I would say invest in those things. That's probably pre-ten gigs of data, right? A lot of small projects, a lot of things that aren't gonna have a lot of users, done a lot of data. In terms of like medium-sized data,
10:40
now you need to start to learn a little bit more about Postgres. You don't have to become a DBA or an expert, but a few kind of key things will go a long way. Here we're looking at something probably like 10 gigs of data up to 100 gigs or so. A few basic things you're gonna wanna do. Set up logging, make sure you have log rotation. The number of times I've seen where people set up logs
11:01
and then don't set up log rotation, run out of the space on their database because of just logs, it happens. Make sure you do these sort of things. Configure your memory. Postgres isn't great by default at its memory settings, but it's not that hard to configure. Tweak vacuum, adjust your checkpoints. But the biggest thing here is don't try
11:22
to be an expert in all of those settings. There's a number of tools like PG Tune where you can go and put in some data about your database. Say, hey, I have this much data, this size instance. What should all of these settings be? This is one area where I say don't try to learn too much. There's like 200 different settings
11:40
and configs within Postgres. Just set them once and then you're pretty much good. And there's a few talks out here on this as well. A great one from PyCon, I think two years ago was PostgresQL when it's not your day job. Christoph Pettis walks through a bunch of set this config, don't worry about what it means. Set this config if you're on AWS, don't worry about what it means to get you a good solid setup.
12:03
All right, so the number one thing I pay attention to when I first look at a database is the cache hit ratio. Is anyone else excited that DuckTales is coming back? I'm way too excited for this with my kids. Anyways. All right, so running this simple, simple query
12:23
against your database will tell you the cache hit ratio. Postgres keeps a lot of data about the queries it runs, such as which things were served from in memory, which things went to disk. As we know, going to disk, much, much slower than in memory. You've gotta look at 100 to one,
12:41
1,000 to one difference in kind of performance time here. This super simple query is gonna give you back something that looks like this. And here, what I'm gonna look for in most applications, and this is for a transactional web app. If you're in data warehousing, it's completely different. But most web applications that I come across,
13:01
you wanna see a cache hit ratio of 99%. If this is too low, the easiest things to do to improve performance is go get a bigger instance. Upgrade your database, add more memory to the box, things will be happier. When you go from 99% to something like 98, 95%, performance doesn't go from one millisecond to two,
13:22
it goes from like one millisecond to like 100 milliseconds. It falls off a cliff really, really quickly. So, querying this on a weekly, monthly basis is a good thing to do. Send it up with some of your regular alerts, check it out, and see when you need to upgrade. The next thing I'm gonna look at is,
13:41
am I properly using index? So, just like how many things are served from cache versus how many things are served from disk, Postgres has index hit rate. This query's gonna give us something that we have to do a little bit more thinking through. It's gonna show the number of times I used, the percentage of time I used an index against this table on my queries,
14:01
and the number of rows in my table. As a rough rule of thumb, what I go for is, if I've got more than 10,000 rows and I'm doing a lot of really short lookups, I wanna have an index hit rate of 95% or higher. This is gonna vary based on application. Sometimes you may want it a little lower if you're doing a lot of reporting against one table.
14:22
But single row inserts, you wanna typically, inserts, lookups, updates. Because you're getting a single row, you wanna have it in cache or using an index. So here you're looking for a higher index hit rate. And so here we can see, these top three tables,
14:40
we probably wanna come in and add some indexes on. All right, so, looking at it from the other perspective, if you've got a certain page in your application, you're running some, you wanna improve the performance, you know it's slow, you don't know why, you can see the queries it's running. Postgres has something called an explain plan.
15:00
If you put an explain in front of any query and run it, it's gonna tell you what it thinks it's going to do from a performance perspective. You wanna explain analyze, it's going to tell you what it thinks it's gonna do, and it's also gonna run the query and give you what actually happened. If there is a really, really, really horribly performant query,
15:20
be careful with running explain analyze, because it's also going to run the query for you. I've worked with Postgres for a little over 10 years now. I still look at explain plans and have to like pause, get some coffee and like look at what's going on there. It's not, it shouldn't be as confusing as it is.
15:41
So, some rough guidelines first. Like for web application, I'm thinking, okay, can I get, you know, really, really common page response times down under 10 milliseconds. Now with things like single page apps, that's more like one second. Common queries though, still in my application, I want 10 milliseconds or less. More rare queries under 100 milliseconds.
16:03
So coming back to this query, there's a couple of things here. So, if you can see in red there, there's two different values. Those values that actual time is separated by two dots, those are two different values. I want to pay attention to the second one.
16:20
That is what it actually ran as. In that case, it's 295 milliseconds to complete this step of that operation. So, knowing that, hey, if I want things under one millisecond, I can see, okay, this is, total runtime is slow. And that one operation that's happening right there is a sequential scan, not using an index. So, what do I want to do?
16:42
I want to come in and add an index. In this case, I'm just going to add it on salary because I've got that filter there. Rerun this and that's going to bring it down to 1.7 milliseconds. There's a tool if you search for understanding explain, there's a tool by Des Pez,
17:00
I don't actually know how you say his name just from the internet. It's great, that kind of helps like take an explain plan, paste it in, it'll highlight the really bad steps for you. It calls out kind of basically where you need to optimize within the step. So, that helps if you know you've got a specific query.
17:24
But a lot of times, you have an application and you're like, things are slow, can you tell me what? Like, queries are all over, I have all sorts of things being executed from my ORM, I have no idea what the queries actually are. Postgres had this extension called pgstat statements.
17:41
It records every query that's run against your database, essentially parameterizes it, so the query that would have actually been run in this case would be select star from users where email equals crag at site is data dot com. It strips that crag at site is data dot com because it's different for every user in my application. It records all these internal things like how many blocks were dirtied,
18:01
how many buffers were dirtied, how many new blocks were written. I don't care about much of this, you can drill in if you want. What I really want to see though is how many times a query was ran, how long did it take on average, and how much time is it consuming an aggregate against my system. So this one query will give you
18:20
something that looks like this select ID from users was run a ton of times, it averaged 10 milliseconds, in aggregate it's run for 295 seconds against my database since I've started recording the stats. I've got another query that's this other select star from on average it's 80 milliseconds
18:42
and it's run almost as much time as that other one. Coming back, doing some of that math, thinking okay, if I know I can get a query down on typically to one millisecond, if I were to go and optimize that second one, I'm gonna get two orders of magnitude back in overall performance against my database.
19:01
So really handy without having to dig into the application code, which queries are being run from where and being able to optimize them. All right, so I mentioned indexes as a reason that Postgres is great. Postgres has a lot of indexes. Most databases, if you have a CS degree,
19:20
you probably learned about a B-tree index in school. You probably didn't learn about the rest of these. These other indexes, GIN, GIS, SPGIS, BRIN, Postgres pretty much gets a new one every year. There's a group within the Postgres community known as the Russians. One of them is a professor of astrophysics at the University of Moscow.
19:42
And for fun, he acts on Postgres. We have different definitions of fun. But they usually show up with some crazy academic paper saying hey, here's this new index type that I read a paper on, space partition GIS, which is generalized search tree.
20:02
They show up on the mailing list. They're like hey, we're gonna implement this. People are like, that sounds crazy. It's like they go into a cave and they come back three months later with code and everyone's like, okay, that looks good, let's commit it. But if you read through each of those, even in the Postgres docs and kind of look through them,
20:23
which do I use, I have no idea. I read through, I look at the papers, and I'm like, okay, I kind of understand what's going on here maybe a little bit, but how do I actually put this into practice? That's a completely different story. So this is slightly oversimplified, but generally will work for 99% of times.
20:43
B-tree, this is usually what you want. When you're doing a lookup based on email address or a filter where salary is less or greater than this, this is usually what you want to use. Gen.
21:01
So gen is useful when there are multiple values within a single column. So if you think of a single data type, a single column, something like an array, hstore, which is a key value data type in Postgres, or JSONB, where you've got multiple different values inside that JSON document, this is usually what you want to use.
21:21
Generalized search tree, this one's a little more confusing. You can think about it as when values overlap between rows. So if you think of like shapes, you've got polygons, right? You've got a circle that's in one row, you've got another circle that is in another row. Some of the values within those polygons, within those circles can overlap.
21:40
Some will not. Same thing with full text search. If you've got a full sentence, some of the words overlap with words in other sentences, some do not. So if you've got values that overlap between rows, this is probably what you want to use. The simplified version is if you're doing things with geospatial stuff or full text search,
22:00
gist is probably what you want. SPGist and BRIN are generally used for really, really large tables. SPGist, I've asked a lot of the Postgres core community, like can you simplify it? And the only thing I've taken away is if you're working with phone numbers, you want to use SPGist. That's all I've understood thus far.
22:23
BRIN is really, really good when you're scanning a lot of sequential data. And I say a lot like tables with billions and billions of records, often time series, that sort of thing. There is also another index type that's supposedly coming up.
22:41
So I mentioned the Russians are the ones that have contributed most of these. They felt because we had gin within Postgres, we also need vodka. I wish that was only a joke. That is the working name for the index type. So we are getting apparently vodka within Postgres as well.
23:03
So a few other index tips. Be specific with your indexes. Don't just say create index. You can add composite indexes. If you always query on first name and last name when you're searching, add an index on both. Postgres will be pretty smart and use those composite indexes.
23:21
Functional ones. If you always query on lowercase of name, if you want to have a case insensitive search, make sure if you're sending in the lowercase version of it, you do the same thing within your database. Or conditional ones. If you have a lot of data,
23:42
but some of it you're not usually searching on. Like a great example is if you're a phone book and actually within the database you have every historical phone number and every place everyone has lived prior, you're probably not frequently searching on where I lived five apartments ago. So you could say where current address is equal to true and it's only gonna index
24:01
where current address is equal to true. Really nice if you want to have an index on a part of your data. And there I'd look as you know, do you have 10 times the amount of data but are you commonly only filtering on a subset of it? So within your database, you usually have a big trade off between faster reads and faster writes.
24:23
I know the answer is you want everything to be fast. Sorry, it doesn't work that way. So a few months ago I created this tweet. This will give you the output to create an index on every column of every table of your database.
24:43
So you'll have indexes on everything. Everything will be fast on the read side. This was mostly a joke but it's also not a completely crazy idea. So when you add an index for every write you make, it's gonna figure out the query plan,
25:01
it's gonna write to disk, it's gonna wait for it to acknowledge that write and it's gonna return. This is about a kind of one millisecond round trip on a reasonable disk. As you add an index, it's gonna also update that index and make sure there's no constraint violation or just make sure the index is up to date, then it's gonna return. So we can just roughly say hey,
25:22
I added a couple extra milliseconds down. And you add another index, it's gonna do the same thing. So it's gonna have to do both of these. And you add another one and it's gonna do the same thing. So I go and add an index on every single column on every single table. Now I have writes that take like two seconds or something like that. Not the best idea.
25:41
But as I mentioned, Postgres is really, really good about keeping stats. At that medium scale, it's not terrible to over index things. I wouldn't go all out and index every column of every table, but you can be pretty generous with your indexes. And then Postgres has this, another super, super simple query you can run
26:02
that will show all of the unused indexes on your database. So if you have an index that's sitting there just slowing down writes, but maybe your application code changed or the data distribution changed and it's never using this index, you can run this query, say, hey, I don't need this and then go come back and drop it later. So a good thing to do to come back and kind of start that process
26:21
of cleaning up unused indexes. And it'll give you something like this of, how large is that index? Is it really, really large? Is it small? How many times does it use? If you see a zero there on index scans, go ahead and drop it. It's not getting any sort of value from anywhere in your application.
26:42
So on database migrations, not null is a good thing, right? Like if we wanna have some default value, we want it in our application, except when you're running a database migration. Postgres, if you add a new column and add a default value,
27:00
it's going to rewrite that entire table. It's gonna take it and make a copy under the covers of it, write that new value out. What that means is it's gonna hold a lock on all new incoming data while it's rewriting that table. And your staging, this is gonna be really fast. You're gonna write things out and it's gonna take a few seconds.
27:21
Even if your staging has a gig as the table. Production, when you run this on a hundred gig table, you're gonna let it start running for a few seconds, a few minutes, five to 10 minutes, and you're gonna start wondering when it's gonna complete. And an hour in, you're gonna have people kind of over at your desk saying what's going on, when are we back up or down, that sort of thing.
27:42
The better way to do migrations at any large scale is simply break it up into three steps. First, you're gonna allow nulls and set a default value, not at a database level, or at the database level and at the application. So all new data, all new updates are getting that default value.
28:01
You're gonna gradually come back in and backfill all the old data. So all of those hundred gigs of records that you have that don't have this value in there, you're gonna run a background job at night, update a thousand records at a time, that sort of thing. And then you're gonna come back in and add your constraint.
28:20
You can do this now, terabyte size tables, pretty much no downtime, like you'll have a few seconds maybe of lock rights, but overall, much, much safer and you won't find yourself bringing down production. I think every person I've seen that's really used to working with small databases and then it gets bigger and bigger runs into this the first time they work with a larger sized database
28:42
of 100 gigs or so. And it's much, much more painful the larger you get, so be careful on this one. All right, so that was kind of medium scale and I would classify that somewhere from a database with 10 gigs of data to 100 gigs or so.
29:02
All right, the larger you get, you wanna be a little more careful in particular. The first thing you're gonna wanna start doing, before where you said create index, now you're gonna wanna say create index concurrently. This could not be run inside a transaction, so inside like your Django transactions, you're gonna have to break this out. Sorry, it's worth it.
29:22
Create index concurrently does not hold a write lock while the index is being created. Technically it does at the very, very end, but that's for a few milliseconds. What this means is you can, instead of doing create index and everything kind of crawling to a halt while you add an index on a 100 gig table,
29:41
it's gonna build up that index in the background, wait for it to be almost done, then take that lock, then cut over to it. It's roughly two to three times slower, but doesn't lock writes. Really, really key for larger databases. Connection pooling. So SQLAlchemy has connection pooling built in.
30:01
Django has connection pooling now built in, or at least a persistent pool. I would encourage you also looking at running a Postgres-side connection pooler. So there's basically two ways of connection pooling. You can have your application having a bunch of open connections just waiting to grab one as a new request comes in.
30:21
So there's some latency there in grabbing an SSL connection to Postgres. There's also the fact that Postgres doesn't handle a lot of connections very, very well. At 1,000 connections, you have a lot of contention, it's consuming a lot of memory, that sort of thing. There's a few options here. PgBouncer, PgPool are the two big ones.
30:42
I highly, highly recommend using PgBouncer. PgBouncer is a connection pooler that as a new transaction comes in, it gives you kind of a new connection, it watches for that to complete. You don't have to worry about idle connections. There's a query you can run to see active versus idle connections against your database. I've talked to a lot of people that say,
31:01
hey, I have 2,000 active connections from my application. I really, really need a high connection count. They run this query to show how many are active, how many are idle, and active is like 10. That means we've got 1,900 plus connections doing absolutely nothing, just consuming resources,
31:20
not adding any sort of value. PgBouncer solves this for you. So scaling cache, I mentioned this early on, watching for that cache hit rate. If you see a drop below 99, go get a box with bigger memory. That's the easiest thing to do in terms of scaling. At some level, what you're gonna do
31:41
is run into a brick wall. Like AWS only has an instance so large. Okay, fine, you're gonna leave AWS run on your own hardware because you can get a really beefy box. Then you're gonna look at, hey, how long does it take to rack and stack that? You're still gonna run into a limit there. Maybe you can get a box with a terabyte of memory, but if you keep growing,
32:01
you're gonna run into the limits of single node. The first thing that most people do is offload some read traffic, right? Okay, let me have a read replica. Things are slightly stale by a few milliseconds or seconds, but that's okay for these different models. For replication, there's a few options.
32:22
I would generally recommend Wall-E, Wall-G, or Barman. Wall-E, a colleague authored back when we were at Heroku. We used it to handle replication for around two million Postgres databases when we were there. I'm pretty sure it works at some decent scale. We recently rewrote and released it as Wall-G
32:43
to be a bit more faster in performance and go. Wall-E's and Python give either a look. Barman's also a popular one. I recommend either of these if you're looking at setting up streaming replicas. The other option is sharding. Sharding is the idea of splitting things up
33:02
into smaller bits. If you have one really, really large table, I usually, in an application, if there's one table that is a crazy amount larger than all others, it's probably called events, messages, or logs. It may not belong in your database. You can probably move it off somewhere else.
33:22
There are certain other data models that actually fit really well. I see things like one database per customer or one schema per customer. Please don't go with those two paths. There you have other things. Now when you run a migration, you have to run it across a million databases. When you have 10 customers, this isn't a problem.
33:40
When you have 1,000, it is. At things like 1,000 schemas, Postgres backups sometimes just don't work. It's okay, we didn't test them anyways. The other option is sharding within your application. So sharding is definitely the really large scale option. It's worked for Instagram,
34:01
works for Facebook, Google, Salesforce. It used to be really, really hard. It's gotten a little bit easier. So one option, not gonna drill too deep into this, I work at Citus. We turn Postgres into a sharded database, but your application, it looks like a single node. So you're still talking to one node database. It's a pure extension, it's open source,
34:21
so you can download it, just use it. Under the covers, there's multiple physical instances there. So still all the benefits of Postgres, still your application thinks it's single node Postgres. Under the covers, it's sharded. A lot of the principles of sharding, the Instagram talk from a few years ago is a great one
34:42
that basically walks through the best practices, that sort of thing. If you're thinking about that, have questions, happy to fill them afterwards. Backups. So when you start out, you're probably using pgdump. That's the one where you get kind of a SQL dump,
35:02
you upload it to S3 or locally. It's human readable, it's portable, it works. The other option is what's known as a physical backup within Postgres. This is the actual bytes on disk. It's a bit like operating system dependent. You can't take like a Linux physical backup
35:21
and restore it to Windows. So it's not necessarily great for going from your production environment down to your local laptop because you're probably running a different OS. There's some trade-offs between the two of these. Logical, that pgdump one, is really good across architectures. It's great for going from production down to a local environment.
35:41
It's good for portability. It does have some load on the database. So when you run this, you're gonna see things spike in terms of load and performance. It works really well less than 50 gigabytes of data. When you get larger, at one terabyte, I don't think I've seen a pgdump successfully work. You can kind of tweak it and maybe make it work.
36:00
It gets really, really hard. At some point, you're just gonna have to abandon pgdump and go with physical backups. Physical backups have a lot less load on the database. There is some load, but it's far, far less. It scales pretty infinitely, and it's what you're gonna use for things like read replicas as well and that sort of thing. So over time, you're gonna wanna shift
36:20
from logical to physical backups. All right, that was a lot really fast. So a quick recap. Leverage your data types. I think DHH from the Rails community once said, the database is just a dumb hash in the sky.
36:42
I think very differently. The data is where a ton of the value is. If your data is wrong in your database, it doesn't matter that you put a pretty UI on it. There's a lot of value there, and you wanna make sure you're getting the most out of your data because it's one of your most precious things. So leverage your data types. Please test your backups.
37:01
For everyone that didn't raise your hand, go test them, see if they work. And then spend some time mastering your tools. Things like the data types, some of the extensions, and psql. There's a lot of value in your database. Having a good editor that you like. If you don't like psql, go find another one. But I would say psql's already there,
37:20
and it's really handy, so give it a try. As you get a little bit larger, make sure Postgres is well-tuned. You don't have to become a DBA. Find a consultant if you need to. Look at some of the blog posts or tools that are out there or other talks. Really don't invest time, I would say, in becoming a perfect expert in all of the different configs.
37:42
There's a lot of resources out there, use them. Watch your cache hit ratio above anything else. This is the number one thing that'll affect performance. And then go ahead and index things. There's a lot of people that when you're small, you don't index anything. As you get a little more medium-sized,
38:00
they still don't have indexes, go ahead and just start adding them and see what happens as the worst case. If you spend a little bit more time saying, hey, I've got these slow queries, I can index this thing, that's definitely better. But the worst case is just index a bunch of stuff and delete it later and see what happens. As you get to that large scale, move away from pgDump, set up connection pooling.
38:23
How many people here run a large scale database? Okay, a few hands. How many of you run a connection pooler like pgBouncer? Actually more hands than large database, so that's really impressive. Well done. pgBouncer is the best tool there. It's not perfect, but it's the best that we have.
38:43
And hopefully in time and core, Postgres gets better connection pooling. And then if you need to shard, invest in the right way, look at the Instagram talk, use a tool to help. Please don't go down the one schema per customer path because you'll go down that path and lock yourself in and a couple years later, you'll have even more work
39:01
than you had to put in initially. So invest, once you get down that path and doing it the right way, try not to do that. Try to just scale up and up and up as long as you have to, though, because not sharding is a lot easier than sharding. All right, I think that's it. Any questions?
39:30
Cleanness for questions, please line up here, here, and ask your question to the microphone.
39:49
Hi, so I have a question about indexes because this got me really interested. I have this problem that I didn't have to solve myself, but I heard about this many times about scheduling problem when you have like, let's say rooms and you have like very flexible times
40:04
and you don't want, of course, to schedule two meetings in the same room. So would you use Gist index for this to optimize this query? So a couple of things I would do there. So one, are you using range types? So okay, so within range types, now I'm slightly blanking.
40:21
We actually have a blog post on the Citus blog about constraints. What you can have is an exclusion constraint so that things don't overlap. I forget if it's related to Gist or not. I think it is. Search for exclusion constraints and range types. For calendaring scheduling, range types are a type that has a from and a to for people not familiar.
40:41
So if you have like a, like right now, you don't want two talks going on at the same time or people scheduling things for a registration class. You can have an exclusion constraint so that things don't overlap there. Thank you. Yep. Thanks for the talk. I took some notes to pass it on my DBA.
41:02
I have a question, it's more like architectural one. There's been some debate in my team about using the Postgres, but in this Docker as containerized Postgres, Kubernetes Docker, whatnot. What are your thoughts on that? I was a bit, it sounded weird to me,
41:21
but I really haven't experienced. It's a bit of a loaded question. So there are people that run Postgres in containers. We do for development, for testing, not for production personally. Safeful sets are getting better within Kubernetes.
41:40
It really kind of depends. There's a conference I chair out in San Francisco Postgres open. We actually have some talks coming up in September on that of some places that run Postgres within Kubernetes, within Docker, Safeful sets, that sort of thing. I'd say it's still a little bit early days. People are doing it. So it kind of varies and depends
42:01
on your risk tolerance and expertise. You didn't mention the built-in replication of Postgres, so how do you feel about that? Yeah, so built-in replication still needs kind of a little bit of help to set up with something like Barman or Wally
42:22
because you need a base backup and then the write-ahead log to catch up. If you're using replication, I wouldn't recommend using something like Sloanie or LongDC, which are a replacement for the built-in. So Wally and Barman are more meant to supplement and help you use the built-in streaming replication. It's come a long ways.
42:41
It's still got a ways to go. So the base one right now is like a binary format. Logical replication came in in 10, like fully baked. It kind of got there in 9.6 but not quite. I'm a fan of it. It's getting better and better. If you have specific questions, happy to drill in offline too on it.
43:04
At the beginning, using soft deletion, but I mean when using soft deletion, you kind of lose the benefit of on-delete constraints. Do you have any good recommendation besides creating triggers and everything manually? Not right off. That is probably the best one.
43:23
Yeah, that's probably your best case is just creating triggers and doing any application there, yep. Thank you for the talk. Quick question. Do you have any performance advice for running a large-scale database on RDS? Use Citus?
43:41
No. So it depends on which large scale. So like. 700 gigs. Yeah. Optimize a lot of things. Like you're getting to a scale where it depends on what size instance you're on. Like as you get larger and larger, like we run Citus Cloud on top of AWS
44:01
and see a lot of that scale. And we have customers up to like 900 petabytes or so. Or sorry, 900 terabytes, not quite petabytes. Is there any tuning difference between running Postgres on AWS instance and moving to RDS? You'll get some more flexibility. So like you get a little more control running it on AWS yourself than RDS.
44:23
I would say RDS can scale to a terabyte. I've seen up to four terabytes on RDS. I haven't seen really performant ones beyond that scale. At that point, your options really do look to be sharding of some kind. To start to split things up into smaller bits, be able to scale things out. Thank you. One more question.
44:42
More, sorry. Okay, we are running a complex recommendation system that it's running queries over very large tables. And we're having all the time issues with performance. And I guess that the recommendations are basically touching the index and to check with it because we have many indexes, you all have issues.
45:01
But also, don't you think that it's better than to have one write from one side and narrate in a different purpose instance? Yes, but it depends. So having writes go one side and then reads on the other can actually create issues around replication lag. So you can create latency and make kind of that replica like not safe or not able to catch up.
45:22
So in theory, yes. In practice, some of the time. Some of the time, I've seen it create more issues. Running queries against a read replica can actually create issues on the primary as well. So it's not to say, oh, that you have a read replica. It's perfectly safe to touch. So there are kind of complexities there. Generally, that can be one option.
45:42
The other thing I would say is if you're running a recommendation engine, the order, if you're probably scanning a lot of data, so it's probably going to disk, the ordering of data on disk is really important. I would look into a couple of extensions, pg-repack or pg-reorg, which actually can reorganize data on disk.
46:01
They're really interesting, powerful extensions. They do some really awesome things. They can also do some really scary things like if you misuse them, they can just make the entire database unusable. So proceed with caution. But if you are doing a lot of sequential scans, they could be interesting to help too. Okay.