Logical Replication in PostgreSQL 10
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 19 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/48975 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
PGCon 20174 / 19
1
7
8
10
11
12
15
00:00
Computer animation
01:06
Computer animation
02:05
Computer animation
02:51
Computer animation
04:54
Program flowchart
05:39
Computer animation
08:31
Program flowchartComputer animation
09:35
Program flowchart
12:38
Computer animation
20:30
Computer animation
21:11
Computer animation
24:47
Computer animation
27:59
Computer animation
28:49
Computer animation
29:37
Computer animation
31:14
Computer animation
32:58
Computer animation
34:26
Computer animation
37:56
Computer animation
40:36
Computer animation
41:58
Computer animation
45:24
Computer animation
47:22
Computer animation
Transcript: English(auto-generated)
00:05
So if you've seen this talk before it's the same talk, so you can go to another talk if you want all right, I guess the Everyone who's here is here and wants to be here is here
00:23
so Welcome. This is about logical replication Postgres 10. I'm Peter Eisen chart. I'm a longtime Postgres contributor And I work for second quadrant Postgres code services and Postgres development Now there's my email address and my Twitter and you know, obviously this is going to be linked from the
00:46
Schedule so you can find all that information later So what I want to do is just show a demo and then we'll talk about some theory and have some slides So this is Postgres 10 right we have Postgres
01:25
10 beta 1 is out as of a week or so ago so you can you know Download that and then test test that out Obviously still gonna be a couple months until that's final. So this is sort of a preview, but you know, you're encouraged to
01:41
Play with it and then test it and report back if you if you don't like how it works So just gonna show a demo. So I'm gonna set up two instances and replicate some stuff between them Right since this is going to be on the same Machine they're gonna have two different ports other other than that. It's gonna be a real
02:03
Set up so one data directory So here I'm just gonna use two different port numbers so we can keep them apart in reality They would be on different hosts right and now and now in this case, I have to give an option
02:21
Or rather logical you would put this in the configuration file normally Right, it's just kind of illustrate that here and I'll explain the other way you need to do that. So now approach goes running This is also nice in postgres 10 the default log outputs a little bit more useful So there's a timestamp now by this is all default settings, right? So the timestamp and paid is there by default now and you also get some kind of useful output back there
02:42
That's new on what ports is listening and stuff like that. So that's just a unrelated change So that's one host Gonna make another one data directory to postgres data directory to port
03:02
To so in this case, I don't have to give any wall level anything else. So it's just running now All right, so I got two instances now, right? so I'm gonna leave these running in the terminal here gonna create a test database in the first one create DB
03:21
Test so connect it to test create a table Column name here. Okay, I put some data into it
03:49
This is kind of my standard test data here because it's kind of easy to see what's going on So now we got a table with some data, I want to replicate it
04:01
So now it comes a new thing gonna create something new that's called a publication So there's a new command great publication and You just give it a name my pub for table test one. Okay, so that
04:24
You know ran pretty quickly didn't do much What a publication does is basically just groups tables that you want to replicate into some kind of a name Named grouping in other systems. You might call this replication set. Maybe that's you've seen that maybe in in You know existing systems. So this is just the same term for just grouping some tables
04:45
It doesn't allocate any resources really or anything. It's just a way to refer to that so you can you know, there's system catalogs that kind of belong to that you can see You know name and IDs and some attributes there that will come back to later
05:02
There's also a backslash command for this in PSQL, but all the good letters were already taken. So it's dr P You know P DP already taken and dr was already taken so we can have big R's for application perhaps so That's just the way it goes if you have you know, these kind of commands
05:23
You run our letters after a while, you know, and you can also if you go look at the table It tells you what publication is part of just as a sort of a hint So that's all set on the on the sending end. Now. We have to set up the receiving end. So
05:41
That's going to be So this is all per database So you got to set up the receiving database? And it could mean I could have any name, but we'll just keep sort of the same name So that's on the second one. Okay, please go Okay, so that's gonna be the target database it's currently empty
06:04
yep, and One thing this does not do yet is replicate your schema. So you have to set that up yourself This you know might be a future feature, but right now you have to set up this you have to set up your
06:21
target schema yourself and you could Use, you know PG dump obviously to kind of move that over Just in you know, PG dump schema only dump it out move it back in then you have a copy So you have to do that And
06:40
So we have a table now, there's nothing in it and we want to replicate some stuff into this so the opposite of a Publication is a subscription. So you got to create a subscription My sub but maybe So the sub groups subscription needs mainly two pieces of information one is where does it connect to so there's a
07:05
Argument here connection which is going to be a connection string like libpq kind of connection string So in this case, it's going to be port The other guys over there and database name was test and Then the publication so it knows what tables to pull
07:24
So we have publication my So now I told you it did something So it said synchronize table state that means it's copied the initial data from the tables and you created a replication slot
07:41
Which you know, that's an existing concept if you've seen that a replication slot basically just means it registered itself on the Sending end. So the sending end knows to You know hold on to the information that needs to be replicated All right, so now let's see There you go. There it is
08:01
So that's application. Okay any questions No, it doesn't do that that's what I mentioned no, but that could be a future feature but it doesn't question Joe
08:22
Yeah, yeah connection string could be UI or Obviously you can have usernames passwords and stuff if you need that kind of stuff, right? So let's just kind of see that it actually, you know Can you do a couple more things here? See that it's actually updating. Okay. All right, so it works
08:48
so here's We'll get to that those are the question was if you have more more tables in the
09:03
subscription the publication can have more than one table and So if you had more than one table they would all So some kind of we'll get them to some slides with some of these details. This is just kind of a demo, right? So system catalog
09:22
DRS shows that there's also if you want to Monitor this For like replication lag and that kind of stuff. There's It's very similar to a physical application that you might be used to already So on the sending end, it's exactly the same stuff PG stat replication. It's the same table. It's just
09:47
Different consumer on the other end, but doesn't care yeah, all the LS ends there and the lag stuff that's new in Postgres 10 would show up there and And synchronous replication stuff is also there. We'll get back to that. That's the same stuff
10:01
if PG replication Slots, it's the same thing if you were looking at that on the receiving end. There's something new There's PG stat subscription So this is the if you use physical application you might use PG stat wall receiver and
10:22
Then you compare PG stat replication to a wall receiver So you can kind of see how far along the LS ends are different. So this is the equivalent of that. Yes, Chris Can you see the structure? Yeah, it's a normal table. You created it yourself
10:52
Yes, the so for the audio the Question there was or the comment was that in systems like Sloanian and like this and they do that too. They
11:00
Protect the receiving tables from manual updates. This doesn't do that. You could do that yourself But the intention is also that you can write to it. Anyway, if you want to Yeah, okay So We'll get we'll get back to some of the details so there's you know, this is how you would sort of build up the monitoring system around that you can compare the
11:22
The LS ends on these different sides. So this is it's built on the same Infrastructure as the as physical application really so all these concepts you are all the same, right? Yeah Yeah
11:48
so the question is instead of for when you create a subscription you have to put a connection string in his Comment or question was to good was any thought about using the foreign server
12:01
Infrastructure to store the connection information there hasn't been there was a thought about that But we kind of decided to go ahead with this in the interest of simplicity there's some ideas about maybe have sort of a node registry in this in the future so you can and How do you how do you generalize that?
12:20
There's some kind of questions about that So this is kind of you know, the the simplest thing that will work and then we'll build it out and so So There was a demo obviously as you can see there are not too many moving pieces here There's just publication subscriptions and then some monitoring tables around that the Devil's in the detail as they say, right?
12:43
so Why even do this? I think it's not gonna You know take over from physical application just for your basic standby Because physical application is just a you know, it works you get a full copy of the old server goes up in flames
13:03
You have a full copy and you use that one. This is more like if you have sort of more nuanced and specific use cases One big one anything where you want to just replicate part of a server, you know One database a couple tables part of a table certain actions on a table or any kind of like where you want to archive things but not delete things or you want to
13:25
You know aggregate multiple databases into one database anything that is not an exact one-to-one copy So many use cases without that in terms of archiving or creating sort of a data warehouse or anything like that So that's one big use case
13:41
Cross version is upgrades the second one. So that's the other big use case that we hope to to achieve with this Obviously that's not gonna happen until we have post goes 11 because we can't upgrade From you know old version that doesn't have this into this so that's gonna be sort of a future thing but yes, that's definitely something we want to do that, you know, you set up a 10 and 11 you just say
14:04
publication subscription and everything just ends up there and Build an extension that could be a provider or maybe or publisher for an older version
14:29
That's that's an interesting idea I guess on the On the subscription side this would be kind of hard because we kind of fiddle with some internals to make that work But on the publication side that might work
14:44
I Think that might be a little hard For if you want to upgrade from nine point six to ten you can use PGA logical It gets you the same kind of functionality So that I think that might be hard to do because we have to fiddle with some stuff with replication slots So it's not exactly you can't exactly backport this ideally though. That might be hard
15:04
so basically, those are the two big ones or any other situation where physical application doesn't work and that could include things like you want to partition tough differently or the collation changed or between different platforms or anything like that, right, so that would be those would be the
15:21
Use cases and you know, it's kind of sort of a You can think of your own thing basically, right? I have a slightly unrelated question. Is there, as far as I know, when a slot starts to be used, if it gets not being used, it stops being used because it replicates the subscription data
15:41
It's gonna grow and grow and grow, I mean it's not Yeah, the replication slot behavior is the same one as you have now No, so this question is if you have a replication slot This is the same thing as you have with physical application if you have a replication slot But then you don't consume anything from it anymore The wall information on the sending side is going to be retained indefinitely and then your disk will fill up
16:05
So the idea is there were that to to limit replication slots in some way to based on like time Maybe or disk volume or something, but that that's not implemented But that would be a sort of an independent feature that would be useful for for others as well. So
16:21
I believe that's a big problem You cannot control subscribers and if they die, then your primary server drops out Yeah, so yeah, that's an underlying problem. Yeah. Yeah. Yeah, I understand. Yes, that's something we were definitely thinking about Yeah, so what about these things that you might have heard of one of the things that I saw in the
16:42
hacker news thread on When the beta one was announced is like, oh great. They merged PDR in the Postgres. So that's not quite correct. So If you don't know what these are PDR is is bi-directional application. That's a a fork of Postgres that was started around the 9.2 9.3 time frame that
17:04
basically implemented a lot of the infrastructure that we now use for logical application that includes logical decoding and commit timestamps and Origin tracking and event triggers and a bunch of these things to implement
17:20
a multi-master replication system and that's a open source project that you can download and All of these features were basically fed in back into Postgres one by one Until you know, basically in 9.6 all of them would merge back in So the next version of PDR is just going to be an extension
17:42
So that was sort of the origin of all this logical replication Business that's been going on for many years now in the Postgres community and Because but because PDR is kind of a specialized product It's you know It's very fairly complicated to manage because you have all these different nodes that all need to be aware of each other and so on
18:00
so The next step there was pgeological is an extension that basically just does one-to-one replication as an extension and one of the Original proposals around 9.6 was to just merge that into Postgres But then people said we don't want an extension. We want to build in so we started again in Postgres 10 and
18:21
Just basically took the same tech. That's input pgeological and just made it it's Put a different interface on it So that's how it looks now But if you've seen these or use these or wondering about these these are basically different interfaces around the same technology But they're not compatible with each other. So you can you know use pgeological to
18:41
Replicate into the in core stuff in Postgres 10 or anything like that But what I mentioned earlier if you want to know If you're interested in doing something like this for upgrading from 9.6 You're invited to use pgeological that does get you almost the same stuff But those are separate products or projects that are not
19:02
This yes question Is the feature set of pgeological equivalent to the stuff that's in Postgres 10 no Pgeological has more sort of practical features it can for example like
19:23
Internally run PG dump to copy your schema and stuff like that. It has like some filtering functionality you can output JSON so you Know these are all things that we could do based on what we know but it's not done, right? So pgeological has a lot more stuff you can you know filter you can it has
19:45
Origin tracking you can do a small amount of conflict handling based on timestamps for example so it is it is Has more functionality that way But it's also worse in some other ways because it doesn't have it doesn't it's only an extension
20:01
And we had we did a fair amount of internal fiddling also in Postgres 10 to do do certain things more efficiently So they're both trade-off the long-term plan here is definitely emerged all of that back into Postgres But you know it has to go through the community processes and be cleaned up so in the long term You know a couple of releases from now. We hopefully have
20:23
Most of the functionality in in core, so we don't want to you know have these around a separate projects indefinitely alright, so Good questions, so this is basically another sort of picture that Summarizes what we've already alluded to earlier so
20:40
Everything's per database You can have you know multiple publications in one database. You can have multiple tables in the publication They can overlap and things like that and you can also you know have Multiple subscriptions One subscription can subscribe to multiple publications, and then things like that so it's all sort of many-to-many mappings
21:08
Oh Yes See I already gave this presentation once and I watched it, and I filled in all the gaps that people ask me questions about
21:24
So everything is Obviously these are distinct systems in the same system everything sort of linked together by our IDs But these are separate systems so the way everything is linked together is by name So the table have to have the same name The columns have to have the same names
21:42
It's possible to have different orders, and it's also possible that the receiving end has more columns that they'll just be filled in with nulls if You know constraints check constraints and such allow Data is shipped in text format that might be also one of those like future improvements
22:00
Maybe you do some binary format or something but right now. It's text format So that's also good for upgrades and that kind of stuff so you have the most sort of compatible representation And And new types also mapped by name It since it's text format the types don't exactly have to be the same
22:20
In a way because it's just as long as whatever text is being sent fits in it will work that also helps in case you do some kind of upgrades or variants of that And you are allowed to change the tables on the fly and You know appropriate things will happen, I mean it will it'll work so if you add a column
22:44
Let's say you add a column on the receiving end That's fine as long as it allows null values. They will just the rows will just keep filling in And you will have null values in the extra column. Let's say you add a column on the publication end as soon as you will as soon as
23:01
You put data into that column and it ships and The column is not there yet on the receiving end, and it'll just arrow until you make that column And then yeah, yes question Is it possible to use this in conjunction with the new partitioning
23:33
You want to replicate sort of in two different directions Yes, you can do that so all of this is just one way, but what he's kind of alluding to is having
23:46
Partition so multiple partitions on both ends and one partition replicates that way and the other partition replicates that way so you can kind of build a multi master that way if you're you know you you're If you only ever write to that one partition so you can make that kind of stuff work yeah, so
24:04
But sort of the point is here. It's going to be much more robust against schema changes Compared to sort of these sort of trigger-based systems that we had in the past Lots of questions, let's kind of go there
24:27
Yeah, yes, you want to replicate all the tables in one schema You would have to enumerate them. That's probably on the next
24:48
Slide as well So there's for all tables I get to that in a second we'll go through the questions yes, yeah, yeah
25:12
Yeah Yes It will queue up the inserts or DD change records until you consume them basically
25:25
Yes Chris yes It's It'll just it'll It will not be consumed from the replication slot until it's sort of acknowledged right so that kind of works Chris. Yes
25:43
How do these things blow up if primary keys are missing? That's sort of one of the Technical details I was going to mention later, so yes You do it you do kind of need a primary key on these tables in order to Locate what you need to update This is something that's called the replica identity that you might have seen somewhere in the alter table man page that otherwise is never used
26:07
If you don't have a primary key you can still publish inserts because they will just be appended But if you try to do an update and you don't have a primary key it will complain And you can then also do replica Identity full which means you make the whole row
26:23
The key and then things like that that also determines what the key is that gets shipped with the information So if you want to make that as small as possible, so All right, so here's sort of the we saw create publication earlier These are kind of some of the variants you can do so you can have multiple tables you can have all tables
26:41
That's addresses your use case of that. That is all the tables now and in the future So if you can either have an explicit list or you can have all the tables in the database There's no all tables in the schema. That's not doesn't exist if you were hoping for that and you can also have some options of only publishing certain actions like
27:06
Only publish maybe inserts and updates don't publish deletes. Maybe that's useful for like a Archiving kind of application. That's just you know you can find your own use case for that Question. Yes
27:21
Can you publish a subset of a table? No, you would have to partition it then you can make that work for me How would you? Like based on what criterion would you? No, I mean that that's You know, that's the sort of thing I mentioned that pgeological might do and that we might add this future, but we don't have that right now
27:46
And you can change publications later on if you if you want to you know Tables maybe or just replace the whole set of tables or remove some tables. We don't like them anymore. You know, it's standard stuff basically and then you can drop them so
28:12
No, no again that you can You can you can already think of lots of ideas like I want to fill up a column by row I want to just map this column to that if because I renamed everything and all of these could be future features, right?
28:27
Yes, it's just column by column question here Can one subscription point to multiple publications yes
28:42
On in the same database, right? Yes Which is actually the same
29:04
Yes, if something gets applied if something gets applied on the subscribing and into a table this triggers apply Constraints apply a foreign keys apply. So yes
29:21
Triggers yes. Yeah, sure. Sure. Yeah. Yeah triggers would be good for that. Yeah All right. So yeah, we kind of saw that example already when we did the demo connection You need to know where to go and then the publication is just which tables to graph. That's it. And then there's some options for special cases you can play with
29:42
You can have the subscription initially Disabled so it will sit there would not do anything That's just if you need to do like network maintenance or something. You just stop it basically and then there's some options for application slot handling and That basically is mainly intended for PG dump if you dump out a database and
30:04
If you were to you know, it dumps all these statements out like great subscription. So it reproduces that when you restore it, but You don't necessarily want the subscription to start up right away as soon as you're as soon as you Restore it because maybe you just want to have a copy for testing, right? You don't want that to connect out
30:22
back to where you originally subscribe to So what PG dump does it it creates all subscriptions in a disabled state And then you have to go in and enable them if that's what you want to do depending on what why you restoring So that's what some of these options are for or if you know for some reason you want to name the
30:41
Replication slot in a different way or and then the last option there is if you want to do the initial data copy or not so As we saw what it does is when you create a subscription it Internally runs that kind of a copy command just to copy the initial data And then it syncs it up with a with this change stream
31:00
And then it tells you you're good to go, but you can omit that So you just get the new data and you just omit the old data if you want So some of the options therefore, you know certain special use cases So and then you can alter commands to change that if you want to enable and disable as I mentioned or change You know, maybe you move the host or rename something you can change the connection. It's pretty straightforward
31:25
so this is kind of the fiddly part because if you want to So it create so you have your subscription here You create a subscription it reaches out to the publication side and creates a slot there so if the publication side knows to hold on to the stuff to send it's what we mentioned earlier and
31:48
If you drop a subscription it does the same thing it goes and drops the replication slot so all that is coming up That's what we want but if you if the maybe the
32:00
Publication side is gone. The host is no longer there. You can't reach it or it's you know The drop subscription would fail Because it will try to reach out it's not there it'll fail so you have to use these kind of things to kind of Tell it that the slot is no longer there and then you can drop it So those are kind of fiddly options that are necessary because you have like kind of these distributed failure cases question the back yes
32:30
Well, I was pointing at you actually if you want to you go but Okay in the back, please. Yes
32:44
Yeah If the slot already exists in the publisher will tell you it's not already exists But you can also use that one and if maybe there's sort of certain cases You can also maybe create the slot manually and just use it so then you if you create the Create subscription create slot false
33:04
Give it a slot name. You're gonna use that one If the slot already exists, but you were thinking about making a new one then it's just a normal name conflict You know that synchronization is independent of that yeah question you're done in the middle yes
33:31
Synchronous commit works just the same way your question is synchronous commit
33:51
The publisher Yes synchronous commits works the same way here as it works for physical application so
34:03
What synchronous commit does is basically the publisher or the sending side Waits with the commit until it's verified to be written on the receiving end, right? And there's it works the same way, right? There's feedback and you will have the same behavior if you want to set that
34:26
So we have that so this is again this is kind of a fiddly part and You can change the subscriptions and there's so that what refresh publication does is if you add tables to the publication You have to run that so it gets goes out and fetches the new tables
34:42
So there's you know a lot of kind of details if you change things or move things around you can all update that It's hopefully the documentation will cover that And you can drop it of course So here are some configuration settings That are of interest. I said I showed wall level logical
35:02
So that just means to include in the wall the information for logical decoding So that's the same stuff here that you need for PG logical or VDR or test contrib test decoding, you know, that's just So the wall contains additional information to make this logical decoding possible The other next two there is are
35:24
In parentheses because in postgres 10 the defaults would change so you don't actually have to set them as you just saw in my demo So in in in previous releases if you want to do any replication at all you have to set Max wall sounders to something and if you use replication slots
35:40
Which you probably should you also have to set that to something now the defaults I think of 10 and 10 so you know for a small setup it should be enough and Then if you want to obviously if you want to connect to a remote host you have to set listen addresses and set up Your PGH pay.com so you can connect that's just a normal stuff So we tried in postgres 10 also, you know, not only because of this feature
36:05
But also because you know pitch based backup users and that kind of stuff just to make it a little bit simpler So the defaults are better So it's it's easier to get started and then on the subscription end as I as I showed you don't actually have to set anything by default, but there are some options to you know,
36:24
basically tell how many of various things you want so You again the application slots You don't have to set by default and now at least someone in this room is probably gonna thinking wait a second Why do we need replication slots on the subscriber and you would be right but the
36:44
Replication slot setting also controls how many replication origin tracking slots You need and if you don't know what that means just forget about it Just set that to the same value you have on the subscribe out in the publisher Worker process is the same stuff as you use for parallel query and
37:02
It's just the same pool of extra processes, so we need one process per subscription plus initial table sync plus one launcher process And then you can also set how many you want specifically for logical application So the worker processes pool goes for all background workers and logical application workers is just for us
37:24
And then you know parallel query has a separate pool with a similar setting And then so on and then the last one is how many that the last one is It's kind of nice the initial table sync can run in parallel for each table That's for example something that PGA logical cannot do because we had to do some internal fiddling with the
37:45
slot snapshots business So by default that's set to two so you get two tables being copied in parallel if you want to do it faster You can give more, you know less That's your choice Oh Security stuff in terms of you know who is allowed to do this
38:05
That's just sort of the initial set that we came up with for Postgres 10 so if you to create a publication you have to have create privilege on the database and In order to add a table you have to be the table owner so that hopefully makes sense for subscription you have to be a local super user and
38:23
The only reason for that is really that this is a description enables you to connect To a remote host and You kind of want to restrict that somehow it doesn't necessarily have to be super user, but we haven't really agreed on a Lower level privileged bit that could represent that well. That's again room for future improvement and
38:42
on the remote and so you have two users basically right the user that creates a subscription and then the user you connect as That they don't necessarily have to be the same and the remote use has to be replication you that's the same thing as we had before And then you need an HBA entry You know you could use scram hopefully in Postgres 10
39:04
Something there was changed If you have never used logical replication before Postgres 10, then this will not affect you But in before Postgres 10 you had the replication keyword here So if you use physical replication you would have host replication, and then a user and some stuff
39:23
And that was the same for logical application And that was pretty basically the error the way it was designed and so we have changed is now So you have a pro database setting as you would expect? question job Yeah
39:46
To create the subscription in that database you have to be a super user, but then in the connection string you have a user That user on the remote and has to be replication has the replication attribute
40:00
Which is kind of the same thing you have now if you want to do physical application right so in that because Now I think that was looked so he's wondering about how this interrelates with Roll-level security I think that was looked into when Petter and Steven talked about it, so we could check that again, but I
40:28
Think that that was looked into All right, so hopefully this is not too surprising it I Showed these monitoring views Actually one thing I didn't show was over in time here. Oh, we gotta hurry up a little bit one thing
40:44
That's also new in Postgres 10 independent of this is that background brokers show up in PGSAT activity So you also see all that stuff happening there, so you have one One worker per subscription showing up in PGSAT activity, and you can see what it's doing
41:00
All right, so here are some other stuff that people already mentioned what hopefully all just works so synchronous replication works as I mentioned and basically we didn't have to do anything for that all you have to do is kind of Send the feedback when you're done and all the really interesting action happens on the ascending end Cascading is possible not really because because it's a cascading system like Sloanie, but you can
41:25
You can have a publication on the same database as a subscription It will just and do the whole decoding again and keep sending it on right that there's no reason why that wouldn't work Triggers and constraints work and writing to subscribe tables as I mentioned already earlier as we discussed works if you want to
41:44
If you don't want to there's facilities to prevent that You know triggers permissions We'd only know it maybe things like that. So We did we didn't really want to go into that So things that don't exist yet and we already talked about some other like filtering call mapping all that stuff could exist
42:05
Basically, these are the sort of the big ones that we know of right now. So Schemas are not replicated. We have an infrastructure for that with event triggers. You just kind of have to You know write event triggers to package that all up ship it over unpack it and put it there
42:20
So that's that's something someone can work on Sequences are not replicated because they're that you know, the support for that in the logical decoding API doesn't exist. So they you know, if you have a serial column the data in the table obviously ships around but the Sequence itself will just still sit at the start value
42:41
So if you want to do sort of an upgrade or failover kind of scenario you have to use maybe PG dump to just ship your sequence data Records or there's also better support for the PG dump in that in postgres 10. So it's my it's a bit easier as to
43:01
Copy just a sequence set valve Commands or but that's obviously something we want to fix eventually. So again, you know if you if you if you Live through the Sloanie evolution You will also remember Sloanie at one point did not replicate truncate and we were in the same situation now until we figure out how to do that and
43:22
It's also not quite clear to me how to actually what that would actually mean because if you delete everything from a table We will ship individual delete records over so we will delete exactly the same rows on the Receiving end and if they don't match then we don't delete them. And if there are additional rows, we just leave them there So it's a pro record. There's a truncate is you know
43:42
What happens if you truncate the table here or do you want to truncate everything here or just the rows or? There could be multiple opinions on that. So it's not clear of that. It's sort of if there's a single answer to that And right now it's only we only replicate base table to base table so that means you you can't replicate from a view to you for example, or
44:03
From before and therefore form something into a foreign table doesn't work Something involving materialized view doesn't work If you want to replicate between partition tables You can only go from the same partition to the same partition and we talked about that yesterday in the unconference that that's obviously something We want to fix it was kind of interesting that so partitioning and logical application
44:24
They kind of showed up in Postgres 10 almost around the same time Not sort of knowing about each other Ahead of time and so we've got the both of these complete features, but they didn't really work together. They're all that Well, that's just you know, how it happens sometimes So that's definitely something you want to address
44:43
probably in Postgres 11 So if you have the same partition hierarchy, it'll work But if you want to change the partition hierarchy, then that this is not going to work question in the back is it possible to
45:01
Can you you can't well you can't you can't create catalog objects on there on a read replica So you can't create publications on the standby
45:21
Yeah, and then there's all this there's a whole bunch of issues That is all mentioned on this slide That exact address at this exact point that the notice is actually on this side on this slide And Physical application and this logical application don't work together in certain cases all that well and
45:45
That's really unfortunate, but there's some really tricky issues there. And if you want to Let's say you you have a physical application pair, right? Just a normal stuff you have now, but you also have a logical application Connection hanging off the master you fail over you can't just move that connection or
46:02
Because the replication slots are not replicated so there's no start point there are fiddly ways to work around that but it's weird and That needs to be addressed, but it's that doesn't work Alright, so I think we kind of got a wrap up here. Um
46:21
So I mentioned already beginning physical applications not going away because it just works it's much easier I Mentioned replica identity handling that's you know, the issue with the primary keys There's an issue the long-running transactions, that's just the way logical decoding works. It doesn't decode your transaction until it's committed So if you're a long-running transaction, it will there will be sort of an appearance of lag because it doesn't
46:44
Start replicating until you commit whereas in physical application. It just ships the bits and bytes over then it will Interpret them later, but it can't do that here. I Mentioned how big you dump works. So if you dump a subscription it'll Restore it disabled and then you can decide what to do with it
47:04
And you can't really do any failover failback with this. That doesn't really make sense Probably so it's not really an HA solution probably Or you can have a copyright and failover to that But then if you want to go back, there's no real way to define where to go. Right? So that doesn't work
47:22
All right, so what we're gonna do in a few sure fix all that in Postgres 11 and beyond This is very similar to Robert's talk earlier. Just fix all these things Multi I mentioned we want to get to the point of multi master obviously And then we do want to we don't want to keep around BDR indefinitely. So that will come but not in 11
47:44
and the fixing the integration with physical application that alluded to that You know when you fail over that the replication slot information is somehow transported we have to figure out how to do that and Some more fine-grained security might be nice so you don't have to be super user for everything
48:01
Let's just work. You got to figure out, you know What kind of attributes you want to expose to user commands and that kind of stuff and all these good fun ideas that everybody had here about filtering transforming expressions and that kind of stuff what certainly doable and Could pretty much independent of anything else, you know, you just have to have a way to kind of
48:25
Put that into the code say, you know, if this matches my filter then do that so that shouldn't be terribly hard is you've got to be within reason and Go through code review. All right good, so we Made it in time. So
48:40
I'm just sort of the spokesperson here. I did some reviewing and committing of all this These are also some people who did a lot of work That the yellow neck is was the main author of this code actually he also maintains physiological so he knows most about this Andres of course is well known in the community
49:01
He did a lot of reviewing this time around and also, you know, a lot of the prep work in all previous releases We had Eric Rikers Doing all the testing on the mailing list Craig is the BDR author and he doesn't he has a lot of a lot of feedback and
49:20
Masahiko, I just got to know yesterday's II here Masahiko, he'd love testing and then reviewing lately. So also and you know a lot of other people obviously, so Right now it's committed. It's out there. It's community property Obviously second quadrant is gonna you know, some of these people are associated with second quadrant as am I So we're gonna keep working on that because that's kind of all thing
49:43
but other people have already started pitching in so it looks pretty good and You know beta 1 is out So, please try it out and I think we're not gonna have questions, but you can grab me privately afterwards So thanks for coming and let's go to lightning talks