Heavy Duty Backup with PgBackRest

Video in TIB AV-Portal: Heavy Duty Backup with PgBackRest

Formal Metadata

Heavy Duty Backup with PgBackRest
Title of Series
Number of Parts
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date
Production Place
Ottawa, Canada

Content Metadata

Subject Area
PgBackRest is a backup system developed at Resonate and open sourced to address issues around the backup of databases that measure in tens of terabytes. It supports per file checksums, compression, partial/failed backup resume, high-performance parallel transfer, async archiving, tablespaces, expiration, full/differential/incremental, local/remote operation via SSH, hard-linking, and more. PgBackRest is written in Perl and does not depend on rsync or tar but instead performs its own deltas which gives it maximum flexibility. This talk will introduce the features, give sample configurations, and talk about design philosophy. PgBackRest aims to be a simple backup and restore system that can seamlessly scale up to the largest databases and workloads. Instead of relying on traditional backup tools like tar and rsync, PgBackRest implements all backup features internally and features a custom protocol for communicating with remote systems. Removing reliance on tar and rsync allows better solutions to database-specific backup issues. The custom remote protocol limits the types of connections that are required to perform a backup which increases security. Each thread requires only one SSH connection for remote backups. Primary PgBackRest features: Local or remote backup Multi-threaded backup/restore for performance Checksums Safe backups (checks that logs required for consistency are present before backup completes) Full, differential, and incremental backups Backup rotation (and minimum retention rules with optional separate retention for archive) In-stream compression/decompression Archiving and retrieval of logs for replicas/restores built in Async archiving for very busy systems (including space limits) Backup directories are consistent Postgres clusters (when hardlinks are on and compression is off) Tablespace support Restore delta option Restore using timestamp/size or checksum Restore remapping base/tablespaces
Point (geometry) Ocean current Server (computing) Table (information) Divisor State of matter Multiplication sign Demo (music) Online help Streaming media Replication (computing) Information privacy Product (business) Independence (probability theory) String (computer science) Operator (mathematics) Software Computer hardware Traffic reporting Backup Computer architecture Scripting language Email Sine Server (computing) Data recovery Software developer Counting Database Bit Port scanner Entire function Database normalization Computer animation Software Integrated development environment Personal digital assistant Computer hardware Right angle Quicksort Procedural programming Table (information) Thomas Bayes
Complex (psychology) Group action Multiplication sign 1 (number) Mereology Mathematics Computer configuration Military operation Core dump Series (mathematics) Data compression Partition (number theory) Scripting language Arm Nuclear space Closed set Data storage device Electronic mailing list Type theory Data management Exterior algebra Process (computing) Website Moving average Quicksort Freeware Directed graph Spacetime Point (geometry) Trail Server (computing) Computer file Divisor Open source Link (knot theory) Image resolution Gene cluster Distance Product (business) Crash (computing) Goodness of fit Electric field Backup Installation art Scaling (geometry) Image resolution Tape drive Graph (mathematics) Database Density of states Limit (category theory) Timestamp Subject indexing Computer animation Personal digital assistant File archiver Video game Communications protocol
Scheduling (computing) Thread (computing) Differential (mechanical device) Multiplication sign Source code Parallel port Inference Mathematics Blog Computer configuration Military operation File system Heat transfer Data compression Physical system NP-hard Differential (mechanical device) Linear regression Data recovery Parallel port Streaming media Port scanner Connected space MiniDisc Website Remote procedure call Spacetime Point (geometry) Backup Server (computing) Computer file Link (knot theory) Data recovery Virtual machine Directory service Heat transfer Streaming media Metadata Product (business) Causality Operator (mathematics) Software testing Selectivity (electronic) Data structure Backup Information Database Directory service Cartesian coordinate system Timestamp Inclusion map Length of stay Kernel (computing) Computer animation File archiver Rewriting Codec Local ring
Thread (computing) Code Multiplication sign Direction (geometry) Mathematical singularity Source code Sheaf (mathematics) Set (mathematics) Client (computing) Mereology Computer programming Synchronization Single-precision floating-point format Recursion Data compression Physical system Metropolitan area network Theory of relativity File format Data recovery Software developer Data storage device Parameter (computer programming) Instance (computer science) Benchmark Data management Buffer solution MiniDisc Software testing Right angle Summierbarkeit Procedural programming Quicksort Cycle (graph theory) Reading (process) Asynchronous Transfer Mode Directed graph Point (geometry) Laptop Computer file Virtual machine Streaming media Event horizon Number 2 (number) Product (business) Causality Reduction of order Energy level Backup Form (programming) Window Pairwise comparison Default (computer science) Multiplication Code Computer network Database Denial-of-service attack Basis <Mathematik> Directory service Particle system Computer animation Software Integrated development environment Personal digital assistant Video game
Scripting language Metropolitan area network Link (knot theory) Demo (music) Demo (music) Sampling (statistics) Student's t-test Function (mathematics) Computer programming Type theory Computer animation Lecture/Conference Website Right angle
Covering space Metropolitan area network Default (computer science) Regulärer Ausdruck <Textverarbeitung> Computer file Data recovery Maxima and minima Database Grand Unified Theory Directory service Arm Value-added network Inclusion map Uniform resource locator Computer animation Repository (publishing) Query language Software repository Operator (mathematics) Configuration space Software testing Gamma function
Backup Game controller Thread (computing) Computer file Link (knot theory) Maxima and minima Mereology Computer programming Value-added network Twitter Different (Kate Ryan album) Single-precision floating-point format Information Gamma function Data compression Form (programming) Metropolitan area network Information Database Symbol table Inclusion map Type theory Arithmetic mean Pointer (computer programming) Process (computing) Computer animation Software repository File archiver Website Table (information) Spacetime
Point (geometry) Metropolitan area network Computer icon Demo (music) Maxima and minima Grand Unified Theory Hand fan Arm Value-added network Inclusion map Data management Process (computing) Computer animation Personal digital assistant Universe (mathematics) File archiver Gamma function
Backup Computer file Multiplication sign Virtual machine Maxima and minima Set (mathematics) Abstract syntax tree Mass Value-added network Revision control Goodness of fit Causality Different (Kate Ryan album) Single-precision floating-point format Core dump Information Gamma function Data compression Error message Multiplication Physical system Metropolitan area network Pairwise comparison Email Multiplication Standard deviation Matching (graph theory) Information Parallel port Database Maxima and minima Directory service Arithmetic mean Process (computing) Computer animation Software Personal digital assistant Software repository Mixed reality File archiver Configuration space MiniDisc Quicksort Table (information) Directed graph Arc (geometry) Spacetime
Metropolitan area network Functional (mathematics) Multiplication Backup Information Differential (mechanical device) Line (geometry) Data storage device Maxima and minima Database Angle Directory service Arm Revision control Uniformer Raum Computer animation Oval File archiver Reading (process) Newton's law of universal gravitation
Point (geometry) Existential quantification Thread (computing) Differential (mechanical device) Multiplication sign 1 (number) Maxima and minima Theory Value-added network Revision control Duality (mathematics) Uniformer Raum Computer configuration Software testing Gamma function Partition (number theory) Metropolitan area network Default (computer science) Multiplication Stapeldatei Electronic mailing list Database Grand Unified Theory Timestamp Inversion (music) Message passing Computer animation Software repository Summierbarkeit Right angle Object (grammar) Table (information)
Point (geometry) Ocean current Server (computing) Functional (mathematics) Multiplication sign System administrator Data recovery Set (mathematics) Function (mathematics) Mereology Hand fan Arm Computer programming Product (business) Goodness of fit Bit rate Different (Kate Ryan album) Computer configuration Operator (mathematics) Data structure Error message Physical system Newton's law of universal gravitation Scripting language Metropolitan area network Information Haar measure Rollback (data management) Data storage device Planning Database Line (geometry) Incidence algebra Type theory Message passing Word Computer animation Repository (publishing) Personal digital assistant Right angle Quicksort Table (information)
Point (geometry) Backup Thread (computing) Copula (linguistics) Computer file Demo (music) Heat transfer Mereology Perspective (visual) Revision control Mathematics Computer configuration Internetworking Utility software Data structure Extension (kinesiology) Data compression Oracle Area Metropolitan area network Standard deviation Demo (music) Interactive television Maxima and minima Flow separation Particle system Computer animation Software Personal digital assistant Internet forum Right angle Quicksort Game theory Table (information) Reading (process) Spacetime
Web page Point (geometry) Trail Group action Functional (mathematics) Presentation of a group Computer file Multiplication sign Data recovery Set (mathematics) Streaming media Replication (computing) Hand fan Goodness of fit Bit rate Computer configuration File system Selectivity (electronic) Error message Physical system Chi-squared distribution Metropolitan area network Default (computer science) Matching (graph theory) Information Block (periodic table) Physical law Database Bit Data management Film editing Kernel (computing) Computer animation Logic Software repository Personal digital assistant Information retrieval File archiver Right angle
Functional (mathematics) Implementation Thread (computing) Code Sheaf (mathematics) Function (mathematics) Mereology Value-added network Subset Force Revision control Mathematics Gamma function Data compression Physical system Metropolitan area network Information File format Real number Electronic mailing list Database Grand Unified Theory Binary file Timestamp Dirac delta function Particle system Category of being Type theory Process (computing) Computer animation File archiver Right angle
started can everyone hear me that is all take that as a yes OK my name is David still on our topic today is having back up with PG backrests OPG backrest as a new piece of backup software that you have a problem not heard of before and we're going to run through features and no and see how it works unseen
data architecture conjugated solutions but this is pretty recent and they're working there since April but I have been developing with POS 1st since 1999 right so this
is our agenda and obviously at some point and talk about factors but the 1st thing we're going to talk about is is backed up in general why backup and I wanna make a case for that 1st and then we'll talk about what you decide you do deafening tobacco people talk about how to back up and then talk about PG backers design performance of the wheel philosophy and then we will have a lot of general right so 1st while back up but 1st and foremost homophily no amount of redundancy can protect you from this on at least on those grows so if your master fails then you you're going to have some problems and you wanna make sure that you can recover that to stand by and you might actually have a multi-machine failure of you might have an actual to cover corruption later on all these sorts of things can happen you need to have backups even better continuous backups and then examines replication so when you're doing replication of course everyone wants a substring replication but the thing that can happen is you replica will get far enough behind that because I think that the master of the replica needs to be a little hole in the wall segments from someplace and that's when a war that comes in handy of the other thing is if you're bringing up a new replica rather than doing the bayes factor of the master you can actually just recover your last backup for the backup server bring it up let it stream the wall that it needs and then it will sink up with the master and become a string replica of this this is little little master as possible of the next thing courses of corruption on new corruption can be caused by hardware or software of the triple-A corruption is actually figuring out that happened on so backups will help you recover from corruption but if you don't discover until you're down the road then and you got a bit of a problem but this obviously hits me better by the on paychecks on simple suppressed but still currently there's still no system-wide way look entire database and see whether you have corruption or not but hopefully that's coming if not it may be coming and backrest then X and so these are the sorts of things that you know corruption homophily replication needed is a day-to-day operations are adding things that can happen over the next thing is sort of an accident right so you drop the table by accident or in header update script the dropped it so that was someone in production just messing around on but you ran up descriptors tables gone what you do but were somehow you believe your most important count on you have lots of fun cascades in there so now all the state is gone you need to be able to get it back on backups can help you with this you can bring up the backup you can export the data that you need and then bring that back into your production database and you're good to go on replicas don't help you with this because of course if you drop a table on the master that's replicated up quickly you might have replication delay and that may help or you may not discover hours later that the tables gone the accounts gone or something like that so that that's a great for that and another thing backups can be good for it development so this may not be you that do this in all environments because of privacy issues you may be governmental health or whatever but about copies production databases can make great all the databases or staging you you may have a stationary bring over a copy of production stage everything uh make sure that everything works and then you follow the same procedure on production later so be able to get exact copies of production onto another database server quickly is a good thing but another thing is reporting so reporting obviously can be done on streaming replicas but sometimes you need access to timetables and other things like that that you can't you on replica so 1 thing you can do is do a daily reporting server where you update the the server at midnight every day you bring it up as a normal faster so that people can use timetables and the rights to do other things like that and you refresh every day on if you can do if you can stand you reporting without the current day's data for a lot of people can come and the last thing is its friends it's also sometimes there's data that was actually deleted on purpose but you wanna go back and look at on you know that it for whatever reason I mean it could be something that you know something malicious you're trying to track down it could just be something interesting you when we get rid of that we did that on purpose but now we want back on so use backups for that thing and how far back in time you backups are 1 OK so the the next thing
of course is is how to back up so I will talk about black capsule later but I think and the general if you're in this room posting backups your where was considering it so the next question is how do you back up on the person of course is PG don't especially small databases this is what what was people start with this very simple and it's very easy doing stores is very easy but PG don't has a couple problems so 1 is it doesn't scale well were the I want to get a sense of most of the the decade so we didn't doesn't scale well so really your if your database goes beyond even say gigabyte and doing restores can be quite painful if your database goes beyond 100 gigabytes then that becomes insanely painful speed don't doesn't actually done about indexes so they they have to be rebuilt when you bring the database back the other problem PG dump is that it's really the uh on the right represents only a point time in your database so if you would say you're doing a daily PG don't at midnight on your server crashes at 10 pm everything lost out so you you restore that Piaget employees lost now 20 hours of data or 22 hours of data so it's a problem there's no way to play forward and we capture all the changes that happen during the day of the other and lastly PGM takes a lotta locks so if you're doing partitioning trigger writing and things like that a lot of things can go on multi-agent of running on because it'll it'll take locks on just about everything in the database and then you're stuck you'll you'll you'll be holding on your petition creation until the end of PG done if you that the database that can be a real problem of the next courses PGB based back this is also built in the polls graphs on this is a pretty good tool actually it does a lot for you it gets you on a back up gets copies all war archives for you but a couple problems here still lower PG based backup always has a full backup if you've got a very large database this can be a problem for you the other thing is that it's still not an of archive management solutions so even though you get the wall segments you need to make the data is consistent is not provide you with the wall segments you need to play forward from there so you still have to have some kind of management solution in place for that but the other makes even do this on manual backups so cold back up on your own copy files across an ankle stop back of or you know roll your own perl scripts around that again and there's a lot more complexity here than you might think and you still have the question of what to do with your wallet of then of course you got various third-party tools on nuclear bomb Wally on how many people in the river using 1 of those 3 again and the people or the rumor using something they've rolled their of themselves via the use of our the home and the the the due dates back again right for PG this factor here of both for your tool for a better life right good yeah unfamiliar part actually had on and then OK so there's very little room is actually using the back to this point right so course that's the year that's your last option on sorry in actually spot on because I I tell I was trying to with things that I know a free and open here and natural part is actually all free open source or is it part of the young where the DB package again I can check up on that I'm happy to added to the list but I didn't think it was a free open-source products so that's why wireless to these on the course the last 1 is PG back processes of a new way to back up on and we're going talk about that and see what I think it's better than the alternatives that are out there currently OK so the 1st thing
almost everyone uses are assumed to do backups are sink is great if it's so easy it's so convenient on its beguiling you wanna use are saying because it is 95 % of the job for you all you have to really do is call start back up in start back a step back and in our sake and you're good to go the problem is same has a lot of limitations and so 1 of the goals with this originally was to get away from our sink it away from tar get away from all these tools that would limit what we can do in the future with piggyback rest so was look at some of the limitations on Furcal or sink single-threaded this is probably the biggest problem on you course you could multithreaded or sink by keeping track of everything yourself but once you've done that then you don't want PG backrest is done so you know why I keep our single that point on the other problem is has this issue with the 1 2nd 1 2nd timestamp resolution so this can if you run to our sinks in close proximity or you get really unlucky you can actually end up in your incremental backups in your 2nd or sink may action this files this will happen if the file is modified in the same 2nd RC copies it originally away goes to the E. site did you can be right exactly you can use checksums on it and that and that will work always work of course but you know with you're 15 terabyte databases it's it's extremely painful you know if your database is large enough you just have to go and trust the timestamps or you might was almost full back up you know what you gotta wanna check something the entire database and makes incrementals less attractive but except for space savings of course but nothing about sink is no destination compression this is a big deal so when when our nursing gets to the destination is not compressed furthermore if you wanna do incrementals with our sink using link just the previous backup also has to be uncompressed so you could obviously moved the back up and then compress everything and go on but if they can do an incremental the incremental requires the previous packet being compressed that to uncompressed copies of your 15 terabyte database laying around which the new is unacceptable on in a lot of cases you could do that you could do that of course that's only do it on the uh not everyone you will all 1 thing I started thinking about is all lot of Postgres installations or small ones you know the running on just a of B and someplace in that that not everyone is running kind of big metal arm and so I want a solution this sort of work for everybody so you can scale all the way up I have customers who doing this you know doing uncompressed backups EFS and then they can bring this up as clusters on CFS without actually doing any kind restore so there's lots of options but as China like work from the the simplest thing and scale all the way up to the biggest installations on so anyway Beiderbecke's doesn't know in this philosophy it doesn't use sound passing TA or any other tools of that type and it has its own protocol which supports local and remote operation and and it solves the timestamp resolution issue it turns out this is a fairly simple thing to do I have been thinking recommending it to their site on guys basically just wait after you've got the manifest use wait for the remainder of the current 2nd and then start reading there are some other ways to handle those that have been suggested to me as well but I think that once the simplest and I'm sure that it
works on so let's go through some of the features of so compressions and checks the compression is performing checksums are calculated in stream out so I try not to do anything in place so fine you know archiving wall file or copying follow whatever I don't check some introduce anything like that 1st lecture to start copying file everything's done his dream and obviously that's variation efficient and it also makes a very accurate because 1 do you want to know is size of a file can change wire copying it is good to know that once I got the correct checks on the correct size and I can store them on the remote there's also asynchronous compression and transfer for wall or coming out so if you have a system that's you know of new generating Wall really quickly so you can actually offload that and and do that separately or you can you know synchronously pushes well but this support remote or local operations on and you don't have to do anything crazy with back you know as essentially Baxter anything it will natively work locally so if you want backup to say manifest now that's kosher you could of backup to the backup server if you're doing wrote operation of course that requires ssh to operate on the best ready for parallel compression transfer this is an important feature and this is 1 of the things that of course originally got us away from our site on because now we can paralyze on and I can dedicate many caused a compression as you want and you know those big backups can go a lot faster than others so full differential an incremental support that could be used not yet all of 0 yes so basically in essence you just done so like our St. come back to the entire manifested the things can copy right at the very beginning of end so the what all I do is I wake so let's say that happens at exactly you know 10 o'clock 10 10 PM so I wait after the manifest as built whatever 2nd I'm currently and I wait until the next 2nd before start copy on and what that means is that if the if the file is modified during that time all pick up those modifications because American Stock copy until the next 2nd in and then the timestamp you forgets modified after that of course the timestamp will get updated it may not get updated right men on some file systems will update the timestamps until essay on but I don't really care about that during that back up you winning I care about is at that time stamped modified before the next backup on PG stop backup does sink so by the time you know you start the next backup you should definitely have this time stamps on on disk a once a manifest is built I never actually go look at any file metadata again so is that I'm not concerned about that stamping updated I just have to know that it's going to happen sometime down the road but so yeah differential kernel support of a lot of people are differential is differentials just like incremental but it's always off the last full backup and there will more flexible because they can be expired on user scheduling criminals always depend on you can depend on a long chain of backups so for some applications is better use differentials criminals incremental is a good if you have a just a huge data everyday change on and you really can't afford to accumulate that every day throughout the week between your falls whenever the falls out of their back up and archive expiration policies so you can define how many full back you want to differential you can define you know whether aware archival expire will keep archive for all full backups are just some of the full backups it'll still keep archive you have to make the database is consistent even if you're expiring archive for the older backups of backups are also reasonable so if you're halfway through 15 terrabytes back up and it dies or you have to kill it or you have to bring the machine down you can actually resume that back up and it'll rechecks on everything's in the backup directory to make sure that it's kosher and then it will continue on in causal do those checksums inference if for new running for threads or 8 threads selection do checksums across for costs to make that as fast as possible so that you can also do hard linking and this is this can be handy if you're doing uncompressed backup some CFS what this means is that makes that 1 they also about a minute but the the directory structure actresses looks like it can the backup structure looks like a consistent post once you can actually point close just directly at the back up and bring it up again and put in a little to recovery and will start running now you would wanna do that without taking a snapshot 1st of course because he had been what tablespaces here on dosages rewrites so rewrites at it place a basin on top of this in a 2nd so this but this is handy for that kind of thing under also works with POS grows 8 . 3 and above on I just put in some experiments support for 9 5 but with some of the new recovery options are really have those working in the regression tests do a lot of recovery scenarios and so the regression tests currently broken for non father but by the release of causal working of source
the back backup structure so this is where on so it's really clear and simple structure and you have a base directory world which is you know the the post test data directory and many other tablespace directory ready tablespaces logo and then there is a of follicle Backup Backup that manifest which is a plain text file on which it is human readable or readable and has information about all the files checksums timestamps etc. so what I do when I do this is on i rewrite of the on links space too right now I do them on the relative you know so it's not that well that that we can with the backup directories around and they still continue to work on postcursor perfectly happy starting up that way it is an can issue this isn't meant to be a production database it's a backup but for customers they have very very large databases that would be almost impossible to restore anywhere it's great you connection to bring up the database in place it'll do recovery before take a snapshot of course take a snapshot bring the database database of places that do recovery and then you can do exports you can you can do friends that you can do whatever you want at that point without having to go and they can be copy of it some place which is a problem of yes all examples started about directly in the backup directory if no compression is used you can also another feature is you can actually tell backrest to copy the Archive lots needed to make the backup consistent directly and x log on so we don't even have to write a recovery of Hong Kong already thing if you want point time recovery of course you still don't have to do that but if all you need is a consistent database the the excellent you will be right there so very very convenient and that's an optional feature doesn't do that all the time because of course it can take up more space so you don't necessarily want that no OK
so let's look at some of the performance numbers so consists is important as part of the reason why this is done is to back up very very large databases and do it quickly so on so is a comparison and our same of the 2 programs work slightly differently so sometimes getting them working in the same mode was all weird in some ways of the 1st the 1st example works quite well the so in this case we're doing of 1 thread were doing on level 3 network compression of that section the default for back rests on a person defaults to 6 on but I find that level 3 actually works very well because you log-compression there's very fast so if your destination is uncompressed then the defaults work pretty well but all this is configurable of course Jesus Jesus yeah I I would it just because I wanted on backup directories just to be very accessible to people and so you know when if you are using compression and everything just end-user format in the directory so you can actually just do a recursive on Jesus and then you got your data back on so I figured that there is some better form of compression algorithms out there for speed up on but I thought I'd just go with the old standby this is like something that might not be made optional in the future but for now it works pretty well so here we can see on PG backrest doesn't know that so this is a young untrained member the size of think it was half gig k relation written this down to the 3 that but in the DB was but is big enough to you know get some useful benchmarks of of and for 500 In the case anyway and so Pakistan at 1 141 seconds interesting today in 124 so sink is clearly faster backrest is written in Perl on so you know I'm using the of z live in the store buffer management there's particle management all this kind of stuff so unfortunately not as fast but when but the next thing we end up with this sum multi-thread to events so the settings of the same still doing all 3 compression and were doing destination no destination compression Nouri will do this in 84 seconds so that's 1 . 5 times faster than with 1 passing through on an hour single courses and a because multi multi threading of then the next thing I did was on 1 thread with network compression L 6 and destination compression at all 6 in here backers came in at 334 . 4 and sink was 5 10 so you're saying hey harassing doesn't do compression so how did you do this well I compress the files on the destination so do our sink and then do compression just to give you an idea of how much an advantage the in-stream compression gives you over compressing you know on the destination and the last thing was to run 2 threads of and do the same thing network compression the destination compression all 6 an hour to 2 . 9 3 times faster than 1 passing through on so such skills pretty well these benchmarks were made on my laptop which only has 2 cause in you know of course in this case you've got the SSH processes that are running you got the compression decompression so basically this point 1 0 1 I want and I worked pretty well but when I went beyond 2 threads performances can went down and down and down whereas I tested the a similar thing on finds machine you know much bigger machine I much better scaling in that direction and I have clients they're running up to 8 threads of for big databases and you you can keep the cost busy doing compression if you run it that way the work of the use of yeah world but they're 2 things 1 is on the and also the fact that the of the compression so if if the destination compression is set to 0 6 I use that same compression for network right so basically what happens is the the file is compressed on the source side and then it goes across the network and then it stored and that's it there's no recompression on the other side you actually just keep that compression strain and distorted as that set of checks and you're done in the downstream so you don't have to go and try to check some that followed the end on and course check something a file on disk before you copy it would be dangerous because that file could be changing you while you're reading it so 0 and also the size is calculated at the same time so that follow goes across and stored now if you have destination compression turned off then the default is to do level 3 compression and then on the other side it will uncompress it and put it on just for you but but the checksums will still calculated in stream and so all you do is just decompressor store on so just just more fission all long so so you can see them with 1 thread you get some advantage with the extreme compression and when you go to 2 threads you to start multiplying the on is not quite a multiple of 1 . 5 times 2 should be 3 . 0 4 on because there is you know what you start doing multiple threads now the synchronization in messaging and all this kind of stuff being passed around so you get a slight reduction performance but you can multiply that up very well sorry say again it's it's it's it's a big part because the because I'm doing my own compression and not using sh compression on I can actually keep that compression on the destination side and have to recompress on so that that's that's the big advantages just you're compressing 1 time you taking checksums 1 time you doing everything 1 time and then that's it so if you want decompress that's fine so of compression and decompression but the usual cases to keep the compressed on the destination side and that's where the real benefit is just not doing things multiple times for all of yet if you think about exactly and we've seen
this some you know in in then production environments where are just you know if you actually do try to say paralyzed without compressing of you can flood network pretty quickly especially when you have a 32 shot cluster doing back up and and pushing it out over a network which was originally 1 gigabit slows a problem even attending the network can get pretty easily flooded by that they get back up and so what so what start just a couple minutes about living backup so we talked about work earlier about kind wider back up on you why you need to back up but I also had this sort of philosophy living backups to it's extremely important your backups work when you need them and 1 of the things you can do to make sure that's true is kind of subscriber this philosophy of find a way to use your backups right don't just have something that's just part of the DR play you wanna find a daily way use them so for instance thinking creating new replicas offline reporting on offline data archiving so instead of dumping you know may CTR is from the production database you can actually dump them from the back you know the development always sort of things so because in my opinion at least on code has will not work when you need the right and also if people are using the backups on a daily basis so be unfamiliar with the procedure you want people to be familiar with the backup tools familiar with the restored tools have this be just a normal thing that they deal with so when something big happens everyone knows how to deal with that and you got document procedures in place and you know your backups for work because you using an every day that don't work you know it's you alarms other than that of course there are things that have to do it you know regularly scheduled fell over instead of doing it only when you have a disastrous situation and to just test these techniques and make sure that everything actually works out if you don't do this but when you actually go use your backups you may find that there we know that the discover mounted somehow lower in 0 you you think you bet you should be getting alarms which have many alarms because some counters messed up for monitoring was messed up or all these sorts of things so is usable losing here right if you values uses backup or you could take the chance of losing your data on so the good thing is that they need to do is to find good ways to use your backup so if you can do that make it part of the life cycle of your system then I want things do
go wrong you'll know what to do so this is a really relevant reliable this picture such Ottawa incorporate that whenever I can on the right so student
there was a fun I could but
now I'd like to script my demos so this is this is a real so we're actually we're gonna to go through a do real backups and restores all I like to do is write a Perl program that actually go through all the steps that led up to stand up here and try to type and this type of and you think poorly but if you go to the end the whole link if you go to the get help site on you can actually get to this program and there's also a sample output there's well done . out so that's the program run on my machine so if you just want see the commands you can run through that take looks a principled commands and the output of all kinds were the wrong so the 1st encoded in here is create a cluster but so
we can do some testing uh we're going to create a backrest outcome file and select enough so you can run backrest entirely from the command line but it's a lot nicer if you create a configuration file on you know for the things you repository location location database on PC cool if you're running on a special port all that kind of stuff that way you don't have to retype this stuff at the command line every time but if you using but you know Scheffer something like that this can write everything for you then maybe that makes sense although to make recovery campaign focused covers of often a manual operation you won't be that that be as simple as possible on and the last thing we do here is that the query by directory
of posters are Baccus has a default on the volley backrest but more create that for you you do have to create repo directory and if it's not it's you not understand location in your article Chris were lives right so as to back
up I there is PG
backrest stands equals mean type people's full backups symbols that and so now we made back up and we can see here so we've got our 1st full backup I we've got a backup that info file which contains information about this back up and subsequent backups and then latest is just a link that points to the ladies back I site it's it's start back up stop back up the copy files on so it is a it is a physical backup logical back up on I'd like to have logical some pointers logical the harry Hindi as well especially for large systems but for now it's just a just a physical back up so we can take a look at the size of the backup so the size the databases 51 and the size of our backup is 4 . 9 on you should expect a generally could this kind of compression obviously since I just created the cluster almost always follows a 0 it out so they're very very small and even very very efficient backup of annex E is this backup info file so there's a whole I'm not going to go through all this but there's a whole bunch of information here about of this back up you archive start and stop of the actual size of the original database its size in the Repo is such researcher will see a a nice form of this later you can actually export always data as Jason yeah it might be good for all of us will be the threads are actually enable just to copy files just to check some copy files so the you know the the the main part of the program all the control stuff is done in in a single master thread of the main process the trends are brought up to so then what I do is build a manifest on i segregate things by table space so that threads actually you know work on a different tables of yeah for tablespaces and 4 threads and each 1 is going to initially work on its own table space as you start to run out of files principle will bounce along and start working on other tablespaces well generally speaking a compression is your biggest bottleneck not I 0 so you can actually have multiple threads running on a single tablespace we when 1 all 8 on 1 table space you certainly not initially the rest of this year you know have all of these of IEEE L-amino I should I should have
mediation point that I was right here in so so in this case what I've done
is turn archive on low-level equals archive and then here's the archive command here if back Pakistan's America put so back process for management built-in you're not STP files anywhere copying them and having picked up backrest takes the art on archive fell from cradle to grave universe asynchronous will be stored locally and pushed up later or not doing asynchronous archiving here but it's it's checksummed as soon as it's pull off a disk on and that some follows it for its entire lifetime so you can verify the descriptors such as so yeah that's so that's the the set up in the documentation of course to tell you the basic setup of backrest on PostgreSQL but the good thing about this demo is it is also I mean it will show you every step of setting
this thing up and running including creating the post this poster on sort pretty much to here minus there's a lot of information about your your database here so if you say this configures system and you start doing a back up from System B to System is repo back RESTful fail it'll tell you all say hey you know this system don't match here the database version of these don't match something this is horribly wrong it'll also do that for Walt reads the wall headers for versions 8 . 3 3 9 5 and also if you're are the wrong recover on so wishes to him thing you know sometimes you can fat finger configurations are you copy file from 1 place to another and suddenly mixing archival that would be very bad yes yeah it's on I haven't directly and the reason is on wide try not to be controversial to some extent but but also you know bombing is actually based on our saying something by comparing sink of compared to bombing of pretty easily on you know that the same thing is true I could definition persons with this although I think it's pretty clear you have asked a work out because they spectacle courses and threaded based backup compression compressive strain so that's good and leave you with a compressed copy of so that's something but I think you're going to find that the performance of the various single-threaded backers performance on but I I should use more benchmark and have been directly comparing also like Wally is a very special use case because it's compressing locally and pushing S 3 so I know that it's it's really hard to compare this to the the different use cases until I had that as 3 support it would be kind disingenuous to try to make a comparison there course Bacchus the faster but that doesn't mean that wall is not doesn't have its own value of yeah I definitely
can so on what 1 of the things although I'm working on for a be working on the summer's is backups from standards i which is relatively easy to do is just more a configuration problem anything else goes down backers has to be where the standby and some other things like that but generally speaking at what is said to be the whole backup process is very CPU bound on so I'll just doesn't have to be as much of a problem if you got a machine with 32 cores then on the weekends hopefully you can spare some of those 2 do you hoping to the fall back of that if you don't if you the thing about the thing about back up his backup can be slow right if you're doing 1 full backup a week and maybe a couple of incremental 0 coupled differential I mean you can't really afford spend 2 days creating a fallback right so what really needs to be fast is restore right and in that case you can actually you should be able dedicated resources to us and I'm going going do a restore about the compressed data someplace else it'll come across the network compressed it'll be uncompressed on the destination machine so might say will handle dedicate a cost to this if you to getting these checksums stunning get this of gross you can have a massive performance improvements in restore with parallelism on the back up it's more of a convenience thing how long you spend backing up and if you got an course to support it and you got enough I would supported you if you've got a lot of disk sets of a lot of tablespaces then you know what you know I work on systems that you know have a minimum of a table spaces so if you're doing a course on that each cause working on some tables based on you you've got 1 sequential really coming off of a single table space which isn't that is solution as I O isolation of course but this is the same with time pretty good and this is the archive directory so as you can see the archive files you all segments have been copied over and each 1 has a show on Jackson attached to it which like is it stays with it forever so that you can verify that the young and will when the archive files are actually if the requested but when the copy back and decompressor actually checks and on the way so if there's something wrong with the arc of value to find out about 4 this bankrupted in place in the Repo you'll find you'll get an error from backrest about that rather than having discussed potentially and impose this would also detected on because the checksums of all would bad and it would know it and there would be a problem but you'll find out in advance and here the same sort of thing before you got this information file which tells you all kinds of things about the wall archive and make sure that you don't mix wall between multiple versions I'll see here 0 and also you can see that wall is
stored in its own version directory this means I have a written functions yet so right now at previous samples because if you want to upgrade to a new version you actually create new stands up and then start backing up there are here now you're going to be able to just issue an upgrade command backrest with the larger than the upgradable read the new information it'll store that and they'll start accepting wall from that new on database so you don't have to you can still do expiration across multiple versions you can pin the last version to something like that so these are things that are on the way I not
knowing do differential backups are so we can see now that we've now we've got a new backup type so it's based on the full backup but it ends with the B cells differential backup latest has been great updated that and now we can see we've got more archive now if you might recall
a previous backup was 4 . 9 9 Adding the differential it brings us up to 5 . 1 so this is this this is this is were incremental really starts to save you if you that you particularly very large databases that with a lot of partitioning where you create new partitions every day but you're not modifying the old ones this could just be a lifesaver it incredibly reduces the size your backup and for very large data Incremental is an absolute must but I don't I I have an
open issue to do to allow you to specify checksums right now it is based on backups are based on timestamp restores are always based on checks a list you force them off on the theory being here when you got a new kind of normally operating database that you check sums are good and you can trust if you're doing restore by definition something's coming gone wrong you may not know what and you may not be entrusted timestamps so stored checksums are always used if you do it the default horror stories of backrests expects the restored objects to be empty right that's the default but you can there's a batch edge delta option you can use world say OK all military under check somewhat you have compared the manifest an old copy what I need to and this can be very efficient with multiple threads could you can check someone multiple threads and many people from the Repo only what you want and so the examples of 1 customer with a relatively small database 30 games and but with with 4 threads they can do restores in under a minute on you know from from the NFS-mounted occurs you look closely database the same you copy which you need where a full restore takes about 6 minutes so that that someone threat actually so where are we yes so on right so no announced release time right we're going to release of before this we decide to do an incremental backup that way for him to some kind restore we don't have to replay a lot of wall segments if you if you generate a lot of all you wanna go and do an incremental here on and to show that we are where we are we're going to insert a message in our test tables and boat on before release so certain this message before release we create a restore point we do the release that and then we update the message to say after release so this is the inversion table or something like database always have version tables after the release of version the updated you know that your on a version of the DB and
analysis curious OK the release the release is no good please rollback of article restore and immediately get an error because back small tried it attempted to restore what those busses running so we really need to stop the database and try again out so we do a restore we start the cluster and we check for the message so we did a was actually at the store so we did not understand the main type people's main target equals release that delta right so this is a delta restore based on on point time recovery to of that name and we can see back respect she writes the recovery come fall for use on so that you don't have to do the access all the information it needs to do it although you can override stuff if you want on the command line and now we can see we've gotten back to before the relates excellent but but then we also
earlier I forgot about it's part of the script so so we got back to for the release we bring things back up and then this very important day comes in right 1 out QA actually says you know what we made a mistake the release was fine just go back to that point after the release rather than trying to reapply everything keep everyone up all nite will this will go back to that point in time so we can do that by the time rate so in this case we just to a a plane restore so this is the fault restore which is going to take us to the end the Wall Street right and so now will be back to where we were at the release so we can see the message here after release well now 0 so what about that very important update now that's gone so our back to after the release where we want to be but we've lost that very important data they got incident database so now on on this system or another system we probably do this on other systems this example of lost data have been found the backup so maybe on another system we do this instead so now we we were going to a sort this time follow different time 1 word of all the time when it was created and after that 1st restore recover that we did to get back before the release so that was timeline too so we're gonna cover runtime line to and while there is there is a very important updates on time 1 to we can do this on another server that people you don't the table up and bring production we could use some production lots of options right but you have lots of options to get places and like said backrest takes care of writing on your conveyed falls for you and also kind stuff it's a very simple command operation to get to any point you want on your summer info so if you are if you use the info functions it will give you kind of summary of the current status of backups backup latest backup stuff like that this isn't really very interesting but this is a lot more interesting if you do output equals Jason but it would give you a really really comprehensive set of data about your repository sizes of on it which backups reference which timestamps union all kinds information and this can be used to feed monitoring or some kind of admin program or a union there's a ton of information here and I get more all the idea to the structure of course
and that 1st we stop the cluster cleanup and the demo is complete go back to this to stop so the so I
gotta questions from the yeah yeah you it actually is so if it there are multiple tablespaces in you specify multiple threads back Russell makers that best attempt to balance this you know put put 1 so if you got 4 tablespace import threads is gonna the least initially start by playing each of thread a separate table space to do it's you're reading compression on as as you work your way down say 1 tablespace if they're not symmetric and then you know start running into issues where it is gonna run down and then multiple threads within upon the same table space on a generally found from an IR perspective you can still get away with you know several friends on 1 table space without it depends on how busy assistance on compression is generally the bottleneck here not that we died yeah so yeah what what I wanna do is an option to in a sort of a max per tablespace sort of options so if you have 2 tables that are you know if it if it gets that point you would only need run 2 threads on the are tablespace or something like that and kind thinking about setting some maximum on especially if you're doing 8 and all the balls down 1 tablespace than that can end up getting all the area of all of the yet at vessel that that's why haven't it isn't a big deal for me examine seen any problems within the field or or always just massively CPU limited on the backups and it is it is it not being not much for problem just around about for that as well here on my signal I really like the but it's it's when you're doing your transfer receipts still using SSH compression on the network that bond is set up that way Yemenite 0 we we see network as those on bottleneck as well when said that 32 shirt cost rose song about could could saturate the network on but usually it's if you get more we have to used these days of the the part of the of the of the international standards and the the the main things about the structure of the game it takes use of Internet files and found that he had never and all that sort of thing that got me all that right and this is this is possible those of you who made that you know utilities you absolutely could go on but it it can it goes against the philosophy if you will of of full backups because the idea of a full backup is that everyone's flyers going to go back and copula again so you're you're sure that was right but you could definitely things along a backups to another or you could do just that we call the mind on all the right all right so you can walk what I would probably do in that case is to fall back up then maybe due weekly differentials and then hanging Daily incrementals off of those differentials undertaken yet with them that I was at the end of the year you can do that what what the next step that I really like to do is to of some following criminals based on checksums on so good right now if if a file change from copying the whole file and this can be painful for 1 big of that extent so I'm going to be using checksums to actually copy parts of files that exactly exactly I files or a broken up into 1 big segments so you get some advantage over like oracle table spaces which can be quite fit quite massive of depth of a 1st yeah there are actually so the calculated in the in the in the particle interaction show 1 checksums so I'm not actually I'm just checking check something the entire file as a comes across so this is related to the wall checksums sounds or the datafile checksums on those are always available obviously have support all back to 8 . 3 so for the main Bakris checksums I want to do something to be compatible across all versions of goes but I would like to do you know and support for 9 3 and greater
databases to you if you got checksums turned on do more intelligent I still obviously checks on the blocks myself and make that compatible backwards but like you suppose there and also 1 nice thing is you could tell backers to basically charcoal checksums in your database when as a backup to make sure if you're dutiful backup anyway you got all the data in your hands seems like a good opportunity to check on your tables and and see that the checksums match that we if it did don't of the backup will still succeed if they don't match but you'll get some alarm bells going hey you know I found that checksums they work in the last database of your last backup you might wanna think about doing some work you a restore here's some recovery right that's that's exactly yeah that's exactly what were around it's kind of experiment we wanna do 1st obviously the pages rate k and so were worried that the checksums may not action match up with the page in in in for things that are actively being written on but at the same time because born alive systems were not sure of the kernel presenters for that 1 page or not so it's a little bit experimentation researchers required there but it's definitely directional interesting going to a er so I will the other backers selection right recovery come fall for you so wall of course is stored in archive off with the backups so the of the recovery command will actually retrieve the wall from the archive for you of 4 PostScript you also welcome to write your own recovery come file you can't there's an option to preserve the current recovery come from so if you've already got a complicated 1 placing you doing restore you can just preserve the current 1 you may have your wall on S 3 you might have some backers doesn't assume that you always going to be managed by accuracy on it may not be it may be someplace else maybe only your backups about management backrest of and the default case actually automatically retrieval the wall for you from the from the wall archived which is stored beside the backups are more worried tell to be stored it can be stored in a separate repo the generally stored in the same because the backups yet of the archive command is always impose cuts so all of that will continued archive you wanna backup isn't running it is decided you just during a backup it's always going to be keeping track of your wall for you I eat it whether backup as running or not because the archive command we saw the the beginning is given the pose goes in the backrest archive command so it's going to be continuously archiving and matter what is going on a time walls but from postcursor law covers the use of yeah exactly so you know if if you're you know I see people who don't have archiving solutions in place of streaming replication like logic stream that find in it's not against the synchronized and then that's it they have to do a base back up that something drastic if you if you have good archive management place you don't run into that problem and you can just yet will was this is was saying in the docks actually addresses but if you're if you're doing this on you must take a snapshot but will if you if you associated that was seen brought up a cluster on an incremental without doing any amount of restore you're going to corrupt not only the incremental but the previous full different if you got hard linking turned on or like the whole sets can be gone so yeah that's a pretty stupid thing to do and there's no way for me to prevent that because I don't have hooks into the file system what will happen those if you did recovery on that back up that would not work on it was saying no the checksums don't match up these are the correct files something went wrong and so you will you will not get inconsistent recovery from that situation because certainly going you can go in the back of structuring just delete files is there's nothing I can do to prevent that the only thing I can do is when you actually go to Europe covering it'll tell you know this is no good sorry but I can't use as backup you got you pick something else and I will also be working on a valid functions just go and validate a backup yields just offline said is just tell me the back of the store good before it you're recovery as a really simple thing it just kind of you know so many features so little time me up because I am going to read those files often because every time something is transferred is checksummed so when I read the backup often did this but you know read that back up i'm going to transfer the file across and walls coming across is decompressed on the database side is going to get checks and at that point will also the checksums and manifest you that I originally wrote into that manifest file so those 2 things don't match that's it on your recovery is his failed and you'll get an error but just to show you the look of yeah that's so that's what that on that's so that last thing was that information functions
on so that gives you your unfortunately it's gonna let me on this year have shown you pull this look at them and that our
common part should be uh so
yeah here the and on so if you run the other info function with output Jason this will give you the exhaustive list of so here's the stanza I mean I gives you status code OK or it gives you information about the database in that in that stands up and then in this backup list is an array of all
backups with all the information starts top for archive on the format of the other version backrest did the back up the database ID which you can reference in that database section information about this in size deltas labels the label of the back the prior back up any references to previous backups of the time stamp of the start and stop and the type so this is a differential and you can fall this down and so here's here's the incremental and incremental cost as a way that incentive that's increment we're going to see that prior to work well and yet which all of the could will currently I mean it should run on any flavor of Linux right now the of the fretting implementation uses electrodes which some systems don't support which until an acceptance compiled threads I'm working on taking threads already replacing processes online right middle that it requires changes the particle where some attractions there some working on that and then everything will be done processes I really already do that because the remote is a process that I start on the other system to do the particle layer stuff so basically we have instead a sentence or a local remote so you the due the compression a copy and get redress altogether and that should also increase compatible with a lot but for a single Cunningham also 14 which Windows doesn't like some of you have to find a way to just do that differ lady that is an exact is something that will Windows be happy with but I haven't really got kind if compatibility yet so this definitely that issue in the questions of that property for overtime are great well thank you very much