Logo TIB AV-Portal Logo TIB AV-Portal

Auditing PostgreSQL Databases Using Logical Decoding

Video in TIB AV-Portal: Auditing PostgreSQL Databases Using Logical Decoding

Formal Metadata

Auditing PostgreSQL Databases Using Logical Decoding
Title of Series
Part Number
Number of Parts
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Have you ever been wondering what edits are happening inside your databases? Logcial Decoding, introduced in PostgreSQL 9.4, allows to keep track of changes commited to the database. This talk presents how this mechanism can be used to audit PostGIS/PostgreSQL databases. After an introduction to the concepts of logical decoding, two use cases are presented: Quality Assurance: writing an audit log into the database after each commit so that someone else can do a review of the modified data. Cache Invalildation: refreshing a GeoWebCache instance at the regions in which the data has changed after each commit. To support these two use cases, a little Java program able to be run as a mircoservice was developed and will be shared under an open source license with the community via github.
mathematics Software Databases Databases programs
Java time maximal Databases water Databases programs words mathematics MKS-System caches Computer animation Software CRM system Information System utilizes Right sum localization systems
mud runtime parsing files states Java code sets schemes Databases functions part metadata programs architecture Spring mechanisms mathematics round box record configuration model encoding architecture classes parsing response mapping forces boilerplate coma Databases attitudes applications mathematics category processes Computer animation Software orders configuration box Plant variety (law) table record libraries
area demo states demo projects water Databases applications part metadata programs attributes CAN-bus category Pointer mathematics Computer animation Software case speech law table record
Slides states time neural network schemes 8th part second attributes Pointer mathematics different level structure Databases bits Transactional Hidden Markov model applications substitutive category Computer animation Query table record
projectors states directions time Databases functions data replication sun programs consultant mathematics different logic Kanäle stable man formating Relational building Development binaries The list physical bits maximal data streaming mining orders Iris configuration log files Right Blocks record geometric point track server services files maximal 3rd similar Stream rules events van Emulation versions means level testing encoding structure CAMS sum Gravitationsgesetz addition server projects Content cores coma Databases directories applications mathematics Computer animation case logic CRM HMS <Fertigung> logic table
control Meeting/Interview different case Iris Information System Databases Right applications
caches batch system integration Meeting/Interview different time moment systems
man mathematics Meeting/Interview time maximal 3rd
Meeting/Interview directories data replication
Meeting/Interview configuration Em's addresses
mathematics level Databases applications data replication logic Codes record attributes
Computer animation Em's
so they let everyone I will talk about post-disco databases and the bald auditing the data inside so we and get to know what is happening inside and so I will make a shopping mall for a little program I wrote to eject those changes of the database and presented the software and the background and said post-Christian database so
just some words about me and the Java programming based in Munich and I work for the rest of the convention which is the local utilities company and we are concerned with the management of all the network infrastructure like Prolog us and so on water networks and and indigestible document and solution architect for the network information system so I'm concerned with GIS software for about 6 years now and during namely programming jealous of so now
about the problem we want to talk about today than you might have been well 10 or 20 people editing a database then the right and you might want to know what it's a really happening inside my database so which features were inserted which features were deleted you could ask when future change the last time where you could ask where did data change in general so you want to know the region where the data has changed or someone has deleted data and this might be of interest for technical purposes so you just think you have the rendering of your data and put it into a cash and the data changes in the database you might want to update the cash you're and the data and could be of interest as well for all process-related reasons because other people might want to review that change operates you have to trigger a workflow because of the data has to be adjusted as well you have to and from different systems in your manual system landscape
so I will now describe a little application of wrote to extract changes using mechanisms of oppose this database so little architectures lights you have some data somewhere in post database and you you edited with some might be curious but it doesn't have to MIT model which is and then we put onto so called logical decoding which is able to pull out the changes and you can consume them why a skill and then you can pass those changes and said this job application and then you have it inside the runtime of the job application of the change sets and then you can start doing fancy thanks for the 1 s 2 coats and making it so difficult to give a question stands as sending a little JC document saying well here that it has changed please up education make receipt of a truncated here's or you could also and from other applications like this 1 you would have to write itself because it only applications but you can write a little mini job class which processes the state and does something and the 2nd thing my application is doing is it's feeding was database into the poster so database into metadata at table which builds the complete history of all the records that have been changed and when you have to be careful that you don't consume that changes well and that was it would go round and round and then then you could have a look at into this table AGS program as again because we also compute the bounding box where the data has changed and then you see on your map where the data has been changing yeah and what are the responsibilities of this little program periodically check have the change sets from the post-disco database and then pass the output of special packing inside the posters could surpassed the real change compute bounding box and then for every change drawn every change table it will create an order record the other table and con taining all the old and although new variously so see the whole state of the future before and after the change and then can also do it it's difficult to do so to truncate the received the affected region and this piece of software is highly configurable you can switch parts of it on or off and then you can define the polling interval and which database schemes you want to watch and so on so that it could be helpful for many people hopefully and what the use as a technical step Java and the runtime container sprinkled which is a very cool every phone and getting productive very soon and very fast because you don't have to write much of boilerplate code so you just write the application code and the rest is nearly done for you as force and expanding the use under large for the parsing of the of the of the of the plug and then a geometric processing is tedious and other come and libraries you find if you do job programming this implication is it's a plant varieties of nearly stateless and you can inject all the confirmation from outside from a simple configuration file and then uh I give it the MIT license so you can do what you like with and I shattered and it up so have a look and try it out like but if you want to run at
a you don't need to write on at the application of properties file which contains you configuration saying Mr. a database and what do you want to do and so on and then you have drawn it's just saying job question John and then the just and so will do that now
I give you a little demo all this it's going to work and way it
on and I got a little to this project and I want to have a little the use case is I want to inform the public about the construction work that's going on in the public so we need to replace some parts and parts of us want to know that there there will be a construction site in this area so low at new feature to my to my database and say OK here we're going to do something and so I put some attributes and so that's me to the edited this record water networks and they give it a start and an end date so that's that's the day you want to and tool and a week later and put this speech into my database so up to now nothing special and then I start my my little program consuming changes so I say just to job and run this as an application properties configured correctly inset here to manifest this 1 you go that will wake up and tracking changes and states publishing 1 change metadata the check table because it's just found out this change we have just been
doing and divide the refresh now I see it then you wanna changes level as see came in this region something has been added to the database and then I could move this feature around and say OK I want to move somewhere else and save again and wait some seconds every 10 seconds on the background of a pool new changes and so we get a different region it's updated features is now the bonding of substitute of the state before and after it has been changed because also there's some headed off properties mismatches due to pronounced and compare itself the k m now if we look at the database now and in the end I don't delete this feature again so this is perhaps the most important interesting part because you can find was features of them there anymore so far going to delete this feature selector 2010 the and safe again and it's in seconds and then we will get a bit of Frederick saying OK here's something has been deleted and if we look into into a
database into the into the submitted at a table I would show that use well so comes from you see you see it 3 entries and then the stable because we have done 3 edits we have an entity something updating and deleting it and then you have some attributes the region the already been seen and we also get some more data which was the name of the table to name of the scheme which is the transaction ID which is the time stamp when the data has been changed and you see also and adjacent structure containing the records before state of what was changed and after it has been changed so if you if you know if you know SQL good and if you know post-disco good you can also make queries into this and structure to extract the changes that the so this for the application of a novel that to my slides
and describe a bit how this works in the background so and to understand that you need to understand what is logical decoding and this is a feature which faces which is based on the application capabilities of the post-disco so so the Forest Service of minor which were released with 9 country Australia you could do physical binary replication of data so on those write-ahead log entries binary and could be shipped to a different server to have it so but you have no chance to understand what is really inside this should you this war file you were shipped to the and uh the logic of coding feature which was inserted into post-disco 9 9 . 4 now it's able to decode this record back into the application level so you really see which rules in which tables and updated or deleted and this is what I used in the background and if you want to set it up for you own for yourself you need to know some more details so you need to know what is the replication slot this is the points where you can consume changes from use you tell the database so well I'm here and I want to consume changes in the future and that this replication slots keeps track of the state when did you visit the last time and all of those things and is as it has a name and you can consume a change exactly once from such a replication of the data is stored and the exploit directory so well all those right at log files of there and be sure that you consume it because otherwise this Friday at low pH i won't get away and you plugin and the other thing you need is a so-called logical decoding output Plotkin again which defines the data format of changed so this says the Britain in the inside so supplied inside a database and that luckily the developers and its dual test decoding proper in which is the use of a room and you human readable text from other may use that the past the output because I didn't want to write a supply and didn't want to threaten the stability of mine was so if you have all those 2 things and in case you can put it together so you need to to do some some configuration you post PostgreSQL service so his need to set the writer had lot level to logical and set up some that you're allowed to to some replication starts and then you can create 1 credit projector applications that you give the name of the slot and you give the name of the output cut so that test decoding and the other thing you need to do in my use cases that you have to say which Collins I use identify feature so I say I wanted was all columns because I want to have the full state of the feature before and after the change and if you have that you can start using so you during the could the data during edits and after that you can really starting fetching changes with the command as a skill command slot get changes and it is the name of the slot and then you get such a structure you get the beginning to the committee members and you get such a such a record well you can cause all all the relevant data and if you don't don't forget to disabled purification thought otherwise you data won't be discarded so this is for the consuming content and I want to talk a bit about related work on this topic so and use that people thinking about the version control and so on so this uh uh the GOT project going on there their talks here I think on Friday morning so they're they're they have approach for disconnected and distributed and branching managing and all those things so but more and but it's also more complex and if if you think more in the direction of fine-grained replication of data you should but I have a look into the geological which is a program which is the program released the 2nd quadrant company doing was fiscale consulting services and this focuses more on how to to get the data from 1 database to another but they also say they want to be open for other programs to consume those changes and I want to do some days formats to fetch and if you're looking to get up and do some searching you see other people of writing out the programs for logic for decoding for their needs I decided the different way in order not to threaten my server to put something in I don't really know yeah uh so to summarize we have a logical decoding is a way to produce an ambient event stream out of your database describing what it's all really happening inside a database and uh the logic according the gyrus able to process this data stream and can write an log back into the database aware of of the GUI aspects so it's it can pass geometries and to something sense full list and and can also trigger occasionally addition due workers and if you're interested so have a look at the it and check it out there also binary and release which is already built so that it's easy to to try it out this little tutorial for this example I guess if you so I'm interested on your comments on this and also give a reference to to the background documentation from for using this feature I think it's quite interesting and could solve some of the the problems we have every day and yeah I'm interested in your comments minimal open to questions thank you OK
here so we are quite a ways away from the minutes for questions so go ahead and the she the yes this with hi I like how you keep track of all the changes and be able to start of workflow was with with with those changes but at 1st I I I thought well you could set up something similar with just
triggers an and only in certain all the leads and and on the date there exist you comment on like how your solution is different or better than those things that some the difference is that
you don't have any they don't have the application doesn't have to be aware of those background checking so you put it at back into the back of the database you don't have to store triggers imagine you don't have the right to do a little soul so and it's possible to do it with different techniques as well as from normally this would be built into the application itself this is the solution and from other information systems but in this case if you don't have control of it you can do it the do you think this
approach will be possible to use on OpenStreetMap times on cache expiry I think so yes I got 1 from from uh poster from 2nd
quadrant saying keep it care ofyou performance if you do not have the decoding so that would have to be tested in advance because they have many small and it's coming in at the moment I think of rather system integration will which I want to do with it saying OK I know that in a batch job at the nights and that will come in and I have to pay player to 2 different system of strip this quite busy perhaps you have to test if it you really work but the workflow could you could do it this an advantage would be
that you could do with this stuff in times of the day the president that PC yeah that's right you you can you can wait for half a day and then consume all those changes it's a
for you until you pick a future check them all Jeffrey questions I myself how long is this it's forever
or at least pick it up or it is thought as long as you don't
consume it and you have to make sure that you consume it otherwise you're you're directory will will never get and the some money to your your replications of so how far they are behind what
questions the just a quick question on users science see you using your address I would expect know of the team tool this is an option of hasn't
non-Dutch not because of the database but if you if you look at it you don't have in terms of
logical replication you only see committed changes it don't see which database that commit the change so on that level you you have no chance to get it so if you want to have to pull it into the application that will write the current user onto the record and then you have a a yeah it has to be an attribute and directly but they you can't pull it out of the metadata because and this logic coding of this replication thing doesn't that just also the data that will go on to this and post-disco doesn't save who committed this data by itself so you you pull it you need to put it into the application and then it will be in the the
more questions now gave them the Greek and they missed the Korean thank you you you you you you you you