Building and Testing Database extensions with Nix

Video in TIB AV-Portal: Building and Testing Database extensions with Nix

Formal Metadata

Title
Building and Testing Database extensions with Nix
Title of Series
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
2018
Language
English

Content Metadata

Subject Area
Abstract
When building extensions for databases, it can become burdensome to make sure you build and deploy matching versions. This often involves changing the version number in multiple places or custom scripting, in particular when deploying extensions to other environments. In this talk I will cover two cases of building database extensions for PostgreSQL and Redis, as well as how to use Nix to manage Python environments to run tests against temporary instances of the database. Finally I will briefly discuss building docker images for the database with the extension installed. --- Bio: John is a software engineer at InfoSum, where he works on the core components of their customer data platform. At InfoSum, he uses Nix to build several C projects as well as managing python dependencies, and has been using NixOS as his primary operating system for over a year. In his free time, John enjoys experimenting with Haskell, playing board games and going bouldering.
Point (geometry) Axiom of choice Statistical hypothesis testing Kummer-Theorie Context awareness Building Functional (mathematics) Presentation of a group Multiplication sign Time series Mereology Perspective (visual) Product (business) Roundness (object) Field extension Operating system Software framework Descriptive statistics Area Scaling (geometry) Kummer-Theorie Software developer Projective plane Bit Database Mathematics Type theory Field extension Uniform resource locator Digital photography Process (computing) Sample (statistics) Angle Personal digital assistant Query language System programming Quicksort Communications protocol
Statistical hypothesis testing Windows Registry Web page Kummer-Theorie Server (computing) Scripting language Service (economics) Sequel Computer file Structural load Computer-generated imagery Compiler Methodenbank Function (mathematics) Independence (probability theory) Revision control Medical imaging Latent heat Benchmark Field extension Physical system Scripting language Service (economics) Host Identity Protocol Server (computing) Building Structural load Projective plane Electronic mailing list Propositional formula Database Bit Benchmark Windows Registry Statistical hypothesis testing Field extension Integrated development environment Function (mathematics) Revision control Website Right angle Writing Local ring
Kummer-Theorie Field extension Macro (computer science) Hash function Field extension Methodenbank Revision control Compiler Information Block (periodic table) Macro (computer science) Data structure
Complex (psychology) Kummer-Theorie Regulärer Ausdruck <Textverarbeitung> Sequel Computer file Distribution (mathematics) Code Directory service Methodenbank Revision control Field extension Methodenbank Flag output Physical system Email Building Internet service provider Bit Density of states Field extension Function (mathematics) Configuration space Gastropod shell Integer Flag
Derivation (linguistics) Regulärer Ausdruck <Textverarbeitung> Field extension Phase transition Revision control Flag Configuration space Configuration space Derivation (linguistics)
Kummer-Theorie Regulärer Ausdruck <Textverarbeitung> Kummer-Theorie Patch (Unix) Wrapper (data mining) Structural load View (database) Patch (Unix) Bit Methodenbank Directory service Function (mathematics) Mathematics Field extension Methodenbank Plug-in (computing) Library (computing)
Kummer-Theorie Kummer-Theorie Function (mathematics) Multiplication sign Moment (mathematics) Similarity (geometry) Function (mathematics)
Statistical hypothesis testing Functional programming Building Run time (program lifecycle phase) Linear regression State of matter Code Multiplication sign Database Methodenbank Function (mathematics) Derivation (linguistics) Methodenbank Physical system Scripting language Email Kummer-Theorie Linear regression Software developer Instance (computer science) Unit testing Control flow Statistical hypothesis testing Statistical hypothesis testing Category of being Process (computing) Phase transition output Quicksort Physical system Reading (process) Computer file Control flow Modulare Programmierung Regular graph Hypothesis Wave packet Goodness of fit Field extension Gastropod shell Integrated development environment Codierung <Programmierung> Metropolitan area network Data type Installation art Information Expression Projective plane Database Mathematics Field extension Revision control Gastropod shell Fiber bundle Object (grammar) Table (information) Library (computing)
Statistical hypothesis testing Derivation (linguistics) Field extension Methodenbank Hooking Building Expression Revision control Gastropod shell Methodenbank Gastropod shell Statistical hypothesis testing
Scripting language Building Server (computing) Link (knot theory) Kummer-Theorie Service (economics) Computer file Patch (Unix) Structural load Computer-generated imagery Expression Shared memory Bit Canonical ensemble Medical imaging Methodenbank Gastropod shell Configuration space Configuration space Library (computing) Resultant Library (computing) Directed graph
Windows Registry Statistical hypothesis testing Slide rule Building Kummer-Theorie Key (cryptography) Decision theory Structural load Multiplication sign Computer-generated imagery Basis <Mathematik> Online help Control flow Windows Registry Statistical hypothesis testing Medical imaging Software repository Software repository Video game Arc (geometry) Cloning
Point (geometry) Subject indexing Word Roundness (object) Kummer-Theorie Software repository Right angle Object (grammar) Reading (process) Twitter Number
we are ready for John was gonna like I said be talking to us about building and testing database extensions with Nix see Mike there you go give him a round of applause please [Applause] hello thanks having me this is kind of a presentation about how I've been using Nick's at work it's not really recommendations I guess more of a description of what I've done maybe things that I think could improve or from my side or from maybe things I could contribute to Nick's packages and just sort of a perspective from that angle I've been using Nick's for probably about a year and a half I initially started it for this project which is hangout described in the talk and I've switched to using it is next so as my main operating system of choice for work and home generally pretty proud of it or pleased with it so let's just begin the talk so I'm just going to describe I have a couple of Postgres extensions and a redis extension that I build and maintain as part of my job and I'm just going to cover how that's done and as I said it's a description and a prescription maybe there are some good ideas I don't know so background on database extensions particularly in the context of Postgres they've been around since about nine point one exactly since nine point one and they're actually pretty flexible in what you can do Postgres has always had the ability for custom types and was its original selling point but extensions can load in complicated functionality is that you might want to have in your database some examples situs or post kiss time scale DB situs is sort of a distributed database product it the private company that does a SAS product based on it but kind of a bit wild when it comes to changing what photographs actually does post cases for geographical data if you want to have a whole bunch of locations and find out a query on which locations are in a certain area you can do really cool stuff that time scale DB turns Postgres into a time series database which I mean these are pretty radical departures from what you usually do but still it would in the Postgres framework and the protocol protocol when you use SQL so quite handy for a development perspective there is a caveat though if
you are going to write an extension they only work with the major release that you compile against them this is quite handy for making sure everything's compatible when you deploy it but say you want to update your database you need to make sure your extension is the correct version otherwise I won't load so we were developing my sequel UDF which works on a very simplest kind of system and basically one of my co-workers didn't really understand how widely laid out the project we were deploying into a doctor file and the doctor file had major release of particularly my sequel version he updated the bill dependencies and it just meant it yeah if you forget to match the virgins it's just not going to work and it's gonna be difficult to figure out exactly why it doesn't work until you actually try and use the image so he'd heard about Nix so it's like well why don't we try that and we did so here's kind of how my workflow would work before I started using Nick's have to still the dev package was a bun to package mapped or something and still my other dependencies build a project locally build a docker image with that output matching the right server version and then started as a local container and then run all my tests against the service it's nice it's a lot of steps to remember how to do if you want to maybe write bash script doesn't list it's nice problem is if I want to intervene in something or do you think differently it's a bit more complicated for when I run want to run benchmarks instead of the regular tests if I want to run a specific test or yeah if I want to just connect to the database itself I could just write a complicated bash script but I'm really lazy so can I just push it to my CI pipeline that I've got because we're hip and trendy yeah okay but now it's got it down with the dependencies wait for the extension to build wait the docker image to build wait for the document should be uploaded to the registry it's got a pool the image again started as a service and then I've gotta wait for my test to fail to find out the site's got them wrong this is just motivated by my laziness so let's say I
want to update which major release I'm using for my base database I've gotta update the documentation got to get a new docker image gotta update my local pages and I've got to update the remote build environment so okay digression into how Postgres works just gonna talk briefly about how it stops you from loading modules aren't compatible and
that's called PG module magic which I like to think of as a hash tag rather than a macro so essentially as a magic struct that it uses you just add if and f PG module magic add it's your build sorry so you'll see extension and basically this just stops you from loading your extension dynamically so
this is what the magic struct looks like if you're interested
so if you were wanted to build something usually for the post-crisis extension system they provide PG access which is kind of just a big make file thing include that you can add and it will pull some module definitions and basically use that to hide a lot of the complexity from you this is an example from the documentation for Postgres basically its definition of the header and C file where you've got your code for your extension some versions SQL for telling you how to load and unload the module and then this tool called PG config which lacy tells you how Postgres was set up and then this is an example of what the sequel might look like so it
just to make file turns out next packages is very good at engine handling make files so it's actually pretty easy win if you want to add an extra dependency you can just use the LD flags and customize things of the C Flags bit of a pain with them PG conflict though show you why
essentially PG config will have a bunch of C flags from when Postgres is compiled and LD Flags as well and share them so when I am building an extension I'll do this kind of bad practicing we're adding the LD Flags here I have since writing this had a better idea about how I could do this but essentially you just have to add a pre-configured phase and you can integrate directly with PG excess because seems to work pretty well that way
so let's say you wanted to standard make derivation for post cross extension and you've got xx a sh pretty straightforward everyone here is familiar with next I didn't want telling you this and same pre-configure stuff so
let's look at extensions of nexpaq views there's actually quite a few already and I was not super familiar with where they were or how to discover them I knew post kiss was there but it's kind of in its own libraries directory whereas either in the Postgres directory so I think maybe discoverability first something like can be discovered it could be improved a bit more but the next OS Postgres module can just have these added to it and it works which is something I would want for my extension as well but I'm not really sure how you do that because how old is Postgres fine my extension so yeah how do we how do we use it well I just tried to copy the way next so next to us does it if you look into the ways in Nix OS loads extensions basically just has a thing that checks if you've asked for any extra plugins to be enabled and if you have just adds them into subpaths and symlinks them into the build output but it needs to do a bit of extra stuff with post build wrapping to basically just set this Knicks PG lib tear which yeah it turns out he's quite important basically post press will look in a couple of places for where your extensions might be and it's not it's not in the next door so you need to give it a bit more advice on where that would be so that's I think is why Nick's PG Lib dear is set which comes from a patch
that's made to Postgres when it's built so this is where that comes from so that wrapper will allow you to provide the Lib de to find your extensions
so can we just set it to our build output from our extensions and we're good to go right well not quite because there are some extensions from Postgres that you might want to use anyway so that's why you need to take the same approach and wrap it as well
here I've pretty much copied the same thing a very similar idea I'm just calling a package with some extensions in just some more some more paths but I've also exposed in it to be in create GB we'll see why in a moment because I'm using them to test but yeah this just defines my custom Postgres and I'm using a particular package from Nix packages so this is clear okay there'll be plenty of time for questions hopefully so I can build this package
it's all very nice what do I do with it I can probably deploy it somewhere and test it but can I just maybe start on my system and give it a go yeah okay but then I need to remember to delete that database so I'm still kind of at the same place I was initially and maybe I can add it to my script but then I've got to manage a whole bunch of other things to control that essentially what I found is there's some tools that basically provide you with sort of ephemeral databases that they will basically just kick off a database process with a empty database sorry empty tables and things and I'll just use the there's a few piping packages for example that would provide that so I'm using testing dot Postgres and testing common and essentially they are things I provided Nick shell so it the idea is to get a pretty good developer ergonomics so they can just pull up next she'll get all the dependencies for the project and then also get why we require for the running the tests as well and all the tooling as well see yeah I didn't really want to go into too much detail about the Python code that I've written generally having access to Python here is quite useful there's a really nice property based testing library called hypothesis which you can use to make sure that you're encoding and decoding works in the way you'd expect but generally it seems to be I don't know if many people who use this approach I haven't heard a huge amount about it we're using temperate databases there's a tool called PG temp oh wait PG temp which is like a bash script that will run an instance of place for us temporarily and it candles some of the cleaning up a bit better than what I've been doing the there's also a package vascular's welliton called ten Postgres TMP - Postgres which will also do a very similar thing I don't know how common it is to really need that kind of thing unless you're testing an extension most of time I think probably unit tests would cover this the issue is I'm kind of testing how my extension works with Postgres which is perhaps a unique problems face and yeah I just I end up using PI test fixtures for controlling the state of the database which is not very clean and can end up leaking databases which is an interesting problem as well if you forget to or if your script breaks improperly but with the shells I've written if you want to just jump in and give run all the tests run a copy of Postgres run all the SQL commands against it they're required to test extension it's essentially two commands which I can just put in the readme I don't need to describe how to apps install the various packages that people usually require and I think as somewhere where we're not really good functional programming and we don't really people haven't really gotten bored with the next train I think having something where it's just two commands is really helpful for getting people on board with it so yeah cool I've written this expression where I can just define it Postgres installation with the extensions enabled or extensions installed as well and what if I want to update it previously I had to update dr. file update all these other things now he just changed the package from Dix packages there are a few problems with this PG access has regression tests which are make install check phase which require a running Postgres instance to actually test them they essentially this is a descriptive expected SQL output and some commands and there I haven't really thought of a good way to do that it'll be really nice if there is a annex packages or make derivation phase the for install check but it's not actually a build input it's sort of a runtime dependency of regular tests and it'll be nice if I could just have my single expression for my extension and just be able to run the tests that way on that expression so in a standalone way rather than other man when I kind of bundle up all my extensions and test them together wit Postgres I have no idea how to do that the only thing is I'm always overriding and because I want to make sure that the post preservation I provide to my extensions is the same as the one that I built with kind of think maybe I should just be make my extensions unable to build without post Cresson kind of curry the expression but I haven't seen that pan very often in the next packages so I'm kind of worried that briefly reticence has a module system - it's not very interesting compared to Postgres because Redis modules are responsible for loading initializing themselves and they provide other information about themselves they do have an extension system you just write a make file and as long as it has a read as header you can load it as a shared object into Redis I when I initially submitted the abstract for this talk I kind of thought there'd be more to it but unfortunately not or fortunately somebody say um I haven't seen any
examples of Redis modules indexed packages I think maybe they're quite new I maybe I don't know how to discover them I certainly would Postgres as well there was an issue of trying to find the Postgres extensions I don't know where I would find a Redis module if it existed I kind of think maybe like Python packages or high school packages that maybe they could be a Redis modules or Postgres extensions attribute set but I don't really know much about that
so yeah similarly really simple derivation this is like basically the most simple derivation you can get BAM you've got a realist extension and
there's also testing Redis like testing for us so I can reuse all of my testing infrastructure for Redis as well which is nice small caveat when you have your
shell next you need a small shell hook to export your Redis modules path to your expression full of Redis modules just thought maybe that might be interesting a little trick to share and
now we're gonna talk about docker it's inevitable the people who I work with are very keen to ever he'd end up in the document owner so I have to somehow get my extensions to work with Postgres in the document owner or with Redis in the dock you know so there are a few tools in Nix packages that will help me accomplish that goal unfortunately with Postgres I can't they're ducks tools has pull image and build image there's also now built layered image the place where they would want to put my extensions it wouldn't be able to be found with the Postgres in the base image from just pulling the Postgres image like the official library image I could wrap it with a Postgres from next packages with the patch for Lib PG dear but it's kind of awkward to do there's like some nice shell scripts that the base library image has I really want to commit them so the practical side of me currently just uses as a doc file I feel really bad about myself as a result but I hope to improve it but I don't really know if there's a good way around doing this however for Redis oh sorry so I have this horrible hack where I use our sink to pull out all my extensions and then I put them in my docking toner and I really wish I didn't have to do that so just thought I'd share my shame with you
so Redis on the other hand I mean if you look up doc tools in the Nick service sorry next packages manual Redis is actually the canonical example which is quite handy so essentially I just ripped that off and if I want to load my module the lid thing is a bit contrived that's just where my expression is installing it from my make file probably I want to have a better place for it but yeah it's simple enough you just add a letch Redis config in and then your build image definition and yeah you just load it up when your server starts easy enough so now you know it's not great there cool
thing though all these tests I've written using temporary basis I can now run them against the images I've built with build image etc or docker if I'm feeling bad yeah so essentially I've shown you the arcing thing I'd like to find a better way to do that the problem is I'm using gitlab CI and it is not super compatible with Nix so there's a few issues one one thing that really irritates me is the docker sock thing so it once have privileged access to that socket so that it can upload things to my daughter registry but if I built my image with the old image I mean my next Runner and I don't have that access so I essentially have to copy the tarball of my image into another Runner which then loads the docker image and then sends it to the registry which is pretty annoying I think well why would like to maybe spend some time doing is build like a go tool that uses a bunch of the docker things to just upload images so it's just a tool that takes a tarball from build image and then you can just push it to an arbitrary registry if you have the credentials I don't know if something like that exists if someone knows something about that and making them Oh Zim burtom husband back sorry tell me in a minute okay Scorpio so that'll be really handy and yeah hopefully from his white images for Redis are just one layer hopefully the build layered image would be helpful but I haven't mentioned integrate that yet into my pipeline closing thought I was reading a key news yesterday and I like to base all of my life decisions on what random people are saying on Hackney's someone was saying it would be a good if Postgres had a package managed for extensions and maybe what I would like to see is the ability to manage extensions when when people think oh I'm gonna manage my post cross extensions they think of next packages
acknowledgment just to move help yourself my slides and I said I'd to give him an acknowledgment he has a cool repo showing you how to use next packages to build stuff with Beamer I
this is a point where I'd beg for Twitter follows and also this slides that you're watching reading now are bill and in the repo cool any questions
or Corrections of any questions yes they are are you aware of this pool request by told police that overhauls the Bosque SQL infrastructure index packages I am NOT that seems like something I should have known about before I gave this talk right yeah no I am aware that a lot of the Postgres extensions are written by him I don't know there's some things that they do differently that I'm not sure about so for example they won't use PG access directly they'll kind of copy the shared objects and seagulls are sub yeah do you have the pull request number button okay thank you very much anyone else No then another round of applause for this interesting talk thank you interesting is a good word [Applause]
Feedback