We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Elephant Puppets

00:00

Formal Metadata

Title
Elephant Puppets
Subtitle
Deployment automation for PostgreSQL
Title of Series
Number of Parts
25
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language
Production PlaceOttawa, Canada

Content Metadata

Subject Area
Genre
Abstract
Deployment automation for PostgreSQL Puppet is a platform for I.T. automation and configuration management that has gained popularity in the devops movement. This talk will provide an introduction to puppet and talk about how Afilias uses puppet to deploy and maintain over one hundred servers running PostgreSQL that support key parts of the DNS infrastructure Puppet allows DBA's to automate the installation and configuration of PostgreSQL across multiple servers. This talk will explain how puppet can be used to install and configure PostgreSQL. Automated configuration of your database servers keeps servers looking the same allowing you to control differences. Managing your database servers through puppet also allows your to easily reproduce your configuration for development and QA purposes. The talk will cover the basics of puppet and best practices for deploying puppet to manage PostgreSQL including: How to install PostgreSQL via puppet and create database clusters How to manage postgresql.conf, pg_hba.conf and .pgpass via puppet Best practices for dealing with configuration differences between different database servers and different environments Afilias is a registry services provider running registries for numerous TLDs including: .INFO, .ORG, .MOBI, .XXX, .IN, .ME, .ASIA, .MOBI. Afilias has been using PostgreSQL since the 7.x days and currently has more 100 production database servers running PostgreSQL.
AliasingSoftware developerMetropolitan area networkAerodynamicsDirect numerical simulationPC CardIntegrated development environmentDomain nameOperator (mathematics)Office suiteDatabaseRow (database)Domain nameWindows RegistryReplication (computing)Physical systemTraffic reportingConnected spaceServer (computing)BitType theorySoftware frameworkGoodness of fitSoftware maintenanceProcess (computing)Forcing (mathematics)InformationInstallation artSession Initiation ProtocolArithmetic meanLevel (video gaming)Profil (magazine)DemosceneSet (mathematics)WebsiteStress (mechanics)CausalityProgram flowchart
Different (Kate Ryan album)Server (computing)DatabasePhysical systemBitSoftware bugFunctional (mathematics)Office suiteTraffic reportingProduct (business)Windows RegistryData warehouse10 (number)Operator (mathematics)Instance (computer science)SpacetimeStress (mechanics)Process (computing)Data managementMultiplicationComputer animation
Human migrationRed HatDatabaseServer (computing)Binary codeWindows RegistryData centerVirtual machineStandard deviationConfiguration managementQuicksortSoftwareMultiplication signSystem administratorPower (physics)Computer hardwareDesign by contractVirtualizationMereologyPoint (geometry)Stress (mechanics)MathematicsPerformance appraisalHuman migrationCartesian coordinate systemPhysical systemPresentation of a groupComputer animation
Computer fileScripting languageTask (computing)Software frameworkGastropod shellServer (computing)DatabaseForm (programming)Type theoryWebsiteSystem callCASE <Informatik>Stress (mechanics)
Server (computing)Metropolitan area networkConsistencyScripting languageDirectory serviceServer (computing)Task (computing)Integrated development environmentDifferent (Kate Ryan album)Software maintenanceInstallation artType theoryQuicksortFile systemLastteilungProduct (business)Projective planeRevision controlDatabasePhysical systemWeb 2.0Level (video gaming)Software testingConsistencyInstance (computer science)Identity managementCartesian coordinate systemSoftware developerFunctional (mathematics)AutomationNetwork topologyArithmetic progressionObject (grammar)File formatComputer animation
Metropolitan area networkServer (computing)Asynchronous Transfer ModeSocial classOpen sourceStapeldateiServer (computing)Projective planeWeb 2.0Library catalogType theoryDatabasePatch (Unix)BitQuicksortSource codeRevision controlElectronic mailing listModule (mathematics)MathematicsSocial classComputer programmingSoftware maintenanceDesign by contractInformation securityEndliche ModelltheorieCategory of beingSet (mathematics)Coma BerenicesEnterprise architectureComputer animation
Tape driveInternet service providerSocial classServer (computing)Social classInstallation artServer (computing)Default (computer science)Statement (computer science)Instance (computer science)Variable (mathematics)Data managementOperator (mathematics)Control flowQuicksortType theoryPhysical systemFunctional (mathematics)Computer animation
Asynchronous Transfer ModeComputer fileComputer fileContent (media)Type theoryFile systemSocial classData managementField (computer science)MathematicsEndliche ModelltheorieServer (computing)FrequencyComputer animation
Binary fileModule (mathematics)Asynchronous Transfer ModeStress (mechanics)Slide ruleMathematicsMultiplicationComputer fileSoftware testingGastropod shellReduction of orderLine (geometry)Traffic reportingResultantServer (computing)Multiplication signSocial classRevision controlProduct (business)Branch (computer science)Point (geometry)Lattice (group)Interior (topology)Perpetual motionBinary codeDatabaseScripting languageIntegrated development environmentComputer programmingOnline helpProcess (computing)Module (mathematics)Interface (computing)Network topologyData managementPattern languageContext awarenessLibrary catalogWeb pageInstallation artSoftware repositoryGraph (mathematics)WebsiteVideo game consoleProjective planeWritingHookingQuicksortDifferent (Kate Ryan album)Mobile appJava appletAttribute grammarUniform resource locatorRepository (publishing)Enterprise architectureProgram flowchart
Information systemsCodierung <Programmierung>Standard deviationMaxima and minimaOvalRead-only memoryServer (computing)Computer fileSingle-precision floating-point formatTemplate (C++)Poisson-KlammerTriangleWeb applicationCodeSubstitute goodWritingSquare numberLatent heatSocial classBuffer solutionFormal languageRun time (program lifecycle phase)Mobile appLibrary catalogWeightRight angleMereologyComputer animationJSON
Asynchronous Transfer ModeComputer fileComputer fileSocial classTemplate (C++)Different (Kate Ryan album)Instance (computer science)CodeVariable (mathematics)Set (mathematics)Ocean currentContext awarenessBuffer solutionOrder (biology)Slide ruleResultantQuicksortServer (computing)Error messageRevision controlString (computer science)Content (media)CASE <Informatik>WeightOffice suiteCore dumpMereologyUniqueness quantificationAeroelasticityMessage passing1 (number)Execution unitReverse engineeringComputer animation
Server (computing)Type theoryPhysical systemMereologyVariable (mathematics)Operating systemServer (computing)DivisorIP addressCodeComputer programming2 (number)Semiconductor memoryVirtual machineWritingBitAddress spaceInformationParameter (computer programming)Electronic mailing listReading (process)Right anglePoint (geometry)Computer animation
Flow separationCodeSocial classVariable (mathematics)HierarchyDatabaseMereologyConfiguration managementCodeUsabilityServer (computing)Maxima and minimaLibrary (computing)Data managementUltraviolet photoelectron spectroscopyComputer fileComputer animation
Server (computing)Integrated development environmentData typeDomain nameHierarchyGame controllerDomain nameServer (computing)Social classProduct (business)Set (mathematics)Integrated development environmentNetwork topologyLevel (video gaming)DivisorCartesian coordinate systemVariable (mathematics)Domain nameArithmetic meanSampling (statistics)Slide ruleInformationConfiguration managementType theoryJSONUML
InformationDomain nameIntegrated development environmentData typeProduct (business)Electronic mailing listOrder (biology)Gene clusterDatabaseWordElectronic mailing listSequelParticle systemConfiguration managementFile systemComputer fileFront and back endsHierarchyQuicksortFormal languageServer (computing)Demo (music)Domain nameNetwork topologyData structureProduct (business)Level (video gaming)Keyboard shortcutProcess (computing)Traffic reportingSlide ruleKey (cryptography)Different (Kate Ryan album)Hash functionGreatest elementCASE <Informatik>Default (computer science)CodeBounded variationComputer animation
Server (computing)Tape driveStapeldateiDensity of statesQuicksortSet (mathematics)Revision controlComputer fileConnected spaceOverhead (computing)Server (computing)Query languageDifferent (Kate Ryan album)DatabaseStapeldateiStrategy gameGroup actionCartesian coordinate systemTemplate (C++)Social classDefault (computer science)MathematicsPhysical systemMaxima and minimaOnline helpStress (mechanics)PressureVideo gameOperator (mathematics)Internet service providerContent (media)Configuration managementMereologyMassArithmetic meanSemiconductor memoryMetropolitan area networkComputer animation
Maxima and minimaScripting languageTable (information)Product (business)DatabaseStructural loadServer (computing)Integrated development environmentQuicksortDemonSlide ruleRevision controlSoftware maintenanceSurjective functionBackupData managementBefehlsprozessorLevel (video gaming)Group actionProcess (computing)Type theoryComputer fileSocial classMereologyOpen setOffice suiteLine (geometry)Multiplication signInstance (computer science)Computer configurationDifferent (Kate Ryan album)Buffer solutionPhysical systemGene clusterReplication (computing)Formal languageCartesian coordinate systemModule (mathematics)CASE <Informatik>Term (mathematics)Volume (thermodynamics)Data loggerCodeProper mapMetropolitan area networkPresentation of a groupContent (media)Dependent and independent variablesVariable (mathematics)File systemProjective planePattern languageMarginal distributionScripting languageUltraviolet photoelectron spectroscopySource codeMessage passingJSONComputer animation
Server (computing)Mathematical singularityHash functionComputer fileLoop (music)Bounded variationDifferent (Kate Ryan album)MultilaterationProduct (business)AdditionTemplate (C++)InformationVariable (mathematics)Data storage deviceTable (information)MereologyPasswordLine (geometry)Slide ruleLevel (video gaming)ProgrammschleifeKey (cryptography)Server (computing)Inclined planeSign (mathematics)Metropolitan area networkContent (media)Computer animation
PasswordWeightPasswordServer (computing)Web 2.0BitVariable (mathematics)Social classDatabaseLoginRight angleDrop (liquid)QuicksortElectronic mailing listMathematicsBuffer solutionTemplate (C++)RoutingDifferent (Kate Ryan album)Computer fileVirtualizationSound effectCASE <Informatik>Multiplication signCodeAsynchronous Transfer ModeCausalityFlow separationData managementHookingDisk read-and-write headSystem administratorWeb applicationLastteilungSoftware testingSpacetimeBranch (computer science)Functional (mathematics)EncryptionSource codeOpen sourceFront and back endsAdditionLibrary catalogRepository (publishing)ChainContent (media)Implementation1 (number)HierarchyInstallation artMaxima and minimaMereologyPhysical systemDemo (music)Mobile WebPoint (geometry)Stress (mechanics)Core dumpGroup actionRule of inferenceComputer animation
Transcript: English(auto-generated)
framework. And before we get into that, a bit about who I am. I've been a member of the Postgres community working with PostgreSQL for quite a few years. I am one of the maintainers of the Sloanie replication system, and I do this while working at Affilius. Affilius is a top-level domain registry operator.
So who in the room knows what a top-level registry operator does? A lot of people, good. We run registries, so .info, .org, the top-level domains
that we run. If you wanted to register a name in one of these registries, you would go to a registrar, someone like a GoDaddy or a TwoCows. You'd go to their website, type in the name that you want. If it's available, you'd say you want to register it. Behind the scenes, they are talking to our servers.
They would open up a connection to our servers, check if the domain name is available, and register it on your behalf with us. We use Postgres as the database for storing the system of record for all of the top-level domains we run. And we tend to use Postgres in a lot of our back office processing systems for our operational needs,
such as billing and reporting. We've been a Postgres user since, I think, 2001 or so, back in the 7x days. Back in those days, Affilius was one of the higher-profile, larger Postgres users. These days, there are a lot of other bigger Postgres
installations, but we're still involved in the community. We're still heavily involved in Sloanie. And we are quite happy with our experiences with Postgres. So a bit about our systems. We have over 100 production Postgres databases. And when I mean databases, I mean different servers running Postgres instances.
There's people at the conference who have one database server with tens of thousands of databases inside of it. We run lots of servers with different databases. Many of these databases are replicas of each other, being big Sloanie users. We tend to have needs for lots of replicas.
So for many of our operational registry databases, we will have a master database and five or six replica databases. These replica databases we use for things like failover and reporting and offloading back office functions
too. And we have other systems that are not replicated. We have our bug tracker uses Postgres. We have billing and data warehouse systems using Postgres. So we have a lot of databases. And we need to manage them.
About a year ago, a year and a half ago, it was time to replace our hardware. We had been running Postgres on AX using power servers, IBM. These servers were over five years old. The warranties sort of had expired. They were not as fast as machine you can buy today.
It was generally time for a hardware refresh. We went, did an evaluation, and decided we were going to move to Red Hat Linux on VMware virtualized. And at the same time, we decided, hey, we're moving all of our servers. Let's move data centers as well. Our data center contracts were up.
And we decided we were going to change our data centers. So we had a lot of new servers. We were going to have to buy, provision, install, and migrate to in a relatively short amount of time. So how could we use automation to help in this? We had already been using CF Engine, which is another configuration management system, deployment
system, to deploy our Postgres binaries. But we weren't really using CF Engine for a lot more than deploying binaries and maybe deploying crontabs. Other departments in Affilius had been using Puppet, particularly our sys admins who maintain our registry application, the software we write in house.
They were deploying with Puppet. And we decided, let's standardize on Puppet. We'll use Puppet for the database stuff. And we'll do that as part of this migration. Question? Basically, because the other team had already been using Puppet, the three big players
that you hear about are CF Engine, Puppet, and Chef. It seems like CF Engine is losing popularity. You hear about Puppet and Chef all the time. So we picked Puppet because we were using it already. So deployment automation, for those who aren't familiar with it, the idea
is you want to automate the tasks in setting up a server. On the non-automated Postgres installation, the extreme case, you have to install a database server. You download the tar file from the Postgres website. You uncompress it. You want to configure. You want to make. You want to make install. You run initdb. You copy over a file into the init script there, and so on.
This gets very tedious and long, particularly if you're doing this for more than one server. Anything that is more automated than what I just described is a form of deployment automation. Some people have been using shell scripts for this type of stuff for decades.
This is deployment automation. Some of the deployment automation packages, like Puppet and Chef, are frameworks for doing these common tasks easier and with more reuse than if you were writing your own shell script. We had a couple of sort of general goals
for our deployment automation project. We wanted servers to look the same, which doesn't mean we wanted them to be identical. Think of a web server where you have 10 web servers for the same web app, all doing exactly the same thing behind a load balancer. There, your servers are pretty much identical.
For us, while we have 100 databases, none of them are really identical. Many of them are running different versions of Postgres. Many of them are running different database schemas for completely different products. And then others are configured essentially the same. So we needed a way to keep things to look mostly the same, but manage the differences.
So before we did automation back on AIX, we hit an instance where the directory we were writing, the Postgres logs, pglog, on a couple of servers was one level higher in the directory tree in the file system than it was 90% of the places. How did we discover this?
It was in the middle of the maintenance. A DBA was doing something, and the log directory was not where he expected it to be. He might have been running a script to check something, and it wasn't where the script expected it. And this is not the type of thing you like discovering in the middle of the maintenance because, yeah, you can deal with the other directory
moving, but it's a pain. We wanted ways of managing this and preventing that type of situation. We also wanted consistency between our production and our development in our QA environments. As we move to more deployment automation and more automated
tasks in general, we wanted to be able to confidently test our scripts in QA and staging and know that they have a good chance of working in production. While we were on AIX, we had, I think, three database servers in our staging environment and maybe five in our QA environment
for all of our different products, all our different projects and systems. So out of those 100 databases to test something we might have in QA, we might have five servers to choose from. So QA would install different versions of Postgres on the same server, create multiple back ends listening on different ports, and do whatever
they needed to do so they could load the data and test the application functionality on the server with lots of other QA activities going on. This was fine for testing application functionality, but it tended to have some problems, particularly if you wanted to test maintenance type tasks,
it didn't work so well. Now I'll admit, while this was one of our goals, we've not been so successful yet in making our QA environments look as close to production as I would like. We're still working on making progress on that. We also wanted to reduce manual commands.
Before I gave the example of doing the install manually, if you have to upgrade Postgres on 100 servers, maybe you have to change the version rather quickly because there's a security patch you need to roll out. The less typing you have to do on each of your 100 servers, the better. So we wanted to reduce that
and deployment automation was a good way. As I mentioned, we've picked Puppet. Puppet is an open source project with a commercial edition that you can buy. You can get Puppet from puppetlabs.com. Puppet Labs does most of the maintenance of Puppet.
The open source version or community version of Puppet will probably do everything you want. We initially launched with the open source version when we did our stuff. And then we bought a commercial support contract with Puppet Labs, mostly for access to their professional services.
And then we migrated to the enterprise version. It has a bit fancier GUI, a bit nicer packaging, but most everything I talk about in this talk applies to both the community version and the commercial version. So a bit about how Puppet works. In Puppet, you have a Puppet Master.
The Puppet Master is a server that will have all of your Puppet module source code and a list of what goes on which server. Then you're gonna have your servers, which might be database servers, web servers, LDAP servers.
These servers do your real work and they will run a program called the Puppet Agent. The Puppet Agent contacts the Puppet Master and asks the Puppet Master for instructions on what it should do. And the Puppet Master ships stuff down to the agent to make changes on the agent server. Some terminology that I'll refer to.
In Puppet, when you tell a server what should be deployed on it, that's the node definition. And the node definitions, it consists of a set of classes. So for a particular node, you might be saying, we're deploying the PGS grow class and the Sloanie class.
The Puppet Master will take this node definition, which is the list of classes and the actual source code for these classes and batch them up together and produce a catalog. This catalog is sort of instructions for the agent on what to do and how the agent should make the server look.
This catalog gets shipped to the agent. What's a Puppet Class? This is sort of a very basic Puppet Class. Puppet Classes can instantiate other Puppet Classes. They can instantiate Puppet Resources and they can have things like variables being set
and basic control structures like if statements. A Puppet Resource is something in Puppet that you instantiate to do something. Puppet ships with a whole bunch of default pre-provided resource types. One of them is the package resource. The package resource installs packages
with your operating system's package management functions. So this particular example will create an instance of the package resource to install the Postgres SQL Server package using YAML. So on a Red Hat system, this will install Postgres. And that's what the class will do.
Puppet is also declarative. Another resource type Puppet ships with is the file resource. The file resource manages files in the file system. So the idea is you declare that a file should exist on your server, TempHelloWorld,
and you declare what the content of that file should be. So in this example, I'm saying I want to have a file TempHelloWorld, and it should contain the content Welcome to Ottawa. When the Puppet agent runs, if there's no file in Temp called HelloWorld, it'll create it and say Welcome to Ottawa. When the Puppet agent runs
and it finds a file in TempHelloWorld, but it says Welcome to Toronto, Puppet will change that file to say Welcome to Ottawa. So you run Puppet once and make some changes. You run Puppet again, and you haven't changed anything else, Puppet should really do nothing. It checks to see that everything that you've declared is still true,
and it'll do nothing. If something has changed, Puppet will make what you declared true to be true again. And this is the philosophy of Puppet. There's ways you can do things that don't follow this philosophy, but you're going to get into trouble. Puppet is meant to be declarative,
and you want to build your Postgres or your Puppet classes to do things in this declarative fashion. So the way we deploy Puppet at Affilius is we have a Puppet master and we have our servers, database servers, Apache servers, Tomcat servers, et cetera. We write our Puppet classes
and we check them into a Git repository. The Git server we push things to has some hooks that push things onto the Puppet master to make those codes available. When we do things in Git on a particular branch, that branch is then pushed to the Puppet master,
and we can have multiple Git branches of Puppet classes available at once. This allows us to have different servers running different versions of our Puppet classes, which is very useful if, let's say, you want to test your Puppet changes in QA before you want to push them to production. Then, you tell Puppet that you wanted to use
the test123 branch on a QA server while it's using the prod branch on a production server. Git allows us to do this rather nicely with the hooks. Then, before I talk about a node definition, a node definition, you can write a file in Puppet
that says for server 1, it's running classes Postgres and Sloanie, server 2, it's running classes Tomcat and your Java app. But if you have a lot of servers, a one file with all of this gets a bit unwieldy. We already use LDAP for managing a lot of our infrastructure,
what server is there, what the names are, etc. So with Puppet, you can have what's called an external node classifier. An external node classifier is sort of a program that Puppet will contact during the run that says for server 1, 2, 3, what classes should I install here?
And there's a bunch of pre-written external node classifiers, one of them interfaces with LDAP, and that's what we do. So in LDAP for server 1, 2, 3, we list the Puppet classes as attributes in LDAP that should be installed on a server. We also list the branch from Git,
the environment we call it, that we want to run on those classes of. So a server contacts the Puppet master, Puppet master contacts LDAP, it gets what it should install, the Puppet master builds up the catalog, ships it to our servers.
A couple slides ago, I showed you a web page with a management console. That's the Puppet enterprise management console. We can launch Puppet run on a bunch of servers. So I can say all of our servers in Texas, we want to do a Puppet run now that are database servers.
I select them on the GUI, launch Puppet, it'll go run Puppet now, and I can view the results in a report screen. If you're not using a fancy GUI, you can launch Puppet on the command line of each of your servers. You could write shell scripts to log on into each of your servers and do this.
A lot of people run Puppet on an interval, so they have Puppet automatically runs once an hour or once a day. We're not doing this yet. We might eventually get to that point. That's sort of the other way you can deploy Puppet, so it always runs. Make sure that the things you declare in your manifest are true.
We had some specific goals for deploying our databases. We wanted to deploy binaries, we wanted to deploy conf files and manage conf files. We wanted to handle the initDB because as we were rolling out our 100 databases onto these new servers, we wanted to get that done very quickly and easily.
We also have a lot of Trontabs that we have to manage. Each of these things I'll talk about in more detail in a couple of slides. And we did some things with Sloanie that are easier to manage with Puppet as well. There is a module for Puppet to manage Postgres.
Puppet has these different modules, collections of classes. Many of them are available on GitHub projects and Puppet Labs websites. There's one for Postgres. It does many of the things I talked about in the previous slide. Starting, stopping Postgres, handling your conf files.
You can get this module at this URL. We don't actually use it. The reason being we had a perceived need to install multiple minor versions of Postgres on the same server. So we thought we wanted to install a 9.2.4 and a 9.2.5
on a server at the same time and have them running at the same time. The Postgres RPMs that you get from the community don't really do that. So we had to build our own RPMs. And it seemed to be easier to just write our own class for managing Postgres than it would
be to modify the community classes here to do that. But I would recommend you take this as a starting point and see if it will meet your needs. So what we instead do is we have a resource called the PGSQL cluster resource. It manages a single DataDeer.
So it will launch, create the DataDeer, perform the initDB, install the confiles we want here. And then we have a whole bunch of infrastructure that sits above this layer for managing actually what confiles and what DataDeers and what users go to your particular server.
So one of the more interesting things and more common things you're going to want to do is manage files like your Postgres conf file. Puppet has a templating language. It's the Ruby templating language. If you've ever used Ruby on Rails and done web apps HTML,
it's the same templating language that that uses. Puppet is actually written in Ruby. And there are many places in Puppet where you can break out in your classes and basically write pure Ruby code to do things. And often you will need to do that when there's no easy way to accomplish what you want in Puppet.
So the way these templates works is you might have a Postgres conf and you want shared buffers to be variable. So you put that in and instead of a specific value for shared buffers like 32 megs, you would say within these square brackets or triangle brackets shared buffers.
And shared buffers is a variable and that will get evaluated at runtime. So when you're deploying your catalog, that variable will be substituted with, let's say, 300 megs. So the template, the ERB, gets turned into a specific conf file
with the variable substitution applied. The way you would use that template is using the same file resource we discussed before. You set the variables in your Puppet classes. So shared buffers might be 300 megs and we set the template we want to use. Then we're saying this file is for the Postgres conf,
which should be in the Datadir. And the content of this file, instead of being a static string like my previous slide, it should use the template and we pass it the name of the template. Puppet then goes, evaluates the template in the context of the variables you've set
above in the class and gives you your result. You can also do ordering on Puppet resources. So here, we're saying that we also want to make sure we run this file after we've done the initdb. I have another package called exec,
which execs the initdb command. Or sorry, another resource that execs the initdb command. And we're just saying in order to run the Postgres file conf, we require the initdb to have been completed. If you install your template before the initdb, it'll get overwrited.
So, okay, in Puppet, each resource instance has to be uniquely named. So if I had a different Puppet class that was installed on my server that had this string here with Datadir being the same value,
it would be a conflict. Puppet would say, oh, this file resource in this class wants Postgres conf. And this other class is also defining Postgres conf as the same thing. Which do I choose? I don't know. I'm confused. I'm going to error out. So every resource and every class has a name.
In this case, the name of the file resource I'm instantiating, the instance name, is the path to my Datadir slash Postgres conf. The class that I'm including this to also has a name. And the dollar name evaluates
to the name of the current instance you're running at for the class or resource you're inside of. Does that sort of answer your question? So I have no idea. That's probably how I named it.
There's no good reason. The dollar name will give you the uniqueness. It could even be a typo on the slide or a typo in my code. So some parameters in your Postgres conf file, you're going to want to be based on information on the server you're actually deploying to.
A good example is the IP address. Often you will do listen address equals star to listen on all your ethernet devices. But you might only want to listen on the first ethernet device. Postgres conf requires you to type in the IP address of that first device. But this is something you can automate.
Puppet has a program called Factor that runs as part of the Puppet agent run that collects facts about your system and ships them back to the Puppet master to make these facts available as variables during the Puppet run. For example, the IP address of the first ethernet device
and the second ethernet device and all your other ethernet devices are available as facts. The amount of memory on your machine is available as a fact. The operating system you're running is available as a fact. And you can also write custom facts. You can write a bit of Ruby code that will run on the server, the agent server, as part of the agent run
to do whatever you want to compute a value that gets shipped back to the server, the Puppet master, as a variable. Then inside of your templates, you can look up these variables and use them as values. Some variables you're using
are coming from your agent servers, but other variables you're going to want to control and set as configuration to manage your configuration and get a lot of usability between your Puppet classes.
So you have your Puppet classes and you have 100 database servers. Ideally, you would have one Puppet class for all your database servers to maximize code readers. But if you're doing this, you need to take the configuration parts out of your class and put them somewhere else.
Puppet has a library or facility called Hira, which is a data lookup library for looking up data in a hierarchical fashion. What it allows you to do is move your data out of your classes into these YAML files, and you can then do lookups and vary your configuration hierarchically.
So the way it works is you might have a DB server class and you might want to find out what port should Postgres be running on. And it might be that for different products you're selling or doing, you have different ports you want your Postgres to be using. So you'd say,
my application is Postgres, my product is Org, and the Hira, what is my Postgres port? And Hira will come back with the port it should use. How does Hira do this? First, you have to define a hierarchy. This is an example of a hierarchy
where what I'm saying is for a particular fully qualified domain name for my application, which is Postgres, I want to be able to set things on a particular server. But other things for Postgres I want to set for the entire domain. So for all of my tor.affilius.info servers
I want to set some stuff there, and then texas.affilius.info I want to set other values. But some environments, some settings might not vary based on where your server is located or the domain name of your server. Some of our settings vary based on
what we call environment type, which is whether it's a QA server, a production server, or a staging server. So we've made that another variable. And then other things, like maybe the port, varies based on the product. Everything in yellow is a variable that we've defined, and we've decided these are the switches, the control things
we want to use to build our configuration trees. You can pick other things that work for you. The important thing is if it's a variable in your Hira configuration, this variable should be set either by factor or explicitly as we did in our slide here.
So how does this then work? You then write some YAML files. YAML is a language for writing data. It's kind of like JSON. Hira, its default is YAML, but there are other backends you could use. And you write these files that will contain your configuration in a
tree structure on your file system. So this tree structure matches the structure of my previous slide. So some values, we might have a server called db1.tor.affilius, and we set some values in Postgres here. So when I go look up
my port, Hira will first check this file, this Postgres file for db1 because that's the fully qualified domain name. It doesn't find the value. It moves on. Oh, this is a production server. Have I defined it at the production level? No, I don't see it there. Oh, this server is, the product was org because that's the value I set to the variable
two slides ago. Yes, I see a port value. Hira will return that value. If it found the port value at the top, the first file it looked, that's what it would return. Hira will walk through your hierarchy in the order you define it, looking for the values you've set.
What do one of these YAML files look like? They're pretty simple. You have a key and a value. So the key in this case is Postgres port. The value is 5432. You can also build more complicated structures. So at the bottom here, we have our
DBList, which is a hash. So the key is Postgres DBList. The value is a hash. The hash, the key of a hash is pgecon, which is let's say the database name, and the value that points to is the owner of the database. There's another value
in the hash for a different conference with a different owner. We then have some puppet code that walks through this hash, creating databases with the create database SQL command for every database we've defined in the DBList, and setting the role in Postgres that owns the database
to be the role we've set here. This allows us, we have some databases where or some clusters, I guess I should say, where the databases in the cluster vary. For a lot of miscellaneous things, we'll have one database with a handful, four or five
different databases in it, and another cluster will have three of those four, but not all of them. We use this to manage what databases go on a server, and it works pretty well. As you manage your Postgres configuration, you're
going to need to come up with a strategy for differentiating different confiles and grouping things. I'm going to talk about two strategies. This is the one we actually use mostly. For every major version of Postgres, we start off and we do initdb, and we get the default confile for that
version of Postgres, and then we start to change it. There are some things that we change in the confile that apply to all of our database servers, and then there are some things that differ based on the class database. Some of the databases are small, they're not running on big systems, they don't have a lot of activity, so
we created a grouping of servers called small, and we have confiles for 9.0, 9.1, 9.2, etc., that are small. Other databases do batch processing systems. These systems typically have few connections, use lots of memory, so
we set up a set of files for these batch systems. Max connections are low, workmen is high, min duration for logging long-running queries is rather high, things appropriate for batch jobs. Then our OLTP systems, we tend to do a lot
of very high max connections, much smaller workmen because max connections are so high, and we have a lot of connections, and our min duration, we consider our query running long if it's 100 milliseconds on these systems, so we log it while on our batch processing systems, the query
of a minute is not long at all. So we built things into groups, and we try to manage things on these groups. Even though different applications are using the same group, we try to keep it together.
You probably could. The question for the audio was do we get a lot of value about having different version templates for each major version, and I like the fact that you can see the values that we've not changed.
And sometimes the default values change in Postgres, and we sort of find that's helpful, but it's not well for others not to do this. There's not that many major versions of Postgres, so the overhead of this is not a big deal, and there are settings
we've changed on different versions of Postgres. there are things that are appropriate for 8.3 that are not appropriate for 9.2, as an example. You could conditionalize some of this stuff, which sort of brings me to my next slide.
Instead of having one file for class of systems, you could have one file for each major version of Postgres, make everything that you might change a variable, and then set up a Hira level or use Hira to manage the different groupings. This is probably a better way of doing it,
and I'd like to move from having the one confile per class to one confile per version. What Peter suggested is I guess another level we could go to of just one confile for all versions. I'm hoping to go to this sort of level.
We would say then that our shared buffers and workman are all variables, and at some Hira level we would say for big it would be certain values, and for small it would be other values, and some systems we would override.
Another thing we do a fair amount of is crontabs. We have a lot of crontabs for things like backups, purging jobs, both purging things in the database, purging log files on the file systems, etc.
It wasn't a big deal for us to rewrite the module for installing Postgres and doing the initdb. Probably 75% of our puppet classes in terms of volume of code deals with all this other stuff that is not reusable between other prices with respect to managing our crontabs
and what files we go. We actually run most of our crontabs. We have database servers and we have what we call database application servers. A database application server does not run the Postgres daemon. It has Postgres installed and it runs crontabs and our slons and other things that connect to
a database and do maintenance. We do this to move load off of our database servers onto other servers because our backups between the gzip and some of our backups we then encrypt and send places. It's actually a fair amount of CPU load and getting this stuff off of a database is helpful we found.
Managing these cron jobs is a big part of what we do. Question? That is probably open office misbehaving.
To do the cron to manage crons puppet has a cron resource type. The cron resource type manages your crontab. For a particular line in your crontab you will declare an instance of the cron resource type saying
I want a cron for a backup under bar name. This is the name I was talking about again where because we have so many different servers and you cannot have duplicate instances of a cron. So name might be the product name in this case. Our class might be named
dap-server-org cron would then create a crontab with a name which will be a comment in the crontab file dap-server-backup-under-bar-org and it will put the command we've specified here. And the command can have a variable
so this way we have the same puppet class or puppet resource that lets us back up all of our different products. And some of our products we might be running the backup on one time and others we might make the minute be a variable instead of zero so we can control the time.
We have a lot of different products and crontabs to do different things and we've been managing those crontabs with puppet and we have switches to say which crontab should run on which products and we have switches in our Hira files to control what
options we pass to them and we have other switches to control what servers a backup should be launched from. So it gets kind of complicated but Hira has turned out to be a pretty good way of managing it and it definitely is working better than trying to manage all of these crontabs manually with
CVS or SVN and pushing them everywhere. Another thing we use Puppet to help us do is manage some of our Sloanie stuff. For those of you who don't know what Sloanie is it's a replication system and to configure Sloanie you would
Sloanie has a command line language called Slonak. Slonak is how you tell Sloanie what it is you want Sloanie to do and this is an example of a typical Sloanie script. They start out you name the cluster and these different clusters in your environment will have different names. Then you tell Sloanie
how to contact each of the nodes in this cluster and you have to provide the hostname and users etc. in the Slonak file. Then you actually do interesting things like your Sloanie commands. This file will not work in both the QA and production environments because at the
top we're specifying hostnames and the hostnames are not the same in QA and in production. So how do we solve this problem with Puppet? We have a template in Puppet that generates Sloanie variables. These files basically contain that top part of our
Slonak commands that are the same for any command. We go oh the cluster name should be from the variable cluster name. Then we do some more advanced templating stuff. We can do loops in the templates. We loop over all of our nodes.
SloanConf is a hash that comes from Hira and we loop over it. We get the node ID which is the key and some values and the values is another hash and the values might have a port it might have the hostname. We could put users and passwords in this hash as well. We write out that
admin.coninfo line from the previous slide using our Puppet template. Sloanie also has the idea of the variable that you can then reference later. In addition to the coninfo this same information is available as a Slonak variable so we
can reference that path to a particular host later in a Sloanie store path command. This is an example of how you can build up things in Puppet to separate the stuff that is the same on all servers let's say the set-add table from stuff that varies. So then when we run our Slonak commands
on QA or production we first include this preamble file then we do our set-add table or whatever it is we actually want to do. So how do our YAML files in Hira look for something like this? That hash I showed you that we were looping over will look like this.
The key might be put to Slon nodes and we'll do a Hira lookup and assign that to a variable Slon conf. And then we have the node ID 9181 that's the first level of the hash that contains another hash that has host and port and then we have another entry that's a different host
and a different port for a different Sloanie node ID. This might be our production conf and then for QA we would have a different YAML file that had different values for these things that would generate a different preamble on the server when we ran Puppet. So I'll talk a bit about what has
been a bit challenging for us. One issue is removing stuff. You declare these Puppet classes that you run the Puppet agent it installs a bunch of things then you decide oh I no longer want to be running this database on a particular server
maybe you decide to move it or maybe you're decommissioning it. You don't want to throw away the whole server and if you were doing this for web applications for a web server often what people do is they will just build a new web server and replace the old web server with a new one behind their load balancer and throw away the old one.
With virtualization this is very easy if you're doing web servers because you can just throw it away and replace it with another one if you're using a load balancer. This doesn't work so well with databases because you can't really unless you're throwing away the entire server you have to keep that data because if you're only
throwing away one of the databases on your server and keeping the other one you can't throw away the data of the one you're keeping by re-provisioning it. So what we've found we needed to do is for every class that does something in Puppet we've had to write a second class that undoes it. So we have sort of these removal classes and removal resources.
So we'll install things with the install one and then we decide oh we want to actually remove this from a server instead of having that server list the install class the postgres class on the server it might list the postgres remove class in the manifest in the node list I should say.
So it undoes things. Testing this is a bit of a pain because you have to install things, make sure things are how they look deploy your removal class and then look for any changes any leftover things and usually when you're installing something you want to get it out there.
You're not worried about removal when you're writing the Puppet class initially. You just get it out there and it's only when you uninstall something that you tend to bother about writing these removal classes. Yep.
Yes. So Puppet for example the file resource has an ensure present and ensure absent. So just like we were ensuring that the file was there in the examples I gave you can also add an attribute ensure absent where Puppet will make sure the
file is not there. And for your resources that's sort of how you remove things. And one way of doing this instead of having a separate class for removals you can have an ensure absent mode that will undo it. We've mostly not done that route.
Maybe that is a better route to go but I've been doing separate classes. I don't have off the top of my head a good reason why. But either case you have to actually think about what should happen in a removal and spend the time writing the code to remove that.
We also have had some challenges Yes. Well the issue is, so let's say we have Postgres and we have two databases in Postgres. Database 1 and database 2. And if you want to remove database 1
but not database 2 you can't do that. But even if we have a class that removes Postgres and it stops Postgres it deletes the datadir cause often if we're deleting something we want that space we don't want that datadir laying around cause we're likely going to
deploy something else on that server so yeah you can just ignore implement half the problem or half the solution by just stopping Postgres and not completely cleaning it up but eventually you're going to want to completely clean things up.
So password management is also a challenge. If you put your passwords in your Hira files and you put these into git and lots of people have access to your git branches you've just exposed your passwords to lots of people. That's not so good. So when we first tackled this we actually
weren't using Hira and I wrote some custom Ruby functions that ran on the Puppet Master to look up passwords. It was sort of like Hira but custom stuff that just look up the passwords and the passwords were only on our Puppet Master that only our sysadmins had access to and that worked okay except adding and removing passwords
you had to ask a sysadmin to do that. Then Hira, when we moved to Hira we discovered that there were different backends for Hira. Hira has in addition to the YAML backend it has a backend that works on GPG encrypted files. So you write up your YAML files and you encrypt them
and you can check them into maybe a different source repository and get them somehow deployed onto the Puppet Master and then you deploy decryption keys on the Puppet Master that you control fairly tightly and then the Puppet Master can decrypt these YAML files to look up passwords and push them down into the
catalog so they get deployed on your server. You can do this for doing things like writing PGPAS files, etc. Changing passwords is also a bit of a challenge in that if you want to change your database user password your superuser password you might need both the old and the new password available at once.
You need the old password to have your psql command log in and you need the new password to set it to something different. This is something we've not yet solved. One way of doing this is to have the password available as two variables and another way is to have a different user that you're using for changing
these passwords but we haven't actually implemented a solution. There are some next steps we're looking at doing. One, I spoke about how we're managing databases as we define the list of databases we create. We're not doing this for our typical database
login users. Right now we're manually logging into our servers issuing a create role or a drop role and it's getting kind of tedious. I'd like to sort of have Puppet manage the login users, the roles for our databases. I think that's quite doable.
Also, when we push a confile change we are not restarting Postgres. So we change, let's say shared buffers in our template or in Hira. We push this down, the confile is changed on our database server but it has no effect because we've not restarted Postgres. You could tell Puppet
to just restart Postgres but that means Postgres is going to be shut down when Puppet runs. If you're running Puppet every hour, you probably don't want to have your database go down unplanned if it's an important database. So I still have to figure out some good techniques
for managing restarting servers that can be restarted sort of at whim because a lot of our replicas actually can be restarted at whim because they're not yet being used for things but not restarting servers that can't be restarted at whim and then getting an idea about when
we've pushed something to a server that doesn't take effect until a restart changing the confile but not having it take effect is a good way to confuse yourself. So that's sort of on my next list of to-do stuff for Puppet.
Any questions? We use I think it's Git2 Lite as our Git server and then we've written custom hooks. We've written something that involves RabbitMQ
I believe. Those hooks are custom. We might be able to release them or open source them if there's interest. I can talk to the guy who wrote them and see what they look like and maybe push them to GitHub.
Any other questions? Thank you.