Managing FreeBSD at scale
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 26 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/19173 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
1
2
7
8
10
11
13
14
15
18
19
20
21
00:00
Scale (map)Control flowFreewareVideoconferencingComputer networkMathematical analysisInformation securityCache (computing)Sanitary sewerServer (computing)Object (grammar)Directory serviceCloud computingValue-added networkScalabilityPhysical systemProcess (computing)Channel capacityPredictionGame theoryRing (mathematics)Expected valueSystem programmingPlanningRule of inferenceWebDAVNewton's law of universal gravitationPressurePoint (geometry)Kernel (computing)Task (computing)Mathematical optimizationComputer hardwareBefehlsprozessorEuclidean vectorData managementKolmogorov complexityVertex (graph theory)Mobile appLastteilungScalabilitySoftwareContent delivery networkForm (programming)Physical systemNumberServer (computing)Cloud computingMereologyComplex (psychology)Scaling (geometry)Data managementMiniDiscConnectivity (graph theory)Multiplication signOrder (biology)ResultantRevision controlSkeleton (computer programming)Point (geometry)Process (computing)Different (Kate Ryan album)Virtual machineKernel (computing)Data centerBitProduct (business)Exception handlingOnline helpLastteilungTask (computing)State of matterRule of inferenceArithmetic meanCuboidProcedural programmingEndliche Modelltheorie2 (number)Mobile appDefault (computer science)Sign (mathematics)Expected valueArithmetic progressionPressurePhysicalismAdditionTerm (mathematics)Social classReading (process)Event horizonSystem administratorMathematical analysisInformation securityModule (mathematics)Control flowChannel capacityYouTubeStructural loadComputer hardwareHard disk driveComputer animation
09:20
Scale (map)ScalabilityPredictionChannel capacityMeasurementMultiplication signCloud computingDifferent (Kate Ryan album)AbstractionInsertion lossControl flowCloud computingMeasurementData storage deviceClosed setChannel capacitySoftware testingBlock (periodic table)TwitterInsertion lossConditional-access moduleWeb pageError messageServer (computing)Computer hardwarePhysical systemDifferent (Kate Ryan album)Noise (electronics)Instance (computer science)MereologyScaling (geometry)Slide ruleMultiplication signBand matrixBitFamilyBit rateSocial classRight angleWebsiteTotal S.A.Event horizonPredictabilityPoint (geometry)Video gameAbstractionUMLComputer animation
14:13
ComputerArithmetic meanCloud computingOval3 (number)Graph (mathematics)Slide ruleSoftware testingMedical imagingBlock (periodic table)Standard deviationRevision controlImage resolutionWebsiteSingle-precision floating-point formatInstance (computer science)Thermal fluctuationsLevel (video gaming)VotingEuklidischer RingCASE <Informatik>Physical systemLine (geometry)Data storage deviceElasticity (physics)RAIDLink (knot theory)Reading (process)Computer animation
16:18
Scale (map)Cloud computingHydraulic jumpControl flowComponent-based software engineeringCore dumpDuality (mathematics)QuadrilateralAxiom of choiceVisualization (computer graphics)Meta elementComputer networkComputer hardwareFreewareUsabilityVideoconferencingDifferent (Kate Ryan album)RootkitBit rateTerm (mathematics)Template (C++)Configuration spaceServer (computing)State of matterData managementComputer fileObject (grammar)Vertex (graph theory)Client (computing)Formal verificationPublic key certificateMulti-core processorConfiguration managementTemplate (C++)LastteilungServer (computing)Data centerWindowInstance (computer science)Electric generatorDirectory servicePlastikkarteMobile appVirtualizationCache (computing)Condition numberDirac delta functionProduct (business)RootkitBand matrixDirect numerical simulationMereologyPhysical systemStreaming mediaScaling (geometry)Patch (Unix)DatabaseVideoconferencingDecision theoryBuildingOcean currentChannel capacityLatent heatSlide ruleNetwork topologyCuboidDifferent (Kate Ryan album)Cloud computingContent (media)SoftwareInformationExtension (kinesiology)BootingHash function1 (number)Group actionBitComputer hardwareVirtual machine2 (number)NumberScripting languageProcess (computing)Uniform resource locatorElectronic mailing listInsertion lossAverageInternet service providerCASE <Informatik>Ideal (ethics)Traffic reportingInstallation artPattern languageComputerCore dumpCartesian coordinate systemHacker (term)Point (geometry)Metropolitan area networkBuffer solutionBijectionPhysicalismSkeleton (computer programming)Basis <Mathematik>Right angleOpen setPresentation of a groupComputer animation
25:28
Macro (computer science)Gastropod shellFreewareData managementWikiServer (computing)World Wide Web ConsortiumReal numberScaling (geometry)Scripting languageData typeDefault (computer science)Library (computing)Software testingAerodynamicsMagneto-optical driveModule (mathematics)Network socketComputer fileClient (computing)Public key certificateMereologyOverhead (computing)VarianceScale (map)Discrete element methodRaw image formatBlock (periodic table)PermianWeightAliasingLocal ringContent (media)Control flowPoint (geometry)Modul <Datentyp>Manufacturing execution systemAdditionProxy serverFormal verificationMountain passType theoryMereologyDifferent (Kate Ryan album)2 (number)YouTubeUniform resource locatorData storage deviceSign (mathematics)Formal verificationDivisorComputer programmingNumberInstallation artRevision controlClient (computing)Public key certificateServer (computing)Network socketWeb 2.0VideoconferencingKey (cryptography)Module (mathematics)Point (geometry)Regulärer Ausdruck <Textverarbeitung>Hard disk driveWikiMultiplication signElectronic mailing listChemical equationInstance (computer science)CodeComputer configurationProgrammer (hardware)BitKernel (computing)ResultantSystem callProduct (business)Physical systemPublic-key cryptographyAuthorizationNormal (geometry)Front and back ends1 (number)AdditionThread (computing)Single-precision floating-point formatBootingMacro (computer science)Link (knot theory)Content (media)Library (computing)Scripting languageDefault (computer science)Arithmetic meanLimit (category theory)Directory serviceVariable (mathematics)Slide ruleAuthenticationGastropod shellFile archiverDynamical systemSoftware testingEndliche ModelltheorieNatural languageState of matterExpressionIntegrated development environmentLastteilungWritingRight angleInteractive televisionCartesian coordinate systemSampling (statistics)RootkitControl flowRule of inferenceBound stateSet (mathematics)CuboidCategory of beingStudent's t-testInjektivitätLogicComputer animation
34:37
ArmTemplate (C++)Computer fileServer (computing)Vertex (graph theory)Configuration spacePhysical systemInformationVariable (mathematics)WeightIdentity managementNormed vector spaceInclusion mapVideoconferencingTorusRead-only memoryThread (computing)Ultraviolet photoelectron spectroscopyLocal GroupAsynchronous Transfer ModeSource codeRootkitLocal ringInternet service providerMassMKS system of unitsClefKey (cryptography)Metropolitan area networkEmpennageGraphics tabletBinary fileComputer clusterBackupVideoconferencingIP addressScripting languageServer (computing)Gastropod shellInformationDefault (computer science)Block (periodic table)BefehlsprozessorIdentity managementVariable (mathematics)Vertex (graph theory)Level (video gaming)Connected spaceDomain nameNamespaceLoop (music)Electronic mailing listMarkup languageCuboidGreatest elementPublic key certificateProxy serverSemiconductor memoryLastteilungAuthorizationSingle-precision floating-point formatLine (geometry)TunisUniform resource locatorAdditionNumberCASE <Informatik>Social classDifferent (Kate Ryan album)Revision controlSystem administratorInstallation artPay televisionMultiplication signRepository (publishing)Wrapper (data mining)LoginThread (computing)AreaData storage deviceComputer configurationNetwork topologyEndliche ModelltheorieTemplate (C++)Client (computing)Direct numerical simulationMultiplicationGame controllerMereologyOrder (biology)Directory serviceDomain namePartition (number theory)Set (mathematics)Maxima and minimaDivisorConfiguration spaceRight angleACIDView (database)Computer-assisted translationBranch (computer science)Operator (mathematics)Speech synthesisGroup actionMacro (computer science)LogicSound effectState observerTraffic reportingInteractive televisionComputerReal numberDeterminismRegular graphMathematical analysisComputer animation
43:47
BackupMeta elementSummierbarkeitMetropolitan area networkVertex (graph theory)VideoconferencingComputer networkServer (computing)Computer hardwareMiniDiscRead-only memoryConfiguration spaceElectric currentControl flowFreewareSource codeScale (map)Internet service providerContrast (vision)Cloud computingVirtueller ServerWeb serviceSpacetimeAutomationRootkitAugmented realityComputer-generated imageryBootstrap aggregatingWorld Wide Web ConsortiumCache (computing)Arc (geometry)Binary fileMereologyScripting languageServer (computing)Revision controlRootkitPhysicalism1 (number)Statement (computer science)SpacetimeMiniDiscDiscounts and allowancesMultiplication signRoboticsInternet service providerCASE <Informatik>Set (mathematics)Latent heatSystem callArc (geometry)Java appletVariable (mathematics)Limit (category theory)Channel capacitySemiconductor memoryWeb browserGroup actionData centerInstallation artVertex (graph theory)Binary fileWindowWeb 2.0Different (Kate Ryan album)PressureVideoconferencingType theoryGame controllerComputer hardwareAxiom of choiceCache (computing)Cloud computingDirectory serviceError messageVideo game consoleFunction (mathematics)DivisorRepository (publishing)Logical constantFile archiverIntegrated development environmentMedical imagingBitDirection (geometry)Social classExterior algebraBootingConfiguration spaceScaling (geometry)Point (geometry)Mixed realityCellular automatonVolume (thermodynamics)Graph (mathematics)Office suiteAutomationSurfaceFreewareMereologyCuboidStandard deviationMathematicsCondition numberRoutingMacro (computer science)Bit rateInsertion lossRoundness (object)Solid geometryNatural numberCartesian coordinate systemInterior (topology)Right angleComputer animation
52:56
Metropolitan area networkData managementStandard deviationOpen setGastropod shellFreewareEmailWindowRow (database)Server (computing)Medical imagingSlide ruleInformation securityMultiplication signMacro (computer science)Right angleBootingPatch (Unix)Product (business)VideoconferencingData storage deviceRevision controlCache (computing)Set (mathematics)Line (geometry)Hard disk drive1 (number)ComputerException handlingHypermediaError messageIP addressCollatz conjectureFingerprintCuboidCorrespondence (mathematics)Branch (computer science)Java appletFile archiverData managementSocial classAdditionInstance (computer science)DatabaseVariable (mathematics)Process (computing)MereologyQuicksortCondition numberOperator (mathematics)Cartesian coordinate systemPattern languageCASE <Informatik>Physical systemFigurate numberNormal (geometry)WordWikiComputer animation
Transcript: English(auto-generated)
00:03
So, my name's Alan Jude, and I'm going to talk to you about how I use Puppet to manage a large number of servers by myself, because he's no help.
00:23
He's helpful to everyone except for me. So I've been a FreeBSD sysadmin for about 11 years now, and my biggest accomplishment is building a CDN.
00:42
Didn't start as a CDN, but eventually needed kind of a CDN, and then we built it, and then that turned out to be more popular than the product that was built as support, so we just do that now. And I also host a podcast every week, TechSnap.tv, where we talk about systems network and
01:01
administration. And, before that, I was a professor at Mohawk College, although in Canada, college is different, so professor doesn't actually mean anything. I don't actually have a degree, but whatever. And I taught network engineering and security analysis. And then I've worked a lot with Puppet, mostly since last year when Edward Tan published
01:27
an article in BSD Magazine about how to use Puppet on FreeBSD, and I was like, I should be doing that. And I saw his talk at EuroBSDCon, and basically this picks up where he left off and goes further with it.
01:41
And I also worked with ZFS a lot, and I also pronounce it Z-F-S. So a couple things, we'll talk about what scaling actually is, and why it's important, why I don't trust the cloud to do the scaling for me, a little bit about what
02:04
Scale Engine actually is and does and so on, how to get scale by buying things, and then advanced Puppet and the things we've learned and the things we still need to do after. Scale!
02:20
We'll do this. So basically scalability is the ability for your system or network or process or whatever to handle more capacity than it does now without breaking. But you can't predict how an app's going to scale ahead of time.
02:43
So scale's always just in time, because in order to decide how to scale, you have to know what part needs to scale, and you can't predict that because users are unpredictable. No matter what you think they will do, they will do something dumber.
03:05
So designing for eventual scale. If you build all your scale up front, and then it ends up that the pressure point isn't where you thought it was, and so you have all this scale for something that doesn't need it and no scale for the part of the system that does.
03:22
So if you try to predict where the failures are going to be, most of the time you're wrong. So you design the system with expectation to scale it out, but you don't actually scale it out too soon because you don't know exactly where the problem's going to be. So an example of something you can do is when we started naming our systems, we came
03:45
up with a geographic system for it, even when we only had one data center, all of our servers were named with the data center as part of the subdomain and so on. And that's worked very well for us now that we're in 29 different data centers. That way, when a system, by looking at the name of the system, you can tell where
04:01
it is, which is something the cloud always, you know, it's, Colin Percival's definition of the cloud is when you don't know which state your server is in. So planning. This is a postcard version of Google, and it says please allow 30 days for your search
04:22
results to come back. So there's a few different ways you can scale a system. You can optimize it, which basically means getting more out of it without spending anything other than time. You can scale up, meaning making, buying a faster system, or you can scale out, buying
04:44
more systems and spreading the work out. Or you can use the cloud, which is basically a different way of scaling it. So if you optimize a system, basically what you're doing is making each task on
05:00
the system take less work or less time, so that there's more time left over to do more work in the same amount of time, right? So if each of your processes take, each step in your process takes two seconds and you drop that to 1.5 seconds, you now have less spare time that you can use to do more each second.
05:21
But if you try to tune too early, you don't know what to tune, and so you just spend a bunch of time tuning things that don't make a difference. And as Ivan talked about in his talk about benchmarking, there's not that much that you have to tune anymore. Most of the previous defaults scale automatically very well.
05:46
Ten years ago it was very common to be compiling a custom kernel that had a bunch of tuning in it. Now that's much less common, right? Almost everything is a module or a Sys ETL now. You rarely have to recompile the kernel and you rarely have to change that many Sys ETLs.
06:05
So scaling up is basically attempting to make your system handle more work, right? So you can get a faster processor or more processors in the same box, or more RAM or more spindles, right? If you don't have enough disk I.O., more spindles means more I.O.
06:22
Or IOPS or whatever. But eventually that's limited by the hardware you can buy, right? You can't buy a 6 GHz processor, and you only have so much money for hard drives. Eventually you need a different solution. But in order to scale up you have to know which component is the problem, what is stopping
06:44
you from getting the performance you want, or getting enough performance. So you have to figure that out. But scaling up, it might work or it might not. Just because you bought a faster processor doesn't mean your processor is going to run any faster.
07:00
Or you can scale out, which is mostly what we do. More nodes, spreading the work out, and also this lets you handle failure a little better as well. So you just buy more servers. Although it depends if your app can actually scale that way, right? Some apps only use one processor on one server, so they're not even taking advantage of the scale you have on a single system.
07:28
And so it doesn't always help, but most of the time it works. But it comes at the cost of increased management complexity, right? You need to be able to manage more servers. And that's where the pain point came for us, right?
07:43
When it was 10 servers and then 20, it was fine. But then it got to the point where it's like, you need to deploy 10 new servers in 6 different data centers in 6 different countries this week. At that point, it became cheaper to learn Puppet than to do it all manually.
08:01
And it also requires some kind of load balancing, which we didn't have originally. And we built that, and we gave a talk about that at EuroBSDCon. It's on YouTube if you want to see it. And it also requires some market awareness. You have to know what you can buy and where you can get it.
08:22
Or you can use the cloud. Basically what the cloud allows you to do is get horizontal scale nearly instantly. You can go to Amazon and rent 20 servers and only use them for a couple of hours if you want. But if you're going longer term, it turns out to be much more expensive.
08:41
And there are a number of other costs that you don't think about ahead of time. It's basically automagically adding virtual machines to your cluster. But you still have most of the management complexity you would get from additional physical machines. But you also have to deal with the fact that your cloud has some API that you have to use in order to interact with the machines and so on.
09:07
More or less complexity at the same time. And also you have basically uncapped costs. Amazon will keep billing you forever in larger and larger amounts.
09:22
Because you don't get anything for nothing. If your business isn't built to be able to scale up, it's probably not going to do much. But if you try to predict how you're going to scale up too far ahead of time.
09:44
The startup that builds for 100,000 users before they have one user. Then you spent a bunch of money probably in the wrong place. And you basically have misguided capacity to support some predicted growth for no reason.
10:03
So rather than building scale on predictions, you need to build it on measurements and telemetry and stuff. You need to know what the problem is so you can solve it. Instead of trying to guess what the problem might be. And then you can react to actual growth and your scalability issues.
10:24
So some of the things that I don't like about the cloud are the lack of capacity. You can't tell what's happening in the cloud. You have no visibility at all what they're actually doing or where they're actually storing your files. Or when there's a problem, how do you find out?
10:42
We never know when there's a problem at Amazon. And then we see Colin Percival on Twitter saying, Oh right, yesterday he was talking about how one of the Amazon services is throwing 503 errors at an incredibly high rate. Amazon's status page doesn't say that there's a problem. But Colin has detected a problem.
11:03
If I was using Amazon, I wouldn't know that there was a problem until stuff was breaking. And I would have to investigate myself. Because I have no visibility on what's inside Amazon's part. Colin sleeps every once in a while.
11:25
Or, you know, he's on an airplane coming to BSD cam and he doesn't tweet and then I don't know what's happening. But also you end up with vendor lock-in. If you build it into the Amazon ecosystem, there are other clouds but most of them you can't just pick up your stuff and move.
11:43
It's very difficult. There's different APIs and they have different abstractions for storage. There's S3 from Amazon but Rackspace cloud files works entirely differently. And they have EBS for block storage and I don't think Rackspace actually has something like that. And then so on and so on.
12:02
And so it means if you want to move, there's all this extra cost of redeveloping everything. And so you end up being stuck at Amazon. And there's the price. Basically for us it wasn't cheaper than doing it ourselves. The marketing people try to tell you, oh the cloud is so cheap, you know, it's like 10 cents an hour or whatever.
12:23
But if you leave an Amazon EC2 instance on for the whole month, it actually costs more money than renting a server somewhere. And that's before you consider that with Amazon when you rent a server, you pay for every bit of bandwidth after. Most of the time when you rent a server from a provider, they give you some to start with.
12:44
And so that makes a difference too. And then you have to consider how much is the loss of visibility worth. How much is it you can't tell what's happening or where your stuff is. There's another slide here that talks about on Amazon, somebody was doing some performance testing.
13:00
And they spun up like 200 plus instances and they found that certain instances were really slow. Probably because there was something wrong with the physical machine at some point or something. And they didn't know. And then they would see this noise on their performance tests.
13:20
And they were like, oh so if we throw that machine away and ask for a different one, we get assigned to a different physical machine and we don't have the problem. But they can't predict when, if they try to spin up a machine, they're going to get stuck on a piece of hardware that's failing. Or has some other problem. Or it's just busy. It's a shared system. When you're on an Amazon cloud or whatever, there are other people on the same physical machine sharing some of the same resources.
13:44
If they hog it all, then you have performance issues. You can't tell what other people are doing because it's the cloud. And there's the risk. If you use some cloud other than Amazon or Rackspace, then how many cloud providers have gone out of business in the last year?
14:02
Quite a few. So you'll be building your system, running your business, all happy happy. And then one day you get those, oh sorry we're closed. And then you have nothing. So there's a link buried under that image lately. I'll post the slides later and you can get it.
14:24
They are out. The graph is still a little small, but basically this is the throughput they got from EBS. And as you can see, rather than a nice steady line, they're just all over the place.
14:44
Although on some tests, on some instances, this instance did very well. It's a nice straight line. For this one, it was just all over the place because what other people are doing is affecting what you're doing. Because it's not your own system.
15:06
Not exactly. This isn't my graph, this is data I got from someone else. Basically they were doing performance testing on Amazon, in this case specifically EBS, which is their elastic block storage. And they spun up 200 instances and did performance tests with read and writes at read 4k, read 4 megabytes, and so on.
15:28
And then graphed the throughput they were getting. And rather than getting some consistent level, they would get these wild fluctuations all over the place. Some instances would just stop.
15:47
These are... Yeah, sorry. This is a small on EC2. This is a small by itself. We're using one elastic block store. This is a small using 4 in RAID 0.
16:02
And this is large with a single EBS, and this is large with 4 EBS's stuck together. Yes. Their website has much higher resolution versions of the graph and an explanation of their testing methodology and so on.
16:21
So among a number of other reasons why we don't use the cloud, it basically reduces our autonomy. We're basically tied to some vendor and having to rely on them for more of the business than if we had the hardware ourselves. Also, other than Amazon where you can kind of use FreeBSD, most of the other clouds don't offer FreeBSD, and that's a deal breaker for us.
16:47
Also, using virtualization usually has a performance impact, where bare metal doesn't. In particular, virtualization is usually really bad at network IO.
17:01
The throughput you can get on a virtual network card is usually not that great. There was another talk about that yesterday, I think. And being that we're doing video streaming and we need to push a gigabit or more out of each machine, then that's a problem for us. Especially when it's not predictable. We can't say this machine can do this many megabits because in virtualization, how much throughput
17:24
you can have is based on how busy the machine is doing other people's virtual machines. At some point, when we can't predict how much capacity we have, we can't load balance based on that. And cloud hardware and network are more expensive.
17:43
If you get one of the first generation M1 instances from Amazon, you get a dual core processor and 7.5 gigs of RAM. I think that's the Windows price, but it might be the Linux price, but it works out to $262 for every 30 days that you leave it running per machine.
18:02
And that's an old 2007 AMD Opteron processor that you're getting through virtualization. And then the second gen ones, which are the M3s, which are the newer ones that you can run FreeBSD on without paying the Windows tax. You get 15 gigs of RAM and a quad core processor, and it works out to $360 for 30 days.
18:26
You can get slightly cheaper if you use reserved instances, but you have to basically pay for it for a year up front, which is a lack of flexibility in some cases. If we rent hardware from one of our various providers anywhere, on average for an
18:42
E3 1230 quad core with 16 gigabytes of RAM, we only pay $189 a month. And that includes 10 terabytes of traffic with it, which we wouldn't get from Amazon. Or from somewhere like OBH, one of the sponsors of the conference here, we can get a dual processor quad core Xeon
19:02
with 256 gigs of RAM for $389, as opposed to $360 for 15 gigs of RAM and dubious network performance from Amazon. So yes, why does that matter?
19:21
Basically, scale engine, we do HTTP delivery and video streaming. We have about 80 non-virtualized physical machines spread out in 28 different data centers in 10 different countries. There's an insert in your duty bag that has a list of all our locations and so on on it.
19:43
And basically, in aggregate, we can push about 50 gigabits a second out of the machines we have. And most of that's because we never try to push too much out of any one machine because with video, contention is a killer versus HTTP.
20:00
If you have more people, more demand for bandwidth than you have bandwidth, and people are downloading over HTTP, the file download slows down a little bit and they don't really notice or care. But with video, if you add one more user to the server and that causes contention, all of a sudden all 1000 users trying to watch video are buffering. And they hate that.
20:21
So basically, if we have any contention at all, it kills not just the new customer but everybody who's viewing on that box. And so we purposely never push a box beyond 65% of its rated capacity. Part of that is because our DNS load balancer system takes about five minutes before it stops sending new users to a system because of DNS TTLs.
20:44
So we have to cut off early to make sure we don't go over. So all of our boxes are FreeBSD 9. A lot of them have UFS but we're transitioning by building all new ones with root on ZFS. Part of the reason for that is we use Nginx and Varnish for the HTTP delivery.
21:04
And our Nginx cache is basically a bunch of hash directories with about 12 million files in it. And ZFS does a much better job of that. And also basically because of the 12 million files, an FSCK after an unexpected reboot means the machine's down for an hour or more.
21:22
Whereas ZFS comes back right away. And we have enough RAM now so we use lots of ZFS. And now we manage all the servers with Puppet. And we also make extensive use of jails. Partly because one of our apps is Java and so we keep it in a jail.
21:42
And over the last year I just pulled some stats out of our billing database. We've served 80 billion HTTP requests totaling over 500 terabytes of traffic. And served about 2 petabytes of video.
22:04
So why we use Puppet is it basically allows us to quickly scale up and scale out on our own. Basically we can rent another server somewhere. Usually it gets set up in a couple of hours. Then we throw Puppet on it and it pulls down our entire infrastructure and sets it up and the box is in production.
22:24
So to do that we had to basically deploy Puppet Master at scale. If you just install Puppet from the ports tree it won't scale. And so the next couple of slides will walk through what we had to do to make that work. We also use custom facts.
22:41
Basically we extract some specific information from FreeBSD and partly from our own infrastructure to make decisions about how to configure the machine. Puppet has a lot of built-in facts but it's mostly Linux specific. There are some that work on OpenBSD and not FreeBSD even though they're the same.
23:01
Somebody needs to write some patches. I guess that's me. And then instead of just using Puppet to deploy config files we use templates and basically customize some of the config files for each machine based on the facts that we extract. We also use Puppet to manage packages. Currently we use port upgrades like port install to install things from ports.
23:26
Building all the apps we need on each individual server takes about half an hour. Not really a big deal for us but we'll be switching to package-ng at some point. And we also use Puppet to manage our easy jails. Basically we have a little Puppet recipe and it can pull in a jail and deploy it because we just stamp out these video server jails on every machine.
23:53
Depends between 2 and 10. Most of the machines, basically each role is packaged up into a jail.
24:03
And so it depends how many roles the server has based on how much RAM it has and a bunch of other decisions. And mostly how much bandwidth it has. Our original product which was scaling your PHP app by spreading it out across a bunch of FreeBSD servers.
24:23
Used jails much more because we have lots of fast GI jails and so on. But yeah, there's a couple of jails for each machine. So what is Puppet? Basically it's a configuration management engine. It allows you to push out your config to machines.
24:44
And basically you describe what the server should look like. And Puppet analyzes the machine, finds the deltas and does what it has to do to fix it. So the machine looks like the manifest. So basically you make these simple declarative manifests and it describes the machine.
25:01
Based on our, compared to our old approach which was write a couple of scripts and run them on each machine. It will retry when one of the steps fails. Or it will only do the next step if the previous step actually worked. And things like that. Compared to scripting it's much better to have the manifest. Because it will also notice down the road when all of a sudden one of those conditions isn't true anymore.
25:26
Because of something. We also use it to manage packages. So all of our web servers need to have the latest version of Nginx installed. And users. I need my administrative user on every machine so I can log into it.
25:42
And all of those need to have my SSH public key so I can log in. And we can push other files and things like that. One of the other things it does is it's bundled with this program called Factor. That basically extracts facts about the machine and stores them at the Puppet Master.
26:01
And it also uses SSL certificates to verify each of the clients. So while it's serving our config files basically over HTTP. You can't walk up and download my config file. Because you don't have an SSL certificate signed by my CA.
26:21
So as I mentioned at the beginning this talk doesn't cover the very basics of getting started with Puppet. Edward Tan gave a talk at EuroBSDCon. And there's the YouTube link. And he also put an article in BSD magazine. It was early last year. In addition we also use two macros from the Puppet wiki.
26:44
shellconfig which is basically a macro for adding nginx underscore enable equals yes in rc.conf. And configuring all the variables in the various shell files that make up FreeBSD. So we can put values in loader.conf, rc.conf, make.conf, etc.
27:02
And then the ports.conf does basically the same thing but uses the rc.conf.d system. So where you basically have a separate config file for each service that you have enabled. So that you can... Basically all of our varnish variables are in a file utcrc.conf.d slash varnish.
27:24
And it makes it easier to manage them that way. So the first thing you have to do to scale Puppet is replace the default web server that it comes with. Basically Puppet's a Ruby script and it uses WebRig. Which is this little Ruby library that provides a single threaded useless web server.
27:44
It's great for testing and making sure Puppet's working. But after that, before you start having machines get configs from it, you need to replace that. One of the options for that is mongrel. There's a little checkbox option for it in the port.
28:00
Although it doesn't actually do anything useful anymore. So mongrel basically is the fast CGI approach to Ruby. You have a bunch of workers forked off on listening on different ports. And you just load balance across those. Each one's only single threaded because it's running Ruby. But you can basically have a bunch of separate workers and you have a front-end web server call back to them.
28:23
And they execute and return the result. But after Puppet 2.7 they basically decided to deprecate that. All the bits that made mongrel work that were inside Puppet they took out. They switched from Rails to Rack. But I'm not a Ruby programmer so I don't care.
28:43
And basically it's not the best option anymore. Also the pool of workers that would run the code was a fixed size. So you say, you know, spawn five mongrel instances. And then you load balance them with the web server. The other option, newer one, is passenger. It's now basically a module for nginx.
29:03
So when you compile nginx, if you scroll down far enough on the list you'll see passenger. You check it off and it compiles it. Basically it provides dynamic workers and it's built into nginx. Whereas mongrel is a fast CGI approach, this is the mod-ruby style approach. So the web server forks a bunch of rubies and talks to them directly rather than over a pipe or a TCP socket.
29:27
The biggest thing is the dynamic workers. You say, you can spawn up to 12 but it doesn't leave 12 running all the time. So in our original deployment, which was Puppet 2.6, we used mongrel.
29:41
But when we upgraded to 3.1, I switched to passenger because it was not clear how to make mongrel work on the newer version. But our biggest challenge was the way we deployed the jails. Basically we have easy jail archives that are like 700 megabytes that have everything in it for the video server.
30:04
And we'd have Puppet deploy that to each server. And so it would be pulling it from the Puppet master. Because the jail archive has our license key for the video server and a bunch of other stuff in it that we don't want to just be laying on an HTTP server somewhere.
30:22
So to deliver files with Puppet, you're basically running through Ruby. So Ruby's reading the file and then writing it out to the socket. And that's slow and not good. And especially when you have mongrel, where you only have five workers, that means if you have five new machines deploying, they're all tying down those workers
30:43
and now the sixth machine can't check in and get its manifest or anything. So to avoid that problem, basically we wrote some nginx config that says when you're talking to Puppet and you request a file, it skips the Puppet side.
31:03
So because we have nginx as the web server and then it calls off through passenger to run Ruby, we can say for certain files, just deliver it directly in nginx from the hard drive. But we'd lose authentication in that case, but because Puppet's using SSL certificates,
31:23
where each client has a signed client certificate, we can have nginx verify that by just pointing it to our CA certificate. And that way, nginx will only deliver our jail files to authorized clients, but will do it without using Ruby.
31:41
So this is basically the config you would use in nginx. You can download the slides after. This is what you would have in a stock install of Puppet. And then the next slide shows the specific things we do. So in the production environment, when you're pulling the file content from the file directory,
32:03
we just point it straight to where those files live on our Puppet jail. Because even our Puppet master is actually in a jail. And then we have a second one here for modules. Basically, Puppet has modules, and the files for the modules are in a different place. So we have a little regular expression here that sucks out the module name,
32:26
and point it at the module directory with a small rewrite to... so that the URL doesn't really match the file layout on the disk, so we have a little rewrite rule that cleans it up. And this way, we can serve the files from our files directory
32:42
and the files that are in our modules directly without going through Ruby. And that way, we're not blocking our limited number of Ruby instances, and we're not using Ruby to send files when nginx can just say, hey, send file this, and the kernel does it.
33:00
This introduced one specific problem. In the nginx config, we have SSL verify clients on, meaning nginx will not talk to you unless you have a client certificate. So the problem is, when you create a new Puppet, a brand new server,
33:21
it connects to Puppet and says, I would like... here's a certificate signing request, please approve me. And then normally, I would log into Puppet and manually sign that certificate. So what we did, we had two options. We can configure verify clients to optional, and basically say, let them connect either way,
33:41
and then in different location blocks, we'd have to somehow check and decide, for these things, you have to be verified, and for these things, you don't. Normally, the Puppet takes care of all that, but we've wired around Puppet in a number of places, and so we'd have to add our own access checking. So rather than that, we've created a second nginx config on a different port,
34:03
basically the normal port plus one, and we have verify client optional on, but we don't have our workarounds, so everything goes to Ruby. So when you build a new Puppet agent,
34:23
as part of the first command where you tell it to wait for a certificate and so on, you just add the option that the certificate authority is actually on a different port. And so it will do pure Ruby to talk to the certificate, get a certificate, but it will still do pulling files and stuff over the regular port.
34:43
Part of the reason for this is that you can have multiple Puppet masters in a setup, but you can only have one certificate authority, because it's the authority, right? So in addition to this being useful for our access control workarounds, you can have every Puppet master proxy this port back to your single certificate authority.
35:03
So we have a Puppet master in North America and one in Europe, but the one in North America is actually the certificate authority. If you connect to the one in Europe on this port, it basically just proxy passes you to the certificate authority to the one in North America. But if you connect on the regular port, it serves you locally.
35:22
And with this, we can do DNS load balancing and our geo-discrimination on our Puppet master domain, and now clients just go automatically to the right server. This way, when we deploy servers in Europe, we don't have to tell them a different Puppet master URL.
35:45
And some of the other things we do are templates instead of just config files. So instead of just pushing out a sysctl.conf for each machine that has a bunch of tunes in it, we can analyze each machine and decide how to write the different lines.
36:04
And basically we can incorporate facts, the information we pull out of the machine. And we can also, in our node definition, in Puppet where we tell it, hey, this machine needs these roles installed on it, we also define some of our own variables there.
36:24
So this is a node definition for one of our machines. I prefix all of my variables to create a namespace. So we have a machine in Chicago, it's called Chicago2. So we have its fully qualified domain name, and then we specify the location,
36:40
which we use to group the machines. And then we have this identity that we use for various other things. And then we tell Puppet which IP addresses on this machine are for what. So Varnish, we have an array so that we can handle more than 65,000 concurrent connections. And then our Wowza, which is our video server, and our Nginx have dedicated IP addresses.
37:05
And for load balancing we also specify at what traffic level should this machine stop accepting new traffic, and at what traffic level should it be ready to take more again. And then we include our various roles at the bottom. So an edge server has Varnish and Nginx, and a video server has our video server.
37:27
So for some things we needed to create custom facts because either the Puppet fact didn't understand FreeBSD, or there was no Puppet fact for what we were trying to do. So we extract things like how much memory is in the box.
37:42
So we just pull the sysctl and say how much RAM is in this machine, or how many CPUs does this machine have. So then we, in ERB, which is like a Ruby markup language, we take that information and do things with it.
38:02
So we pull in that list of IPs. Basically we had an array of the different IP addresses that Varnish was supposed to listen on. And we loop through them and add colon 80 to the end of each, and then join those in a comma separated list without putting a comma after the last one. That was interesting for me because I had never written anything in Ruby before.
38:25
And so this is actually creating the rc.conf entries for Varnish. So it points to our config file, and then it says for storage we want to make an area of 25% of whatever RAM this machine has. So it looks at the facts and is like, oh, this machine has 24 gigs of RAM,
38:42
so we'll use a quarter of that. Or this machine only has 8, so we'll use a quarter of that, instead of having to define that value somewhere. And then we use that identity to identify the servers. And then we also do things like we create the number of thread pools based on how many CPUs the machine has.
39:02
So if it has 8 cores, then we create 8 thread pools. And then how many threads in each pool, we divide our max of about 8k divided by the number of CPUs. So this dynamically sizes how many threads Varnish will use on each machine based on how many CPUs and how much RAM it has.
39:23
Because in our setup we're renting from a bunch of different companies, so no two machines are really alike. Even when we rent from the same company, we come back six months later to get another machine, and it's going to be a different model. And so this basically allows us to adapt to the fact that all of our servers are heterogeneous.
39:47
So our solution for installing packages is we use port upgrade and it's app port install, but we also deploy packagetools.com that basically has all of the options
40:01
that you would normally answer with the dialogs defined. So we say nginx with passenger, nginx with caching, nginx with SSL, and all the options for all of the packages we use are basically answered in this file. Originally we looked at pushing the makeconfig file
40:21
that you'd end up with in the bar directory, but that stops working if your ports tree is newer. If the port is newer than that options file, then it asks again, and you end up without the options you wanted. So we do that. Also the files in our puppet are named differently. We actually have four different packagetools.com based on different roles.
40:44
So we have packagetools.edge for edge servers, and then that gets named packagetools.com on the remote system. Then we create a class for each app, in this case denyhost, which is like fail to ban.
41:04
It basically monitors our SSH log and blocks you with TCP wrappers from connecting to SSH if you're being a bad person. So basically we say, you know, include the package denyhost, also deploy at denyhost.conf, which comes from our repository.
41:25
But this file depends on the package, so it won't install this .conf file until the port has already been installed. And then we use our ports.conf macro, and we basically enable denyhost in rc.conf,
41:40
so it'll start up. And then we define the service, and we say, ensure that denyhost is running. So every time puppet agent checks in, which is by default every 30 minutes, it double checks that denyhost is running, or as soon as it installs it, it starts it for us. So we say, enable through, and we also subscribe it.
42:02
Like Brad was saying in his talk, when you create the subscription there, if we ever change our denyhost.conf, it will refresh the file, but it knows this service depends on it, so it'll restart the service so that it picks up the new config file. And then I have a package for easyjl,
42:23
so this installs easyjl, but it also takes care of the initial setup. So we have the regular stuff we need, ensure that easyjl is installed. But then we also have an exec line, which is just run a shell command. And we're saying, with the shell, you want to run easyjl admin install,
42:40
and then whichever version of the OS we use in that particular host. And then we tell puppet what directory that creates. So before it ever runs this command, it checks, does the base.jl directory exist? If it doesn't, it runs the easyjl admin command and installs it, and when it checks later, that directory does exist,
43:02
so it doesn't run it. So it allows you to basically make shell scripts deterministic. And we say that it requires the easyjl package, so this won't be run until easyjl is installed. Or if easyjl fails to install for some reason, it won't try to run this command. So in puppet, nothing happens in any specific order
43:23
unless you create these interdependencies to create an order. And then we enable easyjl. And then this is where all the magic happens. It's a class for creating jails. So in a node configuration, you can just say,
43:42
I need an easyjl with these settings. So it takes the name and the IP address of the jail, optionally a jail archive, which would be the char created by easyjl, so that it can stamp out a bunch of the same easygales. And you can specify an alternate root directory if you don't want to install the usergales.
44:03
This class file is actually bigger than this, but I trimmed some of the stuff out. We actually have an if case for whether or not there's an archive, because we need to not include the minus-a directive if there isn't. But it wouldn't fit nicely on the slide, so I trimmed it down a bit. So basically, we use the jail root to see if the jail directory that we created actually exists.
44:26
And so this way, it can tell whether that jail already is installed on the machine. And we say that creating a jail depends on having easyjl installed, having easyjl set up, which is installing the base jail,
44:41
and it depends on having a copy of the jail archive. So it has to download that easyjl archive from our repository before it can try to create the jail, otherwise it fails. And it notifies the service easyjl every time this is changed,
45:01
so that basically it'll start the jail as soon as it's finished creating it. And this is the file definition that tells Puppet to download our archive, which comes from... That's where the nginx override comes in.
45:22
And so we create a service, basically, a service definition for each jail. We say make sure the jail is running, it's called easyjl, it's enabled. But the service command can't start each jail individually, so we had to teach it how to do that. So to start a jail, easyjl admin, start jail name, stop, restart,
45:44
and we created a little hack for figuring out if a jail is running. It basically uses the easyjl console command to run user bin true, so we get a 1 or a 0 in the jail. This will return an error if the jail isn't running, and it will return true if the jail is working.
46:04
And then in our node configuration, we just say, create an easyjl, call this with this IP, and use this archive, and it'll spin up the jail.
46:21
So basically, to get our servers, rather than having... We have our own rack for one set of servers, but it doesn't make sense to try to have our own rack in 28 different data centers in countries where we aren't or have never been. So we basically rent servers from a bunch of different places. One of the advantages to this is that we have transit
46:42
from a large collection of providers this way. We're not dependent on one place or buying transit from one place. And it also means that we can add nodes really quickly. One of our providers in Europe has servers sitting in the rack, and a robot gives you IPMI access within 5 minutes as soon as you pay them.
47:03
And you can install FreeBSD and have the server up and running in no time. But it also gives us predictable monthly costs, and as we add more servers, we can start negotiating with places like OBH to get discounts and so on. And it allows us to choose our disk and memory size more adaptively.
47:25
We've had... The way our demand was at one point, we needed more RAM, so we canceled some of our servers that didn't have a lot of RAM and bought different ones that had a lot. And then our needs changed. All of a sudden, we needed more disk instead of RAM. And so we could throw out some of the servers that were over capacity on RAM,
47:44
but didn't have enough disk. We could get rid of those and rent different ones that had a lot of disk. And this gives us the flexibility to adapt our inventory based on our needs, whereas if we had bought the physical machines, it's a lot harder to do that.
48:02
And yeah, we can provision new servers in under 10 minutes sometime. But compared to a cloud provider, you basically lose control over geography a lot of times. You can pick a general region, but you don't get that much choice. You don't have any choice about transit provider or anything.
48:22
And we don't know anything about the physical hardware or what's happening. I was talking to Cullen and he said, depending on what type of node you buy from Amazon, it's a different version of Zen. If you get a Linux node, it's like Zen 3.1, an old Windows node is 3.3, and the newer ones are a much newer version of Zen.
48:41
So it's hard to predict what you're going to get when you use virtualization, but when you buy, when you rent hardware, you know exactly what you're getting. But most importantly, it allows us to use FreeBSD, because that's what we like. So some things that we still need to do to improve our setup are switching to package-ng,
49:02
basically set up Pudrier and build all of our packages and push them out. It would take 20-plus minutes off our deployment time and give us a little more control over what's installed where. And it would definitely make upgrading the packages on the servers a lot easier.
49:21
Our approach now is just uninstall all packages and let Puppet install fresh ones. Luckily, it only takes 25 or so minutes to do that. Also, we're looking at ways to automate our deployment of root on ZFS. Unlike a traditional shop, we can't really pixie boot, because we have one server at this provider in Germany,
49:42
so there's nothing for it to pixie boot off of. So we're looking at different ways we can basically plunk down FreeBSD as quickly as possible. Not all of our providers give us IPMI, depending on the hardware, so our custom ISO that plunks down an image would be great, except for the certain providers that don't provide it.
50:03
So one of my ideas was basically to create a small disk image, maybe four gigs or whatever, that was our boot, a small swap and a small ZFS partition, DD it onto the drive, and then use gpart resize to fill it out the rest of the drive.
50:22
And we just do that for each drive in the machine. And that inside of it would have the OS with Puppet already installed and everything, so it can just start up. The advantage to that approach might also be that if the provider plunks down an OS for us, we can just overwrite it in place,
50:44
instead of having to try to get out-of-band access somehow to install the OS ourselves. We'd also want to look at auto-tuning our web and video cache sizes. So in Nginx we say you're allowed to use this much disk space to cache files,
51:03
so we'd like to change it, so that's auto-tuned based on how much disk space there is on the server. Although ZFS makes that easier and harder at the same time, but right now the blended environment makes it kind of difficult to tell, because basically in DF, ZFS lies about how big the disk is,
51:27
because it shows the disk as being the size of the free space, and so it changes constantly. And that also makes for bad facts and factor because it changes all the time. So we might be looking at a specific ZFS dataset that maybe has a quota,
51:43
and that would solve that for us. And we'd also look at tuning more loader.conf variables for ZFS. Limit the arc size based on how much RAM the server has, because we'd end up with memory pressure otherwise, because we have this Java video server that needs 4 gigs of RAM at least,
52:02
and we have varnish using a quarter of the RAM on the box, so we need to limit the arc size to make sure we're not putting pressure on varnish and web browser. Puppet specific things we'd like to look at, some previously patches for factor,
52:20
a lot of the facts like how much memory are provided for almost every OS, including OpenBSD, and the way to determine how much RAM is installed in the machine, and OpenBSD and Previously are actually exactly the same. So by adding one case statement to the Puppet repository, all of a sudden it would do this for FreeBSD as well.
52:40
So there's a bunch of little things like that, where I suppose I just need to make some Git pull requests, and there will be better FreeBSD support in Puppet. But somebody needs to do that. I don't like GitHub, so... Anyway. Stored configs are a big thing we're looking at.
53:02
Basically this means the Puppet Master has a copy of the config file from each of the agents when you create them, and basically they had an active record way to store that in a database, but they deprecated it in 3, and basically the only way to do that is PuppetDB,
53:21
which is a big Java thing that doesn't have a working port, last time I tried. And we'd like to use that, because in addition to better inventory management and stuff, what it allows us to do is have our Nagios instance be configured by Puppet. So every time we add a new server or service,
53:41
it would actually create the corresponding monitoring stuff in our Nagios for us. So every time we deploy a server, it would be monitored automatically. Basically it allows you to configure one server based on the config of another server. And without storing the configs, Puppet can't do that. And it would also allow us to manage the SSH known hosts.
54:00
Every one of our boxes could know what the public key fingerprint for every one of our other hosts. That way when you SSH around, you know that you're hitting the right thing. And also we'd like to look at using, I don't know how you pronounce that, the tool that Puppet uses for managing config files.
54:20
Basically instead of using those shell macros to check if this line exists, and then use set and awk and things to modify the config file in place, this is designed specifically for writing key-value paired config files, and it probably works better.
54:40
And that's it. Any questions? I probably want to bring up the light so you can see me in the back. Yes, that's good. I apologize. Now I have a different question. If it's not a religious issue, how would you compare this to, why wouldn't you use something like CFN which uses less Ruby?
55:00
I hadn't looked at anything really, and I didn't know what worked at FreeBSD, and then I saw an article, I got an email saying, hey, Puppet on FreeBSD in a magazine. So I looked at it and started using it. They do look similar, but I know that there's a Ruby error. Right. The Ruby really isn't a problem except for when delivering 700 meg .tar files.
55:22
So we worked around that. Just to throw something out there, you can use SSHFP records. We do that, yes. Although my Windows machine doesn't like those.
55:44
Separate problem. But it was just an extra example. I could think of what you could do with stored configs. I haven't used them yet, so I don't have very many examples of what you could do with them.
56:03
Yes, our mail server is managed there. Some of the stuff on our mail server is still done by hand, but yes, our mail server is a jail. Yes, so the mail server is like, there's only two of them.
56:24
But yes, we manage them with Puppet as well. Basically, every host has Puppet on it. So it's just a matter of, hey Puppet, make this jail. We don't have an easy jail archive of the mail server. We can't just stamp it out. But yes, the machine runs Puppet, even for the mail server.
56:54
Kind of, but not really. We don't have much sensitive data on the machines. Mostly these are edge servers that are just caches of JPEG files and videos.
57:05
But we should have a better defined process for destroying most of that data before we return the machines. The ones that have IPMI is easy to boot DBAN. The ones that don't is the problem.
57:23
All of our Puppet configs are in SVN, and we have a dev, a QA, and a production branch. And we merge the stuff through. We have a couple of jails that we use to test the deployment and make sure it goes smoothly before we push it out.
57:42
Also, some of our jails, like the jail that runs the media server, actually has Puppet installed inside of it as well. So Puppet actually treats that jail as a separate host and computers it. So basically that jail, since it has its own public IP address, has to have deny host installed as well, and so on and so on.
58:03
Some machines are running Puppet two or three times. I only have FreeBSD machines, but Puppet has better support for OpenBSD than FreeBSD because somebody wrote it.
58:25
But a lot of the things are the same between OpenBSD and FreeBSD. So by adding a second case above or below the OpenBSD one, it would make it work on FreeBSD too. So there's a lot of really low-hanging fruit for patches that I really should do someday.
58:46
No, mostly because since we really switched to it, there hasn't been anything to upgrade, really. On the wiki, there are macros that we use to run port, snap, a lot, and FreeBSD update. So we've done security updates where we've not ever done a previous D-upgrade, or switch from 9.0 to 9.1 using Puppet.
59:08
We haven't done that. A lot of times, yes. For the machines where we don't have IPMI access, that's more complicated.
59:24
But for the ones we do, yes, that's how we treat the packages. It's just package, sell everything, and then let Puppet put everything on fresh so you don't end up with version conflicts and stuff.
59:42
Well, that's why we're trying to come up with this ZFS thing that'll plonk down and make it faster. And that way, when it's time to upgrade, we just DD the hard drive with an image over top of the system and reboot.
01:00:05
Yes, but I don't know how Yes We specified present rather than latest and then periodically just purge all the packages and have them reinstall because Port upgrade isn't the best way to upgrade your packages. I think so
01:00:28
Yeah Yeah, no because that requires some kind of back-end database, right
01:00:47
Right, uh, I Haven't got to that. It's stuff we'd like to do. But you know, I just learned puppet shortly Not that long ago and I was kind of stuck in 2.6. And then I just When it was time to write these slides, I upgraded to three and kind of learned while I was writing the slides
01:01:14
puppet has Right puppet has classes to write out and Nagios files, but it requires
01:01:22
Storing the config of every server in the puppet master so that you have the variables you need to plug it in and that requires a database and The one way they had to do a database using Ruby's active record class to just talk to any my scroll server is deprecated now and the only way to do it is with puppet DB, which is this big Java thing and
01:01:46
Is imported to FreeBSD? So we're not auto configuring Nagios yet, and it's something we would like to do We have to figure out how to make puppet DB work