Configuration Management 101
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 199 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/32501 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2014111 / 199
2
3
10
11
12
13
14
17
18
19
23
24
25
27
29
45
46
47
48
49
50
51
52
55
56
57
58
65
67
68
70
72
74
75
76
77
78
79
81
82
84
86
87
89
90
93
94
100
101
102
103
104
106
107
109
110
111
112
113
114
115
119
122
123
124
126
127
128
129
137
139
141
143
145
147
150
154
155
158
161
163
164
165
166
168
171
174
175
176
179
182
183
185
188
190
191
192
194
195
196
00:00
Statement (computer science)Parameter (computer programming)Well-formed formulaResultantPosition operatorBasis <Mathematik>Universe (mathematics)Computer programmingIntegrated development environmentTerm (mathematics)Physical systemRight angleVolume (thermodynamics)Execution unitMathematicsState of matterStrategy gameElectronic mailing listCommunications protocolMedical imagingOperator (mathematics)Revision controlInternationalization and localizationService (economics)Multiplication signSpectrum (functional analysis)Process (computing)Cycle (graph theory)Set (mathematics)PressureEndliche ModelltheorieWater vaporInsertion lossForm (programming)Type theoryMathematical analysisFunctional (mathematics)Server (computing)TheorySystem administratorMaxima and minimaMessage passingVirtual machineConfiguration spaceInformation managementDomain nameIntrusion detection systemProduct (business)QuicksortLecture/Conference
08:32
Software developerSoftware testingSet (mathematics)QuicksortArithmetic meanSoftwareMathematicsIterationAngular resolutionDifferent (Kate Ryan album)Bit rateVotingSpacetimeStability theoryIntegrated development environmentStandard deviationQuery languageState of matterCASE <Informatik>Coefficient of determinationBit1 (number)Basis <Mathematik>Group actionDeclarative programmingComputer fileRevision controlLevel (video gaming)Order (biology)PlotterAuthorizationOperator (mathematics)Derivation (linguistics)WordMultiplication signGame controllerPhysical systemDatabaseScripting languageIdempotentPattern languageFunction (mathematics)PasswordInformation managementService (economics)BuildingBootingDivergenceServer (computing)Configuration spaceGastropod shellMereologyMultilaterationResultantNumberInfinityFactory (trading post)Information securityGUI widgetElectronic program guideRight angleString (computer science)Structural loadBookmark (World Wide Web)CodeTemplate (C++)Social classVirtual machineJava appletWeb 2.0TouchscreenFunctional (mathematics)Acoustic shadowInformationLecture/Conference
17:04
Game controllerPhysical systemSpeech synthesisUniverse (mathematics)Multiplication signServer (computing)1 (number)ImplementationVirtual machineGroup actionConfiguration spaceCuboidComputer fileSoftware testingInformation managementPasswordInformationSoftware developerRevision controlCodebuchSoftwareMessage passingMereologyLevel (video gaming)Set (mathematics)Pattern languageOperator (mathematics)Keyboard shortcutService (economics)Integrated development environmentSystem administratorFunctional (mathematics)Type theoryOrder (biology)State observerTheoryLimit (category theory)Formal languageProgramming languageBlack boxVariable (mathematics)Machine codeParameter (computer programming)Installation artRepository (publishing)WritingClient (computing)CodeAutonomic computingComputer clusterRight angleInterior (topology)Figurate numberWeb 2.0Electronic mailing listSoftware maintenanceWave packetState of matterQuicksortMultiplicationLocal ringScripting languageNeuroinformatikDirectory serviceWindowAsynchronous Transfer ModeINTEGRALResolvent formalismReal numberMetadataDefault (computer science)Control flowData recoveryProduct (business)NumberSource codeBranch (computer science)Interface (computing)Complete metric spaceRegular graphLastteilungCartesian coordinate systemFlow separationObservational studyMotherboardDatabaseCodecFactory (trading post)Proper mapTemplate (C++)MathematicsExplosionArithmetic meanIP addressGene clusterPhase transitionBuildingInternet service providerEntropie <Informationstheorie>Structural loadNatural numberQuery languageInsertion lossSound effectSemantics (computer science)GUI widgetSequenceRecursionParsingCase moddingInformation Technology Infrastructure LibraryTotal S.A.MiniDiscView (database)Graph (mathematics)Inheritance (object-oriented programming)Form (programming)Endliche ModelltheorieSoftware bugPatch (Unix)DivisorDialectInternetworkingMusical ensembleSocial classTelecommunicationWordNetwork topologyFrequencyEvent horizonCategory of beingWhiteboardCASE <Informatik>ChainSurvival analysisRepresentation (politics)Data structureAreaLinear regressionProcess (computing)AdditionFood energyDifferent (Kate Ryan album)Context awarenessSign (mathematics)VelocityFamilyMedical imagingShape (magazine)Point (geometry)WebsiteIncidence algebraMathematical analysisBit rateVarianceIterationLikelihood functionSummierbarkeitLibrary (computing)Machine visionExistenceResultantRule of inferenceTerm (mathematics)GodDisk read-and-write headTheory of relativityHyperlinkVector spaceConnected spaceDecision theoryStapeldateiWater vaporSeries (mathematics)SpacetimeCellular automatonProjective planeCycle (graph theory)Lattice (order)outputGame theoryGoodness of fitDimensional analysisFamily of setsCondition numberStudent's t-testRankingMatching (graph theory)Computer animation
Transcript: English(auto-generated)
00:23
I have been a systems administrator since I was a little kid, and I've been a systems
00:53
administrator since I was a little kid, and I've been a systems administrator since I
01:09
was a little kid, and I've been a systems administrator since I was a little kid, and
01:58
I've been a systems administrator since I was a little kid, and I've been a systems
02:01
administrator since I was a little kid, and I've been a systems administrator since I was
02:31
a little kid, and I've been a systems administrator since I was a little kid, and I've been a systems
03:01
administrator since I was a little kid, and I've been a systems administrator since I was
03:29
a little kid, and I've been a systems administrator since I was a little kid, and I've been a
03:58
systems administrator since I was a little kid, and I've been a systems administrator
04:27
since I was a little kid, and I've been a systems administrator since I was a little kid, and I've been a systems administrator since I was a little kid, and I've been a
04:51
systems administrator since I was a little kid, and I've been a systems administrator
05:10
since I was a little kid, and I've been a systems administrator since I was a little
05:22
kid, and I've been a systems administrator since I was a little kid, and I've been a systems administrator since I was a little kid, and I've been a systems administrator
05:52
since I was a little kid, and I've been a systems administrator since I was a little
06:19
kid, and I've been a systems administrator since I was a little kid, and I've been a
06:23
systems administrator since I was a little kid, and I've been a systems administrator
06:45
since I was a little kid, and I've been a systems administrator since I was a little kid, and I've been a systems administrator since I was a little kid, and I've been
07:03
a systems administrator since I was a little kid, and I've been a systems administrator
07:24
dollar I do with that SH, you've already know. Funk commands and message queues, still super useful. I think I might be the only person in the universe that actually used this program in production called ISConf. Really interesting, it basically keeps a
07:43
journal of the room commands that we type and plays them across all the machines in the ISConf domain. It's kind of an artifact of like the config management wars from like the usenix style, or the usenix era of this sort of thing. Very interesting, good to
08:01
check out in the config management history. But ultimately it has kind of the same problems as a lot of the other techniques. So lost history, image sprawl, you're dealing with 17 different kinds of servers, you're going to have 17 at a minimum
08:21
kinds of image artifacts, like docker ids or whatever yet, floating around the place. But however, it is kind of easy to manage change across nodes if you have an external die with SSH. If you do this, we'll get to that later. Then finally we get to this.
08:43
This kind of came from outer space and landed in Oslo, I guess in the late 90s. This is kind of where we're at. I don't know if you can see this with the light, but there's Mark Bridges, there's us, huddling around the monolith.
09:05
It's telling us something, and we're building tools with our newfound knowledge that came down and delivered from space. So this talk is going to be, everybody likes to compare this, how are they different? I'm going to talk about how they're the same.
09:21
I'm going to talk about CF engines, I'm not going to talk about beacon fit, because I don't really know anything about it. I'm going to talk about puppet, I'm going to talk about chef, I'm going to talk about salt, I'm going to talk about music. It's all coming out. That's what I'm going to talk about next. So here we go. Let's do this.
09:43
So part two, policy. So at the highest possible level, so modern config management is about describing policy. So what's policy? It's stuff like this. NC password should be both 0644 and shadow should be both 0643.
10:01
This is the very first thing they teach you in Unix class. It's like, how do you log into the machine? Username, password. This is the next thing they tell you. The password database needs to be protected. Seems like a pretty good policy to enforce. The permissions load on your password database. So we're going to put that for policy.
10:22
The kind of things you find in military security guides. Make sure you do this. And these sorts of things pop up a lot. Like, we should have this user. They should exist and be in this group. Favorite Ponzi and the Muppets all hanging out in the system.
10:40
Then you start seeing what you actually want to run the software on your machines. Make them do useful things. You start selling packages. You start configuring your clocks. All this information is policy. This is what the machine should be doing. Sometimes it's not as easily translatable as this.
11:02
Sometimes you can get a little more hand wavy when you start talking to your Java devs. Like, we need this JDK installed on this web server. That's all this stuff. But at the end of the day, policies. And the important thing here is policies are declarations about the state of things on the system. So, vocabulary or declaration.
11:20
States. Policies are applied repeatedly and repair the system when needed. So, this is important. There's a notion of controlling. You can't just do stuff to the server and walk away. That's not management. You have to revisit it every now and then to make sure you're right.
11:41
Sometimes people log into it and do evil things when they're pre-prepared. So, the policies are applied repeatedly. Policies often change. This happens a whole lot. Let's say I'm working for WidgetCo. And we make a piece of software called Widget Factory. And we sell it. So, I'm going to want this version of Widget Factory installed on all my systems.
12:03
Version 1.2.3. Well, there's a new release. They are that version. And I even installed 1.3.3. So, there's an example of policy changing and that need for the control group to come in and actually manage the system.
12:22
Repeatability. So, are we convinced that we want to repeat our policy? So, in our quest for repeatability, we start seeing words like this. So, idempotent, divergent. Since y'all gave me a soapbox, I'm going to go ahead and stand up and say, stop using the word idempotent on, as you all do.
12:45
Pretty much every time you say the word idempotent, you mean divergent. I'm going to show you the difference. So, scripts. Scripts in general, generally not repeatable. Here's an example of a non-repeatable script. So, I'm going to boot Debian.
13:02
I'm going to run my script. So, what happens after I run the script for the first time? Can the ECT piece run? What happens if I run the script the second time? Broken the ECT piece?
13:20
The reason is because the script is not safe to repeat. So, I'm actually concatenating data into a file here. So, I'm not concatenating a file. This is an example of a non-itempotent function. Running the script. Likewise with this.
13:41
I'm going to set up the initial state. I'm going to actually put some data into the file. I'm going to run two non-itempotent commands that you'll very often find in JavaScript. The send. So, the first time I run the send, it's going to retake the string. It's going to dig out this dog and replace it with dog.
14:02
The second time I run this send command, I get an empty string. So, another example of a non-safe to repeat function that you're calling the system. But they can be, right?
14:21
Scripts can be safe to repeat. They totally can. But you have to write them with this in mind. Here's an example of an itempotent operation. This is an itempotent because it's truncating the file. Itempotent just means that you can apply the operation
14:41
an infinite number of times and you will always get the same results. It says nothing about having to actually do the work or not. Two similar, very closely related concepts. But they're different. We need a word to describe each of them and we have them.
15:00
The word itempotent is this one. Convergent is the other one. I'll show you that. Stop it. Stop saying itempotent. I'm going to prove that it's itempotent. I'm going to write this data to a file. I'm going to run the script over and over again
15:20
and I'll yield the same output. Safe to repeat. Itempotent operation. But the problem is I'm doing the work every time. So, itempotent good, but it's not good enough. So what we actually want is convergent operations. We want this test and repair behavior.
15:44
I'll talk about these in depth. Convergent operations test state and repair if needed. Here's my repeatable shell script. You'll see here it says test and repair.
16:02
At the top here I'm going to test. I'm querying my RPM database. And then depending on the output of that test, I will take action to repair the system. Test and repair. Test and repair. If you squint your eyes a little bit, you'll see three blobs of code on the screen.
16:21
Each one of those is a convergent. So, three convergent operations. Likewise, HTTPD. Same kind of thing. It looks almost the same as HTTP. This is a pattern. We're creating this pattern across two things. So we have NTP and HTTPD.
16:42
Both things that you can install from your Linux distribution's native package management system. Very often you can get away with this pattern. Install the software, write the configuration, and start the service. Very common pattern. Probably the most common pattern you'll ever see.
17:02
Package template service. Finally, you have this controller that runs and repairs the system. Each one of these are operations. Each individual thing needs to be thought about individually, and it's not a controller. You end up with this.
17:23
Autonomous agents fixing the system. So, let's talk about convergence.
17:44
This is always fun for me to explore. You always find something interesting when you go digging around in convergence land. So, this is pretty easy to crack. A convergent operator on a system repairing the state of the system. It's pretty easy.
18:02
What happens when you put two on a system? What happens if they have conflicting policies? What if you have one convergent operation starting a service and one stopping a service? What happens if you have a lot? All kind of running on the system at the same time.
18:25
This is how you need to think about config management code. A swarm of these autonomous agents infesting your system and repairing various parts of it. If you have this in your brain while you're writing your config management code,
18:41
you will win at it. You ever see something cool? So, I wrote you a new config management language in Bash. Because Bash is the lingua franca of DevOps.
19:04
So, if you want to check it out, you can go here. It's on GitHub because everything is on GitHub. So, this is what we're going to do. We're going to start out. I wrote a script called How Status. It's basically an integration test script. It's going to check parts of the machine
19:21
and see if they're the way I want them to be. So, this is what happens when I run How Status. I'll look through various things on the system. I've observed users, a directory, and the presence of a file. It says they are all broken. Everything I wanted to test about the system has failed.
19:41
So, what am I going to do? I'm going to write some Bash. I'm going to do this in four slides, by the way, so don't be scared. So, the first one is I'm going to take two convergent operations and put them together to form a type. So, test and repair.
20:01
Test, DBT, DBT group, type box. Do some wrapping. Yeah, we find it. Test and repair. So, I've made a group type in Bash. At the top, I'm defining what my variables are doing. So, I'm actually creating an interface. So, I have two parts, interface and implementation.
20:23
Test and repair. Second one, user. More of the same. Test and repair. Three, four, five, six, for a total of eight convergent operations now. I'm making a user, so again, at the top, interface and implementation.
20:41
Very easy stuff. Just running the user mod command. Finally, for directory, I use three more. So, again, interface, implementation. Set some variables, do something with them.
21:02
Test and repair, test and repair, test and repair three times. And then, for my last one, actually, it turns out CP-U is actually convergent, so I didn't really need to do that. So, a total of, what am I at, 12? Four times 12, you figure, 12 convergent operators
21:22
all on my system, working through the machine. So, I've written my functions. Let's consume the functions. I'm going to actually write some policy now in Bash. So, I'm going to say, group space, give a parameter, HAL, my thousands without the ones. Does this look familiar?
21:47
So, cool, I'm going to run that, repair HAL machines.
22:04
So, he's offline, good, stay up. Want to see something really cool? What happens if we do this in the wrong order?
22:22
So, we're going to do unordered repair, HAL. So, I'm going to do this the exact wrong way. I'm going to first try to update a file and put it in a directory that doesn't exist yet. I'm going to create the directory and give it an owner and permission to the user that doesn't exist yet. Well, I guess that'll work.
22:44
So, let's run it, see what happens. So, I'm going to run my policy. And look at that, actually fixing some of them. So, I have stuff at the top, actually online. The user actually failed to create.
23:02
The directory actually exists, but it has the wrong permissions and the file has not been covered yet. Second iteration, right? More of it is repaired, but not all of it, right? We still have, it looks like a directory permission. And then third iteration, I finally converged
23:21
to the state that I want my machine to be in. That is convergence in action, test and repair, test and repair, on a loop. And that's kind of like the lizard brain of configuration management. It's automatic, obviously it does not,
23:41
because you just showed me that, Sean. Actually it does, it matters a lot. It matters a lot, a lot, a lot. Do you know enough situations where you're mounting file systems and disks and changing the actual view of the system and starting services and packages and all these things?
24:01
Most of my, all of my operations actually were like super low key, right? They were doing very specific things. Sometimes cryptic operations actually have pretty big side effects on the system and you need to watch out for them. So there's that. And here's a question. What would happen if I actually went into my title implementations, like my user, and rearranged all of the functions in the file?
24:22
So I went into the group in the user thing and actually reversed my test and repair things and try it again. It would converge. Because the system is, the system is, it's going to stabilize because there's no conflict in the set of convergent operators
24:43
that are happening, right? There's nothing fighting with each other. So yes, order matters a lot. So write things in order that you need them, especially inside of a type of limitations.
25:04
I'm going to jump into, we're going to wrap about promise theory from 11 to 27. So this is kind of like what the monolith dropped off, right? This is promise theory stuff. This is kind of binds all this stuff together.
25:20
So real quick, agents are economists, right? Each one of those little boxes, the test and repair loops, they're economists, right? They are their own little things. A promise is a singular message perceived by an observer, okay? So within a set of these things they can actually look at each other and see what's going on. That's how they can cooperate, okay?
25:42
Promises may or may not be kept, just a fact of life, right? Sometimes machines break. We need to account for this in our system. Agents can observe other agents, right? This is how you can restart services when config files are repaired. Agents have local information, very important.
26:03
There's a star next to it. This is the implementation of an agent and all of the information that it needs to repair what it's concerned about, okay? So like a user's agent knows how to manipulate
26:21
an etsy password or an analytics, knows how to manipulate the windows database, knows how to do everything it needs to do to actually take care of itself. It's also important because it's why dry run mode cannot work properly, right? You can't, like they don't have local information about themselves, they can't see what other things are doing on the system
26:41
in a real run mode, right? You can't query one of these agents. You're like, hey, what are you going to do later after these 15 other agents have popped off, right? It just doesn't work. If you want to know more about that, get me after this. And then the inner workings of agents are assumed to be unknown, also important, right?
27:01
This is what lets us build black box providers, right? So I can implement a provider in bash or assembly or C or go or whatever, but it's behind this interface I've walled off the implementation of the agents.
27:22
You're not allowed to know what's going on. So, agents have intentions, possible behaviors, things that they can do, right? Package agents can typically promise to install a package or uninstall a package. Users can promise to be present or disabled. And then agents can make a session to develop other agents on the system.
27:42
Actually, they can observe each other and send each other signals. And then configuration management tools embody tenets of promise theory intentionally or not. And it turns out that the current crop actually do, like the current crop that's derived into forms you have to do. I think they all kind of do this on purpose or not.
28:02
So domain-specific languages, a very popular thing to talk about. What do they do? DSLs, this is all they do. They restrict the machine instructions that you're running to convergent operations. That's it. So they don't let you make non-idempotent statements
28:22
unless you're abusing the exact statement and if you're doing that, stop. And then they also manage ordering. The different tools have different philosophies about ordering. CF Engine will actually order by type. And then it'll actually apply a number of times
28:41
to take advantage of that convergence thing I showed you with the bash scripts. Puppet sorts a graph to determine ordering. Chef just does what you tell it to. And then Salt Dance will kind of do that as well. Only they can spray those things at machines and pull them in. So here we go.
29:01
A little CF Engine policy here. So I'm going to install Puppet. Let's see how they're doing. We can actually look at them and see that indeed we have a type subject and that this package type, this autonomous agent that knows how to repair packages can intend to have a puppet.
29:26
So again, type subject intentions. So we have a package type that knows how to ensure that packages are installed or disabled. Here we have a signal. The agents are observing each other. This one's notifying service to restart.
29:43
Chef installs some salt. Again, package, subject, intention. Exactly the same. DSL doing the same things across all the languages. Again, observation. Agents observing each other. So the service can actually subscribe to the template
30:02
to see if it's repaired, I need to restart the service. Here's some salt. Admissions, subject. And then, Ansible. Again, same stuff. On what's speaking.
30:22
Package type, it has a subject, and then it's going to say it's new. I intend to make sure that this is at its latest version. Agents can observe each other. That was exhausting. Here's a basket of kittens.
31:05
Part three, composition. Let's write some Chef. Chef is my native tongue.
31:23
So, recipes. Here's a Chef recipe. Pretty freaking simple. Now that we've just got done talking about promise theory, the economist agents, and all this fun stuff, you will see what it does very easily. I have three resources.
31:40
Again, that same pattern from back at the beginning of the talk. Package, template, service. I'm going to install the software, make sure that it's a configuration file, looks correct, and then if that needs to be repaired, I can restart the actual service. I feed this into a control loop. Gives you a change of this data system
32:02
by updating the policy. So I can change the template, source for the inputs, and actually make the service restart. I group them together into a body of testable intent. So the recipe or order matters,
32:21
you put them together, and you can actually test them. It's a whole different topic that I can talk about for days and days and days. I'm not going to get too far into it, but do it, because we're writing software. So I write my recipe, I compose it with three resource statements,
32:42
and then the Chef I can refer to it like this. I want the recipe and the ACTD cookbook named server. It's kind of like a Chef. Pretty easy. I'm referring to the source thing in my template.
33:02
So I'm saying, Autonomous Agent, I want you to write this file, I want you to use this template source. Well, I need to put that somewhere. I now have a cookbook. So the policy, the recipes, and then everything that they need to function properly.
33:23
So templates, custom types, I'm going to ship the cookbooks. The recipes and supporting files. Types. Types are super, super, super easy to make in Chef. So I'm going to make a young cookbook
33:42
that has no recipes in it. It's only going to ship a custom type. So here we are. So you have the interface defined in the resources, directory, and then we have the actual implementation
34:01
defined in the provider script. So if you remember back in the bash script, I said interface implementation. The interface is the top where I'm dealing with the variables and the arguments and everything.
34:35
I dress up like a programming language for Halloween. So interface, very good.
34:40
I'm saying my young repository, it's going to have repository ID, enabled, gpgject, true, false, and so on. So I can list my intentions at the top. I want this young repository created or deleted and then the parameters that I need to make it all work. Pretty easy.
35:00
And then the actual implementation, I'm going to use other Chef resources. So I'm going to render a template, etsyyoungrepos.de, feed it the parameters that I pass in. So here at the top, this is a special Chef thing.
35:21
It creates a new scope inside of a parameter implementation. It walls off that implementation. So it can be said, autonomous agents inner workings are assumed to be unknown. Right here. So a new scope. I just so happen to be implementing this with more Chef.
35:40
I could totally shell out to bash and do this. Fine. It doesn't count. That was the best check. It's kind of bad. So here we go. I'm going to be on time. Pass my parameters directly into my template there. Chef, Chef, Chef.
36:04
Now we're going to talk about artifacts. You're right, this does depend on you. So artifacts. You need to think about releasing software.
36:21
You need to write code books. I don't know why for some reason. So once upon a time, in the beginning of Unix or whatever, before we had proper package management and everything, we used this thing called Defender Branch Pattern to ship software. Here's a tarball.
36:40
Then you unpack the thing and you can use CVS or whatever branch thing. And then you've got the different packaging. For whatever reason, a lot of the complete management, people want to revert back to the regular branch model and I don't know why. Stop.
37:01
But Chef makes it super easy to actually artifact code books. Artifact means you have a written, peer-reviewed, tested piece of software with a version number on it and a well-documented behavior released into the world for your consumption.
37:21
So we're going to make an artifact. Chef cookbooks have metadata. Super important. It's mandatory. It has a name and a version. So I'm going to name this software artifact HTTPD. It's going to be at version 0.1.2.
37:45
I'm going to take my YUM cookbook and I'm going to give it a name and a version and its metadata. Artifact. So this artifact's name is YUM and its version is 3.0.0.
38:02
Something we like to do in Chef Land is follow a thing called SymVer. It's semantic versioning. It's an English language API about what these numbers mean. So in semantic versioning, we say that the major number, the top number here,
38:21
is reserved for major changes. So you can actually break backwards accountability and break interfaces with a major number. The middle number is reserved for new features and then finally the end number is reserved for bug fixes and patches and things. So if all the software in your ecosystem
38:42
follows semantic versioning, it's safe to actually only lock down to the first two digits and you're guaranteed the last digit and unfortunately, not everybody does that. And people that do not follow semantic versioning within a software ecosystem tend to break things. This happens a lot here in the middle.
39:02
Sorry. So use semantic versioning. Please. So to return from that tangent, I'm going to upload my artifact to Chef server. So Chef server you can think about as an artifact server. You take your cookbooks and you put them in there
39:21
and all the clients can request them. And so with the artifacts, you can have multiple versions of individual artifacts simultaneously installed by Chef server. The Chef server also handles things like authentication, authorization, that sort of thing to actually get the artifacts out of the artifacts when they're not serving a machine. Delivery.
39:41
So in Chef, at first, nodes request their initial run list and it looks like this. So node. And then it pulls down the HTTPD artifact onto the machine, executes the recipe, does it. Then you can do this,
40:02
add more things to the run list, add more recipes. So inside of each one of these recipes is going to be a list that will come as agents all running on controllers and repairing the system and notifying each other all the fun, terrific things and stuff. Then you can kind of do this.
40:21
So I just keep adding things to the run list. And so for everything, you add the NTP, NTP cookbook, client recipe, it's going to just retrieve the artifact that it needs from Chef server and then actually compile and execute the client recipe that you've requested inside.
40:40
Push versus pull. If you prefer push, you're wrong. Pull is always better. There's all sorts of reasons for this. Least of which is network considerations. So firewalling, those sorts of things. Machines that are down for maintenance. If you have more than one machine,
41:01
at one point, one of them is going to be off. Systems administrators is a constant battle with entropy. Motherboards explode. I've seen write controllers burst into flames in my house. This stuff happens. Machines break. They do it often. Capacitors pop off and go flying across the room.
41:22
This stuff happens. Machines are going to be down for maintenance. If you're pushing out your policy, machines that are down are going to miss you. So pull, please. And then machines that don't exist yet. So you're modifying your cluster, you're adding a web server to a node, and when it's actually close to that, pull the server down.
41:43
Dependency. And that comes in super handy. So we're back at Widget Factory Co. now. So I'm going to make my Widget Factory cookbook here, and I'm going to go in, I'm going to edit the default recipe
42:01
of the Widget Factory cookbook. So what am I going to do here? I'm going to say, all right, well, I want to include this other recipe. It should be the new server. And then I want to make a YUM repository to access my widgets. It goes on the Widget Factory. I like to put widgets in the YUM repositories. And then I'm actually going to install a package.
42:20
Well, so I've actually referred to two things that depend on something else. So I have my Widget Factory cookbook. What the hell is a YUM repository? I have no idea. So what I need to do is actually gain access to the YUM repository type that I need to depend on in its metadata.
42:41
So Jeff will actually resolve these dependencies for you. So you say, I want to run your Widget Factory recipe on the node. It goes, okay, pulls down the artifact from the artifact server, looks at its metadata, parses it, and says, oh, look at that, dependencies. Let me get those. It does a recursive dependency resolve
43:03
until you finally get all the things you need, and then you're able to actually run it. You'll see this a lot in just normal programming, because this is programming. Integration testing. So I said that artifacts are testable.
43:21
So the goal of integration testing is to test to see whether the set of agents has achieved their desired goal. So just because your configuration policy runs and finishes doesn't mean you actually did what you're trying to do. So you see this kind of stuff a lot
43:41
in computer management, integration testing, LSOF. So check a port, you know, like a curl URL, make sure the machine's actually doing what the policy hasn't told it to. Here's a set of tools. They're all awesome to use. Rickshelf's client-side artifact metadata dependency
44:04
resolver thing that's built into Kishi CI now, which is an awesome way to do integration testing and get quick books. This comes from the Chef community, but it's going to support other config management types soon. Right now we only support Chef,
44:21
because it's like the Chef people that wrote it. So it's written in such a way that it will soon support Puppet and Ansible and CF Engine and all the things. Pretty cool. And then out of the config management tool, very important, it runs the integration tests. So Bats and ServiceBats and all sorts of things. Environment's going to zip through this real quick,
44:40
because I only have a few minutes left. Environment's going to train quick book versions, and they allow you to actually set data, zipping through, so here's the environment. I want the 1.0 version of the artifact, H-E-D, the artifact. The staging environment, cool. I'm saying I can use the 2.0 version of that of that artifact.
45:01
So it gives me a way to actually test the code and install things on the server without blowing things up. Environment's going to be used to test branches, environment's going to be used to segregate machines, and environment's going to be manipulated programmatically, or a cheffy thing. Here's an example of me modifying my web server cookbook.
45:21
I'm just going to change the template things so that it's looking at some variables. I can give it the new version, upload it to the artifact server, along with some other things that I want, and then it exists simultaneously with the method. Then again, I click which effect would I run this, and it only pulls down the version that it's pinned to.
45:45
Part four, you're going to see me talk real quick. Clusters. This is what we're really all trying to do, is manage groups of machines. So far, most of the stuff I've been talking about has kind of been on one machine,
46:02
so let's talk about this. Here's a typical cluster. This is how I think about clusters in my head, because many things are different. I'm going to have different layers. At the top, I'm going to have a load balancer, and then I'm going to have a list of a group of machines, like an application server pool, and some databases stuff going on down here.
46:23
There's actually topology between these. IP addresses need to be rendered into configuration files for various services and all this stuff. You've got to actually manage that on top of the actual services themselves. Chef gives you an easy way to actually search out different machines
46:43
and actually dig out the information about the files, but we're not going to use them because we don't have time. So check this out. Here's a way to do change management. So I can say, production cluster. This is how I do it. So production cluster,
47:00
and it's running with an environment set to add what the 0.1.0 version of the HTTPD cookbook, and the 2.3.1 version of something else, whatever. I'm pinning down my artifact versions in my production environment. So from there, I can actually spin up
47:22
an entirely weird copy of it, so that's doing a disaster recovery test. DR testing is a very important thing that we like to do. It has the exact same environment. The third one, where I actually want to test changes, so I want to develop config management code
47:40
and develop application code, and see if it all works. I can spin this up over here, completely separate of everything else, but I update its environment with the new version of the stuff I want to do. Or actually, if I need to run it finally, just pull the latest. So what I can do then is I can actually update the staging environment to match that of the development one,
48:01
and then start running the config management code. Sometimes it won't take. Sometimes you hit an error, and it explodes. Well, in that case, you just have to back off and figure out what the hell went wrong. Figure it out. Then you do, and suddenly, you can converge the new code
48:21
onto the existing old infrastructure. Cool. You know it's actually safe to do production. So if you update your production environment, you've just changed. You've managed your change. So that's good. Here's an interesting thing. Again, I'm going to zip through here. This is something that I'll struggle with.
48:45
So check this out. Here's part of that cluster. I have a load balancer. I have an application insert and pull, and my application is written in such a way that I need to do the following. I need to take a machine out of the pool, drain the connections,
49:01
modify the configuration, and insert it back into the pool. So what I do is I change my configuration here, change my configuration here, here, here, here, here, here. I need to do this in a particular sequence in order to actually make this work. How do I do this? Well, here's that Emily Oberg orchestration.
49:22
So in general, there's three ways to do this. A conductor showing signals to a couple of his agents. So this is basically creative policy manipulation. So on your policy host or Chef server or whatever you're saying, right? The server needs to do that or whatever. You're letting the pool happen.
49:41
That's hard, right? An external actor controlling sequencing, execution management, so we're following back into the end of the monkeys, right? We have an external agent, like, doing gates and phases. Like, you do this, and you do that, right? So you have that going on. And then application level sequencing,
50:00
which is the correct way to do it. We'll move it up into application. So lots of modern applications actually have this built in. So things like React and like MongoDB and these sorts of things that can actually take care of their own ordering considerations without having to have a lot of documentation for this.
50:24
So I guess I'm almost out of time. So infrastructures are snowflakes, right? Unlike an individual machine inside of an infrastructure, the cluster itself is a snowflake. It is yours, okay? And that's hard, because it makes solutions
50:44
to handle this ordering and orchestration thing by their very nature are going to be unique to your snowflake, okay? So we need to figure out how to do that, and that's kind of the next topic. So I guess things you need to walk away from here, remembering, is there is no separation
51:00
between infrastructure and application code. They're the same thing. It's all machines. Distributed systems are hard, and specialists need to work together, right? So like your hardcore Unix Sysping guy and like your application tier people, they need to get together to figure out how to manage this snowflake properly.
51:21
That's kind of the essence of DevOps. So do that. Talk to your peers. And then there's this, study privacy, study distributed systems, develop high-quality primitives, and shift them as artifacts, and be excellent to each other.
51:49
If everybody has any questions, we have like two minutes left while the next speaker is preparing, so...
52:00
Questions? No, because I have to restart this. Can I show you a signal for the cut? You can let as many people in as we've got out.
52:21
So that's four. You can go see the champions here. Four is going in, so... I'm going to try to be back for the Docker talk at two here. Okay. Okay, so they're done.