We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback
00:00

Formal Metadata

Title
SaltStack
Subtitle
Configuration Management Meets Remote Execution
Title of Series
Number of Parts
199
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Saltstack is arguably one of the best of the "new breed" of configuration management solutions. In this talk, Corey takes the audience through a stand-up of a Salt environment and leads into some examples of how you can leverage the message bus to automate not just configuration management, but your entire infrastructure. Corey will show you how his sausage is made... Built with simplicity and speed as its overarching design goals, Salt has taken the configuration management world by storm over the past three years. Built atop a zeroMQ based message bus, realtime analysis of your existing environment, and rapid deployment and orchestration of your entire environment are now within reach. In this talk, we'll go through an initial setup of Saltstack in a lab environment, investigate signal passing from one node to another, and demonstrate how Salt can orchestrate entire environments both simply and effectively.
65
Thumbnail
1:05:10
77
Thumbnail
22:24
78
Thumbnail
26:32
90
115
Thumbnail
41:20
139
Thumbnail
25:17
147
150
Thumbnail
26:18
154
158
161
Thumbnail
51:47
164
Thumbnail
17:38
168
Thumbnail
24:34
176
194
Thumbnail
32:39
195
Thumbnail
34:28
QuicksortMultiplication signDemo (music)Level (video gaming)Point (geometry)Object (grammar)CircleElement (mathematics)VarianceMereologyRight angleLecture/Conference
Configuration managementMiniDiscComputer fileBitService (economics)Information technology consultingVirtual machineThermodynamisches SystemAbstractionNumberVertex (graph theory)Electric generatorPoint (geometry)Endliche ModelltheorieMereologyProjective planeSource codeIntegrated development environmentFluid staticsQuicksortDataflowSpacetimePresentation of a groupBus (computing)Sound effectServer (computing)Message passingCausalityLastteilungContext awarenessGame controllerProper mapImage resolutionObservational studyScaling (geometry)10 (number)Event horizonSubsetSeries (mathematics)Web 2.0Type theoryGroup actionStaff (military)Public key certificateFreewareSoftwareLattice (order)Remote procedure callStructural loadHypothesisProduct (business)Statement (computer science)State observerFraction (mathematics)Data structurePhysical lawGame theoryGraph (mathematics)Pattern languageCASE <Informatik>AreaSummierbarkeitVariety (linguistics)Computer hardwareBounded variationNatural numberState of matterMultiplication signFrequencyFood energyUniverse (mathematics)Selectivity (electronic)Faculty (division)Right angleCellular automatonSpecial unitary groupComputer animation
GravitationWordSoftware developerRight angleRevision controlProjective planeRegular graphSolvable groupDifferent (Kate Ryan album)QuicksortView (database)Execution unitSeries (mathematics)Electronic mailing listHuman migration1 (number)Social classBitNumberPower (physics)Food energyMultiplication signHazard (2005 film)MereologyFraction (mathematics)Parameter (computer programming)Endliche ModelltheorieTheory of relativityIterationTheoryUser interfaceCore dumpMoment (mathematics)DivisorService (economics)CASE <Informatik>Instance (computer science)Disk read-and-write headForm (programming)State of matterTerm (mathematics)Group actionThermodynamisches SystemCustomer relationship managementPoint (geometry)Point cloudReal numberConfiguration managementOcean currentOperator (mathematics)Connectivity (graph theory)Medical imagingPrimitive (album)Interactive televisionCryptographyPortable communications deviceVariety (linguistics)SpacetimeVertex (graph theory)EncryptionScaling (geometry)10 (number)Network socketBuildingInternet service providerMultiplicationQueue (abstract data type)2 (number)Code refactoringDatabaseCommunications protocolAlpha (investment)INTEGRALDefault (computer science)Interface (computing)Data centerComputer animation
DataflowMultiplication signControl flowAnalytic continuationTable (information)Thermodynamisches SystemFlow separationNumberState of matterRight angleWave packetUtility softwareOrder (biology)Water vaporArithmetic meanLevel (video gaming)Power (physics)Physical lawThermal radiationLecture/Conference
SummierbarkeitOntologyResultantThermodynamisches SystemUniverse (mathematics)Student's t-testExecution unitMultiplication signStack (abstract data type)Lecture/Conference
WebsiteFormal grammarBitMultiplication signRule of inferenceSlide ruleSinc functionLecture/Conference
Staff (military)1 (number)Multiplication signFunctional (mathematics)MereologyWhiteboardVelocityRight angleBounded variationRevision controlForm (programming)CuboidSummierbarkeitOpen sourceGradientWordQuicksortVirtual machineConfiguration managementGraph (mathematics)Customer relationship managementUniverse (mathematics)Computer filePoint (geometry)Scripting languageSystem administratorSoftware maintenanceMaxima and minimaClosed setOperating systemBasis <Mathematik>Thermodynamisches SystemLecture/Conference
Task (computing)Electronic mailing listView (database)Form (programming)Variety (linguistics)Endliche ModelltheorieRight angleCodeArchaeological field surveyMultiplication signPoint (geometry)Goodness of fitFitness functionSpring (hydrology)Software maintenanceSoftware engineeringXMLProgram flowchart
Virtual machineServer (computing)Point (geometry)Thermodynamisches SystemAddress spaceMedical imagingGroup actionComputer clusterBitComputer hardwareWeb 2.0DatabaseTheory of relativityData centerState of matterTheoryCharacteristic polynomialoutputReading (process)Office suiteOcean currentMetropolitan area networkDivisor
Core dumpRule of inferenceVirtual machineInstallation artDatabaseMathematicsVariety (linguistics)AuthenticationLibrary (computing)SoftwarePhysicalismNatural numberServer (computing)Computer file10 (number)Vertex (graph theory)Software frameworkTerm (mathematics)Standard deviationNetwork topologyWeb 2.0BootingMessage passingCustomer relationship managementBus (computing)Representational state transferWeb serviceFront and back endsFirmwareScripting languageTask (computing)BitBranch (computer science)System administratorProcess (computing)Gastropod shellCartesian coordinate systemMobile appEmailSoftware developerAcoustic shadowWordPlug-in (computing)Matching (graph theory)Stack (abstract data type)Thermodynamisches SystemService-oriented architectureEnterprise architectureDesign by contractRepository (publishing)Software repositoryWindowMultiplicationOcean currentPresentation of a groupPoint cloud1 (number)Object (grammar)Arithmetic meanBasis <Mathematik>Physical lawNonlinear systemMultiplication signUniverse (mathematics)ResultantGraph (mathematics)Right angleError messageSystem callCASE <Informatik>Natural languageForestMereologyPoint (geometry)View (database)Atomic nucleusMetropolitan area networkEndliche ModelltheorieGame theoryInsertion lossLecture/Conference
Right angleInstallation artCuboidServer (computing)SequenceMereologyExistenceMultiplication signOrder (biology)SmoothingCASE <Informatik>Key (cryptography)Standard deviationScaling (geometry)Semiconductor memoryMedical imagingRule of inferencePoint (geometry)Boss CorporationOperating systemWeight functionFamilyProcess (computing)Endliche ModelltheorieTerm (mathematics)SummierbarkeitSet (mathematics)DistanceWordBuildingState of matterVirtual machineMetadataPartition (number theory)BitNumbering schemeDecision theoryAdditionCore dumpConfiguration managementSystem callBlock (periodic table)DivisorInformationFirmwareComputer hardwareAddress spaceVertex (graph theory)View (database)BootingReal numberRAIDProduct (business)Thermodynamisches SystemKernel (computing)Cycle (graph theory)Video gameDifferent (Kate Ryan album)Regular graphBasis <Mathematik>Complex (psychology)WindowDesign by contractCustomer relationship managementOnline helpData storage deviceSlide rulePower (physics)Goodness of fitPoint cloudScheduling (computing)Remote procedure callFlagLatent heatLecture/ConferenceXML
Electronic mailing listPointer (computer programming)EmailSoftware testingVideo gameMereologyPublic key certificateVelocityCycle (graph theory)Variable (mathematics)Figurate numberInternet forumBasis <Mathematik>WordService (economics)Data structureCuboidSet (mathematics)MathematicsCASE <Informatik>Arithmetic meanKernel (computing)Insertion lossEndliche ModelltheorieLine (geometry)Key (cryptography)VarianceAdditionImplementationSummierbarkeitOffice suite1 (number)Installation artOperator (mathematics)Chemical equationAsynchronous Transfer ModeFinite-state machineConfiguration managementTask (computing)Gastropod shellScripting languageDecision theoryServer (computing)Vertex (graph theory)Software repositoryState of matterProcess (computing)Enterprise architectureTable (information)SoftwareMikrokernelRight angleComputer hardwareThermodynamisches SystemMedical imagingProduct (business)Computing platformPoint cloudDifferent (Kate Ryan album)Customer relationship managementColor managementSoftware maintenanceXMLLecture/Conference
Strategy gameMedical imagingSoftware repositoryInformationWikiComputer animationLecture/Conference
Transcript: English(auto-generated)
All right, so next up is Corey Quinn, he's going to introduce us to Saltstack. Again, I'm quite interested because I literally haven't seen anything about Saltstack.
Okay, before I dive into this, can I see a show of hands of who's ever run Salt before? Roughly half of the room, that's quite reasonable.
The reason I ask is just because whenever I give these talks, it seems that the audience is sort of varied as far as their level of experience, level of exposure. Some people have never even heard of it before and just wanted a seat in the room they could actually get into. Other people have been contributing for it for years and just want to ask fun questions at the end.
It's an interesting place. First off, I'd like to apologize a little bit. There's a lot of cutting times in these presentations, so I promised a demo in the end. But unfortunately, we're not going to be able to record that at this point. But we do what we can. Sorry for the false advertising on this.
Okay, so who am I? To answer the question initially, I do not work for Saltstack, although Dave in the corner does. He'll be fact-checking me on everything I say that's stupid, just most things. I'm a technical consultant at Taps, a company based out of the greater Bay Area in California. I'm also staff for the Free Note IRC network in my spare time, where we contribute things like Fosnap.
I was number 15 to contribute to Salt, which is kind of awesome. We're above 500 now. We talk about the things I remember way back when. At the moment, I'm currently packaging it for Homebrew for Mac, but I spent an interesting year as the budget packager as well. And I didn't make it onto the slide, but three days ago, I wound up passing the newly created Saltstack certified engineering certification.
Fantastic. That went on forever. Okay, so what is Salt? This is not going to be an in-depth explanation of everything that Salt does,
because at this point, the project is thrown and blossomed in such a way that this would take several hours to go through. And frankly, no one has that kind of attention span, at least of all me. We're going to be relatively high-level, an overview of what it does, and we're going to come back toward the end with a couple of the high-level features that Salt does that you may not have heard of before.
So what we're going to do is we're going to start with a thesis of what Salt actually is. It's configuration management meeting remote execution. And as far as what those are, let's break them down one at a time. Configuration management, as I see it, really consists of five primitives. And there's no configuration management system out there that I'm aware of that doesn't do these five things.
Once you wind up achieving all of these, you effectively have control of it all. And that's not particularly difficult these days, and it's not hugely interesting. Everyone does it, but the question is, what else is actually out there?
Analyst studies have shown that anywhere between 4 to 10 percent of companies out there are running a configuration management system that is not developed properly. So far and away, the most common thing people are still using in 2014 is SSH in a four loop, or something built around RC. And this is somewhat unfortunate, given that people are reinventing an awful lot of wheels.
And it doesn't necessarily have to be there. Now that we've covered what configuration management is, there's really only one primitive for remote execution, which is to do the thing over there. You can think of it like a flank thrower. Basically, I want to set that thing over there on fire, and done.
Problem solved. The next thing about SALT is that it leverages a message bus to do this. So you can do the thing everywhere, or on a very clearly defined subset of things. You're not going to wind up doing this in series. You're actually going parallel throughout your entire environment. And at this point, it does scale to tens of thousands of nodes.
And this has been tested in a number of environments on LinkedIn, for example. They're running this at that scale. It's kind of an interesting thing to see. So one of the things that makes it a little bit interesting, compared to other contenders in this space, is the simplicity of the configuration.
It doesn't have anything approaching the DSL. It's pure YAML at this point. In fact, this is not an abbreviated example. This is actually taken out of a running production environment. It starts off by defining... All this does is handles HTTPD. Unlike a lot of entrants in this space, one thing that's probably worth pointing out
is that it does have a dependency model that goes top to bottom, which is nice. It doesn't randomly allocate that. That was actually added in relatively recent generations. So it starts off with, obviously, defining the package. It has to be installed. It does that. It then winds up managing the file itself, obviously the configuration file.
And it gives it a source. That source can be a static file on disk. It can also be tempted about with Jinja, which we're not going to dive into heavily into. And one thing that I want to point out as well is that it then defines a service as running. And I forgot to call them there. I'm kind of going...
It watches the file, so it will restart the service when that file changes. And lastly, what I want to point out that makes this a little bit interesting is the require statement at the end. If it goes top to bottom, then why would I bother to put in a require there? Simply put, because if I don't require it, if something goes wrong putting that configuration file into place,
it'll just continue iterating through. If that require fails, it will not start the service, which actually winds up providing a nice fail statement. Picture a scenario of the point of adding a load balancer, or adding a web server to a load balancer, where you don't actually have static assets in place on the web server. That's how you tend to have embarrassing production actions.
Okay, one other interesting thing as well that takes it a bit beyond this is something called the event reactor system. And this is where SALT really wants it true. You just saw the dependency model that I laid out. That was not a random thing. Because what this does is effectively does the same thing.
If a file changes, we start the service. If this, then that. What the event reactor winds up buying us is it gives us that same type of dependency model, only we're no longer talking about a single model. We're talking about things like if this web server comes up, then add it to the load balancer. If that server exceeds a certain threshold,
remove it from the load balancer. You essentially wind up being able to map dependencies and have cause and effect relationships throughout your entire environment. This is something that traditional systems, particularly our old friend Arsyk and SSH, tend not to do as well as you would hope. It's environmental organization,
and that's sort of a new and interesting thing. Something else that's been included in SALT for a while as well is called SALTvert. It's actually a built-in part of SALT. It's not a separate project. And this buys us a few interesting things. Specifically, it lets us deploy virtual machines. Today, KVM is the number one first-class citizen,
although support for LHC is coming up, as well as Xamarin. At SALTCOMP three days ago, we were also given an interesting presentation on how we integrate flow with Docker, which was presented here two docs ago. It's really turning into an interesting space. What this winds up doing is
this adds a great abstraction layer on the business of instantiating and running VMs. Being able to precede the image with its own, with SALT's configuration management system means you can decide to spin up a VM on a particular hypervisor in your environment, and it comes up automatically populated, which is rather nice. But at that point, that's a neat idea,
but it also starts speaking to something that's a little bit higher level, and that's called SALT Cloud. What this does is it winds up provisioning into both private and public clouds. Yes, welcome to the cloud. It's where we have ops. For the next SALT release, this is actually going to be merged in as well,
which is a component of SALT, no longer a separate project. It's already been done in the current release candidate, which we're expecting to release in a certain week. And what this serves as is an interface to cloud providers, which is fairly comprehensive at this point. And what this does is it's not just a list of, ooh, look at all the things that we support, but this represents something
that we tend to be driving towards as an industry, whether we realize it or not. For example, as a consultant, I wind up speaking to a number of companies that are doing migrations either into or out of AWS, moving it in because of the rapid provisioning and instantiation of environments,
and moving out because, holy crap, is it expensive to do it at scale. When you can build a data center for what you're spending in Amazon in less than three months, it's really time to consider maybe doing something else. The problem is, and the reason I think on Amazon for this, is that they are obviously the market leader with a lot of very interesting platform-specific services.
Take RDS, their database service. That's great, but no one else really offers a database service at that same layer. So what this really speaks to is what it's going to take going forward as we start building out these cloud environments that are truly portable, is that we also have to wind up building services as well rather than relying on the ones that providers give to us.
Rather than going with RDS, potentially, you instead define some form of, you design a state within SULT for whatever you're using. This doesn't need to be SULT for a second. You wind up spinning up your database service inside of a container. You wind up restricting your interaction with AWS or any other cloud that you're using
to basic primitives. Stand up an instance, spin down an instance, add it to a load-balancing layer, and then have everything else defined in terms of what happens once those instances are up. You've bought yourself tremendous portability. Obviously, it does point out that you've got to figure out how to scale it as opposed to
once you try to move off and then reinvent the wheel that's already bought you a tremendous amount of technical debt. This is sort of becoming a new best practice in the industry that some of us haven't quite fully realized yet until we wind up smacking us on the face a couple of times. As a consultant, I get to see it a lot, but people who are embedded in a variety of shops either are just starting to realize it
and don't have the luxury of moving on to a new project in six months so they can forget and do it better next time. So it's definitely worth calling out as one of those dependencies. If you manage the targets of common APIs, your migrations wind up hurting a lot less. As I mentioned earlier, we're definitely looking at a point now where there's a transition
in going on a configuration management state. The configuration management space. Node management has been relatively solid for a number of years. It started with CM Engine, and Puppet and Shaft wound up addressing that. But now with things that are coming up like Salt and Ansel, we're really starting to talk a little bit more about the ability to manage in higher environments.
When you start thinking more holistically and being able to interface between different services, that really becomes a powerful thing, and it drives towards a new way of looking at your environment. So the real fun part about this is I've talked a little bit about what it's telling, but specific to Salt,
what's next? What is it that we're looking to wind up doing? It's a hard question to answer because probably in the time I've been giving this talk, three new pull requests have come in, each with a new feature. To say that Salt moves rapidly is being a little bit understating the case. There's currently a project in the works that's going to significantly shore up
the integration with Docker. It's already working in an early alpha stage, but it's not there yet. And there's a separate project that I want to go into very slightly that's actually... Sorry, before I dive into this, one thing that's interesting about Salt today is that it's built on top of ZeroMQ, which serves as the transport layer.
Very soon, that's going to be replaced with a transport that by default uses UDP, but also the TCP as well. It's currently in development and is just announced this week called Great. This is sort of new. This hasn't really hit the configuration management space, so you're pretty much seeing it here first.
Hooray, secrets. What this does is it makes the communication protocol very pluggable. You can drop in ZeroMQ, which is there today. You can use Salt SSH and run the whole thing agentless and just use our old friend SSH. But now you can also drop in this as well, which is sort of forcing a refactor of how things have been addressed before, which means other transport layers
are available in the future. What this does is it needs that your actual queues wind up... You have multiple queues per socket, which allows for things like packet prioritization, and it scales up rapidly. This very realistically takes Salt from being able to scale from tens of thousands of nodes to hundreds of thousands of nodes, which is turning into something relatively interesting.
Not a lot of companies are at that space yet, but it's coming, and it's really neat to be able to see it. At this point, it also winds up kicking encryption down to the socket layer, which is rather convenient, because at that point, encryption becomes pluggable as well. The value of that is that it winds up
using published crypto libraries, which means that at this point, Salt can actually get out of handling the crypto space itself, which in the past has led to some interesting challenges, as I'm sure some of you have heard. At this point, this has been my talk. This is something I've been limited by,
something I've been talking about for over a couple of years now, and it's something that I really hope that people wind up seeing a bit more of. The next talk, just so you're all aware, is David Ludacourt from Puppet Labs who's going to talk about a project that's near and dear to my heart, provisioning and raising. So it's definitely worth sticking around if you're on the fence about it. Are there any questions I can answer for anyone?
Yes? To whom? Never heard of it. So your guess is as good as mine. Sorry. Yes?
You're right, it is. What's interesting is the way
that you can actually wind up structuring your state training. You can leave them a very clear separation. There are a number of shops as well today that are using salt to orchestrate existing puppet or shaft environments to use existing other systems entirely for management of nodes, but then turning to salt for the ability to orchestrate them as environments.
At a very basic level, it acts as an incredibly powerful replacement for a cessation of flow. If all you're ever using it for is to kick off runs of your puppet nodes just so you don't wind up having everything fire at once and destroy your puppet master, that alone is sometimes worth looking into. Does that answer your question?
Anyone else? Thank you all very much for your time. So we have like 15 more minutes if you have something of like dinner or something. No problem.
In that case, we'll Climb on the table. Alright, then I guess we'll have a little break otherwise we're going to be too much ahead of schedule. So we'll just take a 15 minute break and continue with the next talk.
Thank you very much. One of us should remember that. Thank you very much. There's more in that room.
Exactly. We have a lot of animals coming through.
Oh, hey, there you go. We're going to take you home with your friends. Oh, we're doing video?
We have like 10, 15 minutes left.
So we're going to be filling out the whole room. Over there, all those empty seats in the middle. Everybody move to the middle. No empty seats.
Yes.
I wanted to come here. Actually, yeah. Oh, right. Um, maybe five more. Alright.
Uh, yeah, the, uh, the, uh, the talk had already started. We'll start when the room is full. Alright. Thank you.
A bunch more seats over there. Three. There's three seats over there.
So, um, not to confuse anyone, everyone, but, um, so the, uh, the, uh, salt stack talk already finished. I don't know if somebody came here before that, but, um, we're waiting for the razor talk, which we'll start in like, I don't know what time it is now, but we'll start in like 15 minutes.
Sorry.
Yeah, there's one in the middle. If all of you could just move one, if you guys could just move one seat over, yeah, otherwise somebody's gonna have to jump
seven seats to the middle. It'd be easier if you could just move one seat over. There's one more seat here, and there's, and there's some more seats.
I see two seats over there in the last room.
I'm not sure if there's something to say here.
So, uh, we'll do the same, uh, exercise as we did this morning since we have, uh, another, uh, 15 minute, uh, 13 minutes until we, uh, start. I want everybody to, uh, uh, speak to the person that, uh, that's next to you, give them a hand, and ask them why they're here,
and what they're trying to learn, so you can actually talk. Go ahead.
I see one more seat. If anybody has an empty seat next to them, raise your hand, there's one empty seat over there, right? Mina, are you sitting there?
Ah, let's get started. Alright, so next up we have, uh, David Luttercourt, um, as I see in his slide notes here, which is an incredible tool,
but with an incredible, uh, incredibly difficult, uh, grammar, uh, rule set. Uh, anyway, he'll be talking about, uh, RAZR, uh, provisioning tool, and, uh, you have to have a tool. Thanks, Walter. Thanks, everybody, for coming. Um, yeah,
as I said, I'm, uh, I'm David. I write provisioning software. Um, and I've been with Puppet Labs since May of last year, about nine months now. Um, but even though I'm pretty new to Puppet Labs, I've been in the Puppet community for much longer. I ran across Puppet pretty early on
in late 2005 or so, and did a bunch of stuff on it, pushed it into fedora, and, uh, I've been around it one way or another for a long time, and, yeah, one of the things that came out of my exposure to contact management from Puppet, because I noticed this happened by the background of a developer, um, one of the things that came out of it
was August, and if you modify your content files still with, say, to Greg in August, stop doing that right now, and, um, check out August. But this talk isn't about August, it's about RAZR. It's about provisioning. Provisioning is one of these words that mean a lot of things to a lot of people, kind of like configuration or systems management.
Um, for purposes of this talk, what I mean when I talk about configuration, about provisioning, is, you know, this sort of situation. Um, you have a lot of machines, and you need to get them to do something useful. Hopefully your machines aren't sitting in the back, in the backyards, and hopefully somebody's, like,
rags them and cables them up and they're ready to go. Um, but traditionally, Puppet has had kind of a first-mile problem when it comes to getting goings, um, because Puppet really only starts after you have enough stuff on your machines that you can run an agent on there. And, yeah, RAZR's a tool to kind of close that first-
first-mile gap. It's of course not the only tool to do that. There's a ton of tools out there to help you with pixie provisioning. There's a ton of open source tools that do it. Um, each of the big management packages has some provisioning functionality,
and there's of course, sure, everybody in here has their favorite Perl script, you know, the 1000-line script that does everything anybody might ever want to do with pixie provisioning. But if you look at them, they all, you know, they kind of fall into two piles. One of them, they do too little, and the other one, they do too much. The tools that do too little are the ones that
just stop once they've installed all the packages kind of that end when you have a kickstart file. You have to run through a kickstart file during your install, because you're not just installing for the hell of it, at some point you need to manage that machine too. And so the other pile, the tools that do too much, they realize that that's a problem, and you
need to do something to get more fine-grained management than just plonking packages down, and they grow all this contact management functionality. But that's of course the wrong base for contact management functionality because your provisioning tool is only involved at the very first time you build your machine. But you need contact management on an ongoing
basis. So Razer tries to be the Goldilocks of provisioning tools. Don't do too little, don't do too much, do just the right amount. And the way it does it is that it makes it very easy to, once the system has been built, to hand it off to a contact management system for further maintenance. So the philosophy behind Razer is that you just install the
bare minimum of whatever operating system you're installing, and then enroll it with Puppet or Chef or some other contact management tool, and then do the actual personalization of the system with that. Since there are so many variations of provisioning tools out there,
to get a better idea, I made a little user survey. Unfortunately I didn't have time to talk to any users, so I just made up the answers too. So we've done a lot of software engineering research with that, how that happens. But to
give a little prehistory about why Razer came about and why it does what it does, it was started by two guys at EMC, they're now at VMware, Nick Weaver and Tom McSweeney. They launched it in the spring of 2012 at EMC World, and then in the fall of that year at PuppetCon
they announced that they would move maintenance of Razer over to Puppet Labs, because it's a really good fit with Puppet, but also because they felt they didn't have the time and resources to really push it forward. And what's happened since then is that over the last six months or so,
in the early summer we took a look at where the code base was, the initial code base, and lessons learned from people using it. One of the lessons was that it was really hard to get the initial code base installed and going, and of course to maintain it. And so we decided to re-write the whole thing.
And my talk is about the re-write. At this point the initial code base is legacy, if you have an installation for that great, but nobody should be installing that code base and use the re-write code base. So one of the things that makes
Razer unique is that it deviates from the general approach of these inter-provisioning tools that try to make you look at your machines as pets. These things that you know intimately well, and you have a personal relation to them, and you really care about them. When somebody comes, hey,
build me a web server, database server, and whatnot, you go down the data center and look at your 500 most favorite, most beloved servers, and pick the one that is going to do what you need to do, and go back to the office, enter the MAC address into your provisioning tool, and then hopefully you've got a machine at some point.
So Razer taking inspiration from how people use cloud, and trying to move it a little bit into the bare metal world, and picks a provisioning world wants to look at your machine as more like cattle, as things that are largely interchangeable, they have different characteristics, but within each group they're pretty much the same. Just like
with cattle, you have dairy cattle and cows that you raise for meat, and maybe for breeding, or for showing off at some show, but within each group all milk cows are pretty much interchangeable. And the way Razer does that is that when a system
when Razer first encounters a system it puts it into what in Razer lingo is called a microkernel, it's really just a small Linux image that puts on the machine, runs factor, and sends the facts back to the Razer server. And because of that your Razer server has an inventory of the hardware that you have,
and then later on Razer decides what should go on there based on policy that you've set up and in your policy you talk about what should happen with machines that have this much RAM and these many cores and whatnot and based on these policy and rules
Razer decides, oh, this should get rel or this is a node that should get ESX installed. As I said, the rewrite we changed a few things around one of the things is we use Postgres now as the database just because Postgres is awesome
but yeah, the database with Razer is not a huge concern we've literally stored tens of kilobytes of data for each node so you can do the math, how many nodes you would have to have before the database gets a respectable size. We also use Sinatra the server is written in Ruby and we use a Ruby web framework called Sinatra and if you haven't encountered Sinatra
you can think of Sinatra as Rails after a very, very serious time. It's a really nice framework to write a web service. The one thing that's probably a little unusual and so far that's pretty standard web stack. We use TalkBox which is a plugin to JBoss that turns JBoss into a
Ruby app server. I don't know how many of you have deployed Ruby apps and before you know it you have a similar application that consists of like ten demons, web workers and some background workers and something that we as email as a nightmare to manage because now you get to basically ten different things and all that.
The nice thing about TalkBox is that it lets me as a developer do all these things but it doesn't in one process so as an admin you're just watching this one process instead of this myriad of So the one thing that's missing from here since I've been talking about pixie provisioning so much is a little bit
the need for pixie provisioning. What about DHCP? What about TFTP? And there Razer also deviates from a lot of the pixie provisioning tools that kind of naturally branch into managing all that for you. Razer does not do that. We don't really care what you use for a DHCP or TFTP
in SMS DHCP what have you. All we need you to do is put two files onto your TFTP server and it gives you the pixie boot the usual thing. Once you put those two files on there one of them is the ipixie firmware and the other one is a little script for ipixie
that basically tells nodes once they come up go and talk to this other server over here the Razer server. The genius of ipixie is that it lets you gets you out of the TFTP malaise of you can't really do anything and lets you do all the booting and now we can write a web server that has interesting behavior just useful things
just to boot machines. Once you've got those two files set down you don't ever have to touch them again everything happens on the server. In terms of topology Razer has really two APIs one is a public API
you can think of that as the management API that's what you use to tell the server what the policy and rules are and then on the other side there's a private API that nodes use to talk to the server while they're getting installed or while they're booting and the private API
really only comes into play for you if you decide to write your own custom installer to do useful things because then you need to know how to get files from the server how do I tell the server to log something The thing is that the public API we have proper authentication
we use HTTP authentication and we use a library called Shiro that makes it really easy to plug it in to build that or a bunch of other things so the public API is pretty well secured the private API just by its nature you can't really secure
because when a node comes up and says hey I'm a machine that looks like this we just have to believe in this so on the back end you have to secure that network by physical means maybe it's on VLAN or just being segregated from the rest of the network I think for people who do
fix your provisioning that's enough how many of you actually do have to manage physical machines fix your provisioning and when I started doing this I would have never thought that there's that many hands because everybody is called cloud I mean it's still a real problem to do provisioning
yes, that too I'll talk about that at the very end a little bit and so the public API is fairly garden variety REST API the one wrinkle is because it's really easy usually to modify things
over REST and it gets really awkward to change things we have commands you issue a command to create a policy or modify a policy instead of doing gymnastics with current presentations of REST objects but the objects you need on your server are kind of the ones
that are here policy is the most important thing that ties everything together you need a repository or multiple repositories that you eventually want to install you can either just point the razor server at an existing repository like the young repo that you have sitting somewhere or you can hand it an
ISO and import it on the razor server itself that's what people usually do for Windows and ESX installations they just import an ISO into the server broker is kind of razor's lingo for the thing that does the handoff to the contract management system at the end, so there's a puppet broker there's an enterprise broker
somebody in the community wrote a chef broker we don't ship that but somebody in the community actually wrote a broker that just sends a signal on an AMTP message bus for the internal infrastructure so you can, with a setup you can do much more than just handoff to a contract management system and I mean broker, at the end of the day
it's a fancy word for a shadow script reading that much more tags are tags are named rules essentially the way razor works is that when a node comes in razor goes through all the tags it has and the rules that are associated with them and checks whether those
rules match that node your rule might say more than 8 cores and 16 gigs of RAM you tag it as a medium big machine and all the same policy also carries tags and once the tags for policy and the tags on the node match the policy matches get denied
and tasks at the end, those are the actual things that do the installation the kickstart scripts actually for tasks we went through a bunch of naming gymnastics because we initially called them installers but we want to do these things more than just installation
so eventually we set up some tasks after a few detours and to to write an installer or to write a task is actually, once you have the installation automated so once you have a kickstart script and maybe a post install shell script together
getting that on the eraser server is a matter of writing 5 or 6 lines metadata out of the box you have installers for these things on the right so we have an installer for ESXi
that was one of the initial use cases that Nick and Tom had for eraser, they wanted to deploy ESXi and that's a real joy we haven't done that they wanted to deploy that automatically we also have installers of course for the various Linux flavors, Rails, CentOS Debian, Ubuntu
and then the thing I'm really excited about which I didn't think we would get not that quickly is we also have Windows 8 installer now which is I don't know how many of you install Windows on a regular basis it's fun but yeah we have, by all accounts I haven't tried it, but by all accounts
it actually works so you can use eraser to provision pretty much all of the other systems you usually encounter the installer itself this is kind of a linear process you can say the first time we boot with this installer you do this, that's usually
a download, it's a kernel that is actually the installer and then the second time we boot we do something else and the third time it's on until eventually you're done installing and the thing is just set to boot locally from then on you have the machine in production and it just
you could, for example write your own installer the very first step does some configuration of a RAID card, right? boot into some special image that lets you modify RAID config with whatever tools you use and then after that boot into the real operating system
pretty easy thing to do so everything is about nodes in eraser those are the machines that we really do all this for and why we do that, because we need to put something on these machines
and from erasers point of view a node largely consists of those four things we have a little bit of information about the hardware that ipixie sends us MAC addresses serial, I think, you don't get very much information on ipixie just because these firmwares are really restricted and what can they tell you we have facts
which is right now a standard run of factor particularly with block devices how much memory in cores and stuff like that a fairly recent and really interesting addition is metadata you can associate just a bunch of key value pairs with a node
and what makes this really interesting is you can do that both through the API so you can make a call and set some metadata but you can also do that from the installer the installer can call back or you can of course read those in the installer and make decisions based on certain metadata tags so if you're totally crazy you could push whatever
partitioning you want to have on your machine into metadata on your node and then the installer can pull it out and lay down your custom partitioning scheme and then the last thing is the state which tells you whether the machine is installed and is right now the only thing that will add more
flags there about what the node is doing and those four things are all accessible when you write rules so the decisions about what gets provisioned you can base them off all these pieces of data which gives you a good amount of flexibility another recent addition that's not on the slide
is that we also added IPMI support so now you can use both enforce power states and say this thing should be off and keep it off and check that it's off every so often you can reboot it and turn machines on or off and off right now it's kind of simple
basically we support what the IPMI tool does but we want to add support for other remote cloud management and so just a few examples of what Razor can do for you of course you can build machines with it
and add to Puppet or other contract management systems like Chef your initial use case of building nodes and setting them up with vCenter that's actually Puppet modules that help you do that but Razor is well integrated
with that one of the things I find really cool is you can use the provision OpenStack because at the end of the day you'll just use Razor to lay down the basic operating system get the Puppet engine going on the machine and then you use the OpenStack Puppet modules to actually turn your machine
into a Nova compute node or Swift storage or what have you and then something that we're taking baby steps towards but I think that's where Razor will go in the longer term is something that manages the life cycle of your machines
one of the really important differences between Puppet's notion of what a node is and Razor's notion of what a node is is that Puppet thinks of a node as something that lives as long as the operating system is on there if you take a machine and reinstall it, Puppet will think oh that's a brand new node that I've never seen before
whereas Razor really follows the machine the hardware itself is not confused it doesn't and so you can you can use Razor to do complex life cycles one of the things that people seem to do quite a bit is when you decommission a box
do a secure wipe before you install something else in there and that you can actually trigger just by setting metadata flags and writing rules in a somewhat clever way something where we need to do more to make it really smooth is updating BIOS
once you know that the BIOS on these machines need to be updated it would be really cool if you could tell Razor to do that and then on the next schedule reboot you update the BIOS so you don't need to have a specific reboot schedule just to do the BIOS update you might have a reboot machine every two weeks policy
so when it reboots Razor would then make sure that it first boots the BIOS updater image and then once that's done it goes back to booting locally and running whatever is on the machine I've got a few pointers here
I think we have a couple minutes for questions if you have questions yeah if you have questions after the talk like tomorrow or so we have a mailing list IAC channel and so on the thing that's kind of awkward
is the flipping back and forth between the BIOS updates and the thing is that Razor right now once you've run an installer considers the nodes installed
that keeps it from going through the policy table and so we need a way to mark some of these tasks as non-destructive non-destructive actually installers are destructive and you would never want to apply them to a node that is already installed but some of those things are non-destructive
like a BIOS update and you would want to distinguish between those two and allow applying the non-destructive stuff even to nodes that are installed
yeah TPM brings back flashbacks to a previous life I don't know I want to go there and I don't know how much people
would actually want to use TPM for that or how much they're actually using it for anything so Intel was not that idea yeah
so the initial the initial implementation had a state machine and then when we actually looked at the installers which in the initial thing were called models that people actually wrote they were all linear processes there was no no real use of the state machinery
things very complicated. And so one of the decisions we took with the rewrite was installs are just linear steps. You do step one, you do step two, you do step three, but there's no branching and cycling and all that. But some of that will get back with the additional data we have about a node and things like what I just said,
that we have to distinguish between an installed node or between destructive tasks and non-destructive tasks and non-destructive tasks against installed nodes. So behind that is probably implicitly a notion of life cycle of the machine, but it's not really exposed because I think that that's just too hard for people to really make use
of. So can it revoke certificates and pull stuff out of the Puppet Master?
Out of the box, we don't have anything for that, but it's a matter of writing a shell script that actually does that and sticking that into an install. So it would be a pretty easy exercise to do. Yes?
I've heard about Bonzus Metal as a Service. I haven't really looked at how they do things. My understanding is that they push much more into a cloud-like mode of operation. I think Razer tries to be very careful to strike
a balance between being fairly hardware centric and fairly close to the way people are used to doing physical management and the kind of cloud-like features, because I think there's a gamut of these uses. I don't think it'll make everybody happy by giving them a Metal as a Service cloud tool.
So because of that, I would expect that there's quite a few philosophical differences between Bonzus, Blaze, and so on. Yes, Steve? Is it better to do that?
So generally, I would say whatever you can do with Puppet, do it with Puppet,
because Puppet is the thing that worries about the ongoing maintenance and configuration changes of your boxes, and only do the things you absolutely have to do outside of Puppet in Razer. If it's something you need to do to get operating system on, then you do it with Razer and everything else you do with Puppet. OK, thank you.
Yeah. And I mean, yeah, some platforms are about to do you something to do that. Nobody wants a system, and others, yeah, like the tool. Yeah, but I mean, if you can't do it with Puppet,
you can't do it with it. Yes? Yeah. I'm sorry. I'm sorry. I haven't thought about that. There are many questions. Yeah, so for starters, I hate the word microgram,
because that's what I missed over. It's a small Linux image read. There was another thing we changed from the industry product is that we moved it to Fedora. Right now, we're using Fedora 19. The rationale behind that move is that we as Puppet Labs can be in the business of hardware support.
Those companies that do that, they're way bigger than Puppet Labs. So the idea here is that you've actually moved to the enterprise Linux microkernel, so that if you have a support agreement with one of the enterprise Linux vendors, you go and talk to them if your microkernel is sparse,
because it doesn't like the network kernel. But yeah, right now, it's Fedora 19. Yeah, I mean, I actually just noticed that OpenOffice very helpfully make these things pretty much illegible.
The first one to the server repo is to Puppet Labs slash razor dash server. And that's where all the documentation lives to on the wiki for that Git repo. I try to keep everything there, and then the other repos are kind of offshoots. But all the documentation and most of the information
is on that repo. Any more questions? OK, thank you. Thanks very much.