We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Habitat 201: Habitat in the Ecosystem

00:00

Formal Metadata

Title
Habitat 201: Habitat in the Ecosystem
Title of Series
Number of Parts
50
Author
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Habitat is an open-source framework that gives modern application teams an application-centric automation platform. Build, deploy, and manage modern and legacy applications with Habitat. Habitat plays well with many container technologies such as Docker, rkt, Mesosphere, and Kubernetes. This talk will explore some of the ways Habitat fits into the broader ecosystem.
Product (business)CodeContext awarenessDistanceParadoxSimilarity (geometry)Multiplication signMoment (mathematics)Product (business)Different (Kate Ryan album)Self-organizationData managementBitQuicksortLevel (video gaming)BuildingPattern languageDataflowSystem callWave packetSpacetimeSoftware developerCondition numberRule of inferenceRight angleService (economics)GodMathematicsMobile appComputer animation
Physical systemTerm (mathematics)Pairwise comparisonObservational studyDivisorDifferent (Kate Ryan album)Bit rateStandard deviationNumberNP-hardChaos (cosmogony)StatisticsMortality rateCausalityDirection (geometry)Set (mathematics)Right angleBitNatural numberComputer programmingLecture/Conference
Physical systemProcess (computing)Autonomic computingRule of inferenceScaling (geometry)Device driverHypercubeSimilarity (geometry)Cartesian coordinate systemTransportation theory (mathematics)AuthorizationBitCentralizer and normalizerFrame problem
Pay televisionServer (computing)User interfaceMaxima and minimaPhysical systemCodePhysical systemStrategy gameEvent horizonQuicksortData managementPattern languageBitOrder (biology)Server (computing)Software developerCartesian coordinate systemTwitterComputing platformPoint cloudPhysicalismInstance (computer science)Direction (geometry)Axiom of choiceProduct (business)Time zoneoutputRun time (program lifecycle phase)Functional (mathematics)Computer simulationService (economics)Multiplication signProcess (computing)MereologyGoodness of fitAngleMathematicsPatch (Unix)Different (Kate Ryan album)CuboidOperating systemOperator (mathematics)Visualization (computer graphics)Remote procedure callVirtual machineEndliche ModelltheorieObject (grammar)RadiusBlock (periodic table)Right angleHypermediaFrame problemSoftware maintenanceIterationWeightHydraulic jumpComputer clusterContext awarenessData centerTraffic reportingSelf-organizationDimensional analysis
Directed graphService (economics)Web pageServer (computing)VacuumMereologyRight angleProduct (business)Virtual machineCartesian coordinate systemPoint (geometry)Operator (mathematics)Entire functionCycle (graph theory)BitCuboidSystem administratorComplex (psychology)Process (computing)SoftwareRemote procedure callWritingInheritance (object-oriented programming)Service (economics)Video gameResonatorOrder (biology)Sound effectNormal (geometry)Multiplication signEvent horizonRun time (program lifecycle phase)Program slicingMathematicsObject (grammar)ConsistencyoutputPhysical lawChainSet (mathematics)InternetworkingDirection (geometry)Physical systemPortable communications deviceKeyboard shortcutQuicksort
Cartesian coordinate systemInterface (computing)WeightDifferent (Kate Ryan album)Portable communications devicePoint (geometry)ConsistencyVideo gameEndliche ModelltheorieCycle (graph theory)Configuration spaceReal numberSound effectInheritance (object-oriented programming)Right angleService (economics)Event horizonComputer animationMeeting/Interview
GoogolTwitterInternetworkingSoftwareVirtual machineService (economics)CodeComputerCellular automatonPhysical systemServer (computing)Gamma functionEndliche ModelltheorieNeuroinformatikSelf-organizationService (economics)Computer fileRing (mathematics)Shared memoryInformationState of matterState observerOrder (biology)Cartesian coordinate systemVirtual machineConsistencyEntire functionVideo gameCodeGroup actionObject (grammar)Cellular automatonComplex (psychology)SubsetNetwork topologyStrategy gameCycle (graph theory)QuicksortSoftwarePlug-in (computing)Online service providerScaling (geometry)Computer simulationComputer scienceInternetworkingPhysical systemWordAutonomic computingAuthorizationIntegral domainNumberBookmark (World Wide Web)Different (Kate Ryan album)Level (video gaming)PlanningDirection (geometry)BuildingMassProcess (computing)MultiplicationRight angleRange (statistics)Data conversionFormal verificationPortable communications deviceRevision controlAxiom of choiceProduct (business)Perturbation theoryDisk read-and-write headGoogolFigurate numberCuboidVector spaceTwitterData management
Data managementWorkloadProcess (computing)System programmingKernel (computing)Computing platformFile formatBefehlsprozessorMiniDiscMedical imagingOpen sourceSoftwareMultiplication signNeuroinformatikGoodness of fitFile systemFile formatAnalytic continuationLocal ringCartesian coordinate systemScheduling (computing)Projective planeTerm (mathematics)Run time (program lifecycle phase)Electronic mailing listLatent heatData managementCommunications protocolSinc functionType theoryNumberPhysical systemPopulation densityComplex (psychology)WorkloadComputing platformGoogolQuicksortProduct (business)Different (Kate Ryan album)Directed graphScaling (geometry)Order (biology)BitVirtual machineSelf-organizationArmFigurate numberConsistencyData centerWeb pageFunctional (mathematics)Computer simulationKernel (computing)Right angleMoment (mathematics)Instance (computer science)Process (computing)Degree (graph theory)Computer hardwareWebsiteWordSphereAnalytic setDivisorPortable communications deviceSemiconductor memoryMicroprocessorFrame problemPoint cloud
Scaling (geometry)Group actionComputer networkSingle-precision floating-point formatCommon Language InfrastructureData managementLibrary catalogTemplate (C++)Cartesian coordinate systemINTEGRALData managementPattern languageAxiom of choiceLibrary catalogScheduling (computing)Different (Kate Ryan album)Fiber bundleSoftwareInterface (computing)Computing platformOrder (biology)Extension (kinesiology)Group actionSingle-precision floating-point formatBitComputer fileHydraulic jumpVirtual machineType theoryEndliche ModelltheorieMedical imagingComplex (psychology)File formatState of matterForm (programming)Process (computing)Physical systemLine (geometry)Multiplication signFormal languageSoftware developerQuicksortShared memoryConnectivity (graph theory)MereologyCuboidCombinational logicRight angleSurfaceMultiplicationHand fanDefault (computer science)Bit rateServer (computing)Distribution (mathematics)WordLink (knot theory)Slide ruleBranch (computer science)Template (C++)Point (geometry)Goodness of fitData structureComputer animationLecture/Conference
AutomationPhysical systemAutomationNumbering schemeGoodness of fitPhase transitionTrailBitSoftware testingProper mapFocus (optics)Analytic continuationMereologyEndliche ModelltheorieMultiplication signSeries (mathematics)Logic gateContinuous integrationProcess (computing)Moment (mathematics)Cartesian coordinate systemThomas BayesFilm editingSet (mathematics)Sound effectHydraulic jumpComputer animation
Transcript: English(auto-generated)
So I'm going to say I'm really glad that we get to do this talk again. I think there's been a really good call out to context in both of the keynotes that we saw yesterday and today.
We heard from Alaska Airlines, writing code apps in context is meaningless. And today, Adam's talk focused on understanding what we're doing, understanding what the context is around the things that we are building so that you can go forth and just be awesome with them. So that's what we're going to look at a lot in this talk about Habitat and the ecosystem.
But I should introduce myself. So hi, I'm George. I'm a product marketing manager at Chef, but I've worn a lot of different hats in my time there. Before product marketing, I was in business development, working with our partner ecosystem, which I think is appropriate for this talk. Before that, I was doing training services. I was writing a lot of Chef code.
I'm an engineer by background. But what that meant is that in my time at Chef, I spend a lot of time on airplanes. I do a lot of travel. All of those different roles require a boatload of travel. And just in my time at Chef alone, I have covered the distance from the Earth to the moon
and entirely back. Not quite entirely. I did the math yesterday, and once I get back from some trips in September, I will be back on Earth. But for now, I'm still a little bit of a space cadet. But what that means is I've gone some really interesting places in my time at Chef, and I think by far,
the most interesting place that Chef sent me was India. And I think I find it the most fascinating by far because I was struck by this massive paradox just moments after leaving the airport. And the paradox is this. Indian traffic. Has anybody been to India? Raise your hands. All right, we have a few folks.
So if you haven't been, if you don't know what I'm talking about, I think Southeast Asian countries, like maybe Thailand, Vietnam, have very similar traffic patterns. But Indian traffic by far is like the gnarliest that I've seen. It is this chaotic swarm of cars and pedestrians and tuk-tuks and just all sorts of things coming at you.
And it has no clear sense of lanes or rules that were discernible to me. Or personal space, and it's a very active thing, right? Because when you're driving along, you're diving through potholes, you're trying to manage road conditions.
There are not only pedestrians, there are cows. And there's this swarm of motorcycles and mopeds and tuk-tuks, and it's everywhere, and it all just flows. So being in this situation, right, like my westernized little mind is like, oh my god, this cannot be safe. This is not gonna work.
So I did what any good westerner would do in that situation, and I pulled out my iPhone, and I was like, all right, let's look this up. Is this really true? And so with my international data plan, I started poking around like, what can I find? Is this really reality or not? And I found this study from the University of Michigan in 2014
that looked at per capita mortality rates for vehicular accidents per 100K of the population. And what I found when I looked at that study is it showed me this, right? Like, it is way more dangerous to be in that system in India. There's like a higher rate of vehicular fatalities.
And so I thought, like, that's it, man. Like, if I stick around, like, I am going to die. And I was in India for a couple of weeks. I didn't die, I'm not dead yet. And that made me wonder, well, what's really going on? Because when I looked, it actually seemed to be a fairly safe system despite all of the chaos. So I started to dig a little bit deeper.
Like, I completely nerded out on traffic stats during this trip. I have a really good international data plan. And I'll give you the TLDR. This is not a direct one for one comparison when you look at it this way. Like, there are top five leading factors that expose you to a higher rate of fatalities in traffic and things like drunk driving rates,
you know, safety standards that have been mandated. Are you required to wear a seatbelt? Things of that nature, right? And so with all of those factors, India likes significantly behind the US, which means their numbers skew in a direction hard. So it's not like a real direct one for one comparison in terms of the driving system versus ours. So I decided to try to normalize those numbers
and make it more of a pound for pound comparison. And when I did, I took just the first factor alone, drunk driving rates, which are like huge in India. It's like 70% versus 31% in the US. We've made a lot of inroads over the last couple of decades. So, you know, we have an advantage there. And just trying to normalize those numbers,
taking one step closer to pound for pound comparison, I saw this, right? And like, whoa, that is not a small difference. That is a stark difference. It's almost like safer by a factor of two to drive in that chaotic swarm than it is to drive in the US. Like, what is going on there, right? My initial reaction was, I don't understand.
How is that even possible? Like, this cannot be a thing, right? Like, how could that, right, and that, and things like that, like, I saw that all over the place. Like, how can that possibly be safer than this, right? What's going on? So I decided to dig in a little bit deeper. And it turns out that Indian drivers
have to be hyper aware of their surroundings. It's a very active process. And if you've ever seen Indian traffic go, it's very noisy. There's a lot of honking that's going on. And it's not like rude honking. It's almost like a system of radar, right? And like, you see that somebody's approaching you. So like, you slow down or you move over and the person next to you moves over and like, the entire fleet adjusts real time.
And what you end up with there is this system of hyper aware, hyper vigilant, autonomous actors. And it requires those actors to be more engaged, ultimately to be safer and more aware of their surroundings. And what you get is a much stronger resilient system
than you get with something that has rules that are dictated by central authority, like the Department of Transportation, right? And maybe make things a little so safe that we're not as engaged as we could be when we're driving. So India really challenged my assumptions about things that I thought I knew about large scale traffic systems and how they work.
And I think in a similar way, Habitat challenges the things that we think we know about how large scale distributed applications should work. So that's what we're gonna do in this talk. We're gonna look at two things. We're gonna look at first the question, what is Habitat? If you were here for Josh Zimmerman's 101, I think he had a lot of good detail. We're gonna look at it from a slightly different angle
to give you a little more context, right? Like, why does Habitat make the choices that it does? And understanding that scope of what is Habitat and why does it operate this way, then we can look at how Habitat plugs into the larger ecosystem and where it fits into a larger whole. So with that, let's jump into it.
What is Habitat? I've gotta tell you, in this new role of product marketing at Chef, one of my biggest challenges has been trying to figure out how to succinctly talk about Habitat. When you start digging into Habitat, there are a lot of moving pieces and there are a ton of different concepts. And I usually get the question of,
hey, well, what is Habitat? Can you give me the short pitch? And it's like, do you have 20 minutes? And I've gotten a few iterations of this and I don't think that my timeframe is necessarily getting shorter, but I think it's zeroing in on the right concept. So let's see if we can make that a little bit better. I tweeted out this picture a few weeks ago.
I call it box box. I don't know if any of you have met Bach on the sales team, but he wanted to help visualize what's really happening in Habitat. And so we put something together, we put it out on Twitter and it seemed to get a lot of uptake. People really responded to it. I think it's a really good visual model
of what's happening. But just by looking at the picture alone, I don't think it's always entirely clear what's up or what's there. So we're gonna go through it. We're gonna go through it from left to right. So this is basically what's happening. In the traditional computing model of the world when we're managing servers, the challenge here is that we tend to think
of our applications as just those bits of code that we wrote, like the basic inputs and outputs that are giving our business value. That is the only thing that we usually care about at the end of the day, and we have this layer cake of dependencies around it. We think that it's just that one box, but in order to manage that one box,
there's a bunch of things we need to do before and after we go there. So at Chef, we call this infrastructure first development. And we call it infrastructure first development because if you're really trying to manage that application, what you need to do is you need to start with a server, and that server needs an operating system. And in the days of physical data centers, that meant you had to rack a box,
or maybe in the cloud, you have to go provision it, start some VM that starts with some sort of operating system choice. And then you need to think about where that machine is placed, maybe which rack it was going to live in or in the cloud, which availability zone. But once you have those two things, where can this thing live and how is it going to operate, cool. Then I can now install a runtime.
What is that platform on which those functions that I wrote are going to be executed? And then once I have all of those things, great, now I can get to my application. And that's a long time to value. It's a lot of things you had to do first before you can actually start writing any of the things that give your business some actual worth.
You are at ChefConf, so hopefully you're doing all of this with Chef. Congratulations, you are awesome. And maybe you're even doing the next part with Chef. The next part is, well, how do I manage this application once I have it ready to go? How do I deploy it? How do I make changes to it? How do I patch my operating systems? How do I do that ongoing maintenance?
Maybe you're doing it with Chef. Maybe you're doing it with an external tool. Maybe you have a different way that application artifacts are created, so you have to manage the dependencies that that thing brings with it. And so you may have a tool for this. Maybe you're doing it with automation already. But then once we run our application in production, we need to make sure this process
is doing the right thing. So we're gonna get to monitoring. We bolt that on as an external. We usually need to have some sort of remote execution strategy. For various things, various events that occur in our infrastructure, we will need some sort of orchestration tool or some way to execute remote commands.
We have historically very complicated deployment patterns. In order to deploy this object, I need to stop server A so I can stop server B, and then update server B before I update server A, and then update server B again, and then start B before I start A. And like, holy cow, how do you do that without remote execution? So we'll usually have a need for that there.
Service discovery, maybe not so much of a thing in hard-coded legacy data centers, but even in the cloud when you have ephemeral instances coming and going, you need to figure out how to wire them all together. And so there are some really great tools for that. But again, another thing that we need to manage that just bolts onto the tool chain.
And that is how it comes together, right? And it's usually in that order, because that's just how we built it, right? We start cobbling these things together as we needed them, and that's how it evolves, right? It's messy, it's disjointed, it's a little bit slow. I think containers definitely took a step
in the right direction. Containers give us a way to package up our applications and the runtime into this portable immutable artifact, and that immutable artifact works the same way everywhere with consistency, right? And it allows us to do really interesting things like take that artifact, put it up on the internet somewhere, right? Download it, run it with one command, and like, boom,
we are off to the races, right? And like, I can start writing things for my application right away. And we shortcut that time to value, right? And I think this is why Docker resonates so heavily with developers, because at the end of the day, we just wanna write that application, and this is a nice, short, easy way to do that. I think containers resonate a little bit less
with the operations crowd, because if you run these things in production, right, as we've mentioned before, you know that all we've really done is move that complexity further up the chain, right? As we take steps towards running and production, all of those dependencies from the layer cake start creeping back in, right? Now that I have this immutable object,
how am I going to make changes to it? How am I going to update software on it? How am I going to deploy it once I have it, right? Monitoring, right? How can I tell what's happening inside of this obscure, immutable process? Do I put monitoring inside my container? Do I put it outside the container? Like, what am I supposed to do? But that's still a real thing.
Remote execution and orchestration, right? We never really solved for that, so that's still a need. And then of course, right, with a bunch of microservices coming in and out of service potentially, they all need to find each other. Service discovery is a very real concern in that ecosystem. And so, right, we need to bolt that on and again, right, somehow like all of those dependencies
of the layer cake still came back around to bite us. And I think that that layer cake really is just a remnant of how we have built machines in the past. Like, this is just the way that we have always done it. And I think part of the problem is that when we think about our application, we think it's just that one box right there, right? It's just like that one central piece
in the larger layer cake, but I think that's only like those inputs and outputs, those are just the most obvious part of where we derive value. Anybody that's run these things in production knows that where we really have value, right, is when we put that in production into the hands of users, right?
It's not just that one slice, it is that entire stack that brings you value, right? It's that entire stack that you are writing your business on, right? It's that entire stack and its availability that is really key to running those applications and serving them to market, right, like that. It's one big cohesive thing.
And then we sort of just look at our application that's set as that one box and like this layer cake of complexities around it, but it's really more cohesive than that, right? It's the whole thing that matters. But we don't treat it that way. We never see it that way. Why? I think DevOps has taught us many things.
And in what, like the seven or eight years that DevOps has even been coined as a label. I think one of the more important lessons there has been that if we take a step back to holistically look at everything that we need to manage around our applications, the only way that we can really do that effectively
is if we not only look at everything that happens in development, but if we look at what it takes to really run in production as well, right? And so having learned the lessons that we have from the DevOps movement, I think maybe it's time to step back and re-examine what is the right way to run applications knowing that what really matters is that entire layer cake of complexity, not just that one slice that we think
is where we get the value. And so that's the assumption that Habitat challenges, right? So when we look at the question of what is Habitat, Habitat basically tries to take that layer cake and it deposits that what your applications need in order to run are everything
that your applications need in order to run, right? It's like the entire life cycle of things that your application needs, that is actually the application. So it's the entire layer cake, right? It's not just that one tiny slice of the application box. So that is what we call application automation.
That's when you hear us say things like automation that travels with the application, and we say that a lot for Habitat. That's really what that means, right? That we are taking that external set of dependencies and looking at everything that your applications need to run, right? From initial deploy to normal software operation, like life cycle events,
all the way through to decommission, right? And like all of that for that entire span, everything that's necessary, that is really your application. And I think what's been super mind blowing to me, like to my sysadmin, sysengineer, like distributed systems mind, is that if you do that, right, if you manage your applications that way,
what happens is when you deploy software, you deploy everything that your application needs along with it in this immutable self-contained, like self-managing package. And if you do that, it has a super, super interesting side effect, right? And that side effect is that if we bundle everything that our application needs
into that one immutable artifact, what it means is that we can build that artifact once, we expose an interface that alters the behavior of that artifact for different points of its life cycle. And when you deploy it, and when you say, hey, I want to go to this particular infrastructure, great, right, like set up a reconfigure event. And whether I'm going to deploy on-prem,
or whether I'm going to deploy in, I don't know, AWS, right, or whether I'm going to a PaaS service, great. Any other configuration that I need to do for that, already built in. And so you get like real net true portability and repeatability and consistency benefits from treating your applications that way. So Habitat tries to flip that traditional model
of developing from the infrastructure up on its head, right? And it just looks at, well, what does the application really need, right? And let's start with that and manage that, and then work our way down to the infrastructure. So what is Habitat? Habitat is a way to holistically package self-contained applications and run them anywhere. Habitat in 12 words or less.
So those are very deliberately chosen words, right? And I started saying that even yesterday when we first did this talk. And the immediate follow-up question, I think there's a lot of densely packed concepts into those very deliberately chosen words. So the very next question usually is,
all right, well, how does that work? So I think some of you were here for Josh Zimmerman's 101 talk. But here's just a high level enough of how Habitat works to keep the conversation going. So we're gonna talk about how Habitat works in two minutes or less. All right, ready?
Go. So whenever you're developing, you have access to a Habitat studio. The Habitat studio is usually on your local machine, but conceivably that could be hosted elsewhere like a hosted build service. Jamie Windsor's gonna talk about that in the 301. And in that studio, what you do is you can model all of the automation that your application needs inside of a plan.
And that plan includes directions about how to build this software, how to configure it, how to run it, right? All of those things that you need for the entire life cycle. Habitat includes a build system, and that build system processes those instructions and spits out a .heart file. It is .heart for Habitat artifact.
It is .heart because we love you. Habitat includes a supervisor process, the HAB soup, and the supervisor knows how to manage those .heart artifacts that you just created. So supervisors connect with other supervisors to form a gossip ring, and that ring gathers and shares information about the state of all the other supervisors
and the software that they are running. The supervisors basically make observations about their neighbors, right? Spread rumors about what's happening. The rest of the supervisors then go try to verify those rumors, and what you end up with is this eventual consistency of the entire ring through Census. Those supervisors can self-organize into service groups,
which is just a subset of supervisors that all need to act together. And what you can do in those service groups is you can use that to define topologies or update strategies or how these things manage the secret securely, or just anything that you need to do to manage the life cycle of that entire application.
That's kind of how it all comes together. So that's not bad, that was about a minute and a half. So what is Habitat, right? Habitat is a way to holistically package self-contained applications and run them anywhere. And if you look at any of the podcasts that we've done or the webinars
or any of the press around it, Habitat gives those same sorts of self-manageability, self-aware benefits to all applications. They don't just need to be containers or microservices. And when you approach your applications this way, what you end up with is a network of hyper-aware,
self-managing autonomous actors that run in a distributed system that does not have any one central authority. And when you do that, that creates large-scale, highly distributed, fault-tolerant, resilient systems that get more resilient as we make them larger. One of the things that I didn't mention
is in that gossip ring, we have a certain number of Habitat supervisors. If you add more supervisors, what you get are more observations and more verifications. So as you add more supervisors to the ring, it gets stronger, right? The bigger you make it, the stronger it gets. And that's a fundamentally different approach,
not only in managing distributed systems, right, but for managing your applications, what you end up with is portability, systems that are safer, systems that are more reliable. And I think it challenges some of our assumptions of what our applications really are and what all they need to run.
So looking at what Habitat is, right, how it operates, why it makes those choices, let's look at Habitat in the larger ecosystem. So we launched Habitat almost exactly a month ago, happy almost one month birthday, Habitat. And when we did, there was a lot of press.
I read a lot of what was happening. I kept tabs on that. And I think one of my favorite pieces of press came from Wired Magazine. And in this Wired Magazine article, the author looked at different advancements in computer science and biological computing models and what different markers were there along the way.
And it put up this blob of text. I'm gonna read it so I make sure I get it right. The article says, biological computing is also how Google, Twitter, and other internet giants now think about building and running their massive online services. This isn't software that runs on a single machine serving millions upon millions of people around the globe.
It's software that runs on thousands of machines spread across multiple computer data centers. This software runs this entire service like a biological system, a vast collection of self-contained pieces that all work together in concert. It can readily spread those cells of code across machines. And when machines break, as they inevitably do, it can move that code to new machines
and keep the whole alive. And then that article goes on to compare Habitat to this organic model of computing where you care about managing this overall larger organism with these strong, hyper-aware, hyper-enabled cells. And if you look closely at those cells,
those cells should have everything that they need in order to manage their entire life cycle, which is how we start putting this all together and plugging in. So the container ecosystem aims to treat machines like vast biological systems that act together in concert. But if you recall the challenge of managing containers, there's all of this external layer cake dependency on top
that is external to that cell, and we don't treat it as one cohesive object. So what Habitat is trying to do is take that layer cake of complexity and bundle it all into one small, strong, self-contained, truly portable package.
So Habitat focuses on making very strong computing cells. And if you were paying attention to boxes, one of the things that I didn't talk about when we talked about containers was workload placement. Conveniently, I skipped over that because that's where we're going next. But when we look at managing the entire larger biological organism that is our software-defined business
and all of the different applications that we run, that's where we start plugging into a larger whole. Let's talk about Google for a minute. So Google has been running containers at scale for over a decade. And it's this vast containerized approach to computing that they've staked their entire business around.
This is what Google does. And Google has been doing it long before we had things like Docker or Rocket or this recent explosion of technology around the container ecosystem. Containers have been around for years. They just used to be a horrible pain to manage. Anybody ever here try to work with alert zones?
Yeah, you know what I'm talking about. So Docker helped make containers more mainstream and more accessible and a lot easier to consume and set up. However, Google has been doing this for a long time. And Google, interesting thing I found out,
Google has seven products with over one billion users, billion with a B. Let that sink in. Seven products with one billion users. Google is managing the computing needs of the planet at just unprecedented scale, and they do it in containers.
So Docker came along, and then Google open sourced Kubernetes. And Kubernetes is Google's container management platform. And Kubernetes was open sourced as a way of showing the rest of the industry how Google does it. How do they approach these large scale problems of analytics, compute, and storage? How do they tackle it well?
And so what's happened over the last couple of years in the container ecosystem is now there's this escalating arms race of technology to help the Fortune 2000 figure out how to do it the Google way. And if you look at the container ecosystem, that's pretty much all that's happening. And the way that you get there is pretty much with this magic trifecta.
You have containers for that portability, so you can shift these workloads around with consistency. And then you have a scheduler for workload placement. I remember, so my first experience with schedulers was through Apache Mesos. Anybody work with Mesos? Few folks, right?
And so what Mesos allows you to do is have these micro processes that are running for X amount of time. And as you need them, within milliseconds, it will wake up this process, it will allocate resources to it, you can call on it, you can get functions out of it, and as soon as it's done, cool, we can put it to sleep and let it go back into the pool.
And it does so with remarkable speed, remarkable scale, and I remember the first time I saw a scheduler like that I thought, this is the future of computing. And it was one of those aha moments, like the first time I used Chef, or the first time I cobbled together a bunch of tools to make a continuous delivery pipeline and saw it go, and it's like, that is the future.
And I'm telling you, if you haven't played in the container ecosystem to this degree, you will hopefully will have the same experience. But you have these schedulers that move your workloads around as needed on demand and you get enormous benefits in terms of optimizing for hardware,
you get massive compute density and it's really rather impressive. But when you do that, there's a lot of complexity that comes with that. So if I'm going to move a container across a grid of machines, there are certain challenges that you have with that, like data persistency, for example. What do containers do for data persistency?
They write to a local file system. But if I'm gonna move that across machines, oh crap, that local file system needs to be shared across the network somehow. How do I do that securely? Is my networking even set up to be able to do that for me? And so there's usually some sort of cluster management layer on top of all of that that helps that scheduler actually do its thing
across a wide swath of machines. And so it's that cluster management system that really starts kind of tying those bits into that larger biological organism. So how does Habitat plug in? Josh Timmerman talked a little bit about post-process packaging. Post-process packaging is a fancy way of saying
that Habitat lets you take the two things that you care about, the hab soup and that heart artifact that we created, and package those into an infrastructure-appropriate image format. At launch, that means that we have support for Docker, ACI, Mesos format containers
as the images that we have, right? But happy one month birthday, Habitat. Still very young. It's conceivable to think that we will have other runtime image formats like an AMI, right? And just like send this to EC2 or like whatever VMware's format is. And you'll probably see things like that. However, at launch, we launch with support
specifically for containers because the larger container ecosystem I think is very complimentary to the types of problems Habitat is trying to solve. So we've worked with a number of partners at launch. I'm gonna start with Mesos. So again, right, Mesos, a distributed systems kernel,
does the type of scheduling that we were just talking about. Mesosphere has DCOS, their data center operating system, which is that cluster management platform that gives you a number of tools to manage that type of grid computational model. And within that, Marathon is the container management platform.
So the way it works with Habitat, or I guess rather the way that Marathon works, is that Marathon supports two different types of container formats, both Mesos native format and Docker format containers. And in order to make this thing run, you need to create a little bit of JSON that says I need this much compute, I need this much memory,
here's where I'm gonna live, and basically describes the resources that are needed for this application. And then you take that definition along with the container image, you feed them to Marathon, and then Marathon goes and does its thing, and it schedules it and runs it, and it's cool. So when you use Habitat to create a Mesos image
or a Mesos format container with like HAB package export Mesos, this is what you get. You see on standard out right now a big blob of JSON that describes, hey, this is what you would need to give to Marathon along with this container image I just created for you, and you're good.
So a couple of things to note here. The URI portion is actually very important for Mesos. Like Mesos needs to know where do I go pick up this image that I just created for you. And so in this example, it could be storage like on the cloud, in Google, it could be S3, but it could be any URI. It could even be local file system. So that's kind of what we have today,
and that's what we have at launch. We know that it works, and it's a little bit clunky because you have to take that, and you have to manually feed it to Mesosphere right now, but we know that it's there. If you look at the project page for Habitat, and a lot of the Mesos work, you'll see that there's kind of a wish list of next steps and places that we could go.
So if you're going down this route, here are the things you need to know. The Habitat supervisor has specific UDP ports that it needs for that gossip protocol. Those have to be opened manually today by hand when you give things to Marathon. Marathon though has an artifact store,
which if they have an artifact store, we should be able to take this tar ball and give it to the artifact store, and since we're already creating that JSON for you anyway, if we have those two components, well then we should just be automatically able to load that to Marathon through the API, and just smooth that whole transition.
So right now it's like a manual kind of clunky process, but that seems like an easy contribution to make and take that forward. Marathon has a few different package formats that it understands, but conceivably we could do a little bit of work there so that Marathon understands what a .hart file is, and then you don't have to kind of jump
through all of those hoops. You just give it the heart, and Marathon knows what to do from there, but there are a few other things that are going there, or going on there, and you can see those in the Habitat docs. Kubernetes, we did a lot of work with Kubernetes. Again, Kubernetes is Google's container management platform.
Kubernetes is at the heart of Google Container Engine. It's at the heart of things like Tectonic, which is made by CoreOS. And Kubernetes supports two different types of container formats. It's either ACI or Docker, both of which we have support for as well, along with Mesos, and it's the HAB package export into this format, thing you want to export,
and like bam, it just goes. So the pattern here is that Habitat makes creating those image formats a lot easier and more manageable, right? So Habitat worries about creating those containers, and then Kubernetes worries about managing the cluster and running those containers across a distributed grid of machines.
If you look at the future state work for Kubernetes, Kubernetes runs what it calls pods. Pods are interconnected groups of containers that all need to work together in some way, shape, or form. At launch, we have single container pod support
for Kubernetes, and we show you how to do that in the Habitat docs. But Kubernetes is very similar, right? It has its own language for describing, well, what does this container need in order to run? Tell me some things about it, and give me the image format, and if you give me those two things, great. I can run with it. And so it's very similar to the work
with DCOS in that regard, right? Future state is also, I think, very similar. Well, one, we wanna get multi-container pod support. Still working on that. That is very actively being developed. There's a lot of very active development work going on with Kubernetes right now. And when we export your container
into either Docker or ACI formats, well, why couldn't we just generate that YAML that you need for the application the way that we do for Mesos, right? Should be able to do that. That seems like a fairly easy contribution to make. And then the same thing, right? You need to make sure your gossip ports are open because they aren't by default inside of that pod. So there are a couple of to-dos there,
but I think we see the same sort of pattern emerging. And the same sort of integration pattern would probably be true with Docker. So Docker has Swarm, right? And with Swarm, you have DABs, distributed application bundles, and it would be the same thing there, right? And so the pattern is,
Habitat simplifies the management of creating containers and running them, and your vendor of choice then does the scheduling and placement work. We launched with support for Rancher. Rancher is a container management and catalog system.
So it's actually really slick if you look at it. It shows you collections of containers and lets you present different application bundles to external consumers. It's a very nice portal interface for being able to do that. And Rancher works with Mesos and it works with Kubernetes, both which we support. Also works with Swarm,
which conceivably follows the same patterns. And it also manages Habitat catalog templates natively. The docs on this are really sparse. And so if you follow that link, which will be shared in the slides, you can see what all the steps are you need in order to manage Habitat with Rancher.
Before we dive into a general Q&A, I'm gonna talk about some general questions that I seem to get when we talk about Habitat and the ecosystem. So there are a couple of common questions that we hear when we look at the larger ecosystem.
One, does Habitat replace containers? And hopefully from the talk it's been made clear that the answer to that is no, right? Habitat is a different way to think about what your applications need in order to run and package those up and make it easy to run. In the container ecosystem, what that means is we make it easier to generate those containers
and put in all the things that you need because it's just another Habitat endpoint. So the answer to that is it's not Habitat or containers, it's Habitat and containers. An extension of that, and we just covered this, does Habitat negate the need
for a container management system? No, right? Habitat, again, manages creating those containers but you still, in order to get the benefits of what the container ecosystem is trying to achieve overall, you still need that scheduling capability, you still need the cluster management in order to be able to pull that off. So it's not Habitat or a container management platform,
it is Habitat and a container management platform. Josh covered this one but I'm gonna cover it again because we get this question all the time, does Habitat replace Chef? And I want it to be a very clear, unequivocal, no, it does not replace Chef.
So Chef is amazing for managing your infrastructure. Chef models a ton of infrastructure complexity and I think it does a really good job at that and in fact, most of us have probably hacked Chef enough to work our way up through that layer cake of dependencies and we can do it and it's cool. Habitat doesn't care about your infrastructure. Habitat can't manage your infrastructure.
You still need to manage your infrastructure, you still need provision servers, you still have networking and storage provisioning needs and all of that still needs to happen. You still need to do that methodically, you still need to do it with compliance. Chef is a very great tool for being able to do that. Habitat starts with the needs of your application and it works its way down to the infrastructure
as opposed to Chef's infrastructure first approach. So the two probably meet in the middle somewhere. Where is that line exactly? I don't know, let's talk again in a few months after people are using it very heavily. But it's not Habitat or Chef, it's Habitat and Chef. I've gotten this question a couple of times.
Does Habitat replace Jenkins? And when I got that question, my first reaction was like, what? What is even happening there? Why is that a question? But as I thought about it, I think it's because we use language that says things like Habitat helps you build application artifacts
and Jenkins helps you build application artifacts. So on the surface, I think what really the question is asking is, not just that actual build system, but do I actually really need a CI system? And so Habitat doesn't replace CI systems. It's a very simple packaging technology
and you'll even see one of the things you can do is you can run some tests as part of that process. However, it's not a refined system for testing all of your upstream dependencies or having a series of phases and gates that you would get with any good workflow solution.
And if you use Chef Automate and the workflow feature, Chef Automate brings with it a build cookbook that actually helps you do all of that and go through the standard build phases that you would with the proper governance models and tracking of everything that's happening. And that build cookbook actually has a really neat effect in that we go past continuous integration
and we even go past continuous delivery, which is the focus of Automate and the workflow feature, but we go all the way to continuous deployment because Habitat knows how to look in its depot for updated artifacts and then just go ahead and deploy those. So when you use them all in tandem, you actually can get to a continuous deployment scheme very, very quickly.
So it's not Habitat or a CI system, it's Habitat and Chef Automate or whatever CI system you want to use. But that's it. I think that is my tour of Habitat and where it plugs into the ecosystem and I think we still have a little bit of time for Q and A.