Habitat 301: Building Habitats - TIB AV-Portal

Habitat 301: Building Habitats

00:00

0

Formale Metadaten

Titel

Habitat 301: Building Habitats

Serientitel

Anzahl der Teile

50

Autor

Lizenz

CC-Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.

Identifikatoren

10.5446/34633 (DOI)

Herausgeber

Erscheinungsjahr

Sprache

Inhaltliche Metadaten

Fachgebiet

Genre

Abstract

A year ago I finished building a large distributed system to support an online game. We used a the most advanced tooling and patterns known at that time but we still didn't exactly get what we needed. [new paragraph] Habitat needed a similar large, distributed system of it's own: Builder. Builder is the first production application built with Habitat in mind. I will explore the development process of Builder, lessons learned along the way, and show you how Habitat helped us build and deploy a scalable, distributed hosted service.

ChefConf 201622 / 50

1

42:53

Writing DSC Resources and Using Them in Chef for Windows

2

38:08

Writing Composable Community Cookbooks Using Chef Custom Resources

3

09:42

Welcome & Opening Remarks (Day 3) - Keynotes

4

19:55

Alaska Airlines - Keynote

5

03:51

Talk Show: Harvey, Arbuckle, Olivier

6

08:29

Talk Show: Harvey and Sita

7

07:24

Talk Show: Harvey and Trombetta

8

22:50

Keynotes - Arbuckle and Oliver

9

06:42

Talk Show: Harvey and Kirby

10

07:50

Talk Show: Harvey and Jacob

11

40:43

SAP's IT Journey into DevOps

12

15:40

Keynote - Trombetta

13

43:01

Modern Cloud Applications on Azure

14

21:57

Keynote - Kirby

15

03:27

Keynote - Cheney

16

38:11

It's Time for Us to Move: The Story and Migrating Hosted Chef to AWS

17

37:07

Introducing Delivery in Enterprises - Lessons, Techniques, Tips and Tricks

18

34:12

Introducing Chef to an Enterprise and Creating Awesome Chefs

19

40:31

Integrating all Your Tools With Chef - And How we Did it at HPE

20

29:54

How we adapted our DevOps Pipeline for Chef Delivery

21

32:39

How to Drive a Delivery Truck with Insurance in Africa

22

40:56

Habitat 301: Building Habitats

23

41:38

Habitat 201: Habitat in the Ecosystem

24

46:58

Habitat 101: An Introduction to Habitat

25

36:22

Don't Mind the Gap: How to Deploy Chef in Offline or "Airgapped" Network

26

22:45

DevOps in the Intelligence Community

27

29:46

DevOps for Networks, NetDevOps, NetOps or Whatever: Get Your Network Cooking With Chef

28

21:50

Delivery Dependency Support (or How Your Project is Not an Island)

29

41:07

Cooking with AWS - July 12, 2016

30

38:36

Compliance Slowing You Down? How to Achieve Compliance at DevOps Speed?

31

48:21

Chef Journey on Google Cloud - July 12, 2016

32

41:43

Chef Automate: Workflow Feature, Q&A Panel - July 13, 2016

33

39:17

Chef Automate: Visibility Feature, Q&A Panel - July 13, 2016

34

26:27

Keynote - Chef Automate Demo

35

20:14

Chef and DevOps for Pointy-hairs

36

43:03

Canary In the Coal Mine: Initiating Organizational Change Through Rapid Prototyping Pressure

37

38:30

Build Cookbooks and The Service Delivery Canvas

38

41:12

Enterprise Chef: Bringing Technology and Teams Together

39

46:27

Breaking Technology Silos with Chef

40

22:42

Keynote - Crist

41

40:02

Balancing Velocity and Compliance

42

43:06

Adding Windows to Your kitchen

43

49:07

Keynote - Jacob

44

37:55

A Year After My First ChefConf: Lessons, Shortcuts and Hilarious Bloopers

45

10:18

A Conversation with Microsoft

46

08:14

A Conversation with Intel

47

10:28

A Conversation with Google

48

45:22

The Softer Side of DevOps

49

44:33

Save your Crash Dummies! A Test-Driven Infrastructure Solution

50

39:56

We can all have nice things: Patterns for Brownfield Automation

Automatisches Abspielen

Sprache

Text

Bild

00:00

Gebäude <Mathematik>Gebäude <Mathematik>MusterspracheSpieltheorieInformationEinfacher RingProgrammbibliothekFigurierte ZahlOrdnung <Mathematik>VerschiebungsoperatorMultiplikationsoperatorComputeranimationVorlesung/Konferenz

00:52

MultiplikationsoperatorProjektive EbeneKartesische KoordinatenWellenpaketVerschiebungsoperatorNichtlinearer OperatorCodeVorlesung/Konferenz

01:44

Gebäude <Mathematik>LaufzeitfehlerSoftwareCompilerRechenzentrumBitServerZahlenbereichDatenstrukturProzess <Informatik>CASE <Informatik>VersionsverwaltungMAPRechter WinkelKartesische KoordinatenMathematikGebäude <Mathematik>LaufzeitfehlerVirtuelle MaschineDifferenteSoftwaretestCodeComputeranimation

04:06

Arithmetisches MittelSoftwareProgrammierumgebungSoftwaretestVersionsverwaltungMultiplikationsoperatorGemeinsamer SpeicherKartesische KoordinatenVirtuelle MaschineSuite <Programmpaket>Prozess <Informatik>Projektive EbeneZahlenbereichMultiplikationOrdnung <Mathematik>GruppenoperationRandomisierungVollständigkeitPunktWasserdampftafelInstallation <Informatik>GravitationComputeranimation

05:24

ProgrammbibliothekProjektive EbeneSoftwareZellularer AutomatVersionsverwaltungVirtuelle MaschineZahlenbereichDatenfeldComputervirusKartesische KoordinatenMultiplikationsoperator

06:21

Gebäude <Mathematik>Open SourceSystemplattformDienst <Informatik>DokumentenserverSchnittmengeRahmenproblemMultiplikationsoperatorSoftwareComputeranimation

06:52

AdressraumVersionsverwaltungWeb SiteAbgeschlossene MengeRechter WinkelDienst <Informatik>Kette <Mathematik>BrowserZellularer AutomatTeilbarkeitGebäude <Mathematik>MusterspracheFunktion <Mathematik>BenutzerbeteiligungProgrammierungService providerComputeranimation

07:54

Gebäude <Mathematik>SpeicherabzugMathematikSoftwareOrdnung <Mathematik>SoftwareschwachstelleBitVersionsverwaltungVektorpotenzialKollaboration <Informatik>GeradeGebäude <Mathematik>SpeicherabzugSchwellwertverfahrenLaufzeitfehlerZusammenhängender GraphAutomatische HandlungsplanungMereologieZellularer AutomatApp <Programm>Auflösbare GruppeOffene MengeRechter WinkelMultiplikationsoperatorEndliche ModelltheorieComputeranimation

09:54

BitCoxeter-GruppeMereologieHackerRechenschieberDienst <Informatik>

10:42

Dienst <Informatik>COMKnotenmengeGatewayRouterROM <Informatik>Virtuelle MaschineThreadSystems programming languageProzess <Informatik>ServerREST <Informatik>Dienst <Informatik>Cluster <Rechnernetz>DatenbankInformationClientQuaderGeradeRechter WinkelEinfach zusammenhängender RaumFächer <Mathematik>Gewicht <Ausgleichsrechnung>Virtuelle MaschinePhysikalisches SystemSystemprogrammierungRouterFormale SpracheGatewayProgrammierungSpannweite <Stochastik>GefrierenZeiger <Informatik>Interaktives FernsehenDelisches ProblemAutorisierungRechenwerkLeckMessage-PassingHalbleiterspeicherStandardabweichungDateiformatAusnahmebehandlungMereologieMultiplikationsoperatorRoutingMinimumSoftwareMulti-Tier-ArchitekturPunktAuthentifikation

14:02

ClientGatewayAuthentifikationSinguläres IntegralTransaktionAutorisierungE-MailSystemidentifikationRouterLesen <Datenverarbeitung>Protokoll <Datenverarbeitungssystem>Message-PassingMereologieProzess <Informatik>SoftwareentwicklerBitAutorisierungE-MailServerExogene VariableAuthentifikationProtokoll <Datenverarbeitungssystem>GatewayMessage-PassingProjektive EbeneClientFächer <Mathematik>Token-RingSoftwareBenutzerbeteiligungApp <Programm>RouterRoutingBrowserTransaktionWort <Informatik>EinfügungsdämpfungZusammengesetzte VerteilungIdentifizierbarkeitRechter WinkelSelbst organisierendes SystemBinärcodeOrdnung <Mathematik>Computeranimation

16:28

Socket-SchnittstelleCOMMessage-PassingFormale SpracheVersionsverwaltungModul <Datentyp>ZeichenketteE-MailSystemaufrufDienst <Informatik>Gebäude <Mathematik>ProgrammbibliothekSocketProtokoll <Datenverarbeitungssystem>MultiplikationKomponente <Software>Serielle SchnittstelleToken-RingAutorisierungLesen <Datenverarbeitung>ClientTransaktionENUMCodeCodierung <Programmierung>SoftwareentwicklerGruppoidFunktion <Mathematik>GatewayThreadProtokoll <Datenverarbeitungssystem>Gewicht <Ausgleichsrechnung>Zusammenhängender GraphSocket-SchnittstelleBinärcodeSprachsynthesePartikelsystemProgrammiergerätGebäude <Mathematik>ServerCASE <Informatik>RoutingPunktSerielle SchnittstelleMultiplikationGatewayFehlererkennungBitMessage-PassingParametersystemQuellcodeAutorisierungE-MailProzess <Informatik>Bus <Informatik>DokumentenserverInformationFormale SpracheRechter WinkelFunktion <Mathematik>AuthentifikationImplementierungCodeElektronische PublikationHash-AlgorithmusExogene VariableTransaktionVererbungshierarchieDienst <Informatik>VersionsverwaltungDatenbankProgrammbibliothekSoftwareentwicklerClientFunktionalEinhüllendeMultiplikationsoperatorInhalt <Mathematik>Nichtlinearer OperatorSoftwareOffice-PaketPhysikalisches SystemCodierung <Programmierung>TouchscreenThreadProjektive EbeneVerzeichnisdienstDatenfeldSpeicherabzugMereologieProgrammierumgebungFehlermeldungMinkowski-MetrikMinimumSelbstrepräsentationInstantiierungBrowserInformationsspeicherungHill-DifferentialgleichungOvalInzidenzalgebraTypentheorieProfil <Aerodynamik>RouterCluster <Rechnernetz>KoordinatenDivergente ReiheEndliche ModelltheorieMAPRuhmasseProgrammierungAutomatische HandlungsplanungRechenwerkToken-RingStellenringGraphische BenutzeroberflächeKurvenanpassungMoment <Mathematik>Selbst organisierendes SystemArithmetisches MittelEnergiedichteComputeranimation

24:23

Dienst <Informatik>Gemeinsamer SpeicherPartielle DifferentiationTeilmengeProtokoll <Datenverarbeitungssystem>Message-PassingHash-AlgorithmusServerMessage-PassingMultiplikationsoperatorZahlenbereichMinimumDatenbankSchlüsselverwaltungVirtuelle MaschinePhysikalisches SystemTeilmengeDienst <Informatik>Prozess <Informatik>Protokoll <Datenverarbeitungssystem>InformationHash-AlgorithmusDatenfeldRoutingPunktServiceorientierte ArchitekturPartikelsystemZentrische StreckungAdditionBasis <Mathematik>Gerade

26:41

GeradeRoutingImplementierungMessage-PassingInstallation <Informatik>UnrundheitRoutingFunktionalMessage-PassingImplementierungProjektive EbeneCodeProtokoll <Datenverarbeitungssystem>EinhüllendeCase-ModdingHash-AlgorithmusBitEin-AusgabeTopologieComputeranimationVorlesung/Konferenz

27:39

Ganze ZahlFolge <Mathematik>DatenbankFolge <Mathematik>Message-PassingBitDatenbankMultiplikationsoperatorImplementierungProzess <Informatik>MereologieGanze ZahlIdentifizierbarkeitTypentheoriePhysikalisches SystemEindringerkennungComputeranimation

28:55

GatewayMessage-PassingDatenbankLesen <Datenverarbeitung>COMInformationsspeicherungProzess <Informatik>ServerGatewayProjektive EbeneMessage-PassingRouterRoutingSpeicherabzugMultiplikationsoperatorDatenbankImplementierungInformationsspeicherungBitPhysikalisches SystemDienst <Informatik>EinsComputeranimation

30:06

Protokoll <Datenverarbeitungssystem>Design by ContractMessage-PassingDatentypSchnittmengeCOMImplementierungLokales MinimumGatewayDatenbankClientWarteschlangeDifferenteSystemaufrufThreadExogene VariableNeuronales NetzMessage-PassingProtokoll <Datenverarbeitungssystem>DatenbankInformationsspeicherungVersionsverwaltungMereologieRechter WinkelSoftwarePhysikalisches SystemProzess <Informatik>ImplementierungMultiplikationsoperatorServerBildschirmfensterAggregatzustandSerielle SchnittstelleSelbstrepräsentationExogene VariableTopologieProjektive EbeneFunktionalSchnittmengeSchlüsselverwaltungMinimumDienst <Informatik>UmwandlungsenthalpieRoutingWarteschlangeBitClientGatewayEinsComputeranimation

32:39

SystemaufrufMessage-PassingProzess <Informatik>MultigraphCodeCOMRouterBildschirmfensterZahlenbereichProzess <Informatik>UmwandlungsenthalpieDienst <Informatik>BinärcodeNabel <Mathematik>VerzeichnisdienstElektronische PublikationPunktDefaultTranslation <Mathematik>MultiplikationsoperatorQuellcodePhysikalisches SystemTermFigurierte ZahlKonfigurationsraumGebäude <Mathematik>RouterProgrammierumgebungMinimumMereologieSoftwareAutomatische HandlungsplanungZusammenhängender GraphOrdnungsreduktionAggregatzustandQuick-SortServerTemplateBitSchnelltasteComputeranimation

36:01

COMRechenwerkDienst <Informatik>Installation <Informatik>SpeicherabzugZusammenhängender GraphKonfigurationsraumSoftwareBitDatenbankParametersystemElektronische PublikationRouterProgrammierumgebungSchnelltasteRechenschieberRichtungMereologieRadiusPhysikalisches SystemComputeranimation

37:12

BetriebsmittelverwaltungSpieltheorieBitServerProgrammierumgebungFront-End <Software>ComputerarchitekturComputerspielSchnelltasteDienst <Informatik>Gebäude <Mathematik>SchlussregelMatchingPhysikalisches SystemRechter WinkelComputeranimation

37:50

RouterKnotenmengeMultiplikationsoperatorDienst <Informatik>RouterServerTopologieKartesische KoordinatenFahne <Mathematik>Einfacher RingProgrammierumgebungMathematische LogikMathematikZentralisatorErwartungswertZahlenbereichGeradePartielle DifferentiationVollständiger VerbandComputerspielSummengleichungRuhmasseParametersystemComputeranimation

39:15

Virtuelle MaschineAuthentifikationSpeicherabzugGatewayProtokoll <Datenverarbeitungssystem>ZahlenbereichFormale SprachePunktEinfach zusammenhängender RaumNP-hartes ProblemMathematikEndliche ModelltheorieComputeranimation

40:30

HackerCodeHackerHilfesystemSpeicherabzugWeg <Topologie>XML

Transkript: Englisch(automatisch erzeugt)

00:05

Cool, so thank you everyone for coming. My name is Jamie Windsor, and I'm gonna give you a talk today about Habitat specifically building Habitat and building builder with Habitat

00:21

Okay, I'm just gonna So here's some contact information for me I'm a longtime game developer. I'm also a longtime chef user I wrote a bookshelf and Birkflow and Ridley and a bunch of supporting libraries for Chef I'm also an evangelizer of cookbook patterns

00:42

and Worked on League of Legends, Guild Wars 2, Lord of the Rings Online and most recently a mobile game called Moonrise But now I work at Chef I joined last year to work on an R&D project Like I had said I'm a longtime chef user

01:02

I've been using it since 07 or 08 or something like that And I learned a lot about operations and application automation from Chef It got me super super far But it did it didn't get me 100% of the way there. I think that Chef is great for infrastructure

01:22

But it's just good for applications It's almost there about 95% of the way there, but there are little code smells that you'll pick up And there's a lot of training required to get people going with Chef I developed a bookshelf and Birkflow to try to help people understand and have a

01:41

Simple path to get somewhere and through developing those things. I realized a couple of principles that I was chasing after Specifically Automation needs to live with your application You can see that because I promoted that you should have your cookbook live with your software and it should share the same version number and then if you change the software you rev the version number of the cookbook and

02:04

a Little bit of a code spell there is like you want to make a change in the cookbook Well now your whole application has to have a virgin rev, right? So you can you can feel like a little bit of Resistance there doesn't really want to be used that way The other thing that I realized is that you should prefer build failures over runtime failures always

02:26

This is really simple, but in Chef a convergence can happen and You tested it on your build server you tested your cookbooks But for some reason it fails as soon as the cookbook converges in production and you're not exactly sure why

02:44

That would be a run 10 failure the bid the build failure would be that your cookbook Failed its tests or your software was unable to compile or something like that And then the last one is that you should prefer chore choreography over orchestration

03:00

basically what that means is with an orchestrator you tell a fleet to do something you instruct it and say there's a thousand machines across two different data centers and Half of them fail quarter and fail. You need to handle those cases and orchestration is basically you trying to wrap another problem and

03:24

Solve it with a giant hammer. Whereas choreography is things working together You put a bunch of musicians on stage and they know what song they're supposed to play what they play off of each other So if the drummer speeds up, which he's probably gonna do everybody else has to speed up, right?

03:43

And you can do that with applications I actually wrote a failed orchestrator and I'm glad it failed because it was the wrong solution I really don't believe that you I could have gotten it right a lot of people have tried to create orchestrators to Orchestrate our infrastructure and just not one of them has really done the job for me including the one I tried to make myself

04:07

Some other things that I think are really important to having your automation living with your application is these four other principles which are your application or Deployment process should be atomic meaning it's all or nothing

04:22

It should be isolated and you can see that with the emergence of Docker. Docker basically isolates your environment Your application lives inside Docker and nothing else, right? and it's because if you go on to some random machine you install a package like say you upgrade open SSL, but it's a

04:40

application Multi applications are running on the machine. You don't know what you did to one of them, right? That's not an isolated environment. That's a shared environment Deployments should also be repeatable and audible. So if I do the deployment one time it should always work And one of the things that you'll notice with cookbooks is a lot of people use like the package resource with no version number

05:02

And you don't know what version of that software is going to get it deployed So technically you do not have a completely repeatable installation there and then auditable is what is running on my machines right now so if you might have heard of inspect earlier today and That test suite can run on your machines that tell you when you have an issue

05:23

These four things we packaged into the R&D project I was talking about called Habitat Habitat moves those runtime failures that I was talking about into the build time it also embodies choreography instead of orchestration. Habitat has supervisors that run together and

05:41

Tell each other about what's running and the supervisors Supervisors react to the presence of other supervisors by telling your application about new supervisors and new applications turning on it also lets you have isolated builds and atomic builds so any number of versions of glibc can be on the machine with any number of versions of software consuming them and the same for open SSL and any

06:06

of the other libraries in the Linux toolchain But I'm not here to talk to you about Habitat I'm going to assume that you have a basic knowledge of Habitat and you at least saw the 101 or the 201 talk

06:20

Today, I'm going to talk about builder which is what my primary goal was after the initial push of the Habitat supervisor happened Builders a software as a service platform to build your packages basically It's not finished just yet new features will be rolling out over the next few months We don't have an exact time frame yet

06:43

But it's both a package repository and a build service. It's open source and you can run it on your own infrastructure You don't need to use our public one. This is actually live today. You see it. It looks like the supermarket for Habitat The first address is the web client. The second one is the URL for if you were going to program against our API

07:05

So I've had a lot of questions about like why am I building for another build server right there's Travis and there's Jenkins and we believe that a purpose-built build service in built into the community site would allow you to do things like

07:23

Provide public hosting of packages. We needed that anyway public builders for those packages And a history of where those packages came from and you can look at any time and say oh this version of OpenSSL that I use what was the build output of that and Theoretically ten years from now you should be able to go back and look at whatever the version of OpenSSL was built

07:44

I don't know why you want to but you could technically do that another thing that By building our own build server lets us do is automatic rebuilding of and republishing of your packages when a dependency of yours changes and

08:05

And For people that aren't familiar This is a plan file This is for a component of builder that I've built and there's two lines there one is package depths And then there's the package build depths Package depths are what we depend on at runtime. So when I say automatic rebuilding of things I

08:26

Mean that when core OpenSSL gets updated we will automatically Rebuild your software to be linked against that new version of OpenSSL if you want to if you don't want it to rebuild then you won't need to but this is important because

08:43

Habitat assumes that the happy path you can you don't need to have this on but it automatically updates your software for you And if it's running and connected to a public people, so if we find an SSL vulnerability core is our origin And we publish a new version of OpenSSL out

09:00

Automatically your software will know there was an issue rebuild against the new OpenSSL published to the public depot and then your software that's running as long as you have auto updating on Will pick up those updates and you won't even be woken up in the middle of the night right like if you had Inspect set to waking you up if there was a vulnerability of a certain threshold. You won't even see it most likely

09:24

We've catch it publish this out or if it did wake you up It would be too late Five minutes ago by your software to be rebuilt in order to be running on your stack So I want to talk a little bit about building builder and

09:41

That's basically the premise of this talk is to educate people about builder for users and also for people that are potential collaborators and Show how habitat helped us build builder So a little bit of a warning here It's going to get technical and I'm going to take a drink of this coffee to get ready for the technical parts of this talk. Great coffee

10:12

Okay, if you if you haven't seen the 101 talk or the 201 talk yet This might be a little rough And I'm going to assume some knowledge of rust and I know that I asked the room before this if anyone knew rust and

10:24

There wasn't a hand So feel free to ignore the slides that have rust on them and I will just tell you what it says I hope that this talk helps people as they get more familiar with rust and like I said We'll have a hack day and I'll be present and trying to help people out and get up to speed

10:42

So the first thing you need to know about builder is that it is a service service oriented design And it's really important that we built a scalable system here because the thing that I just mentioned about The dependencies automatically rebuilding is like I heard it a

11:01

stampede problem Everything depends on G. Lipsy And if we change D lipsy and we update it every piece of software in that depot that's still active will need to be updated So there's clever ways that we're solving this and we're going to go through those So there's three tiers here. One is gateway notes

11:23

Their job is to authenticate you format your requests into the into the cluster and Then they send those through router nodes and their job is to send them to the appropriate service service nodes are like session server and

11:41

The place that stores all your personal information and the job server and this is the service layout Each one of these is written in rust Except for the databases and Maybe the clients you could write the clients and rest At the top we have gray boxes. Those are the two clients that connect into the HTTP gateway

12:01

The black lines represent connection paths. So clients connect into the gateway The HTTP gateway is a restful interface. It's one of the edge nodes that performs authorization Authentication and it also formulates your request for the inner of the cluster It's also in a public subnet and everything else is on a private subnet

12:20

Or at least that if you deployed it, I would recommend it to be that way The gateways connecting to the routers and the routers are fully meshed. So there's two connections between them Those are the only fully meshed nodes and that's important because those are the only things that can talk to each other Below that there are services that connect into the route servers and they don't know about each other

12:41

So a job server can't message a session server directly It has to message a router and then bounce the message off of it, too So say session wants to talk to job sends a message all the way through the router and then to the job server Then at the bottom we have the databases the services connect into their databases And we have a worker pool which connects into the job server to pull up jobs to do the builds

13:04

So one of the big questions that we've been getting is why rust Go is a really popular language right now. It's great. I'm a big fan of Erlang and elixir Programmed with that for a couple years and had a great time Why rust I didn't I didn't even know it when I first started

13:20

I just knew it was this fringe language that I should maybe look into at some point But I'm actually in love with this language It's a systems programming language that is blazingly fast And it also offers a thing that's unique to systems programming languages Which is memory safety without a virtual machine and what that means is you cannot have double freeze use after freeze

13:43

Dangling pointers there is no such thing as a memory leak unless you are using unsafe for us And we won't talk about unsafe rust right now It also has threat safety guarantees and great tooling which makes it basically a modern C

14:00

Completely awesome to work with so this is the really technical part I want to educate people in the room about how builder works and how we process your job Because I want developers to help us build this So this next bit a bit here is going to be us starting a new build and how a worker picks it up

14:22

Does the work and then publishes it to the depot? And there's ten hops Each one of these hops represents us a place that a message gets Handled and then transformed and then sent on to the next node I'm going to go through how you authenticate and then how you create a New job from a project and then how the worker comes in and picks it off

14:44

So the first hop is the client the clients going to request Then a job is done, and they're going to hit the rest gateway Or you're going to use the web client, and it's going to hit the the the rest gateway as well Because of a web client is written in angular 2 and it's just a client-side MVC JavaScript app that one of my teammates Nathan Smith made and it is really good

15:05

I was never a fan of JavaScript and That guy showed me some really crazy awesome things that you can do with it, and it's it's Probably one of the best front ends that I've had the pleasure of touching

15:20

So the HTTP client is going to send a message to the HTTP gateway This is the second hop like I mentioned the rest gateway or the HTTP gateway is the public facing edge node It performs external authentication to make sure or identify make sure you're okay to do something or identify you So we do that with OAuth and right now we do it through github

15:41

So basically you take your personal access token that you generate from github and you put it into an authorization header and say this Is who I am and it comes through we transform your request into a net request And then we forward that along to the router after we forward along we sit there and we wait for a response We call these network transactions

16:01

So the authorization bit is Specified in the authorization header we rip out the value of that We put it into something called a protocol message, which is a binary message And then we're going to forward that along to any route server available And then the route server is going to for that request to the appropriate session server Because the session server handles authentication for you so you can see the path of the request and at the bottom

16:25

We're now on the session server So I talked about Binary protocol that we're speaking or the network protocol And we call that builder net or the builder net protocol And it's comprised of two components one is zero and few sockets and the other one is protobuf

16:45

Protobuf is a language agnostic DSL basically That bit at the bottom there represents a struct and that could be compiled into any language So when I try to create the protocol messages for rust it comes and reads these proto files and would change that

17:05

Representation into rust code what's neat about this is any language can talk to any one of the servers That's in the builder cluster like they're all written in rust right now But they don't need to be written in rust as long as you can talk protobuf Which every language has an implementation for it almost that's it's not everything a lot of languages have an implementation

17:26

You can talk to our services Something that is really powerful about protobuf is that it also provides backwards compatibility between service versions So if a new service comes online we update we update a service and it receives a message from an out-of-date server

17:43

It will still be able to read and decode that message And that's really important for the way that we want to operate we want our service to be choreographed together I don't I don't want to take an outage because I need to update I want to do a rolling restart and I want habitat to coordinate that for me I also want to store these in a database at some point

18:03

So it's really nice to be able to have that backwards compatibility The other bit of zero MQ, these are just sockets on steroids, it's not a message bus it's not active MQ or rabbit MQ They are literally sockets that have queuing in them and they're very customizable

18:23

Super easy to work with also not specific to rust implementation of this exists for many languages And there's an excellent C library So if your language doesn't support zero MQ, you could tie into that but most major languages have it So these sockets speak protobuf

18:41

The fine turn protobuf messages into a binary protocol and then they get sent through to each and every service This is the basic envelope for all of our messages when we talk about Network services talking to each other. They always have an envelope wrapping the content and this is what that is There's two bits here and then the content of the message

19:02

So if I was to have an envelope here on on the stage, I would have the the destination which is the route information That's like what the post office would do to know where to send this thing and then I'd have the message ID Which is like hey, what kind of thing is this message is this message? That's a hint for decoders and for dispatchers to know how to handle the message and then there's a serialized message body

19:25

So anytime you see another protobuf message on the screen here The body is an encoded version of one of those and this is just the envelope that wraps it I've also included a path So if you cloned our source tree, you can find the build a protocol component and inside there

19:40

There's this protocols directory that contains every protocol So with the authorization we basically take that header out We create a brand new session get and we jam the token into the token field So the third hop is the route server I mentioned that the HTTP client talks to the gateway the gateway says I need to do some authorization

20:04

Authentication and then send it through the route server to the session server Well, how does the route server know how to get it to the session server? What it does is it reads that envelope information like the post office has that routing info like where you where where's the destination, right? So the route info composed of two parts the protocol

20:21

Which is what service you're trying to send this message to and then the second one is The routing hash and we'll talk about that in a little bit, but that's for sharding So The route server reads the routing information and affords it across to the appropriate service Which brings us to the fourth hop which is the session server

20:41

So the session server is going to receive that message for the route server and it's going to dispatch that session get message to a handler What's important here? Is that developers coming in don't need to worry about this Very complex Multithreaded environment it it feels to you that you're working in something I like to call a single threaded server or an STS server

21:04

When you get a message into a handler you can do anything you like with it But it's basically your space. You don't need to interact with any other threads. It's very easy to work in So the handler reads the token looks in the data store for the session if you have one It's gonna say here's your session

21:20

And if you don't have one, it's going to give you back a net error and we'll talk about those in a second That's basically what a session message looks like and that would be in the session server dot proto file So when we reply if it's an error we return this thing called a net error Net errors are a response to a transaction on a failure case and they have two parts

21:41

One is the error code and then the other is the error message The error code is for the user It's so that way we can do localization. Basically error code 8 means Session timeout error code 1 means generic timeout. I just happen to know what a bunch of them are but

22:00

Any client could then represent it in any language. The next bit is the message It's confusing because it's called message, but it's not for you. It's actually for the developer It's composed of three parts One is a two-character code for the service two is what operation or function was happening at the time of the failure and three is The case so for instance RG off one I know means that we were unable to go talk to the external service to verify that your session was valid

22:27

You could also find that out by just grepping the source code Which makes this really easy to debug what happened in this large distributed system Normally people wouldn't sit up here and tell you about errors in their system and like how to handle them

22:40

but I think it's really important because The code base is large and it's service oriented So there's a lot of pieces that are moving and knowing how to debug problems is really helpful But we're going to assume that this was an okay case. So we're sending back a session So the fifth hop is that the route server receives the transaction reply from the session server and it sends a response to the appropriate

23:01

HTTP gateway, this is important because there could be multiple HTTP gateways that the route servers know about right? And how does it know? How does the message know how to get back to the appropriate route server? And we'll get to that in a second So the gateway is going to receive that message Receives the reply and then it knows which client thread to send it for again

23:22

No one in this room needs to know how to do that. It automatically does it The client thread then creates a new project get message from the HTTP parameters that you originally sent when you said I need a new job and basically a project is associated to every package that you have so a package could be core engine X right in the build service you'd have

23:45

Build output or the ability to say I want to build core engine X So it that's the project and the project is associated to like a github Repository The path to the plan file it's going to build and some additional information there

24:01

Unfortunately, the UI is not ready yet. So I can't show it to you and it might make more sense later when I can show something like to But the project get stuff is specified inside of the vault protocol file So we Say what project we want and what we're going to do is route that message all the way to the vault server

24:25

We'll talk about the vault server in a second But this is a good time to stop and explain how the messages are actually getting the right place Because I mentioned that we sharded our data For anyone in the room that's not familiar sharding Basically, that means that there's X amount of buckets that are identical but they contain different information in them

24:42

So in this room if there was a hundred people here and we had a hundred shards Theoretically one of us would be in every single bucket if we were able to distribute evenly across the shards What it also does for us is if a shard goes down We don't have a full outage and this also allows us to scale out easily by moving shards to additional machines

25:02

So here what we have is two routing servers and they're connected to four vault servers And those four bolt servers have these numbers in the bottom. They represent shards. So shards zero through 127 we have 128 shards we pre shard 100 our data 128 times So vault server 4 over here if that goes down only 32 shards go down

25:24

Which means that only a percentage of the room would have an issue and they would only have an issue if they were working with the vault server The vault server is then connected to the databases which have those shards as well. So the databases have their the data sharding So builder is service oriented and sharding

25:41

Like I mentioned about outages one of the reasons that we did this was to prevent full outages of our system We also did it to distribute out, but if a service goes down only have a partial outage for that service So if the job server goes down, it's okay. You can still log in and download packages and everything's fine But you can't make new jobs If a shard goes down, you only get a partial outage of that service for a subset of users

26:05

So the more machines we have running our shards up to we'll say 128 because I pre shard 128 times If one node goes down only one 128 the population experiences an issue with that service This also allows us to add new machines by rebalancing those shards

26:20

So I add a new machine and then the shard migrates on over to it So now that we know what sharding is I can kind of explain what the hashes the hash determines the destination of the shard There's a point where we have a message those protocol messages and there's one field on every message. That's important

26:41

It's called the route key and the route key is Able to be defined in rust by specifying a function that returns a value I know that no one in the room knows rust. But basically what this says is that a Message a protocol message allows has a function that then returns something or nothing and if it returns something

27:02

It's a value to then hash and put into the hash value of the of the envelope What this allows us to do then is mod that hash Against the amount of shards that we have and then deterministically route your message to the appropriate place This is the implementation of the trait that I just showed

27:22

The top is the project get message which would get compiled in the rust code And then this is me implementing the routable trait for project yet And you'll see that what I'm returning in this function that I've defined is an insta ID That I'm creating from the ID of the message Which brings us to my next important bit which is how all of our entities are catalogued

27:44

They all use something called an insta ID and insta ID is a 64-bit integer, but it's not opaque It has three parts of it in there that you can pull out One of them is the create time which is exactly what time it was created since us sensing epoch Which allows us to create messages for 75 year entities for 75 years or so before you run out of IDs that we can

28:05

Give out I'll be dead by then. So it's okay You guys that's your problem in 75 years Sequence ID is automatically generated by the database which allows us to create 1024 messages every Second or so and then the next bit is the shard this was inspired by Instagram engineering

28:25

It's slightly modified from theirs. So what this allows us to do is every time we create a new thing on a Database we know exactly what shard it came from because the data of where it came from is hidden inside that Identifier and the identifiers are guaranteed to be unique because they're sequentially increasing over time

28:45

I can also take every job that's ever been created or every entity in the system and tell you what time it was Done that just by looking at the ID So there's a hell of a lot of data here in just 64 bits This is the implementation rust I'm not going to leave this up there very long

29:00

But basically what it shows you is that the first 32 bits is the time the next 13 bits is the incrementing ID in the last 10 bits is the Shard ID and that's how we generate one So let's get back to things what we were just talking about was project get we were getting a project on the HTTP gateway Through the vault server to then be able to ask the job server

29:22

Hey, I want to build a job based on this project and that project is core engine X let's say So the vault server receives the message from the router it then dispatches it to a project get handler Just like we discussed before in the session server It reads the project from the database and it replies to an HTTP gateway through the route server

29:42

So the message comes back to the HTTP gateway Important here is what what is that vault server? Why was the project there? Basically the vault server is persistent storage for origins Projects any shared entities and anything not specific to your exact account your account stuff lives on session

30:00

And then every other thing that exists in our system lives in a distributed entity store called the vault server Which is backed by redis which brings us to how we persist protobuf messages into the database What's really neat is we have that backwards compatibility and redis is a key value store, right? So if a new version of the software comes out and it pulls an entity out of that database

30:22

Because we stored it in a protobuf message we are able to translate it into the newest version and then restore it into the database and Things should be just fine We can persist the protobuf messages directly into the database with a trait That's just like that roundable trait that I showed which basically just needs you to define two functions Which are a primary key and how to set that primary key

30:44

And this is the implementation of it just as simple as the last one The top is the project that you'd have and then the bottom is saying that the ID is the primary key That's all that says So this is hop 8 of 10. We're almost there. Is everyone still with me here?

31:02

No, no one's napping yet Okay So we received a reply from the route server and we create a new job spec from the session in the project a job Spec is saying like I want a new job this it's from this project And then we're going to message that through the route server to the job server just like every other service And that's what a job spec looks like

31:20

It has an owner ID who started the job and the second bit is the project that the job originated from Job server receives that message It dispatches the job spec to a job create a handler Creates a job persist that to the database and the database happens to also be a queue for us, which is OS specific So there's a Linux queue and a Windows queue, right? And the job server stores it in the job the whole job history that you have the state of the job

31:45

And active jobs, like I said are placed in an OS specific queue and then Job create success or failure eventually gets sent back to the client through the route server through to the HTTP gateway so

32:00

Now the HTTP gateway says, okay. I got my reply. It was successful. We created your job now We're going to serialize your job into JSON and we're going to send you a 200 okay with the JSON represented representation of the job and Again we use a trait to say this protocol message looks like this in JSON So every time you've seen an implementation of a trait, it's saying this is how we route it

32:24

This is how it persists to the database. And now this is how we represent it in JSON. So the JSON response is simple It's just the ID in the state So the jobs are there your jobs done. You were told the job is created, but it needs to get picked up by a worker So there's a pool of workers that are running connected into the job servers and they're OS specific

32:44

so there's four workers here two or Linux two windows and there are any number of workers connected to any number of job servers at a time They're always specific. Like I said Builds are done within a studio on the worker. This is huge Fletcher Nichols on my team

33:00

He spent a lot of work on this thing called the studio I'm not sure if anyone here has sat through the the talks yet, but I'm getting some nods It is amazing when he first made this thing. I didn't know what it was for I know that he's a really smart guy and I just trusted him I was like a Fletcher take the wheel on this one and Eventually when I got to the part where the worker needed to go build the software

33:21

My mind was blown. I had to do nothing simply nothing it shells out to the habitat binary We already have and says have studio build this and then we have a container that runs The whole job that you asked for and in an isolated environment from everything else on the system Which allows us to have safely run jobs

33:41

So like you can't give me something bad that will destroy my worker node. Nothing will be left over between nodes So if one member of the audience runs a job and then another one runs it and then I run something I don't have any of your artifacts or anything like that or any your secrets It's a fully contained environment that gets spawned every single time a worker picks up a new job

34:00

So The next bit is that habitat builder is actually self hosted So when I said that I used habitat to build builder This is what I mean Habitat supervises all the builder processes and the super watch it supervisor watches the public depot, which is itself To wait for package updates of itself

34:21

Builder builds its self and sends its package to itself And then it automatically updates when a build is published to itself Because it was watching itself and this totally works And I can't believe it This is pretty much the part where it all came together

34:43

And I just did not expect it to happen and I had to go for a walk right after My job is done here So This is what the source tree looks like of one of the habitat components We have the plan.sh file Which is the entry point We have a config directory there

35:02

And you'll notice inside the config directory that there's no default toeml The default toeml is the thing that is the translation layer It's like a template in Chef on how you take things That the gossip layer is telling you and turn them into configuration for your node The reason there's no toeml is because builder uses the native configuration

35:21

That habitat supervisor just dumps out anyway And this is some of that You can see that there's some generic configuration up there And then there's down here at the bottom Buying routers members, that's actually dynamic That was found and written to our config file because of something that we call binds Where this service found more routers and will automatically configure itself on the presence of new routers

35:47

And by using the native configuration file, it made my plan template even simpler Because I didn't need to have any sort of translation layer This is not required, but it's really neat If your building services would happen to have in mind, your plan gets even simpler

36:03

Who in the room has heard of the director? Cool, that's more than yesterday, excellent So the director is a thing that supervises supervisors And this is important for builder because builder has a lot of components And what we want to do is have one thing run

36:21

And then monitor every supervisor that's on the node supervising our software So this is a systemd config of the director which starts from a configuration file This is the configuration that it's starting and passing to the director Which is very simple, it's toml again

36:40

The important bits here are that start Those parameters are what you would pass in the command line And you can see that I'm binding a database redis and then an environment router And I'm binding router and have build router from an environment as well So this says that I'm going to look for a database redis in this environment And above it, I happen to also be starting a redis database in that environment

37:06

So the last slide I showed you binds routers members This is how that gets populated using binds Bind is basically service discovery The last bit that I want to talk about that Habitat really helped me out with

37:21

Was shard allocation So this infrastructure that I just described to you If you've ever seen a talk that I gave a year or two ago Not at ChefConf, but it's somewhere else About Moonrise, the game I was working on It should look very familiar because this is a video game back-end architecture That I've just purposed for a build server

37:40

It's a generic distributed system, right? But one of the big problems that I had when I was working on Moonrise Was allocating those shards Turning on the initial environment was a pain You had to handcraft some environment JSON for your environment Put it in the Chef server, run everything, pull it from the Chef server

38:03

And then if you made any changes You would need to go edit the central Chef server again And then run Chef and hope that And basically build all the logic into the application that shards need to be rebalanced The way that I was able to solve this Is using the topology flag inside of Habitat By having a master router be elected through this flag

38:23

Basically what happens is a master router turns on And it's going to expect n number of nodes of a service to come online And ask it for shard assignments So the master router turns on and says I expect four session servers So it will pre-shard I'm sorry It will take the 128 shards, divide them up four times

38:41

And hand them out to servers as they come in What this allows us to do is All you need to do to add a new server Is add a new server with the topology flag set to leader It will join the ring You increment the number of servers you expect As that new guy comes in The master router says to the other nodes

39:01

I need some of those shards back So we take a partial outage And it hands them off to the new server And I didn't need to do any of the leader election Because Habitat did it for me And there's one more thing that I want to talk about Which is bring your own worker I talked about gateways, the rest gateway

39:21

And I also talked about how your packages Could be auto-built If one of your dependencies that you depend on changed Well what's hard about that is We only have a certain number of public workers available And if I change OpenSSL for core Everyone's package is going to update And another thing is like

39:41

If there's no public workers And you have a bill that must go out Well, you need a fast lane And one of the things that we're going to be providing At some point is a gateway For you to connect your own workers to You can make your own workers As I described the protocol It doesn't matter what language you use You do not need to use Rust Or you can use our worker

40:02

Connect in, authenticate through a gateway for workers And then you can have a fast lane for your bills So bring your own worker is basically Any machine that you want to connect into the public cluster You don't need to set up or maintain the cluster You just connect the workers in And basically how we're going to be able to handle Auto-rebuilding the entire world

40:22

Is by telling you that you need to bring your own workers Else the public workers will never be available So we have a Habitat hack day tomorrow It's 10 a.m. to 3 p.m. And I'll be there I'm going to be trying to help people with Rust And the code base

40:41

And I'd love to see anyone come that's interested in Either builder or Habitat or Rust And that's it Thank you very much everyone for coming