Keynote: So, I have all these Docker containers, now what?
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 81 | |
Number of Parts | 173 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/20156 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Bilbao, Euskadi, Spain |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Red HatRobotGoogolBoom (sailing)Event horizonWeb pageComputer animationXMLLecture/ConferenceMeeting/Interview
00:54
Software engineeringMoore's lawVirtual machineSocial classData centerPower (physics)Cellular automatonSoftware developerConstructor (object-oriented programming)SoftwareLecture/Conference
02:03
Process (computing)Electronic mailing listMetropolitan area networkSoftware developerContext awarenessVirtual machineBinary codeSelectivity (electronic)Exception handlingServer (computing)Process (computing)Multiplication signNumberEmbedded systemEndliche ModelltheorieOperating systemWeb 2.0SoftwareIntegrated development environmentConfiguration space2 (number)Software engineeringParameter (computer programming)Computer fileDivisorCircleInstance (computer science)Cellular automatonFrequencySound effectTask (computing)BefehlsprozessorRandomizationScheduling (computing)Type theoryTerm (mathematics)Point (geometry)Right angleSummierbarkeitSystem callMatching (graph theory)CASE <Informatik>MereologyBinary treePerspective (visual)MathematicsSpacetimeFamilyRankingEstimatorFormal languageVideo gameScaling (geometry)Gotcha <Informatik>Set (mathematics)Computer animationLecture/Conference
05:44
Chi-squared distributionScheduling (computing)CASE <Informatik>Software developerRootPhysical systemSimilarity (geometry)Observational studyConfiguration spaceOrder (biology)Virtual machineOpen sourceLine (geometry)Computer fileCellular automatonBinary codeRoutingLecture/ConferenceProgram flowchart
07:02
Cellular automatonData storage deviceComputer fileState of matterNeuroinformatikInsertion lossScheduling (computing)Configuration spaceTask (computing)PlanningDecision theoryRight angleWeightVirtual machineBinary codeWrapper (data mining)Lecture/ConferenceProgram flowchart
08:10
Software developerData miningBitMaxima and minimaArmMetropolitan area networkExt functorEndliche ModelltheorieMultiplicationVirtual machineNumberSoftware maintenanceTask (computing)Product (business)BitGeneric programmingMultiplication signLevel (video gaming)Service (economics)CalculationMereologyStapeldateiMiniDiscSoftwareProcess (computing)MathematicsScheduling (computing)BefehlsprozessorSemiconductor memoryMathematical optimizationComputer animation
10:16
Semiconductor memoryBefehlsprozessorTask (computing)Virtual machineInsertion lossDirection (geometry)Channel capacityStandard deviationComputer engineeringComputer programmingRevision controlLecture/Conference
11:19
Task (computing)BefehlsprozessorUtility softwareProcess (computing)Semiconductor memoryChannel capacityVirtual machineMultiplicationTask (computing)Software developerLecture/ConferenceComputer animation
12:08
Metropolitan area networkSpecial unitary groupMiniDiscProcess (computing)Line (geometry)SpacetimeReduction of orderLevel (video gaming)Pattern languageStapeldateiMatching (graph theory)Server (computing)Lecture/Conference
13:01
Metropolitan area networkInformation securityTraffic reportingPoint (geometry)Set (mathematics)Reduction of orderLevel (video gaming)Data centerVirtual machineMereologyUtility softwareProcess (computing)Maxima and minimaMathematicsConfiguration spacePattern languageRight angleComputer fileMultiplication signData structureTerm (mathematics)UsabilityWindowAbstractionResource allocationBuildingCartesian coordinate systemArithmetic meanOscillationDigital photographySoftware developerCASE <Informatik>Service (economics)Computing platformState observerLecture/ConferenceMeeting/InterviewComputer animation
14:40
Ideal (ethics)Metropolitan area networkMaxima and minimaCellular automatonTerm (mathematics)Scaling (geometry)Instance (computer science)Cartesian coordinate systemEvent horizonPattern languagePoint (geometry)DebuggerPhysical systemSocial classClient (computing)Lecture/ConferenceSource codeProgram flowchart
15:57
File formatOpen setProjective planeLatent heatLine (geometry)Arithmetic progressionNumberView (database)Sampling (statistics)Rule of inferenceSummierbarkeitComputer animationLecture/Conference
16:45
BitHigher-order logicVirtual machineSlide rulePhysical systemAdditionGraph coloringOpen setLibrary (computing)Server (computing)Cartesian coordinate systemRandom graphView (database)Computer file1 (number)SoftwareRevision controlSound effectSemiconductor memoryBefehlsprozessorChainNamespaceOscillationProgram flowchartLecture/Conference
17:58
Head-mounted displayCartesian coordinate systemEndliche ModelltheorieWindowLibrary (computing)Operating systemVirtual machineHyperbolaComputer hardwareCrash (computing)MultiplicationIdeal (ethics)Operator (mathematics)Greatest elementMultiplication signPhysical systemCycle (graph theory)Centralizer and normalizerBitStack (abstract data type)Entire functionKernel (computing)Lecture/ConferenceComputer animation
19:00
Euler anglesVirtual machineNP-hardData managementOperating systemOperator (mathematics)Library (computing)Connectivity (graph theory)Slide ruleoutputIntegrated development environmentLaptopMobile appRun time (program lifecycle phase)Mathematical analysisCartesian coordinate systemAndroid (robot)Ideal (ethics)MereologySet-top boxTelecommunicationComputer hardwareDifferent (Kate Ryan album)Multiplication signWeb pageRevision controlCASE <Informatik>Right angleSystem callPhysical systemMedical imagingIndependence (probability theory)MathematicsPatch (Unix)Lecture/Conference
21:26
Server (computing)Standard deviationLaptopVirtual machineComputer hardwareFile formatData managementLecture/Conference
22:13
Term (mathematics)Disk read-and-write headIntegrated development environmentPortable communications deviceQuality of serviceScaling (geometry)CASE <Informatik>NamespaceMereologyPrimitive (album)Product (business)Projective planeGroup actionSoftware developerSoftware testingVulnerability (computing)Computer animation
23:48
BefehlsprozessorMiniDiscBus (computing)LaptopMedical imagingCloud computingVirtual machineResource allocationSocial classExecution unitMathematicsShape (magazine)Portable communications deviceDifferent (Kate Ryan album)Latent heatPoint cloudLecture/ConferenceComputer animation
24:35
Demo (music)Different (Kate Ryan album)BuildingCartesian coordinate systemDemo (music)Data managementWordSlide ruleBitOnline helpMilitary baseDiscrete groupPersonal digital assistantLecture/ConferenceComputer animation
25:28
Open setForm (programming)RootLaptopPoint cloudContext awarenessWordVirtualizationMultiplicationVirtual machineIntegrated development environmentScheduling (computing)Negative numberComputer animation
26:27
Open setMetropolitan area networkVirtual machineMultiplicationCloud computingMathematicsLine (geometry)CodeJava appletGoodness of fitComputer programmingOpen setSoftware developerSinc functionProgrammer (hardware)Standard deviationCodeProcess (computing)Pattern languageLecture/ConferenceComputer animation
27:22
Right angleScripting languageWhiteboardCartesian coordinate systemLine (geometry)Virtual machineSinc functionProcess (computing)GeometryFormal languageJava appletCodeService (economics)Operator (mathematics)Figurate numberGame controllerReplication (computing)AbstractionPoint (geometry)Matching (graph theory)Scheduling (computing)Term (mathematics)Natural numberTraffic reportingMappingServer (computing)Lecture/ConferenceComputer animationProgram flowchart
28:56
Metropolitan area networkVirtual realityStorage area networkMultiplicationLaptopFLOPSVisual systemProxy serverLink (knot theory)Computer configurationShape (magazine)Matrix (mathematics)Virtual machineVirtualizationSlide ruleSoftwareMinkowski-GeometrieCloud computingSoftware testingMilitary baseInternet service providerPoint cloudSocial classAbstractionLecture/ConferenceComputer animation
29:57
MultiplicationMetropolitan area networkIP addressSoftwareNeuroinformatikView (database)Server (computing)DiagramVolume (thermodynamics)Web 2.0Virtual machineGroup actionLecture/Conference
30:46
Scheduling (computing)Virtual machineAtomic numberProcess (computing)Execution unitAnalogyWrapper (data mining)Field (computer science)Food energyComputer animationLecture/Conference
31:34
Water vaporPlotterPoint (geometry)Term (mathematics)BitHypermediaAbstractionInformationComputer animationLecture/Conference
32:20
Computer networkTerm (mathematics)Metropolitan area networkProduct (business)Web 2.0Projective planeMiniDiscLevel (video gaming)Software developerContent (media)Type theoryServer (computing)SynchronizationMathematicsAreaService (economics)Point (geometry)Flow separationMobile appSoftwareSpacetimeVolume (thermodynamics)CASE <Informatik>Social classLogicMereologyComputer animationLecture/Conference
33:25
SoftwareTerm (mathematics)Virtual machineoutputSpacetimeLocal ringMereologyVolume (thermodynamics)Computer configurationPlotterMiniDiscTemporal logicServer (computing)Level (video gaming)Directory serviceShared memoryType theoryWritingSynchronizationReading (process)DampingBitLecture/ConferenceComputer animation
34:43
Software bugFile systemDefault (computer science)Directory serviceNeuroinformatikResultantComputer configurationScheduling (computing)Physical systemView (database)Multiplication signMereologyVirtual machinePoint (geometry)PhysicalismVolume (thermodynamics)Service (economics)State of matterSoftware testingSound effectConfiguration spaceDifferent (Kate Ryan album)Link (knot theory)Similarity (geometry)Type theoryReading (process)Lecture/ConferenceComputer animation
35:59
Internet service providerData storage deviceBlock (periodic table)MereologySpacetimeDirectory serviceData storage deviceCASE <Informatik>Block (periodic table)Cloud computingMiniDiscElasticity (physics)Volume (thermodynamics)Point cloudPlotterPosition operatorLecture/ConferenceComputer animation
36:44
MereologyPattern languagePoint (geometry)MereologySystem callProxy serverService (economics)Reading (process)Function (mathematics)CASE <Informatik>File formatSynchronizationCartesian coordinate systemMobile appPlotterLogicGroup actionSlide ruleRight angleParticle systemWebsiteLecture/Conference
37:58
GoogolPhysical systemAdaptive behaviorCartesian coordinate systemGroup actionType theorySingle-precision floating-point formatMechanism designMetadataCASE <Informatik>Key (cryptography)Reduction of orderDifferent (Kate Ryan album)Point (geometry)Lecture/ConferenceXMLProgram flowchart
38:49
Executive information systemMetropolitan area networkState of matterTemplate (C++)Replication (computing)Symmetry (physics)Task (computing)Game controllerDependent and independent variablesRevision controlNumberSlide ruleCASE <Informatik>Point (geometry)Process (computing)Cartesian coordinate systemLie groupStability theoryType theoryVotingScheduling (computing)Different (Kate Ryan album)Data storage deviceXMLProgram flowchart
40:24
1 (number)State of matterTemplate (C++)Natural numberReplication (computing)Service (economics)Medical imagingVirtualizationGame controllerIP addressVotingAreaLecture/ConferenceProgram flowchart
41:18
Slide ruleVotingService (economics)MereologyIP addressInstance (computer science)VirtualizationClient (computing)Roundness (object)LastteilungRevision controlDirect numerical simulationCartesian coordinate systemSemiconductor memoryMultiplication signAreaObservational studyDifferent (Kate Ryan album)Web pageLecture/ConferenceComputer animationProgram flowchart
42:25
Level (video gaming)Chi-squared distributionInternet service providerMetropolitan area networkService (economics)Software testingSelf-organizationRevision controlMathematicsIP addressFile formatSoftware developerReplication (computing)VirtualizationMedical imagingDifferent (Kate Ryan album)Game controllerType theoryVolume (thermodynamics)VotingAddress spaceSymbol tablePlotterMilitary baseClient (computing)ConsistencyComputer programmingData managementGroup actionLecture/ConferenceProgram flowchart
43:57
GoogolBoom (sailing)Uniformer RaumDivision (mathematics)Metropolitan area networkBinary fileMereologyComputer virusBefehlsprozessorOcean currentUtility softwareRight angleSparse matrixCommunications protocolGoodness of fitView (database)FamilySemiconductor memoryScheduling (computing)Channel capacitySelectivity (electronic)Point (geometry)BitMaxima and minimaMoment (mathematics)Condition numberMereologyComputer multitaskingComplex (psychology)Configuration spaceProduct (business)Replication (computing)Source codeMultiplication signCASE <Informatik>RadiusService (economics)Computer fileTerm (mathematics)Core dumpScaling (geometry)NumberContext awarenessGame controllerLecture/ConferenceComputer animation
46:22
Metropolitan area networkBeta functionMultiplication signRevision controlHigh availabilityPoint (geometry)SoftwareConnected spaceCASE <Informatik>Group actionMoment (mathematics)Centralizer and normalizerInstance (computer science)LoginLecture/ConferenceComputer animation
47:13
Demo (music)Hand fanComputer fileChi-squared distributionValue-added networkDemo (music)Virtual machineInstance (computer science)TouchscreenService (economics)Group actionRepresentation (politics)Real numberSingle-precision floating-point formatQuicksortSpacetimeSoftware developerGoodness of fitProgram flowchartComputer animation
48:55
Data acquisitionMetropolitan area networkInclusion mapBit error rateInheritance (object-oriented programming)Value-added networkInformation systemsReal numberRegulärer Ausdruck <Textverarbeitung>Grand Unified TheoryPredictabilityStapeldateiElectronic visual displayFront and back endsTemplate (C++)Multiplication signGame controllerReplication (computing)Ocean currentDemo (music)Virtual machineComputer fileProgram flowchart
50:39
Metropolitan area networkChi-squared distributionBit rateAreaInclusion mapPredictabilityReal numberUniformer RaumGamma functionComputer iconArmWindowCartesian coordinate systemIP addressService (economics)Default (computer science)Right angleBitRevision controlVisualization (computer graphics)MereologyGame controllerMathematicsMobile appComputer animationProgram flowchart
51:59
DemonMetropolitan area networkHand fanComputer iconFront and back endsWordProper mapMultiplication signBitSoftware developerComplete metric spaceNumberDiscrete groupPosition operatorFiber (mathematics)Skeleton (computer programming)Program flowchartComputer animation
53:17
Hand fanMetropolitan area networkForm (programming)Executive information systemPort scannerScaling (geometry)Beat (acoustics)BitGroup actionInstance (computer science)Term (mathematics)Proxy serverData managementDifferent (Kate Ryan album)Software developerTemplate (C++)Scaling (geometry)Process (computing)WordPointer (computer programming)1 (number)Single-precision floating-point formatSocial classState of matterView (database)Three-dimensional spaceCausalitySinc functionDiagramComputer animationProgram flowchart
54:33
Metropolitan area networkForm (programming)Density of statesOpen setGoogolBoom (sailing)Hash functionVisualization (computer graphics)Online helpTwitterFreewareLogic gateOpen sourceNumberCivil engineeringLecture/Conference
55:21
GoogolCellular automatonFigurate numberMultiplication signData centerData conversionMereologyComplex (psychology)MultiplicationRootSocial classReading (process)WordLimit (category theory)Lecture/Conference
56:25
GoogolImplementationRight angleInformation securityArithmetic progressionKernel (computing)Operator (mathematics)Process (computing)System callCartesian coordinate systemLevel (video gaming)MultiplicationMultiplication signPattern languageQuicksortLecture/Conference
57:16
GoogolMetropolitan area networkWeb pageConnectivity (graph theory)Field (computer science)Wage labourFinite differenceLecture/Conference
58:11
Red HatGoogolRobotXML
Transcript: English(auto-generated)
00:04
Thank you for coming. Welcome. Thank you. Hey, thanks everyone. Can you all hear me? I think you can, good. Excellent.
00:20
Thanks for having me at Python, EuroPython. My last big Python event was Python, PyCon JP in Japan last year. I didn't get to speak though, but it was really fun. Although most of the talks are in Japanese, and my Japanese is getting better, it's not so great. My Spanish is really, really bad. But Spanish and Japanese are very similar,
00:40
so maybe I should learn both together. No, no, really, seriously, they are. No, it doesn't seem to make sense. Then we're gonna talk about containers, and having containers, and having lots of containers, because ultimately, everything's gonna be containerized, and we're gonna have lots of containers we won't know what to do with.
01:00
And I'll ask you some questions later, and see how far you are, along with moving towards containerization. So basically, when we have lots of containers, what do we do then? And this is a problem we face at Google. So this is a data center. This is a Google data center in Iowa, in the US. It's a place called Council Bluffs,
01:22
and this is one of our bigger data centers, and if I leave it out for long enough, you'll probably be able to count all the machines, and how many there are. But this is a cluster. So clusters are one of the constructs we have internally, but these clusters are broken down into cells. So cells are smaller, we have many cells per cluster,
01:41
and this will probably, a cell we're gonna look at today is gonna have about 10,000 machines in it. So they're quite large, and this is a huge amount of compute power. Lots of compute power, but we need to make this available to our engineers, our software engineers, our developers. So how do we go about making this compute power available to our own developers?
02:04
And it works something like this. This is what a developer does. Well, first some context. The one thing, given what you see there, given what you see there, we don't want the engineer to have to kind of select a rack, select a machine, and say, hey, I'm gonna run it on that machine.
02:21
I'm gonna SSH, SFTP a binary over to the machine, SSH into the machine, stand up my process, my server or whatever, maybe log into many machines and do that multiple times. That's not gonna be possible. Huge amounts of machines, huge numbers of engineers, huge amounts of jobs to run. So how does it happen?
02:41
So basically, we have a configuration file. In this case, it's called a Borg configuration file. I was in India recently, and nobody there had heard of Borg. How many of you are familiar with Borg in Star Trek? Okay, right. So we never used to be able to talk about Borg, because Paramount Pictures owned it. It was kind of like one of our worst kept secrets, that we had this thing called Borg running internally.
03:02
But now we talk about it all the time. Because it's fun, and it's really good to show this in the context of what we're gonna talk about later, which is Kubernetes. So basically, this is a Borg configuration file. And what the developer does is he creates a job, JSON file, pulls a job, hello world,
03:20
says which cell he wants to run it in. Going back to what we said earlier, a cell is a few thousand machines. In this case, he's saying it's called IC, some random cell name we chose. And he specifies what binary to use. In this case, hello world web server. So he wants to run hello world on a web server. And this is gonna be a fat binary, statically linked, all of its dependencies with it.
03:43
So effectively, we can run it pretty much anywhere without having to worry about the underlying operating system. And that includes the web server as well. So this thing is quite big, probably about 50 megabytes. So he specifies the path to his binary, or her binary. And unfortunately, we have too many
04:01
male software engineers, not enough female software engineers. So let's encourage women to be software engineers. And arguments, we have to specify some arguments to our binary, pass them in via the environment. In this case, we wanna specify what port to run on. This is parameterized. Then we have some requirements in terms of resources.
04:20
Now this is important, and we'll circle back to this in a minute. So we can specify how much RAM, how much disk, how much CPU. And ultimately, we can say how many we want to run. So in this case, we wanna run five replicas of this job, five tasks, effectively. And why five?
04:40
Why not do it Google scale, 10,000? Makes more sense, right? We have all those machines. We saw how many machines we have. Let's run 10,000 copies of this. So once we finish this, our software engineer, she types in a command on the command line, passes in the config file, and that gets pushed out to somewhere,
05:01
gets pushed out to this Borg scheduler. And what happens then is this. Over a period of time, in this case, about two minutes, 40 seconds, 10,000 tasks start, 10,000 instances of that job start. And it takes two minutes, 40 seconds, roughly. We do phase the rollout of all of these jobs
05:22
to make sure we don't do them all at once. One of the key factors here is the size of the binary, 50 megabytes times 10,000. It's about 20 gigabits per second of IO. We're gonna be caching that binary quite a lot, but we had to move it around between 10,000 machines, so there's a huge amount of IO going on. But eventually we get to a point where we have 10,000 running,
05:43
or nearly 10,000, maybe not quite 10,000. We'll talk about that in a second. And Borg looks like this. This is what Borg is to Google. It's not gonna assimilate you, but I think we came up with a name because it's probably gonna assimilate everybody eventually. So this is Borg, and Borg runs within a cell.
06:02
So each cell has its own Borg master, its own Borg configuration. In this case, we have a Borg master, which is highly replicated. We have five copies of it for resilience. And we have lots of other things. These down here are our machines. These are our machines we saw in the racks. They're all running a thing called a Borglet. We have a scheduler.
06:20
We have some configuration files in the binary. So what happens is the developer, the engineer, crates his or her binary, and they use a massively distributed parallel build system called, well, I won't say what it's called, but it's externally available now called Bazel. So we made this open source.
06:41
So our own build system is now available open source called Bazel, B-A-Z-E-L, or if you're American, B-A-Z-E-L. Or if you're Canadian, B-A-Z-E-L. It gets very confusing, believe me. You go to Canada, it's so confusing. Like routes and routes.
07:02
So basically, he or she creates a binary, pushes it out, and it gets stored in storage for the cell. And then they push their configuration file. Configuration file gets copied to the Borg master. We have a persistent Paxos back store, consensus-based. And what happens then is this scheduler, looking around, comes along and says,
07:22
hey, what is the desired state? We should have this running. Do we have this running? And it sees 10,000 new tasks and says, hey, they're not running. We should have 10,000 of those. Let's make sure that's happening. Let's fix that. And so it goes about planning the running of these 10,000 tasks.
07:41
And it creates a plan, and then it starts telling the Borg master, or the Borg master makes decisions and tells them the Borg glitz on these machines to run this particular task. So they get communicated. The task will ultimately run inside a thin container wrapper. So it has a container around it. It's not just running the binary. It is containerized.
08:01
A very lightweight shim container that's not Docker. It's not standards-based. The Borg ultimately will pull the binary over from storage, and it will start running. And we will see this. Lots of value-adds. All over our data center. So now we're running multiple copies of that. And so that's what we had, 10,000.
08:22
But if we look at it a little bit closer, we find there's 9,993 running. Not quite the 10,000 we expected. But this is a highly available service. We expect some lessening of the number of tasks we're running over time due to the way we operate. And that's interesting. So let's look at that in a little bit more detail.
08:44
So failures. Things fail, but failure is kind of more of a generic term here. There are many reasons for failures. And one of the main reasons for failures, particularly for low-priority jobs, is preemption. If you look at the top bar, which is our production jobs, we have very few failures.
09:04
And most of them are down to machine shutdown, where we've actually scheduled some maintenance on the machine, and we've taken the machine down. That task, any task running on that machine, would then be rescheduled elsewhere in the cluster. We have a very small number of preemptions. Down here, our non-production jobs, which are things like map producers, batch jobs,
09:23
they get preempted all the time. They're happy to be preempted. And in fact, the calculation generally says that for about 10,000 tasks, about seven or eight of them will be not running at any given time because of preemption. They'll be about to be scheduled somewhere else, but they won't be running at that particular time. And we see other things here.
09:41
We see, again, the, I can't see my pointer, the blue bar, which is the machine shutdown, which is pretty much the same as production. And we have some other things as well, out of resources, very small number of machine failures. And for when you have as many clusters, or as many machines as we have, machine failures are a given. We expect that.
10:00
We don't panic when machines go down. It's part of the normal running of our business. And another interesting thing is how we try to make efficient use of our resources. So we have CPUs, we have memory, we have disk IO, we have network IO. And sometimes it's quite possible for one task to be using lots of memory, but very little CPU.
10:22
Or vice versa, lots of CPU and very little memory. If you put one of those on a machine, then you may be wasting one of those resources. It's what's known as resource stranding. And these are the available resources, these white bars here. This example here is actually our virtual machines, which is the task.
10:41
Our virtual machines are actually containers, believe it or not. This is Google Compute Engine. So these are all virtual machines, these bars, individual bars. And what we can see here is that some of these machines have available capacity, available RAM, available CPU. And if we look over here,
11:01
we see a different situation where we have maybe some with available CPU, and others with no available RAM, and vice versa. This here and this here is called resource stranding. It means we're not actually making use of that resource. So we have spare memory capacity or spare CPU capacity that's been wasted effectively.
11:22
So one of our challenges is like a Tetris puzzle to try to stack these things in a way where we get the best possible utilization out of our clusters. So we will mix and match them to make sure that we have low CPU, high memory jobs running with high memory, low CPU jobs.
11:41
And of course we run multiple tasks per machine. That's extremely important. That can come back to all this with Kubernetes shortly. And another interesting thing is this, which is gonna be a huge challenge in the future when it comes to Kubernetes, but it's gonna be really important to all of us. So we saw earlier that our developer,
12:01
she specifies what resources. She wants to use, what he wants to use. 100 megabytes of RAM, 100 megabytes of disk, 0.1 CPU. And that would be this blue line up here. So everything's running, we'll match into this blue line. These are the resources that were requested by these jobs.
12:21
In reality though, it's like this. And so we have all of this wasted space, which we can't use because it's been allocated effectively for those running jobs, but we can use it. So what we do is we effectively estimate, based on the run patterns of the current jobs, how much they're gonna use.
12:41
And that's this blue line here. So this is our reservation. This is how much we reserve specifically for those jobs. And what we can then do is reuse all that space. Now we can reuse that space for very low priority jobs. Again, those batch jobs, those map reducers. Things that we wanna run, we want them to finish eventually,
13:00
but we don't really care when it happens. It could be like running some kind of monthly report that nobody ever looks at that gets logged, or running a map reducer across a huge amount of data that may be important at some point or just needs to be done, but we don't really care when it needs to be done. So all of that stuff, we can reuse it, and we can run jobs within it. So that's really important. That's how we can get maximum utilization
13:20
out of all of our machines in that data center. And so moving on to Kubernetes now, gradually. One of the observations is that if you have your developers spending time thinking about machines or thinking in terms of machines, and you're probably doing it wrong because it's too low a level of abstraction.
13:43
Now today maybe it's fine, but in the future this is not gonna be the case. We need people to be thinking in terms of applications and not having to worry about the infrastructure in which they run. I mean anybody who's used a platform as a service knows how important that is anyway. You don't care about the infrastructure. You wanna write your work, configuration file, build a binary, and just say run this for me.
14:01
I don't care where you run it. I don't care about how you do it. Just run it for me and make sure it stays running. We get efficiency by sharing our resources and reclaiming unused allocations. And containers, the fact that we containerize everything allows us to make our users much more productive.
14:21
So everything we run runs on a container. Two billion containers a week we estimate. We never really thought that was very important until Docker came along and containers became the next big thing, right? LXC, then Docker, and Docker became huge. And so now one of the things we talk about all the time now is we run containers all the time.
14:42
And we are pretty good at running containers, which is why we created Kubernetes. If you're interested in more details of what I've just talked about, Borg, there's a paper here, goo.gl, one capital C for N-U-O. And that's the white paper in Borg. That's got all of the details, all of the graphics you just saw.
15:01
It goes into much, much more detail, of course. So let's look in terms of a simple application and how we can do this externally with containers and through Kubernetes. So this is a very simple pattern.
15:20
Generally, when we give this all its PHP in the middle, MySQL, memcache, and we have a client. We have many of these pythons running. This could be many instances of flasks, it could be some kind of event system, but we have the ability to run many, many concurrent requests. And we're probably gonna wanna scale this thing on demand.
15:42
We may not want to scale MySQL that much until we get to a point where we have to do replicas and sharding. Memcache, we're probably gonna wanna scale as well, but we're gonna keep it simple for now. Just keep one MySQL, one memcache, and a few of these Python instances at the front end. So let's talk about containers. So how many of you are familiar with containers?
16:05
How many of you have actually spun up a Docker container? Hey, lots of you. It's almost the same number, right? Again, last year we'd asked this, how many of you have heard of containers? Lots of hands. How many of you have spun up a Docker container? Not so many, so things have changed now.
16:21
Docker is the future, or containers are the future. We now have this thing called the Open Container Project. Docker have kindly made what they have into a spec, and we're all gonna get behind it. We have this common specification from which we can write containers. Things like CoreOS with a rocket container, they're all gonna fall in line
16:41
and we'll have a common format for containers, which is gonna be great. But just for those of you who are not really familiar with containers, just a few slides, very few slides on containers. Just to kind of give you some of the concepts. This is the way we used to do things in the old days. We have a machine, maybe next to our desk, in our bedrooms, or in a colo, or in a server room. And the machine would run our operating system.
17:03
It would have all of the packages installed that provided libraries, things like OpenSSL. On top of that, we would run applications. And how many of you have had a situation where you're running one application and all of the other applications on the machine fail because that one application went mad? It used all of the CPU, used all of the RAM,
17:23
it crashed the machine and took all of the other applications down. And this may have been a very low priority app, one that you didn't really care about, taking down some really important ones. Now this is never a good idea, running multiple applications on one machine, because there's no isolation between them. Whatever affects one application will probably affect all of the others.
17:41
There's no name spacing. They all have one view of the machine in which they're running. They have one view of the CPU, one view of the memory, one view of the file system, one view of the network. They share libraries, and so you're in a situation where maybe one day you update a version of a package, it updates the library, and one of your applications says, hey, I'm not gonna run anymore,
18:00
because that library is not compatible with me. So dependency hell. And if it's on Windows, it's DOL hell, and it's probably even worse. Applications are highly coupled to the operating system. This is a problem. And so we created virtual machines, and what we did basically is stuck a layer on top of the hardware called a hypervisor,
18:21
and we now had an idealized piece of hardware on which we could run multiple operating systems. So now we have this thin layer. It looks like a piece of hardware to these running virtual machines. And that gives us some isolation, because now we can run applications in their own virtual machine. So each application is now isolated. If one application crashes, it doesn't affect the others.
18:45
But it's extremely inefficient, because we have this red bit at the bottom here. We have the operating system, the kernel. And you know when you install a virtual machine, you pretty much have to install the entire Debian stack, or the entire Centos stack, or the entire Windows stack.
19:01
So that's not very efficient at all. There's still the same tight coupling between the operating system and the application. And as anybody who's tried to manage lots and lots of virtual machines to provide isolation, you know it's hard. So the new way is containers. In this case, we move up a layer. So we move above the operating system
19:21
and provide a idealized operating system, no longer idealized hardware, an idealized operating system on which we can run apps and their dependent libraries. So the libraries here are part of the container. So the container has an application.
19:42
It has all of its dependencies. It has its entire environment. So we can move this container around anywhere we want to. We can move it from one machine to another, from one runtime to another, from a laptop to a virtual machine, one in the cloud, to a bare metal server, to a set-top box, hopefully, eventually, maybe even to a phone when we had Docker on Android and iOS.
20:01
I'm sure it's gonna happen, right? And let's look at an example. So we have our application, our PHP in Apache. It should be Python in Apache. Sorry.
20:21
I do apologize. So that is Python. So wherever you see PHP in this deck, read Python. I will change it before I share the slides. So I was trying to think of what could offend a Python audience most, and it's probably talking about PHP, right? We're gone. Okay, so we have containers. So we want to run these components of our application,
20:42
Python and Apache, memcache, MySQL, not Apache, obviously, Python and Flask and Bottle and all of the other things we could potentially use. memcache, MySQL, and MySQL has its own libraries. It doesn't have any common libraries with the others. So we're gonna stack those libraries with the container in which MySQL runs.
21:03
And memcache, D, and PHP, or Python and Apache, have their own, I keep saying it, Python and Apache, Python and whatever, Gunicorn, anything. They had their own dependencies, but they also have, when we installed them, some shared dependencies as well. So some common analysis. So when we actually create the image,
21:21
we can actually share some stuff between them. But that's not shared at runtime, so when we create the container, they will have their own dependencies packaged together in the container. And underneath that, we have a server. And again, this could be a virtual machine, it could be a laptop, it could be a bare-metal server, it could be anything, pretty much.
21:41
And underneath it, we have the actual hardware. And all of this is being maintained by a Docker engine. So Docker is the thing that runs this. When we talk about containers, mostly synonymous with Docker nowadays, but again, there are other container formats. And hopefully, they will all comply with a standard. And that's the nirvana we're all heading towards.
22:01
So Docker effectively controls the creation of these containers and the management of these containers. So at the end of it, we will have Python and Flask, Angular, memcache, YSQL, all running in containers. So why containers? So there's many important reasons for having containers, but you can see, just by looking at what we do,
22:21
that that's the only way we can do it. We can't do it any other way. This is a perfect solution for the kind of scale that we want. But it's also perfect for smaller scale as well. Why? Because it's much more performant. It's much more performant in terms of the fact that we don't have to do all of that installation stuff. They are pretty much like they're running on bare metal,
22:42
so the performance is pretty much the same as a virtual machine, but they're much quicker to get up and running. Which means you can swap them out quicker, you can do upgrades quicker, you can do pretty much everything quicker. Repeatability, so the whole problem where we have the development, QA, build, test, production,
23:02
where we want to have repeatable environments, where we have a situation where, when we test something in QA, and then run it in prod, it fails in prod where it works in QA. How many people have had that situation? Like, have your head in your hands remembering those days, right?
23:21
So what containers give us is the ability to have a consistent environment, because the environment's packaged with a container. So basically, when we run it in QA, and when we run it in prod, it's exactly the same. It's exactly the same environment. So that's one of the great use cases of containers today. But much more is the portability of it, which we're gonna talk about in a second.
23:40
Quality of service. We can now do resource isolation as well, using things like cgroups in Linux and namespaces. We can actually isolate the resources. We can say, we only want this to have 100 megabytes of RAM, 100 megabytes of disk, 0.1 CPU. And ultimately, accounting. These things are easier to manage, they're easier to trace, they're easier to audit.
24:01
They're small, composable units that can be tracked very easily. And ultimately, portability. You can move these things around from one cloud provider to another. Images, specifically. You can't just pick up a running container and move it, but you can easily run the same container in a different cloud provider, on a bare metal machine, on a laptop.
24:21
You can move them from one machine to another as the shape of your cluster, if you have a cluster of machines, changes. You can move them around to be more efficient. So we can go back to what we had before with the efficient allocation of resources. We can do that if we have containers. And ultimately, this is a fundamentally different way of managing and building applications.
24:42
So, a demo, I'm not gonna do this demo. I left that slide in by mistake. This would have been a containers Docker demo. And I don't think I'm gonna bore you with that. It's very easy to find a tutorial on Docker and get up and run them a bit. Let's not talk about that. Let's talk about Kubernetes instead. How many of you have heard of Kubernetes? How many of you can say Kubernetes?
25:04
It's a hard word to get your head around. Probably easier if you're Greek, because it's a Greek word. But if you want help pronouncing it, I'll be outside in the Google booth after this talk. So I can definitely provide assistance on that. Maybe I'm saying it wrong.
25:21
Maybe I've been saying it wrong all this time. So I'm happy to be corrected. So Kubernetes, let's talk about that. And we've given you an introduction to what we do at Google, so that should provide the context on why Kubernetes is necessary. Something we often miss out when we give talks is that we don't really provide that kind of context.
25:41
So I'm hoping that the introduction to Borg has probably provided that for you. So Kubernetes, a Greek word, means helmsman, or it's the root of the word governor of some reason. So Arnold Schwarzenegger's governator comes from Kubernetes. And it's effectively an orchestrator or a scheduler for Docker containers,
26:00
ultimately for other forms of containers. I think CoreOS are already using it to schedule and orchestrate rocket containers. It supports multiple cloud environments. So, Mesosphere, I always forget them. VMware, even Microsoft are involved. You can run Kubernetes on Amazon.
26:20
You can run it pretty much anywhere. You can run it on your laptop with Vagrant. So you can just create a four machine cluster, virtual machines, with Vagrant up, and you'll have a Kubernetes cluster. And ultimately, eventually, we may have a situation where we can run Kubernetes across multiple cloud providers. It might be difficult, it might be possible. It may be one day you'll have your feet
26:42
and machines will be running in Google, in Amazon, and Microsoft Azure as well. Possible, I'm not sure if it's gonna happen. So this is kind of inspired and informed by everything we saw previously, everything with Borg. And it's based on our experiences. Open source, written in Go,
27:01
like many good programs nowadays, but completely respect Python. I love Go, I love Python. I used to be a Java developer. I spent 15 years developing Java. No, 11 years, then I moved to Google. And I haven't wrote a line of Java code since. Now I write, now I write Python. It's like Java programmers anonymous, right?
27:26
It's been four years since I wrote my last line of Java code. So now I write in Python, and I write in Go, and I write in Angular, and I write in JavaScript, and all of those more interesting and useful languages. Java is getting better, Java 8 is a big step forward.
27:43
And ultimately, we want to be able to talk about managing applications and not machines, which is actually what we talked about earlier. And some very quick concepts, and I'm not gonna introduce them, but I want to show you the icons so that when you see them, you'll know what they mean. Container, pod, service, volume, label, replication controller, node, are all of the key concepts.
28:03
How many of you are familiar with SaltStack? How many of you like the terminology in SaltStack, like grains and such like? I think it's really hard to get your head around. And I think one of the dangers about an abstraction is that you get too far away from the terms that are familiar to people.
28:20
Most of these are familiar to people with service, the idea of a replication controller, a node, a label, a container. The pod is probably the most difficult one to get your head around. So let's talk about pods. And let's talk about nodes first and clusters. So we have a cluster, kind of maps back to what we talked about earlier
28:40
with Borg, where we have a master, and the master has a scheduler, and he has an API, an API server that can be used to talk to nodes. The nodes are all running a thing called a kubelet, and they have these things called pods running containers. And we'll talk about pods shortly. They also have a proxy by which we can expose
29:00
our running containers to the outside world. And we have many nodes. And a cluster, this is an abstraction, so a cluster could be different depending on which cloud provider you're using. And ultimately what you want to have is a fabric of machines that looks like a flat shape in which we can run containers. You don't care about it, you just care that they're all joined together.
29:21
And it's one big flat space in which we can run stuff. And we'll let this thing, the scheduler, take care of running stuff for us, ultimately. And so basically the options for clusters are laptops, multi-node clusters, hosted or even self-managed, on-prem or cloud-based, using virtual machines,
29:42
or bare-metal virtual machines. Many, many options, and there's a matrix down here, a short link. Hopefully we can share these slides afterwards. And the short link will give you a matrix of how you can run Kubernetes on what you want to run, from Core OS on Amazon. We have different ways of doing the networking. The networking's quite tough.
30:02
Google Compute Engine makes it easy because of IP addressing, but often we have to put this other layering called Flannel to actually provide that ability to give an IP address in a group of subnets to a running machine or a running pod. So let's talk about pods. How many of you are familiar with the concept of pods?
30:23
Okay, not so many of you. So in the diagram here, we have a pod. It has a container. This is a container. This web server is a container. And it has a volume. Like Docker containers can have volumes. Little bit different, but very similar.
30:41
And so we wanna run this web server. And the construct we use within Kubernetes is to create this thing called a pod that's like a logical host. So like if you wanted to run Apache and something else alongside it, you would run it on a host machine. That's the same as a pod. So anything you would run together on the same machine
31:01
will run in a pod. These are the atomic units of scheduling for Kubernetes. This is what Kubernetes schedules. We talked about jobs earlier when we looked at Borg. Kubernetes schedules pods. And your containers run with inside the pod. So thin wrapper around them. These are ephemeral. These are like...
31:21
I've got this analogy. So everybody uses this pets versus cattle analogy, and I don't really like it because I'm a vegetarian. So crops versus flowers. So pods are like crops. You don't care about them. You have a wheat field. You don't care about your individual plants that are growing. When you have flowers, you probably give them names
31:41
and you water them and you talk to them as well. So you care about them. You don't care about your crops though. So pods are like crops. They can come and go. They can be replaced. They're all absolutely the same. You can take one and replace it with another. And ultimately, to make things simple now, you don't have to worry about a pod
32:00
if you wanna run a single container. You just say, run this container for me. It will create the pod for you. And you still have to think in terms of pods when you're doing monitoring, but you don't have to create a pod. You just say, run the container for me. It will create the pod for you. Okay? So pods are an abstraction. It's difficult to get your head around. A little bit more information about them.
32:21
Imagine this scenario where you want to have something that synchronizes with GitHub. This may be a push-to-deploy type scenario where whenever your developers do a merge into GitHub, you want those changes to be immediately pushed out into production or maybe on staging servers. So you have a thing called a Git synchronizer and it's talking to Git and monitoring your project in Git.
32:43
It pulls down any changes and it writes them to somewhere on a disk and your web server can then serve that latest content. Those things are tied together. They work together and it makes sense for them to run side by side. So when one goes away, the other goes away. So we can run them both in the same pod.
33:00
So now we're saying, on this logical host, this pod thing, let's run two containers. In this case, Git synchronizer and a Node.js app or a Python app. And we have a shared volume, concept by volume, which we'll talk about shortly. These are tightly coupled together. So when the pod dies, they die together. It doesn't make any sense to have them running separately. It might do in the way you architect things,
33:22
but it doesn't have to. They share the network space and port space. They have the same concept of local host. They are completely ephemeral and think in terms of things you would run on a single machine. So a volume, what's a volume? And I don't normally talk about volumes,
33:41
but they are very important. So not talking about them seems a bit stupid, really. So a volume is basically bound to the pod that encloses it. And this is something where we can write data or read data from, okay? We have many options when it comes to volumes. Docker already has volumes, and this is slightly different, but very similar.
34:03
So to a container running the pod, the volume looks like a directory. And what they are, what they're backed by and such like, and where they're mounted is determined by the volume type. So the first type we have is an empty directory. So whenever we create a pod,
34:20
it creates this space somewhere on disk, on the local disk, and they can basically share that volume between them. But it lives and dies with the pod. It only exists while the pod is there. So it could be your git synchronizer is writing stuff to this volume being read by the Apache server or whatever server, and you don't care when the pod goes away
34:42
if that space goes away. It's just scratch data, just temporary data. There's nothing stored there that's important to you. And it can even be backed by memory as well. So it could be tempfs file system. And that's great because it's really efficient, much faster as well. So that's what an empty directory is. That's the default you get for a,
35:00
well, I don't know if it's a default, actually. You have to specify what type it is. So empty directory is one of the options. The next one is host path, where we can actually map part of the file system of the node on which the pod is running into the pod. So this volume is actually effectively a snapshot of, not a snapshot, a link into the file system of the actual running machine.
35:22
That's useful to read configuration data and stuff. But it's also kind of dangerous as well because it may be that the state on the node may change in such a way that whenever you run a pod on one machine to another, you don't run it. Whenever the scheduler runs the pod on a different machine, it may see a different view of what's happening.
35:40
So it no longer becomes completely isolated. So it's a kind of dangerous thing to do, but it might work for you. The other one is NFS and other similar services like GlusterFS, I can never say that. Anything with a G on it, I can't say for some reason. So again, NFS, we can mount NFS paths on our pod
36:01
and expose them to our containers as directories. Or we could also use a cloud provider, persistent storage, persistent block storage. Now we call them persistent storage in Google. Amazon called them elastic block storage, that kind of thing. So this is persistent disk. So basically they can write and read the data from the disk and it will always be there,
36:20
whether the pod goes away or whatever. So what we're likely to do in this case is create a volume, a volume in the cloud provider. I call it a disk. We create a disk in the cloud provider which stores data and we'll mount it onto the pod. Whenever that pod goes away, the data's still there. Pod comes along, can mount it as well.
36:41
And also with Google Cloud platform, we can actually mount and read only on multiple pods as well. So some patterns for pods. The first one is the sidecar pattern because basically it's a motorcycle and sidecar. I guess in this case the Node.js application or the Python app is the,
37:00
you don't get offended when I say Node.js, right? The Node.js application is the bike and the Git synchronizer is the sidecar in this case. That makes a lot of sense, right? Ambassador, in this case something that acts on behalf of the actual running container. So this is a secondary container, a Redis proxy
37:21
that effectively allows the PHP application to make calls and then have the Redis proxy call out to shards. So we can just have one service that the PHP application calls for reads and writes and the Redis proxy can do all the hard work of deciding whether to read from a master or read from a slave or write to a master.
37:42
And the final one is an adapter pattern where in this case we have Redis running and we want to monitor it. We want to monitor all of our pods. But we need a common format for monitoring. So in this case we actually adapt the output from the Redis monitoring using an adapter container. And the adapter container will be plugged into the monitoring system. So it kind of adapts what's happening within the container.
38:04
So these are kind of examples of where it makes sense to have a pod. I'm hoping it does make sense. And I'll be interested to hear from you afterwards about whether pods make sense to you. So labels, labels basically the single grouping mechanism within Kubernetes. This allows us to group things that we can build applications like a dashboard.
38:22
So we have a running pod, we give it a label. Labels are key value pairs. So in this case type equals fe. Completely arbitrary metadata. Some of these things are meaningful to Kubernetes. But most of these can be anything that's meaningful to you. So we've put labels on pods. And we can say, I can build a dashboard application
38:41
that uses the API to say, give me the pods with this label. And I can show you all the status of that. And we can have different labels for different pods. So in this case we have a version two pod. We have a different dashboard application that's monitoring those. And that makes a lot more, okay. Pods can have many labels.
39:01
I surprise myself with my slides sometimes. Makes more sense with replication controllers because replication controllers are things that actually manage the running of pods. Now remember I said before that we created 10,000 tasks. And we pushed them out to persistent storage in a ball master.
39:21
And the scheduler comes along and says, yeah, these should be running, but they're not. I'll fix that. So this is the same thing. The replication controller is responsible for managing your desired state. You say, this is the way I want it to be. I want to have X number of these pods based on this container template. Or X number of these pods based on this container template. And I want you to maintain that state for me.
39:41
That is the job of the replication controller. So basically what they do is they work on a constituency of a label type, a label. So in this case, version equals V1 is what they select on. So this replication controller is responsible for all pods with label version equals V1. And we tell it, I want to have two of those.
40:02
So its job is to make sure there's always two running. In this case we also have another replication controller that has V2 of our pod, version equals V2. I only want one of those though. So make sure there's always one of those running. And the kind of way it works is that this is kind of like a control loop. So the replication is one big control loop. Simple as that. It says, look at the desired state.
40:21
How many have we got running? We should have four running. We've got four running. We've got four running. We've got three running. That's not good. Let's start another one. We have four running. We have four running. We have five running. That's not good. Let's take one away. So it just continuously monitors the state to make sure we have the ones running. It also works with a template. So we provide a template which is the pod template which contains the container image definition
40:42
and how many we want to run. We pass that into the replication controller. It doesn't create the pods. But when we create the replication controller and we say we want two of these pods, it says, hmm, there's not two of these running. I should start them. So it starts them. That's how it works. And we can also plug in replication controllers after we've created the pods.
41:01
And as I say, you're managing containers with this label. And finally, we get to services. And services are how we actually expose our running stuff. And we do this through this service here which creates a virtual IP address which has a constituency of pods based on a label selector.
41:20
Again, we don't have labels on here. We'll show it on the next slide. So basically, certain pods with a certain label are the constituency of this service. And when requests come in from clients, it will load balance them across the running pods regardless of which node they're on. So there could be 10,000 nodes. We could have 10,000 pods running on different nodes. And it would load balance them across the running pods.
41:43
At the moment, it only works in round robin. But eventually, it will have much more intelligent support for load balancing. This is used for exposing internal services within Kubernetes and also expose running services to clients externally which we'll see shortly.
42:01
It not only provides a virtual IP address but also a DNS name so we can do service discovery. And I want to move on. So this is a canary example. So who understands the concept of canary? I've got a few of you. So basically, when you have a situation where you have a running application, you want to try out a new version of it. You may have one instance or two instances
42:22
of that running application that are different. So some of your traffic will be pushed to new versions. Some will go to the old versions. You can then do AB testing against them to make sure that the new service works. If it doesn't, you can roll it back. If it does, you can push that to change all of them. This is a similar situation where we have
42:41
version equals v1, version equals v2, replication controllers and pods. But a service, all it cares about is labels type equals fe. And so the service has its constituency of all three of these pods. But these pods are managed by different replication controllers. So that's how it works. Virtual IP address exposes that to a client.
43:02
And so we mapped the Kubernetes. It all looks kind of like this. We have pods, remember all the symbols? This is why it's important. So that's a pod and a volume and a service. And we have, oh, a memcache with a D dropped down. A pod, a service, and replication controller
43:23
with a service. How does that look to a developer? Remember how it looked to a developer on Google? So this is how it looks. They specify a name. They can specify the image. This is a Docker image now. It could be a different image format in the future for a different type of container format.
43:44
Well, I left it in deliberately just to upset you.
44:03
Yeah, PHP. You can specify resources. 128 maybe bits, maybe bits, maybe bits. You can specify how much CPU. And see, Kubernetes unfortunately has its own idea of slicing up a CPU.
44:22
And I'm not gonna get into it, but it's like 500 bits of a CPU in terms of Kubernetes. So you have to read the manual for that. Otherwise it won't make any sense. I'd probably rather have like a percentage, but that doesn't work, because you can't have a percentage of a core because you don't know how powerful it is. So that's how we specify CPU. The ports, protocol, TCP, number of replicas,
44:43
one or maybe 10,000. Again, we cover that case as well. So that's how it works within a replication controller. There's other configuration files as well for services. Oh, sorry, do you? And scheduling at the moment, we saw the complexity of scheduling at Google.
45:01
It's a bit simpler for Kubernetes currently. It's based on pod selection. So we want to have the pods running based on selectors. And it's based on node capacity. So how much capacity does that node have? Is it capable of running my pod for me? If I have multiple nodes that can run my pod, I'm gonna run it on the one
45:21
that has the least resources consumed by running pods. And that's a priority. In the future, we'll have resource aware scheduling. So we can do kind of what we did, what we do back in Google, where we try to make maximum utilization out of our CPU and memory. Kubernetes is 1.0 as of this week.
45:42
Now it's on 21st of July at OSCON in Portland, Oregon. It's been open sourced for over a year now. And we have a product called Google Container Engine, which I'm gonna talk about shortly. Not so much, but it is a good way of running Kubernetes. But it's not a product pitch. Hosted Kubernetes,
46:02
I'm gonna talk more about Container Engine shortly. And the roadmap for Kubernetes is there. It's kind of sparse at the moment because we've just gone through 1.0. So they're now deciding on the roadmap for the next releases, V1.1. And the one on the roadmap currently is auto scaling. The ability to auto scale your nodes dynamically based on the amount of work you have.
46:23
Container Engine is a managed version of Kubernetes and it manages uptime for you. You don't have to worry about the master in this case. It will take care of the master for you. You can't even see the master. You can't connect to the master. So one of the problems we have at the moment with Kubernetes is high availability. So we don't have that replicated master scenario
46:41
we saw with Borg. So the only way to do it is to have multiple clusters to do high availability. But if we look after your master for you and make sure it's running, then you don't have to worry about it. We will make sure that your cluster is highly available by making sure that your master is always running. We can resize. Using a thing called managed instance groups which we'll look at in a minute.
47:01
Centralized login. We can pull all of our login into one place in the Google Developer Console. And it also supports VPN. So you can actually have your pods inside your own network, your own private network. So demo very quickly. And we had to change the setup earlier to make all this work.
47:22
This is a cluster. We have kubectl get nodes. So we have two nodes running. So these are machines in our cluster. And I can look at them here. This is the Google Developers Console. And I can probably make that a bit smaller. If I go into VM instances here, I can see my running machines.
47:43
I have a couple of action machines as well. But these two in the middle are the nodes for our cluster. I have this thing called an instance group which has two instances. And this is the thing that manages the size of our cluster. And below here we have container clusters. And we can see we have one cluster.
48:03
Okay, and if we go to... I've got very little screen real estate so I can't see everything that's going on. So we go here and we can see
48:21
a representation of what's running currently. So we have, these are pods. So this is a pod. This is a service. This is exposed internally. This is a service. And this is another service. Oh, MySQL's not running, which is a real pain. I'll have to run it. Okay, I don't know how that happened.
48:40
Okay, so we have a front-end service. We have a memcache service. And we have MySQL. We don't have a pod running. So we need to have a pod running. So I'm gonna start the pod very quickly.
49:17
That's why it's not running. We've just gone to 1.0.
49:20
All my demos break. That's why I know. I had it all running, but we had to reboot my machine because we were having problems with the display. Now we have a pod. Hey, a pod running MySQL. Have you ever spun up MySQL so quickly?
49:44
I'll bet you haven't. The next thing we want to do is run some PHPs. Unfortunately, there are PHPs currently. But I was trying to get time to update them completely. But I had some problems with Flask and Angular. So anybody else had problems with Flask and Angular?
50:01
No, okay. I should talk to you all. Basically, on my badge here, it says that my Python skills are rated as three stars. So I probably need to talk to you guys about doing it. kubectl, create minus f. And we're gonna create a controller.
50:22
We have a file already created. And we're gonna create that. And now we have pods and a replication controller. I might need to make that smaller. So now we have some front-end pods and a front-end RC controller. So the next thing we want to do is actually look at the running application.
50:41
Because my window's all screwed up, I might struggle. We have it running. So this is the IP address of the service, as we can see here. And this is the application running. But it's Devoxx. Anybody been to Devoxx? You don't wanna go to Devoxx. You wanna go to EuroPython, right? So I told them to fix this beforehand.
51:01
So we have an update, and we can roll that update out really easily to our cluster. So let's do that. Let's roll out an update to our cluster. And I'm gonna close that down so we can see the visualization. And I will go, I'm gonna reverse my history for this. And I'm going to update to V2 of our front-end controller.
51:25
So what's gonna happen now is it creates a new controller. And now it's gonna change those pods one by one to roll out our new version. So now we have three pods, the 2.0 and two 1.0s. We're gonna get rid of one of the 1.0s. Then we have a 1.0 and a two. And then we're gonna create a new 2.0 pod.
51:42
And then we're gonna get rid of the other 1.0 pod. And eventually, we only have two 2.0 pods, and we get rid of the 1.0 controllers. We don't need that anymore. And if we go back to our app.
52:00
Nothing's working. And refresh, we should get, yay! I'm hoping it works. I'm hoping my SQL is running properly.
52:20
So, okay, that works, brilliant. The other thing I can do as well, I should mention it. I'm probably getting close to running out of time. What is the command? Is it RC or not?
52:41
I can't remember. I always forget the scale command. So I'm gonna do V2, and I'm gonna scale it to six replicas, or five replicas. And we go back to our biz. So now we wanna add replicas to this.
53:00
We can do that by scaling like that. And then we have five running pods. Okay, that's as simple as that. So now we have five of them running. We can do that also within the developer's console. And just to wrap up on the whole thing, just a quick talk about the last bits and pieces.
53:24
That's how we visualized it. We visualized it using the API and a proxy. So kubectl supports a proxy. We just point it at some JSON. The JSON, the JavaScript, the JavaScript is all JS plum. So if you want to know what we use, JS plum.
53:42
In terms of container engine cluster scaling, we have this thing called a managed instance group, and that runs all of our nodes. And nodes run within the managed instance group. And we have this thing called an instance group manager that creates them, is responsible for making sure they're running. So that's actually monitoring the cluster of nodes. And we have a template by which we can create new nodes on demand.
54:01
So we can resize that managed instance group very easily. And yeah, I think that's about that for cluster scaling. We can also create clusters using tools such as the Google developer's console, Google deployment manager, and Terraform.
54:21
I was gonna give an example, but it's very basic. Terraform will create a cluster for you, but it won't allow you to resize it. If you want to resize it, you'll have to replace it completely, which isn't really what you want to do. So you can create clusters with various different ways. Oh, that's the visualization. Some frequently asked questions are answered in the documentation. I could spend entire hours on all of these subjects.
54:44
So if you have questions, I'll be outside all day on the Google booth. Come and see me. And Kubernetes is open source, so we want your help making it even better. So please contribute to Kubernetes. If you have questions, go to IRC, IRC.Freno.net on hash Google containers.
55:01
It's a very popular place. And also on Twitter, Kubernetes.io. You can tweet questions to me, but I'll answer them now, or you can find me on the booth, and ultimately that's it.
55:21
Is everyone okay? Yes, we have time for one or two questions. Okay. At the beginning, you were talking about BORG and like five masters that you run in,
55:40
and those figures are based on the data center, or how does that? Based on the cell. And a cell. So we break it up into a cell, and each cell has its own BORG master. All right. And that's about my limited knowledge of how the complexities of how BORG works. Not being as sweet or as sorry, but yeah, that's how it works.
56:00
That's why we have multiple, so within Google, you're gonna find multiple BORG clusters, or BORG cells, effectively. Hi, thank you for your talk. Very interesting. When you mentioned, when you compare VMs and containers, even if the user of a VM has root access,
56:23
it's very difficult to escape from the IP rise or et cetera. How do you see the security in the current containers implementations? It's a work in progress. So this is about security with containers. I'm not really gonna comment too much on it, but it's getting better all the time. Initially, we had problems with the kernel level
56:42
and syscalls and such like being made back into the running operating system, but it's getting better. So ultimately, Docker and such like becoming more secure all the time. Ultimately, doing multi-tenant, maybe currently with multiple customers, applications running side by side, may not be the best idea. But we have to tackle that.
57:00
So ultimately, we have to make sure people are more confident that they can run all of their jobs on containers securely. I don't think we're quite there yet, but we're working on it. So that's one challenge we need to crack. Does that answer your question? Are you? One more question or is that done? No, that's enough. Okay, come and find me outside.
57:22
Come and find me outside. We can talk about PHP. Oh, it's Python, sorry. From our organization, we want to thank Manu to come and give her a present.
57:41
Thank you. Should I answer it? Should I open it? Oh, no. Oh, that's wonderful. Ah, fantastic. Exactly what I need. Thank you very much. Thank you. Thanks for having me.