Intro to Kubernetes
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 95 | |
Author | ||
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/32298 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
FrOSCon 201749 / 95
4
8
9
15
20
22
23
24
25
27
29
32
36
37
38
39
40
45
46
47
48
49
50
51
53
54
59
63
64
65
74
75
76
79
83
84
86
87
88
89
91
92
93
94
95
00:00
FreewareOpen sourceGoodness of fitScalabilitySoftware testingMaxima and minimaPoint (geometry)WordXMLUMLComputer animation
00:37
Core dumpSoftware testingComputer clusterComputing platformPoint (geometry)Projective planeCartesian coordinate systemBitDatabaseOverlay-NetzComputing platformHypermediaExterior algebraEmailProcess (computing)Distribution (mathematics)Maxima and minimaSoftware testingQuicksortAutomationOpen sourceProduct (business)Touch typingGroup actionEnterprise architectureStorage area networkComputer animationXML
02:58
Core dumpWeb serviceCartesian coordinate systemMobile appInstance (computer science)AdditionUniform resource locatorSmartphoneServer (computing)Resource allocationRevision controlMultiplication signRight angleWordCoordinate systemAugmented realityComputer animation
04:50
Core dumpComputing platformGoogolBitMereologyServer (computing)Moving averageRevision controlMoment (mathematics)Instance (computer science)Cartesian coordinate systemDemosceneLimit (category theory)Codierung <Programmierung>Data structurePhysical systemProjective planeGroup actionOpen source.NET FrameworkGoogolComputer hardwareCuboidNumberConfiguration spaceAbstractionComputing platformData managementSoftwareResource allocationIntegrated development environmentWeb serviceVirtual machineMathematical analysisRow (database)Uniform resource locatorUtility softwareMultiplication signAdditionSoftware developerNeuroinformatikCognitionService (economics)Office suiteMobile appCodeKernel (computing)Computer animation
09:10
Core dumpElement (mathematics)Graph (mathematics)Product (business)Connectivity (graph theory)Right angleComputer animation
09:49
Core dumpSoftwareNamespaceGroup actionReading (process)CASE <Informatik>Web serviceGoodness of fitCartesian coordinate systemLine (geometry)Hacker (term)Integrated development environmentUniform resource locatorSoftware developerSoftwareComputer animation
10:25
Core dumpSoftwareNamespaceGroup actionGroup actionTerm (mathematics)Virtual machineLevel (video gaming)Operator (mathematics)Integrated development environmentCommitment schemeSoftware developerRevision controlProcess (computing)AdditionInformationNamespaceEntire functionOnline helpKernel (computing)SoftwarePortable communications deviceComputer animation
12:00
Core dumpMobile appExecution unitExecution unitCartesian coordinate systemRepetitionPlotterMereologyMultiplication signProxy serverSoftware frameworkSoftwareCASE <Informatik>Process (computing)Uniform resource locatorDiagram
12:46
Core dumpServer (computing)Game controllerScheduling (computing)Plane (geometry)Control flowProxy serverVirtual machineVector spaceBitData managementObject (grammar)Scheduling (computing)Kernel (computing)CuboidServer (computing)System callComputer hardwareWorkloadScaling (geometry)Cartesian coordinate systemVirtual machineGame controllerSoftwareFluidInsertion lossWeb serviceConnectivity (graph theory)Category of beingCASE <Informatik>PlanningGreen's functionAreaLogic gateCubeLocal ringBasis <Mathematik>Proxy serverResultantReal numberDiagramComputer animation
14:59
Uniform resource locatorCore dumpState of matterWritingServer (computing)File formatForm (programming)CountingWindows RegistryParameter (computer programming)Computer animation
15:57
Uniform resource locatorCore dumpTerm (mathematics)Revision controlUniform resource locatorCartesian coordinate systemMoving averageScripting languageData managementScheduling (computing)Server (computing)Game controllerMereologyStapeldateiPrime idealScalar fieldRight angleWeb serviceCubeOffice suiteSinc functionState of matterWindows RegistryVirtual machineProcess (computing)Computer animation
17:26
Core dumpComputer networkCartesian coordinate systemCommitment schemeBitState of matterPoint (geometry)Scaling (geometry)DatabaseVirtual machineSoftwareServer (computing)Process (computing)MiniDiscComputer animationDiagram
18:30
VolumeComputer networkCore dumpUniqueness quantificationIdentity managementOperator (mathematics)Time domainOperations researchLatent heatData managementMultiplication signCASE <Informatik>CausalityPopulation densityCartesian coordinate systemMathematicsTemplate (C++)Identity managementVolume (thermodynamics)SoftwareUniqueness quantificationOpen sourceProjective planeJava appletMiniDiscDevice driverBitState of matterPhysical systemNumberCloud computingSource codeDatabaseScaling (geometry)CuboidProcess (computing)AreaRight angleOperator (mathematics)MereologyAuthorizationPlanningPlotterCodeIntegrated development environmentLatent heatSocial classElectronic visual displayPoint cloudExpert systemComputer animation
23:41
Mortality rateCore dumpWeb serviceDirect numerical simulationIntegrated development environmentVariable (mathematics)Server (computing)SoftwareWeb serviceMultiplication signDifferent (Kate Ryan album)MereologyCartesian coordinate systemServer (computing)Uniform resource locatorSocial classIntegrated development environmentVariable (mathematics)DebuggerIP addressLastteilungMaschinelle ÜbersetzungRevision controlDirect numerical simulationTerm (mathematics)State of matterNeuroinformatikSingle-precision floating-point formatKernel (computing)Roundness (object)WordProxy serverGroup actionService (economics)Right angleAddress spaceVirtualizationForm (programming)WhiteboardDesign by contractRule of inferenceMatching (graph theory)Front and back endsType theoryGene clusterEntire functionLaptopComputer animation
27:38
Core dumpComputing platformUniform resource locatorReal numberSineCartesian coordinate systemComputing platformMereologyStability theorySet (mathematics)PlanningGame controllerState of matterComputer animationDiagram
28:25
Core dumpError messageRight angleBitComputer animation
29:02
Core dumpMultiplication signRight angleGene clusterScaling (geometry)Condition numberDifferent (Kate Ryan album)AreaImage resolutionVolume (thermodynamics)SoftwareProjective planeLocal ringBitState of matterCartesian coordinate systemKälteerzeugungSet (mathematics)Web serviceFile systemStapeldateiDatabaseWritingDatei-ServerLecture/Conference
32:23
Core dumpComputer networkHard disk driveLocal ringSoftwareInternet service providerVirtual machineMiniDiscCartesian coordinate systemDatei-ServerOpen sourceCore dumpBitCloud computingDiagramProgram flowchartComputer animation
33:18
Core dumpWeb serviceServer (computing)Computing platformOpen setStack (abstract data type)Slide ruleProjective planeWordCore dumpCodeInstallation artOvalFormal languageCartesian coordinate systemCompilerCompilation albumGreatest elementInternet service providerRight angle.NET FrameworkLevel (video gaming)Mathematical analysisStack (abstract data type)XMLComputer animationDiagram
35:22
Core dumpComponent-based software engineeringProcess (computing)Bootstrap aggregatingScheduling (computing)ConsistencyPointer (computer programming)Game controllerDemonServer (computing)Product (business)Normal (geometry)AdditionLevel (video gaming)Scheduling (computing)Installation artConnectivity (graph theory)Default (computer science)Projective planeHigh availabilityDemonOpen sourceCartesian coordinate systemProxy serverMetric systemAsynchronous Transfer ModeConsistencyGame controller1 (number)Solid geometrySocial classWordBootstrap aggregatingVirtual machineScatteringDegree (graph theory)Green's functionComputer-assisted translationCubeBootingComputer animation
39:38
DemonCore dumpChi-squared distributionSlide ruleRight angleSoftware developerSoftware testingInstance (computer science)Dimensional analysisMultiplication signSolid geometryInstallation artRevision controlForm (programming)Moment <Mathematik>Product (business)Figurate numberClient (computing)Medical imagingWeb serviceCartesian coordinate systemAsynchronous Transfer ModeMotion captureProcess (computing)Coding theoryDatabaseServer (computing)Physical systemScheduling (computing)ForestWorkstation <Musikinstrument>Computer programmingSoftware repositoryOrder (biology)CubeCountingSoftwareMereologyElectric generatorScripting languageMacro (computer science)Open sourceCodeOperating systemBlogBitHigh availabilityStructural loadUltraviolet photoelectron spectroscopySet (mathematics)2 (number)Computer animationLecture/Conference
45:47
FreewareOpen sourceEvent horizonComputer animation
Transcript: English(auto-generated)
00:07
Hello, everybody. Our next talk is an introduction to Kubernetes, and it's helped by Max, so please welcome our speaker.
00:23
Can everybody understand me well? Is that fine? Good to go. All right, introduction to Kubernetes. Today we're going to rethink scalable infrastructure with containers. Let's see how many buzzwords we can put in this presentation. I'm Max. I'm a test engineer
00:44
at CoreOS. I'm working with the Kubernetes team and helping them keep their quality up. We have an entire hour, so I would like to have this as interactive as possible. If you have questions, feel free to ask during the talk or after the talk, but I'm
01:02
also here a little bit after the conference, so you can ask me afterwards or reach out to me over any kind of social media or via email. That is appreciated as well. Okay. Why is somebody from CoreOS standing up here, and what does CoreOS have to do with containers
01:21
and Kubernetes? CoreOS is a company in San Francisco that is also based in New York and Berlin, and what we do is we secure, simplify and automate container infrastructure. And now that might not tell you a lot, so I'll just go over some stuff we do. So we have enterprise products. I'll not go into detail at a free and open source conference
01:44
on our enterprise products, but I want to touch briefly our open source project. I'll be container Linux as a Linux distribution, a very minimal Linux distribution that you just run your containers on top, then Rocket as a container engine, you can think of it
02:02
as an alternative to Docker, and Flannel, for example, as an overlay network, or etcd as a database, which is now pretty much the brain of Kubernetes. These are our open source projects. We also contribute to a bunch, so that will be Prometheus and Kubernetes. Hope you guys are ready if I'm pointing here, I can't really point at both. So that will
02:25
be Prometheus and Kubernetes, and so we are very involved in upstream projects there. Okay, that's for CoreOS. So what is Kubernetes? That's going to be the title and the topic of the talk today. If I want you to take anything with this talk, that would be it's
02:45
a platform for running your applications. That's it. Nothing very complicated, so whenever you have a chitchat you can just shoot out that sentence. But we want to dive a little bit deeper into Kubernetes than just that. So let's first look at what kind of problem
03:00
is Kubernetes actually trying to solve. And I want to do that with a little example here. So let's just imagine we are a startup and we have that really, really great idea. We have our phone and we do an augmented reality app. So we walk around in the streets and then we find creatures all over through our phone, we can see them, and then we throw
03:22
balls at them until we catch them and we, I don't know, swap them and fight with them with each other. Apparently that's very popular. So we got our location service at the very beginning. So that just returns true or false when we give it GPS coordinates. That's very high sophisticated and we run that on a normal server, right? We probably
03:46
put just Debian on it and do an apt-get and install all the dependencies. We're good to go, we got our first service running. Then in addition, our startup is slowly evolving. People want more than just a location, so we develop a user service and we develop in
04:04
something app X service that wasn't creative enough. And we put those on service as well. So now we have three applications, each on a separate server, just managing the dependencies via apt-get. So it's really easy, not very complicated. And whenever we want to roll out
04:24
a new version, we take down the old one and just start the next one. Downtime is not really a problem at the beginning. Now this is getting really popular. People really like it and a lot of people are actually running around with their smartphones in their hands. Oh hang on, I'm sorry. So we probably have to scale this. So we don't only need one
04:48
instance of all of our applications, but we probably need multiple instances of all of our applications. And now slowly it's getting a little bit more complicated. So first of all, dependency management. How do you do that on all of these servers? Well you have to keep
05:03
all the versions under the same, right? And whenever you want to roll out a new version, you have to SSH into every single server, take down the application, put on a new application. Now you probably slowly worry about downtime. You might want reproducible environments,
05:21
so you want your developers not to say anymore it works on my machine. Then in addition, you probably want networking in between these services. You want the location service maybe to know where the user service actually is, so do you configure it, hard code it every time you get a new server and put it on there?
05:43
In addition, you might want monitoring around all of this, so you want to be woken up in the night if something is going very, very wrong. And plenty of other stuff you want to do here. So it's actually a lot of work. Three servers per app might not be a problem
06:00
at the beginning, but the more and more you get, it's getting more and more complicated. And maybe that is all developed on a different technology stack, like one is written in Go, one is written in Python and so on, and suddenly you can't run the location service on the user server and the user application on the location server because the dependencies are
06:21
simply not there, right? And now it slowly comes to utilization. I want to use as much of hardware as possible, so maybe the user service does some analysis in the night, but really doesn't need any resources during the day. So that could be taken by the location server, but then again our dependencies are a problem, so we can't really run our location
06:42
service on the user server. Okay, a bunch of problems. Well, we're here for a solution. So the suggestion I'm making here is why will we have specific servers when we can abstract in between them? So we just have servers that are all configured the same, and we have application
07:03
on top. And we need something in between that does the abstraction for us. Abstracting from hardware, abstracting from network, and abstracting away from processes. And as you might guess, that would be Kubernetes today, and today I want to explain what this box is, and so that you can
07:22
explain to others as well, and maybe even use it or set it up yourself. A little bit about Kubernetes. We've already had our number one sentence, it's a platform for running applications. It's a platform for running application, abstracting away your infrastructure. That's the basic thing you have to think about when the name Kubernetes comes up. A little bit of history. What is Kubernetes all about,
07:46
or where does it actually come from? Well, it comes from Google, and it turns out Google actually has a lot of experience running containers. They've done that since a lot of years, and actually a lot of the main technologies that we today use in the Linux kernel to run
08:01
containers, Linux containers are contributed by Google. For example, cgroups, the concept of cgroups. So, Google has a lot of experience with that, and in 2014 they are open source, their way, how they manage their infrastructure. But they don't do that by open sourcing code, but they do that by open sourcing ideas and learnings. And that is picked up by the community
08:25
by the instructors of Google on GitHub, and now it's an open project developed by a lot of companies altogether. It's very influenced by Borg, as I said, it's the learnings and ideas from Google. This is the internal monitoring system at the moment, I think, as well. And,
08:45
well, this open source project developed and in 2015 version one was released, and with that version one it also joined the CNCF project, so now it's not part of Google anymore, but it's actually part of the CNCF, the Cloud Native Computing Foundation.
09:02
Google is still heavily invested in it, their GKE is actually running on this, so it's very important still for them. Okay, let's dive into a little bit of details on Kubernetes, and I just want to grasp what's your experience with it. So, here, who has ever, let's say, read about Kubernetes? Okay, that's really good. Who has ever run Kubernetes or
09:27
ever interacted with kubectl or something like that? Okay, cool. Who is actually using Kubernetes in production right now? One, two, three. Okay, three. Cool. Okay. I hope after
09:40
this talk all of you are just changing everything. All right, let's go into core concepts. So, core components so you first understand all of this. Well, first of all, let's go back to our location service. I wrote that very high sophisticated location service in Go, of course, and now as a good hype-driven development start-up, I wrap a container around
10:04
it. And let's hopefully think I don't only read hackernews, but actually I've put some thoughts into this. So, what I do is I take my application and put everything that application needs inside that container. That would, for example, be the Golang environment.
10:22
And from these software containers, what is that actually? I talked about cgroups earlier. We have namespaces. So, the Linux kernel actually has no clue what containers are. It's just a higher level concept that you build from smaller concepts. Namely being cgroups and namespaces, what can a process see and what can a process actually use? I'll not go deep into
10:46
containers today. That's an entire new talk, but I'm very happy to point you towards more information here. Now, this meta-level transition is done by, for example, Docker and Rocket, and these are our container runtimes that we can use. What does that help us,
11:03
this container idea? We've been doing software ever since. Why do we need this, suddenly? Well, first of all, portability. I talked about works on my machine problem. You want your developers to probably run in the same environment than it is in the end deployed. So, now your developers can develop in the same Docker
11:24
container and have the same versions around. In addition, in terms of operations, it's very important to have everything isolated. We were able to do that on machine level with virtual machines before, but now we can drill a lot deeper and do that on process level.
11:41
And lastly, we want resource accounting. We want to see who is eating how many resources and we want to be able to restrict this as well. So, all of these things are combined now in this hype of containers. Now, Kubernetes doesn't stop here. Kubernetes not only wraps
12:00
a container around our application, but it wraps another concept around it, and that would be a Well, a pod is the smallest deployable unit that you can possibly have in Kubernetes. And the idea behind it is sometimes you don't only want to have one process. So, for example,
12:24
in case of our location server, we might also put a little network proxy in front of it, or some logging framework in front of it. So, something that has to be deployed with that container every single time. So, why don't we all put it together in one pod,
12:40
and then we can schedule a pod somewhere. So, that's just a simple concept around it. Okay. Remember this picture. I have to jump a little bit, but we'll come back to that. Okay. So, we want to run software. What do we need for that? Of course, hardware. Let's first of all buy a bunch of servers. That's probably a lot of fun. And let's call this
13:03
one the master server. And a master in Kubernetes, it's just a server. It can be bare metal, or it can be a VM, or you can get really creative. And it needs a Linux kernel. That's the main idea. We need a Linux kernel as a basis, and that could, for example, be container Linux, but that could also be Rell or Debian, and so on.
13:25
Now, on that master, we deploy Kubernetes. And as a user, you could just think of Kubernetes as just that one black box, or in this case, green box. But let's go a little bit into detail. So, that thing is called the Kubernetes control plane. And the Kubernetes control plane,
13:43
it has the API server, and it has the controller manager in there, the scheduler, and the kube proxy. I'll go a little bit into detail of each of those components, but just to wrap it up really quickly, the API server is the thing you talk to. The controller manager is actually taking care of all your objects inside your cluster.
14:01
The scheduler schedules the workload on each node, and the kube proxy, for example, takes care of the networking. We'll dive a little bit into detail there. Running one master is probably boring, so we buy more servers, we buy a worker, and that worker, again, is just a bare metal machine and need Linux, and that is actually where
14:23
our workload was running on. You can still run your workload on the master node, but you probably want to run when you run Kubernetes at a scale of more than just two servers. Okay, here again, we deploy a little Kubernetes that is called the kubelet. It listens to the
14:41
big Kubernetes, and whenever the big Kubernetes says, hey, please deploy something here, then the little Kubernetes says, okay, I'll do that, and it starts that up on your worker. Okay, one worker, again, boring. Let's have a couple of ones, and tada, we have our infrastructure. Cool, we're good to go. Let's deploy some applications on this.
15:02
So, Kubernetes has this very declarative style of you interacting with the cluster. You don't really tell Kubernetes what to do or where to get there. You simply describe the state you want to get to, because Kubernetes knows a lot better how to get there than you, because Kubernetes knows entirely what its current state is and how to get to the next one.
15:23
So, what you do is you write deployment YAMLs. You can write deployment JSONs as well, it's just a format in the end. And what you do is you write this deployment YAML, you give it a name, you give it a replica count, so, for example, now I want to run three replicas, and you give it a container. The container you would just push through a
15:43
container registry, like, for example, Docker Hub. And, sorry, I skipped that a little too fast. We give that deployment now to Kubernetes, namely the Kubernetes API server. And the Kubernetes API server picks it up and saves that. The controller manager picks
16:03
that up and goes over it, does it work? Then the scheduler sees, oh, there are three replicas, but they are nowhere deployed, so it picks that up, and then it schedules those on the worker machines, then the cubelets see, oh, something got scheduled on me, so let's start that on me. And that's it. Now you have your applications up and running
16:25
just by giving that YAML to your Kubernetes server. Now, I talked earlier about rolling deployments or how you can now roll your versions in your cluster, which is very difficult if you just have a bunch of bash scripts doing this, and if you want to scale there. So, in terms of Kubernetes, it's very
16:43
declarative, you just change the version of your Docker image, don't forget to push that to a registry, and give that to Kubernetes, and Kubernetes knows what to do from here, and will do a rolling release, so slowly move all the parts over to the new version.
17:00
And that's it. That's all the deployment process. Now, you might be saying, okay, the location service, come on, you give it, I don't know, latitude, longitude, and it returns true or false, that's really not difficult, we solved this since years, that's nothing new at all. So, everybody can do stateless, and everybody can do stateless applications, and that's what probably Kubernetes really shines at, right, because it's not that
17:24
difficult. But stateless is easy, stateful is actually hard, and I don't want to just talk about stateless application, I want to go a little bit into stateful applications and how you could possibly run those in Kubernetes. So, let's first look at the problem. Stateless applications, we don't really care if they die, we can just start up a new one.
17:44
But stateful applications, for example, here on my scale database, it, for example, connects to the local disk, and writes its stuff there. And if this fails, all your data is gone, or you can maybe restart your machine, but you will definitely have downtime, and so on.
18:01
So, I suggest a different approach, and some might look mad at me here. I propose having network storage, and putting that idea of replicating and so on into your network storage, and having just mount points into your server, and your MySQL database attaching to that. So now, whenever my server, let's say, dies, we can just move the disk
18:27
to a new server, and start the process there again, and we're happy. We had a little bit of downtime, but we can optimize that to have it really quickly. That's the main idea, how to run stateful applications on Kubernetes. What Kubernetes here introduces
18:42
is different concepts. First of all, persistent volume. We, in the end, stateful has to be safe somewhere, so we need persistent volumes. That would be network storage. It can either be statically or dynamically provisioned. So, the idea is either you give Kubernetes a bunch of disks, or you're, for example, running on a cloud provider, and then Kubernetes
19:03
already knows how to spin up new disks on AWS. And these drivers are for GCE, I think, AWS. They're a bunch of open source projects that you can integrate with. So, there's a huge community around that. And, of course, we have a new concept, not deployment, YAMLs,
19:21
but now we have stateful YAMLs, and these are just for stateful applications. And the key takeaways here is, well, stateful, unique network identities, I'll go into detail how that works, and persistent storage, of course. Now, your application in a stateless
19:41
environment, you don't really care what application it is and where it lives, but in a stateful environment, we actually really care about that. So, how would this look like, again, in the Kubernetes idea? Well, you give that the stateful YAML, and you can look at this, the change here is that we have a volume claim template, and thereby
20:02
you describe to Kubernetes what does my application need? So, in this case, we need a storage class of anything that could be SSD or spinning disk, but it's not very specific here, and let's say one gigabyte. And then Kubernetes displays your MySQL, for example, somewhere, starts the process, and mounts the disk in there, and you're good to go. Let's go through
20:27
a failure example, this thing dies, then you can just start up a new one on the same node, let's say the entire node dies, or I don't know, you cut a cable or something, then you can just start this process on a different worker, and mount the same disk
20:42
in there. And now I talked about network identities, the network identity moves from the upper one down to the lower one. So, your applications really just notice a downtime, but they don't notice that their entire data is gone, as it's not. Okay, cool.
21:02
This is for basic stateful applications, there's also the idea of operators, I went through that a little bit to my previous talk, the idea is that we have all that knowledge around how to operate stateful applications, which is really difficult, and then people put that knowledge into code and wrote operators, and that would, for example, be the etcd operator
21:23
or the Prometheus operator. Okay, do we got any questions so far? I'll still cover the network part, don't worry. So, how do we apply the concept of pods here in this
22:02
scenario? And here in this scenario, pretty much containers and pods are the same. We it's just one MySQL process, you don't have any sidecars next to it. And here that container would just be in a pod, and there would just be one container in that one pod. And
22:24
then you would mount the volume into the container itself. Let's imagine, for example, we want more logging around this, so for example, we want to log whenever there was a request to our MySQL database, then we might not want to change the MySQL source
22:40
code, but if we want to deploy a little application next to it, we would just place it in the same pod inside a new container, and then we would have two containers in one pod. Does that explain it? You're hesitating. Yeah. Yeah. Then you don't put it in the
23:09
same pod. Only if the numbers always match, right? If you have an application that always lives with the other application, you always deploy it in one pod. If the numbers
23:20
are not equal, you don't do that. Very good. Yeah. And there. All right. Yeah. I'll move on to networking, and then we can go on over questions again, and then I'll go in more expert details in Kubernetes. Okay. We've got the operators covered, right? And
23:40
now I talked about networking, and that is actually a huge pain. It's not only storage, but also networking in total, and the problem is our pods can really move around in our architecture over and over and over again. They can die and come back, and we don't really know about it. We don't want to know about it, because Kubernetes manages
24:01
a lot better and a lot faster than we do, but the problem is how do we tell one pod to communicate to another pod if they really don't know where the other pod is? If it's such a fast-moving infrastructure? So here we introduce the idea of services. You can think of services as pretty much just a proxy in front of it, and it basically just
24:24
groups the pods. So, for example, if I have the location pod, and I deploy three of those, so I have three location pods, then I put this location service in front of that, and whenever another application needs to talk to my location service, I just point it to that service, and I don't really care which pod that a specific application
24:43
is talking to, I just wanted to talk to the type location. These services, the idea is that you create the service, and the service will keep the static IP inside your cluster at all times, and you can obtain that IP as a different application by environment
25:03
variables or DNS. You probably want to start off with environment variables, which are automatically mounted in your pod, but you probably want to go over to DNS soon. So how does this work, and how can one IP address be in the entire cluster, and how can I call that IP address from everywhere and still talk to different pods? That's
25:23
a little bit strange and crazy. We don't do any port matching here in Kubernetes, whereas every container has its own IP in the end. So let's say we have a front-end pod and that talks to our back-end, and somehow the front-end needs to talk to it,
25:40
so it goes over the network and just does a request to it. It read the IP address in the environment variables inside the container, and now does a ping to that IP address. So what actually happens here? The service, the front-end service really doesn't know anything about what happens, and it doesn't need to know about anything. So it just talks
26:02
to the IP address and something is answering from that IP address. But what is actually happening underneath is we build a virtual network on top of our normal network, and the kube-proxy where all the traffic goes through talks to the API server, gets the
26:21
pod IPs of that service IP, and writes that into IP contract rules. And now the Linux kernel does the load balancing of translating a service IP into pod IPs. So the front-end doesn't know anything, it just talks to that service, and the Linux kernel automatically translates to all the other IPs. And now this would look like that. It would just,
26:43
for example, pick the first one if we do round robin, and would just talk to the first worker. And the next time it might even talk to the next one. So that's very random in terms of stateless applications. Okay. How can you get started on this entire Kubernetes
27:02
idea? Well, first of all, I think to really try it out in a quick way, you can just start up Minikube. Minikube is really nice. It's just a VM on your computer, and it's a single-node cluster. You're up and running right away, and you can deploy your stuff in there. Of course, you're bound by the requirements or your resources of your laptop.
27:24
Then next is of course Tectonic, which I want to talk a little bit about in a minute if I still have time. And of course there are hosted versions like GKE or KubeADM that you can use to spin up Kubernetes clusters. Okay. Let's do a real quick recap, then
27:41
I'll go over questions, and then if there are still people interested, I can tell you about more ways how to run Kubernetes in a different way. So our sentence again, Kubernetes is a platform for running applications. You have your application, you wrap that in a container, you can wrap multiple containers in one pod if they are
28:02
still always present with each other. That is your infrastructure, you have a master, you have the control plane on the master, you have cubelets on the workers, you build that infrastructure, you write deployment YAMLs and give that to Kubernetes. And you write stateful set YAMLs and give those to Kubernetes.
28:20
In the end, Kubernetes is just a protecting way your infrastructure. All right. Okay, cool. Yeah, I'll go into questions now, if you still have some. And yeah, please go for it. Yes. Yes. By scaling this.
28:51
Yes, I'll repeat the question. Because of write applications, can you go a little bit more into detail?
29:11
Okay, so the question is, do we have, what happens if we have race conditions, if we have two containers mounting the same volume, and then and again,
29:20
the same writing at this? Well, first of all, Kubernetes, I think, can not solve everything. There's probably a solution for that situation out there that some users solved. But I think the idea is, you don't really want to have two MySQL databases writing to the same file system. Or if you do, you really want them to write into different folders or something like that.
29:40
So you have to take care of that race condition. And what I would suggest is you really have two network storages that you can attach to one container. So each container has its own network drive.
30:02
I hope you replicate your data not this way, but actually by talking with each other between the database. I think, let's go into detail afterwards. I think there are a lot of different scenarios how to do that. Yeah, sure.
30:21
You have to build it yourself. But if we still have time, I can show you a project where you can just run one command and it builds everything for you. Right?
30:41
Maybe no one command. All right, I don't know who was first. Sorry, go for it please. How to magically, okay, let's push this question a little bit back and I'll talk about how CoreOS thinks it's best how to bring up Kubernetes clusters.
31:01
Okay, all right, yes. Okay, do I always deploy a service in front of a pod? It really depends if, for example, if you have a batch job. And that batch job, nobody needs to talk to that. And you don't really need a service in front of it.
31:21
But whenever you need to talk to that thing, you probably want a service in front of it. It's a whole different story around stateful applications. But I, yeah, I can't cover everything in the talk. Feel free to reach out afterwards. Sorry about that. Yeah?
31:48
I'm sorry, can you? It's basically difficult.
32:03
So the question was, what happens if I have a local disk, local volume claim, and now my node dies and now I have to move it to a different node, right? So I think I probably explained that the wrong way.
32:21
Let's go back a little bit. So you don't have local storage but, can I go fast enough? You have network storage, right? So you don't really, you don't care about the hard drive on your node, on your machine, but you let that be managed, for example, by a three or something by your cloud provider or something you built yourself with open source solutions.
32:42
And then you mount your network drives into it. So whenever now this node dies, we have never written to the local disk. So we can just move the application to a different server and mount the same network storage in there. And we don't have to care about moving the data.
33:02
Sorry? Okay. All right, okay. I'll just go into a little bit of details how Core has run this. And if there are still questions, I'm around afterwards. I think that's probably more helpful. Let's go, I'm sorry about all the slides.
33:25
So we are contributors to the upstream project of Kubernetes. And we have a very opinionated way how to run Kubernetes. And so we know how to make it most resilient as possible. And I want to cover really quickly how we do that.
33:41
So the one-click solution that I proposed earlier, I might have over ejected a little bit, but it is the tectonic installer. Don't be afraid of the word tectonic in here. In the end, you can still spin up just a vanilla Kubernetes cluster with it. So it is the Core's way how to run Kubernetes.
34:01
And it is actually built with Terraform. I don't know, probably a lot of people are familiar with Terraform. It's a very nice way out to integrate with pretty much any provider. So you can extend it nicely. We are right now supporting AWS Bare Metal Azure. There is code for OpenStack and there is going to be support for VMware.
34:23
So what does this installer actually do? Well, it starts up Kubernetes in a very special way. And that is the idea of self-hosted. Is anybody familiar with self-hosted compilers, for example, or self-driving compilers? Have anybody ever heard of them?
34:41
So the idea is if you write your own language, in the end you can compile your own language with your own language. So it's kind of a chicken and egg problem here. And what we do is we self-host Kubernetes. So the idea of Kubernetes is you have your application and now you can nicely scale it up and down. You have great tooling around it.
35:01
If it dies, we just restart it. Now why don't we just not just do it with our applications, but also with our cluster? And that's the idea of self-hosted. So we run Kubernetes inside itself in the Kubernetes. Now that's a chicken and egg problem. And I think over there is someone quite surprised. And I'll go into detail how that works.
35:21
But first of all, we have different stages. So for example, DNS and add-ons. That's easy to deploy on your own cluster. It's in the end just a deployment. Then it's getting more difficult at level three with scheduler, controller, and proxy because try to schedule a scheduler without a scheduler. That is really difficult. And then the second level where we are at right now
35:43
with the tectonic installer is the API server started by the same API server. Then you can go experimental with etcd and you can go crazy with kubelets, but we are not there actually self-hosting kubelets. So how does this work? Well, it's an open source project. It's called BootCube.
36:01
And it helps you run in Kubernetes inside Kubernetes. And I want to go through the steps how this actually works. So on your machine, you have your kubelet. Kubelet just being the daemon that can, in the end, spin up containers. And that's all that BootCube needs at the beginning.
36:21
You can think of BootCube just as a script, just creating stuff. So first of all, we spin up a normal Kubernetes cluster. Might be difficult, but if you automate it, that's actually pretty quick. And it spins up a temporary etcd cluster, a temporary API server, and temporary scheduler.
36:41
These are just our temporary components just to bootstrap stuff. Once we have that, on that Kubernetes cluster, on our bootstrapping Kubernetes cluster, we start our actually production running Kubernetes cluster. So we start an additional etcd cluster. We start an additional API server. We start an additional scheduler, and so on.
37:02
Now, this API server sees the temporary API server and sees, oh, there are two. So I'm not going to do anything. It just idles around. We do leader election by default. So it doesn't really do anything. This is just sitting there. The same is with the etcd and the scheduler.
37:22
Now what we can do is move all the data, so all the brain of Kubernetes from etcd over to the long running etcd. And then we can kill the temporary server. The production API server sees, oh,
37:42
there is no other API server. I better take over from here. And then it takes over, and now you have your Kubernetes cluster running inside your Kubernetes cluster. You might be asking, why? See a lot of confused faces here.
38:00
Well, first of all, we have small dependencies. We run our stuff how we run our applications. We have deployment consistency, as I said. We can run everything the same way. We have easy introspection. Kubernetes brings a lot of tooling how to introspect into your applications. And now, in addition, you can easily introspect
38:24
into your cluster itself, like, for example, metrics or logging and so on. Then you can easily do cluster upgrades. I don't know, a lot of people here are running Kubernetes in production, but it's actually a huge pain how to update it. So cluster upgrades are just like application upgrades.
38:43
You change your deployment YAML and give that to Kubernetes, and that takes the rest. And in addition, we have easier high availability modes. So for example, we can scale our API server now. We can scale our applications. We can also scale our API server.
39:01
Now, this high availability we achieve by leader election, as I said. If the one scheduler sees, oh, there's still the temporary scheduler, I won't do anything. And we do checkpointing. So for example, we checkpoint the API server every now and then onto disk. So whenever a node, for example, restarts,
39:22
we can start up that same API server again. That is, for example, how you could now update your Kubernetes cluster. You just use your tooling that you use for Kubernetes and just edit your API server. And from there, you go. OK, that's my end about the more specific Kubernetes class.
39:42
I think there are a bunch of questions around that. I'll just finish my slides, and then you can ask questions around that as well. So we are hiring, if you want to get involved, in San Francisco, New York, and Berlin. We are hiring for interns as well, for Prometheus developers upstream, for automation engineers, for test automation engineers,
40:01
pretty much all over. And yeah, feel free to reach out. And I think now it's time for questions. I still have 20 minutes. Is that right? OK, I'm going to repeat the questions. All right, yes, please.
40:39
I would not call it war, but yeah.
40:54
OK, so why would I use tectonic installer? Why don't I just use BootCube? Why don't I use kube-adm and all the other tools?
41:03
Well, they all have their ups and downs, and they can't really represent the other tools here. For sure, check them all out. Don't just go with tectonic installer. Tectonic installer, as I said, we know how to run Kubernetes. So the entire idea of self-hosted, you get by tectonic installer, and you get the whole story
41:21
of you get CoreOS underneath, CoreOS container Linux underneath. So you don't need to buy any other operating systems. So you get the whole package of CoreOS in tectonic installer. I think kube-adm is probably a more basic approach and less opinionated to the entire package.
41:41
But there are probably a lot of blog posts around comparing those tools. I don't think I'd do a good job comparing them here. OK, other questions? Yes? OK, so we are load testing this a whole lot. But as you know, we're a production company.
42:03
We're the product company. We're not actually hosting this. I'm not able to talk about my clients here, our clients. I'm sorry. OK, yes?
42:22
So are we using Terraform in our tectonic installer? Yes, we are. And that is, we chose that on purpose so that people, we don't just give them tectonic installer, and they can only run our version, but they can actually adjust all of this. It's all open source. You can play around with the Terraform.
42:41
And in the end, just Terraform code. Do we generate Terraform scripts? No, we don't. We don't do any macro programming here.
43:02
We just leverage Terraform itself. So we use Terraform. And you can extend it. You can clone the GitHub repo and use it as well. We have a UI installer as well, if you want to go through that. It's probably easier at the beginning. So that helps you bringing up a custom AWS, for example.
43:21
OK, other questions? Yes, please. Mm-hm. So this basically means how many containers you will run with a certain application. Now the question is how they are distributed across the workers, because from a high availability
43:41
perspective, if you will put all the replicas on one worker, and he's out, then you are increasing your outage. This worker is out. Can you influence somehow? So if I adjust my replica account, can I say that it's not supposed to run on one node? Or can I say that my database nodes
44:00
please run on this really high storage nodes and so on? Yes, for sure. There are taints and a lot of concepts in Kubernetes that you can actually adjust to specify what should run. And for example, you can also adjust. For example, we don't want three API servers and all of them running on the same node.
44:21
So we can specify, hey, if there's one API server, don't schedule a second API server on the same node. So you can do a lot of taints and tolerations. All right. Any other questions? Yes, please.
44:45
Can Kubernetes handle applications that are clustered and synchronized via caches? What would be the problem? Why not? OK. In the end, there's no real, like, it's just a virtual network. So in the end, it's just a network
45:01
that you can use and leverage yourself. But you have to pay attention, of course. You have to talk to the service now, right? Not just to single pods, or you have to talk to a stateful set. We can go into detail afterwards as well, if you want. OK, other questions?
45:22
None. OK, all right. I'm here a little bit longer if you want to reach out. I think the only thing that you need to know is the basic stuff. However, I didn't confuse you too much about the complicated stuff and the self-hosted idea. All right. Thank you very much for the attention. And thanks for the FOSTEM team.