We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Introduction to container orchestration with Kubernetes

00:00

Formal Metadata

Title
Introduction to container orchestration with Kubernetes
Subtitle
Everything you need to know for your next job interview
Title of Series
Number of Parts
95
Author
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Containers are not new and you can hardly find a job in IT nowadays which doesn't involve dealing with them one way or the other. But once you got your hands on the container technology you are inevitably run into the container management and orchestration topics. Kubernetes is a more or less vendor-independent orchestration platform, which provides out of the box automation for many standard infrastructure tasks (scaling, load-balancing, scheduling..).
Keywords
Open sourceFreewareSoftware developerTelecommunicationContinuous integrationDistribution (mathematics)Closed setINTEGRALData managementConnectivity (graph theory)Product (business)XMLUMLLecture/ConferenceMeeting/Interview
Data managementProduct (business)Level (video gaming)BlogSystem administratorSoftware developerCombinational logicWater vaporMultiplication signComputer animation
SpacetimePoint (geometry)Kernel (computing)Maxima and minimaSoftware developerParameter (computer programming)Cartesian coordinate systemService (economics)View (database)Generic programmingLevel (video gaming)Virtual machineProcess (computing)BitComputer animation
SpacetimeHybrid computerDifferent (Kate Ryan album)Kernel (computing)State observerNumberCodePhysical systemSpacetimeSelf-organizationComputer animation
Hybrid computerLie groupComputer wormPhysical systemOperating systemKernel (computing)Hybrid computerSpacetimeEqualiser (mathematics)Random number generationDiscrepancy theoryCartesian coordinate systemInformation securityImplementationSoftware developerSoftware bugAnalytic continuationLevel (video gaming)Source codeDebuggerService (economics)Right angleGeneric programmingLecture/ConferenceMeeting/InterviewComputer animationXMLUML
Computer clusterCartesian coordinate systemPhysical systemKernel (computing)Software developerEqualiser (mathematics)Product (business)Multiplication signVideo gameReal numberCross-platformComputer clusterLecture/Conference
Multiplication signContinuous functionData managementLevel (video gaming)Computer animationMeeting/Interview
Computer-generated imageryGradientCartesian coordinate systemQuicksortMedical imagingMechanism designJava appletData structureComputer fileMobile appNetwork topologySoftware developerRepository (publishing)Computer animation
EmailGradientMathematicsComputer-generated imageryBinary codeMedical imagingMultiplication signComputer fileCompilation albumCompilerPhysical systemObject (grammar)Volume (thermodynamics)Directory serviceReverse engineeringState observerShared memoryData storage deviceSoftwareWeightFrequencyCache (computing)MultilaterationComputer animation
Keilförmige AnordnungEmailRepository (publishing)System callNetwork topologyBitContinuous integrationIntegrated development environmentContext awarenessSource codeCartesian coordinate systemRepository (publishing)Process (computing)Video gameMedical imagingCycle (graph theory)View (database)CASE <Informatik>BuildingWindows RegistryProduct (business)Analytic continuationCausalityGreedy algorithmData storage deviceContinuum hypothesisDifferent (Kate Ryan album)ResultantLevel (video gaming)Library (computing)CodeMathematicsSoftware testingExecution unitMeasurementMeeting/InterviewComputer animation
Repository (publishing)File formatLevel (video gaming)Different (Kate Ryan album)Data managementProduct (business)Computer fileWindows RegistryMedical imagingImage registrationCartesian coordinate systemProcess (computing)Program flowchartMeeting/InterviewComputer animation
Process (computing)Medical imagingWebsiteComputing platformProjective planeStatement (computer science)Online helpSoftware frameworkObject (grammar)AbstractionLevel (video gaming)CollaborationismMeeting/InterviewComputer animation
Formal languageSummierbarkeitTerm (mathematics)Arithmetic meanAutocovarianceSoftware developerAbstractionImage registrationGene clusterSystem administratorVirtual machinePhysical systemMedical imagingMappingInstance (computer science)Windows RegistryObject (grammar)Different (Kate Ryan album)Group actionVirtualizationInjektivitätTraffic reportingAnalytic continuationQuery languageRow (database)Dimensional analysisSet (mathematics)Food energyPoint (geometry)Meeting/InterviewComputer animation
Data managementPoint (geometry)AbstractionSet (mathematics)NumberCASE <Informatik>View (database)Object (grammar)Cartesian coordinate systemError messageWordGroup actionMultiplicationConnectivity (graph theory)HysteresekurveSystem callScaling (geometry)Wrapper (data mining)Meeting/InterviewComputer animation
Object (grammar)Computer configurationPoint (geometry)Category of beingSocial classSet (mathematics)NumberScheduling (computing)Parameter (computer programming)Content (media)Combinational logicScaling (geometry)Touch typingMeeting/Interview
WebsiteScaling (geometry)Machine visionSet (mathematics)Instance (computer science)Service (economics)Content (media)Cartesian coordinate systemObject (grammar)Strategy gameRevision controlFlow separationGroup actionPoint (geometry)Different (Kate Ryan album)LastteilungSurjective functionComputer animation
Rule of inferenceCombinational logicSet (mathematics)Service (economics)Point (geometry)Object (grammar)AbstractionLevel (video gaming)Task (computing)SoftwareIP addressNumberFilm editingDifferent (Kate Ryan album)Process (computing)WhiteboardCartesian coordinate systemSoftware developerWorkloadRevision controlState of matterMoving averageScheduling (computing)Meeting/Interview
IP addressSoftwareCombinational logicInternetworkingPoint (geometry)Service (economics)Mechanism designLastteilungDifferent (Kate Ryan album)Direct numerical simulationRange (statistics)Default (computer science)VirtualizationDiagram
Chemical equationOperator (mathematics)Point (geometry)Wechselseitige InformationForm (programming)IP addressTable (information)Ocean currentService (economics)Level (video gaming)VirtualizationFront and back endsDirect numerical simulationMeeting/Interview
IP addressUDP <Protokoll>Client (computing)Rule of inferenceService (economics)Range (statistics)RoutingInterface (computing)Computing platformWhiteboardSoftwareArithmetic meanMereologyWordData conversionPhysical systemDifferent (Kate Ryan album)Point cloudTerm (mathematics)American Vacuum SocietyDiagram
RootVideo game consoleService (economics)AbstractionClient (computing)Formal languageObject (grammar)Utility softwareVideo gameLine (geometry)Representational state transferWrapper (data mining)Meeting/InterviewComputer animation
Representational state transferSummierbarkeitConfiguration spaceObject (grammar)System callWhiteboardWordCubeComputer configurationMultiplication signComputer fileInterface (computing)Descriptive statisticsWrapper (data mining)Functional (mathematics)User interfaceLine (geometry)Repository (publishing)File formatWeb 2.0Meeting/Interview
Computer-generated imageryLine (geometry)Cartesian coordinate systemMereologyObject (grammar)Combinational logicWordNumberDefault (computer science)Computer configurationPoint (geometry)Service (economics)Instance (computer science)Computer animation
Service (economics)WhiteboardIndependence (probability theory)Multiplication signCartesian coordinate systemSoftwareRule of inferenceComputer animation
Gamma functionLaptopDefault (computer science)Maxima and minimaRange (statistics)MereologyPoint (geometry)Set (mathematics)Workstation <Musikinstrument>Process (computing)MappingTranslation (relic)SoftwareMedical imagingoutputLaptop
Gastropod shelloutputComputer networkSoftwareMedical imagingPoint (geometry)Core dumpCuboidSystem administratorInteractive television
Game theoryService (economics)Point (geometry)Scheduling (computing)Object (grammar)Set (mathematics)Social classProcess (computing)MeasurementVolume (thermodynamics)Configuration spaceAdditionData storage deviceComputer configurationText editorInstance (computer science)Combinational logicDigital electronicsTask (computing)MappingComputer fileError messageLevel (video gaming)AuthenticationComputer animationMeeting/Interview
QuantumDisintegrationHybrid computerCASE <Informatik>Different (Kate Ryan album)Computing platformOffice suiteSoftware developerCartesian coordinate systemContent (media)DatabaseMereologyComputer animation
Continuous integrationData managementPoint (geometry)Level (video gaming)DatabaseMedical imagingSet (mathematics)Cartesian coordinate systemINTEGRALDifferent (Kate Ryan album)Task (computing)State of matterAreaOrder (biology)Object (grammar)System administratorCASE <Informatik>ImplementationDisk read-and-write headTerm (mathematics)Video gameElement (mathematics)View (database)Similarity (geometry)MereologyService (economics)Computing platformSoftware developerProcess (computing)Classical physicsFront and back endsMeeting/Interview
Level (video gaming)Medical imagingSoftware developerState of matterExpert systemBitDatabaseProjective planeProduct (business)Set (mathematics)Computer animationMeeting/Interview
Game theoryGene clusterPoint (geometry)View (database)Software developerProjective planeSet (mathematics)Control flowSoftware testingDataflowProduct (business)Asynchronous Transfer ModeSocial classLevel (video gaming)Integrated development environmentHeegaard splittingWorkloadComputer animation
Interactive televisionGene clusterProjective planeOverhead (computing)CodeConfiguration spaceWordNumberMeeting/Interview
Overhead (computing)Medical imagingNumberIntegrated development environmentComputer animationMeeting/Interview
Menu (computing)Game controllerSource codeSound effectSoftware developerComputer hardwareSocial classMassTime zoneSchmelze <Betrieb>Instance (computer science)Product (business)Point cloudHigh availabilityMeeting/Interview
Replication (computing)Service (economics)Dilution (equation)Menu (computing)Student's t-testScalabilityVector spaceInstance (computer science)Service (economics)View (database)Server (computing)Control flowSoftware bugMeeting/Interview
Replication (computing)Game controllerCommodore VIC-20Menu (computing)Service (economics)Pulse (signal processing)Endliche ModelltheorieInstance (computer science)Service (economics)Different (Kate Ryan album)Computer fileScaling (geometry)Software testingData recoveryIntegrated development environmentCubeProjective planeLaptopSoftware developerHigh availabilityMeeting/Interview
Online helpIntegrated development environmentProjective planeLink (knot theory)Arithmetic meanComputer animation
Default (computer science)Stack (abstract data type)Error messageStandard deviationComputer fileBlogThresholding (image processing)Module (mathematics)Pattern languageString (computer science)Directory serviceVirtual machineLocal ringUniform resource locatorSimilarity (geometry)Variable (mathematics)Revision controlLetterpress printingAddress spaceCodeInstance (computer science)Complete metric spaceGastropod shellInformationLevel (video gaming)FlagView (database)Computer iconSet (mathematics)Medical imagingInstance (computer science)Arithmetic meanSoftware developerSet (mathematics)Service (economics)Right angleSocial classCartesian coordinate systemProduct (business)Staff (military)Data miningSoftware testingComputer animation
Pattern languageBlogInformationVirtual machineError messageStandard deviationComputer fileStack (abstract data type)Default (computer science)String (computer science)Module (mathematics)Uniform resource locatorLocal ringSimilarity (geometry)Variable (mathematics)Revision controlLetterpress printingAddress spaceCodeInstance (computer science)Gastropod shellComplete metric spaceLevel (video gaming)Thresholding (image processing)FlagDirectory serviceCore dumpProxy serverService (economics)Physical systemGamma functionNormed vector spaceSet (mathematics)Event horizonElectronic visual displayWorkloadIntrusion detection systemMenu (computing)MIDIMaxima and minimaUtility softwareAuthenticationInstance (computer science)Level (video gaming)Proxy serverAutocovarianceObject (grammar)NumberProduct (business)MultiplicationResultantSoftware developerStrategy gameRule of inferencePoint (geometry)RoutingSet (mathematics)Forcing (mathematics)Different (Kate Ryan album)LaptopComputer fileVirtualizationMeeting/InterviewComputer animation
Core dumpWeb browserStaff (military)Group actionLevel (video gaming)Service (economics)WhiteboardProxy serverRule of inferenceProcess (computing)MereologyCategory of beingModal logicContent (media)Direct numerical simulationCubeMeeting/Interview
TheoremBootingFirefox <Programm>FreewareOpen sourceEvent horizonComputer animation
Transcript: English(auto-generated)
So, let me start first with who am I and my name is Aleksandr Fedorova and my IRC nickname is Book War and maybe you can see me in some other places under this nickname.
So I'm a long-term Fedor ambassador and formerly I worked in Mirantis OpenStack distribution managing releases of the distribution itself and the full component of it. And currently I'm working as continuous integration engineer at Trivago and as no one knows what
continuous integration engineer is, that's the person who tries to make sure that development workflows fit into the deployment workflows you use in production. So I'm kind of trying to help developers and then we need to find the communication
way, the way to communicate and to improve on this integration pipelines. So if you have questions, comments and discussions, want to discuss something, then you can find me today and tomorrow at Fedorov. You can write me and you can comment on my small blog at medium.
So, today we're going to talk about a lot of stuff but mainly let me explain the subtitle. So obviously this talk is very entry level and it's totally not enough for managing Kubernetes
cluster or for like real working with it but what I meant by this title is that whenever you are a developer or sysadmin or even product manager or product owner you need a certain base level how Kubernetes works so you will be able to talk with people about Kubernetes
and the work and the services. So this talk presents a safe minimum for anyone who's going to work around this topic and I hope you will enjoy it. So yeah, we are going to talk about containers and Kubernetes and as I'm a CI engineer I
want to give a bit of a context and start from the very beginning. And first a question to the audience, who works with Docker? Awesome. And who works with Kubernetes? Awesome as well. Cool.
So I'm going to start with very, very basic stuff. What are containers and what are VMs? There's a lot to talk about but mainly for purpose of from the point of developer who created his application or just from a very generic high level point of view virtual
machine has its own kernel process. Container uses the host kernel from, uses kernel from the host and this is the main difference in the organization of containerized and virtualized processes. So this main difference leads us to first note that containers are generally not secure
and as our container user space has direct access to host kernel that's definitely not secure situation and you can execute code on a host system from a container generally.
There are more to the topic but this is observation number one and observation number two are just that containers are generally weird. It's promised by container developers, like evangelists of container systems that containers
are cross platform, easy to move from one system to another but generally you always need to have in mind that containers represent a very hybrid operating system. You take arbitrary kernel, you add arbitrary user space and you hope that it will work.
It generally does but not always. So the generic takeaways I like to point for like general overview for containers is that first of all use only trusted sources and still never trust user inside the container
unless you invest a lot of resources into research and container security topics and I guess there will be talk about container security right after the lunch here maybe. And as we have this kernel and user space discrepancy we shouldn't rely really on kernel
or system level features in a container. Generally, usually container applications are the essentially user space applications like front ends or generic services but the thing is it's impossible to not rely
at least on some kernel features even if you are a high level application developer. So there was a recent example with PHP 7 container where everything broke because of the different
kernel on the host system because PHP 7 relies on a certain implementation of random numbers and when you work with random numbers this leads us to kernel implementation of random numbers and there was a bug on container of PHP 7 application which wasn't able to
run on recent Fedora for example just because the kernel was different. So even if you are an application developer you are not safe and you might get into issues with different kernels and so every time you work with containers you should test containers in the same host system which you will use in production.
Develop in any system you want, do what you want but before you push it to the real live production system you should always test it on the same kind of host system even if you hope this is a cross platform application.
Now containers are all good and nice and basically container technology has been around for 10 years or more but here comes Docker and what Docker adds to containerization I think first of all Docker just appeared in the right time when container technology
becomes mature enough to be properly used but Docker also adds a lot of stuff around the container technology itself so Docker is an ecosystem to manage containers to work with them and to share them and so on.
So for, again for our high level overview Docker can be considered as a way of managing layered images for containers. So container technology is one thing but Docker adds a sort of a Git repository for
containers and again this one thing which is useful for application developers is to understand the layering structure of Docker images because it's often people consider Docker as just an isolation mechanism and sharing mechanism but they forget about internal
layering structure and with these layers it becomes really ugly sometimes when you have huge images which contain basically nothing. Like in this example you can see that this is a container which is used mostly for building
Java apps and you see that we have a Docker file which is a recipe for a container image and we start from a base layer, we add layer one which is a layer with gradle binary
and we add layer two which is layer with protobuf compiler and you can see that this Docker file is a bit, looks a bit weird because I do apt-get update and apt-get clean several times but the reason for that is again that every instruction creates a certain layer
and this layer you will keep with you whenever you move one container from one system to another so you always want to have your layers as minimal as possible thus if you use apt-get for example you always need to clean cache in the same instruction where
you updated it so you don't carry it with you because you don't need it ever. And with this layered images as I said Docker kind of creates the need for containers, there's of course more to it which Docker adds, Docker adds networking, Docker adds volumes and
you can mount directories from a host system in a container, you can create shared volumes and so on so there is more to the topic but generally for the purpose of this kind of talk you can safely think that Docker is layering and everything else can be added
later. Now once you have these containers then you have a way to store and share and reuse them, now containers become a way to package software and to deliver it and I want
to put this a bit in the context how the continuous integration with containers can look like. So mainly containers are used in continuous integration in two very different distinct ways. So one way you use container in CI is that this is your built environment,
the built tool and generally you have this Git repository with your source code, you have an application artifact which you want to produce and you have your built infrastructure environment like dependencies, tool chain, all related stuff to the build process and
you can like old school way of managing it by installing this on a worker's slaves is really hard to maintain because every application nowadays requires its own environment
and we don't want to agree on the common baseline for using some dependencies and so on. So container image in CI is very helpful to solve this problem, you put all your dependencies, build cache, Git cache, dependency cache, all in container image and then you can safely use it to build this particular application, create the application
artifact, push the artifact to storage and then you can just discard your container and never use it again. But this is just one application of containers which is very, very helpful on the infrastructure level but this is not enough so as soon as we
go further with containers, we want to use containers in production and here this is a critically different kind of container because here we use our application code as a source and we build container images and artifacts as a result of our continuous
integration, continuous delivery process and this container image is our production artifact so the pipeline which builds this kind of container images is completely different, it has different way of approval of changes like if you build tools have more libraries
than needed or if you build tools fail by some reason, you don't care. If it's production container then there is much more to it, you need much better testing, you need much better storage and so on. So the lifecycle of a certain application can look like this.
You have your source code, you put it in Git repository hopefully, then you take the code from Git repo, you do some building, you produce an artifact, in my case this is a jar, you publish jar to your repository again, this is a Maven repository for example, then you
take your jar, you build your Docker image containing this jar, you publish this Docker image to Docker registry and this image in that Docker registry is your final artifact which goes to production environment. And here there are two containers I was talking
about, so for Gradle build you use this build container for CI which is your disposable container with all your build tools and for the final artifact there is a container for production which is totally different from the one you use for building so currently
there is a lot of issues when people try to merge both of those containers into one and this is why Docker for example invented the format which is, I cannot remember how it's properly called, but currently you can have basically two stages in a Docker file,
the build stage and the production stage, but I prefer to have this as a completely different Docker file and manage it differently, build differently, maintain differently.
So now we have come to the registry with our Docker images and we obviously want to run our application in production. And here is where the fun starts because you contain your images are not enough, you need to understand how many of those processes with these container
images you run, how they interact, how to update them, how to roll out, how to roll back, how to reschedule to different hosts and so on. And this is where the Kubernetes comes into our discussion because Kubernetes is an orchestration platform. This is taken from the Kubernetes main site so it's kind of a mission statement for Kubernetes project.
So the whole idea of Kubernetes project is to operate those, or orchestrate those containers and container images we have produced in the previous step. And Kubernetes is a whole
new thing because it is a platform which provides a certain framework in which you want to dig in. And Kubernetes has a lot of helpful objects and abstractions which are helpful for you as an admin or developer or anyone, but this means also that it has
a lot of new terms and the whole new language you need to learn before you start working with it. And this is some of the main terms in the Kubernetes setup which everyone
who is working with Kubernetes must learn and must know because this is the language you use to describe what you are doing with this orchestration cluster. So obviously we have this image with our application, the image is stored in a registry. Now we
have a container, container runs images. Some containers can run the same image, some containers run different. Now we have this new term which is pod and this is basically just a group of containers. And we have a node which is a host system for pods. Node
you can imagine it's bare metal host or virtual machine, it can be anything. It can be virtual machine in OpenStack, it can be bare metal, it can be Amazon instance. And obviously you have this Kubernetes cluster as a set of nodes. And then you have these three
more abstractions which we are going into details later. So just this is the layout of the Kubernetes cluster and generically out how it looks like. So we have cluster, multiple nodes, every node has different, has a lot of pods, every pod has a lot of containers. Generally there is a one to one mapping between pod and container
and we probably will see why is that. So this is the layout of these objects and now we like putting objects in this kind of layout is one thing but now you need to work with this and this is where the management abstractions come into coming.
So how do you manage pods? So from Kubernetes points of view, Kubernetes never works with individual containers. It always work with predefined groups of containers. So we just
are not interested in managing container alone. We always have this set of containers which we put in a pod. And it's cool, I mean it's most common case there is one container per pod and this means that pod is just a wrapper object around your
one base application container. So first when you start to work with pods and containers, the first notion you need to know is the replica set. So a replica set is when you define a certain application, you want to go microservices obviously, everyone
wants to. And you never want one container with one application, you want a set of them. And this is called replica set. Replica set has a counter. The number of containers which
belong to this replica set, all of them, all of the number of pods, all of those pods are equal. This is just the scaling of one pod into multiple components. So here in this example we have a nginx container which is put into nginx pod and this pod
is a member, every pod is a member of replica set for nginx and the replica counter here is four. So main property of a replica set in a Kubernetes cluster is that the replica set can be scaled up and scaled down. So replica counter is a parameter which can
be changed during the lifetime of a replica set. And Kubernetes will deal with scheduling this new pod to some node which if it finds that, if you increase the replica counter Kubernetes scheduler will find a way to run one more pod of the same kind on
some of the nodes. So scheduling is done by Kubernetes, everything is done by Kubernetes, you just basically set this replica number to plus one. But having replica sets is cool but it manages the number of pods but it doesn't touch the content of pods. So once
you define the replica set you define the content of a pod and then you just scale up, scale down. That's all you do with this object. So there's one more thing which
you need once you start working with replica sets. Again, as we are going to scale up, scale down, we obviously don't just scale up, scale down for the fun of it. We want to load balancing of a certain service onto many instances of the application. That's
the reason why we scale. So that's why here comes one more concept which is the concept of a service. Service is a common endpoint for a replica set. It can be a common endpoint for one replica set but you also can have several replica sets under the umbrella of a certain service. You even, there are more to it. You can have services based
on certain selectors. So you can have, you can choose pods by some label and assign service to them. So there is a flexible tool. So mainly service is a common endpoint
for the group of pods which are in replica sets. So in this example I have two replica sets with different version of Nginx application inside and I have a common service assigned to them. Now, so we have replica sets, we have service. So we can scale replicas,
service will be adjusted accordingly. So everything is done automatically by Kubernetes. Once you add more stuff into your replica set, service will include them as well. But again, this is not enough. You need a way to update your content of your pods. And
this is where deployment object comes because deployment object means you add an update strategy to your replica sets. So obviously you can do everything manually. You can set up your replica set. You can set up the second replica set with engines of version two and you can migrate your workload from one to another. But Kubernetes is good
because it already does everything for you and you don't need to care. So deployment object manages updates for replica sets. So in this example I have deployment object in Nginx which was at first set up as a deployment object with replica set with Nginx version
one. But then I want to roll out new Nginx version two. I set up the roll out update process in Kubernetes and then Kubernetes takes care so it creates the second replica set with second version. It starts to scale down the first replica set one by one.
So it cuts one pod from old replica set and adds one pod to the new replica set. And keeps service working to both replica sets in between. So it just replaces one
replica set by another steadily, one by one, and keeping everything in working state in the meantime. So deployment object is a higher level abstraction which provides this wrapping around the updates of replica sets for you.
So now it was all good and abstract but now we come to more interesting tasks which are the networking. So we have pods, we have services, we have applications, how we talk to each other. So the basic difference of the Kubernetes if you compare it with
Docker swarm and usual Docker network is that in Kubernetes you have one flat network internally and every pod you have there is connected to this network and has its own IP address there. So you don't have this problem of port numbers overlapping between
different applications and services because every service, every pod has its own IP address assigned to it by the Kubernetes scheduler and thus you have a freedom as an application developer, as a creator of those pods and containers to use any kind of port you want.
It doesn't matter if the developer sitting next to you wants to use a port 80, he can, because obviously you will use different IP addresses and everyone has the whole range
of ports available. So there is external network which your nodes are connected to and there is a flat internal network where pods live. And by default these networks are not in any way not connected, I mean this is not true exactly, so pods can access
the internet usually, but from outside if the question is how do you access the pods, if there is just an internal network there are some IP ranges, IP addresses but no one knows how to reach those IP addresses from the outside.
So before we get into that we should think about services first, so how does load balancing works with this services mechanism? So in Kubernetes every service has a virtual IP address assigned
to it, so it is not a DNS entry as often happens, this is not a DNS because DNS is too slow in delivering the updates basically. For Kubernetes updating service endpoints in
DNS is not reliable enough for this kind of microservices operations to deliver fast updates of IP addresses, that's why it was decided that services in Kubernetes are represented
as a virtual IP addresses and there is a routing created on IP tables level for services to balance requests to the backend endpoints which are pods of current service, so this
again internally every service has its own personal IP address from the IP range and every pod can access a service by IP address or by name. But now we want to reach there from the outside and there are different ways of doing that,
mainly you need to route the traffic from outside through some node interface to internal network, so this is called in the Kubernetes terms expose any service to outside and you can expose the service in different ways but most basically the service can be exposed
as node port, so what it means you choose a certain port in a range and then you assign this port to a service and then every node in your cluster will have this port opened
and redirected to this internal service inside Kubernetes, so you kind of don't have an IP address here, you have just a port, so every service represented as a port on the node and you can access any node with this port to get to the same service.
Obviously accessing services via ports is not fun, you don't want to remember these ports by names, so Kubernetes has connectors to different cloud systems, for example if you have Kubernetes platform deployed on AVS or on Google Cloud, then it can talk
with Google API or AVS API and once you create internally the service, the service will be registered in AVS API and the AVS will create a rule which will route this traffic to a certain port again on the cluster and route to this service, so you don't want
to have your client services to discover your services by port, you will discover them by name and Amazon will do or Google or your like certain bare metal hosted system like console can do that for you. But generally the underlying concept is the same
so each node exposes a port, port is mapped to a service. Now there is more to it but I want to go to the more client related stuff here, so these were abstract concepts
but how do you work with them really in your daily life. So Kubernetes has a kubectl command, this is a common line utility, it is very extremely verbose because it's kind
of wrapper around the full REST API of a Kubernetes cluster, so every object is represented in this common line utility and you can get objects, you can list objects, describe them, update them and so on, so this is kind of a REST API handler and while
you can create all objects through common line by running some kubectl commands, you obviously don't want to do that all the time, manually typing all the options there, so all your object descriptions can be stored in YAML files and these YAML files can be
consumed by kubectl or they can be consumed like on a, it's also possible to have a JSON format but generally you have a repository with your YAML configurations and then you
just apply this, the whole folder of your YAML configurations to a cluster to do anything there. There is also Kubernetes dashboard interface, this is a graphical web interface which provides you with overview of what you have, which nodes you have, which pods you have, how this is all going, but this web interface, again, it's a wrapper around
the same REST API but it has limited functionality, so kubectl should be your main option once you really work with it, and Kubernetes dashboard is an option for having a generic overview. Now this is two examples, how do you work with kubectl from common line,
so for example kubectl run command by default creates those deployment objects we talked about, so I set this replicas number to the number I want to have this starting
number of my pods, I set the image because like I said, the default option is to have one container per pod and this is what is handled here, if I set up, specify this one image, I will have a pod created for it and I have five instances of it created,
and then I simply expose this deployment object to the outside world, here I'm specifying the internal ports on my application, my application is licensed by some reason to port 5000, and I expose this to the outside world, I don't choose which node ports I'm exposing
this service on, because Kubernetes will take care of it so I don't have a problem of overlapping ports again, so the idea is that you create your deployment objects, your services, your service accounts and everything you need as a user of this cluster,
you shouldn't be interrupted by another user who already have this port busy, that's why you don't choose here which node port you will use, Kubernetes will find it and will register the service to it, then you will be able to discover this port or your
Amazon API or Google API will create this rule so you will access service by name and this port will be somewhere behind, you don't care which one exactly port it is, so every time there is this feature that you don't overlap with other people's work,
so you have a full independent application, fully independent applications don't clash in with each other, don't take in each other's ports and IP addresses, and one more thing is that this networking is not easy and the networking you see from the
outside is different from the networking set up you see from inside, and this is a very huge difference because from outside you get this port mapping and a lot of translations
in between, so sometimes you just want to know what's going on in the internal network and what's happening, can pods talk to each other without going outside, and that's where debugging pod is helpful, you can create a simple pod just with one container temporary, so it's not a part of a huge deployment object, it's not a part of a replica
set, it's just one pod with one image which is run temporarily and soon as you close the process it gets killed and removed from the cluster, but this kind of debugging pod provides you a way to interactively get into the internal network, so you run it
from your desktop, laptop, workstation, you get inside, you trigger this, some debugging pod, in my example it's a busy box, but busy box is actually a bad thing to use as a debugging pod because it has no nice tools which you need to debug, so basically
you need to create a debugging image with tools like TCP dump, nmap, corl get and so on, so you have your admin toolbox in this container image, then you run this container image, you get your interactive command line from inside this image, and
then you can work on internal network for this image and investigate what's going on there. Of course there's much more to the topic, and there's like I covered
I covered only the most common deployment object because this is the first object you start working with, then there's as well daemon sets which allow you for example to deploy pods, at least one pod per node, so if you want a service
which should be local to each node, you can set up a special scheduling algorithm so the pods will be scheduled in such a way so every node in your cluster will have at least one instance of a pod of this type. You can have stateful sets, you can
have volumes, you can have jobs, you can have config maps and secrets which means config maps is a just things you can store in the internal Kubernetes cluster storage, and they can be added to your container, so you
can have some configuration options stored as a file, keep this file in the Kubernetes cluster itself and let containers use this file on the fly and you can update this file and containers will get it updated and so on. You can have the same but for secrets with additional safety measures,
you can have service accounts which are like again your pod can use certain authentication methods which are stored on the Kubernetes level and so on. So Kubernetes adds more and more abstract objects to help you with solving those typical tasks so you don't do this stuff on your own but
yeah we can start with basics and dig from there. So basically this was my own content. If you have any questions you can ask now. Anything interesting?
Yeah? Okay yeah the question was what I'm thinking about running databases in
Kubernetes. So I don't think about it. The main reason like yeah obviously databases are a completely different use case from a common stateless microservices and if you really need to run them in the Kubernetes you maybe
should try but I think this is way out of the scope of what Kubernetes provides currently. So for me the Kubernetes as you probably saw from the talk how I frame it it's a Kubernetes is a platform for deployment pipeline where your deployment pipeline comes to and it's
very useful when you have those hundreds of applications, development teams working on them independently and everyone can manage their stuff and you can test it properly and so on. So this is a very nice platform to solve all your integration deployment tasks but it doesn't add anything in case of a
database management from my point of view at least now so no I'm not thinking about it. Anything else? Yeah? What do you think about using Kubernetes for a different application
area, a classical one like having classical containers, LXC containers are similar and Ansible or Puppet inside a stateful approach not a stateless as yours with Docker and in order to marry these both worlds? To be honest
like there are a lot of stuff happening in Kubernetes world so we cannot even imagine where we'll end up in in two years where might be very many different applications of the Kubernetes approach but from my side
as a continuous integration engineer not not the low-level maybe this is mean I see a lot of benefit of using Kubernetes kind of API with different backends so extending having this Kubernetes concept of replica sets
deployment objects service accounts management tools but replacing the backend to different kind of implementation be Docker be it VMs be processed on the host why not yeah so this is also could be nice and
interesting because like we are really eager for that way of managing our deployments and way of giving developers a way to manage these deployments but we are flexible in terms of how exactly this is implemented on a base level yeah from my side no but it's for recording as
well there was one part in that question can you handle stateful images
yes a Kubernetes has this concept of stateful sets which is the way of handling stateful images but it's from what I remember it's in the early stage of development I mean it was it appeared in in 1.6 release or
something like this year so I don't know the current like production readiness for this kind of setup and I've never heard about this kind of production readiness but maybe it's just me so I'm not exactly the expert in this particular topic because for the purpose of our setup we have very
nice microservices friendly internal project which is like really truly stateless and can connect to the database which is remote and leave separately so I haven't dig into this topic I heard that there is a set but I never use them do you have experience with the amount of needed
kubernetes cluster in a company so we currently have been set up on AWS
where we create for each pro pro project one kubernetes cluster for for test environment and staging and one for production so we have for example than 20 to 30 kubernetes cluster as a way of thinking is maybe to create just
one or two really big kubernetes cluster with a really huge amount of nodes so from my point of view I think that splitting up with workloads to different clusters is a better way to go I mean even for kind of just
development for different teams I would have different clusters to for them to play with because they will have a flexibility and they don't break each other stuff this way so I believe in many clusters because I believe that this yaml approach to kubernetes cluster configuration is very helpful
in this way so you can migrate your configurations from one cluster to another easily so unless you really need the interaction between these projects why do you put them in one place so once you have this infrastructure as a code approach you just create your clusters as you need them and use them
that's what I think about it so let me clarify maybe because from the questions I heard so currently we are working on our like new greenfield projects we're digging in so we are not maybe that experienced as you for
example so we this is our how we look into it and how we plan to do this maybe like in one year you will ask me and I'll say in little it's completely different and this is a different way but for now yeah this is the way I would go and what's what's the overhead of deploying a
kubernetes cluster how much resources does it need to just do nothing and sit there this is a good question which I cannot answer you right now this is something we need to investigate more on our setup so like
just to imagine to have a understanding no I don't I don't have
because we have because we have experienced that always creating a new cluster for some environment just eats up a lot of resources yeah of course it will be more expensive the thing is like what's more expensive the cluster
resources of the resources of these developers who will work on this setup and this is always the trade-off yeah you can go very effective hardware wise but when the developers will struggle and then like your effectiveness doesn't bring you with much so what we have what our experience is what you
need so when we have been set up in AWS for example and you need an high available setup so we have these master nodes which are used to control the kubernetes cluster and then you want to spread the nodes over the availability zone so you have then on our set up three master nodes for each
cluster and then the worker nodes and even the worker nodes are split over the availability zones so we have always set up with three master nodes and three worker nodes that's our minimal setup and what you can where you can work with especially when you have such a cloud setup is how you size
the nodes so for example when you have just a development cluster you can take smaller EC2 instances then for example for production and then of course you can scale up on down but especially for this high availability setup you
need minimum spread at the nodes over the availability zones my question is also about the scalability because I had some experiences with kubernetes and
for my setup I also used three nodes but I have had a problem so I don't like the scalability of kubernetes because as soon as one node of this view breaks down the containers are not scheduled about across the different servers which are still running
so as soon as one etcd instance break down from my whole cluster or the at least the services from the one were down and they were not scheduled back to the other nodes that's not true I mean that if this happens in your setup this obviously is a bug
because this is the idea of kubernetes cluster if your one node goes down you have your rescheduling of the pods to the other available nodes so probably yes as soon as I had four it was of course going down to three but as soon as it was
going under three it didn't scale so I had to at least three running nodes to have the high availability that's an interesting topic to discuss like are you sure it's not it wasn't because you just resources were not available on those remaining nodes and you post as soon as one etcd then
instance break down from three okay then whereas like etcd services can be also clusterized in different ways so it might depend on your setup on you might dig into it because like obviously it's not designed to work this way and obviously the idea is that once you have this
high available cluster you can take out the etcd node and I had the deployment of kubernetes done by cube spray which was cargo project recently but now it's cube spray and I tested this
killing of one of the master nodes and it was recovering and it worked for me so we can think about how it was yeah and another question was how can I set up in kubernetes on my local laptop so as I'm working as a developer on the run how can I set up an environment here and
can test deployment and things like that that's an awesome question because I almost forgot to tell you about it so one thing docker added to I think the development environment generally is that now people start to care about the onboarding of new people rather than just inventing in
developing the technology itself so kubernetes project learned its lesson from docker and created a lot of documentation a lot of tooling around to help people to get on board because like it appears that it's very important so now there's this minicube tool you can
download it from git from github and this is the minicube looks like okay I need to have some link but so there is this minicube tool which you can download this minicube tool allows you to
set up your own developer sandbox with kubernetes so it's a very easy setup of like what it does under the hood it goes to google fetches the vm image downloads it to your
local instance set up your run this virtual image and this virtual image internally contains the kubernetes cluster this is a one node cluster so it's not a development production ready kubernetes setup but it's a full cluster with services which you can use as a
developer to test stuff to run this replica sets to run pods and to deal with to play with the kubernetes cluster so I have it running and I think this is even works sometimes and you can see that it's kubernete this minicube utility automatically configures your
authentication locally so that your kubectl tool starts to work with this minicube cluster so it you can see that I can access this cluster with kubectl utility I can also
try to look into into dashboard of this particular minicube instance so you can see that I have only one node this is this vm which minicube downloaded I have
some deployment objects actually this is a one deployment object running here and with five instances and I probably can see the pods here the five pods running on this kubernetes cluster
and so this minicube is like full kubernetes here on your laptop you you work with the API and the things you do with this API you store them in yaml files you go to your production cluster upload the same yaml files and get the same result obviously not on the multi-node level
because this is only one node we have but we you can test the deployment strategies you can see how updates is rolled out one by one and so on so look as a playground for developers this is an awesome tool anything else what are ingresses so ingress is a way to
route the traffic not by ports alone but by rules so you can assign a set of rules and say if some they're basically http rules you can create this proxy on a level of like virtual host
level if you specify different virtual hosts if you decide different different paths they will be routed to different services so with the ingress is the next level of a node port when you wrap around the node port and you do the routine on a more rich with richer
rules nowadays yep it's it's not a replacement for dns it's a basically it's a proxy of a house seventh level proxy yeah okay anything else yeah so i hope you got interested and will try this
mini cube stuff at home and enjoy how easy it is to live with it because yeah it's really like
very nice tool to work with of course this is not the the problem less tool there are complications and this is just a tip of an iceberg and as i said this the content of the talk just helps to get you on board then like there's a lot to learn and
for me as well it's just we have just started with this journey and yeah as i said there are discussions and we obviously hiring so if you're interested come talk to me thanks