We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

End-to-End Django on Kubernetes

00:00

Formal Metadata

Title
End-to-End Django on Kubernetes
Title of Series
Part Number
23
Number of Parts
48
Author
Contributors
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Not only is Kubernetes a great way to deploy Django and all of its dependencies, it’s actually the easiest way! Really! Deploying multi-layer applications with multiple dependencies is exactly what Kubernetes is designed to do. You can replace pages of Django and PostgreSQL configuration templates with a simple Kubernetes config, OpenShift template or Helm chart, and then stand up the entire stack for your application in a single command. In this presentation, we will walk you through the setup required to deploy and scale Django, including: Replicated PostgreSQL with persistent storage and automated failover Scalable Django application servers Front-ends and DNS routing The templates covered in this presentation should be applicable to developing your own Kubernetes deployments, and the concepts will apply to anyone looking at any container orchestration platform.
6
Thumbnail
42:19
Multiplication signComputer programmingDifferent (Kate Ryan album)Computer animationSource code
AverageBitDifferent (Kate Ryan album)Level (video gaming)Open sourceComputer animationMeeting/Interview
Loop (music)Control flowPhysical systemArithmetic meanLoop (music)Volume (thermodynamics)Connectivity (graph theory)RoboticsMultiplication signProcess (computing)Local ringControl systemService (economics)State of matterAreaEvent horizonForm (programming)Virtual machineProgrammschleifeTerm (mathematics)Mobile appGame controllerComputer animation
Term (mathematics)Mobile appComputer animation
Vertex (graph theory)Slide ruleMedical imagingNumberReal numberComputer animation
DemonNumberSet (mathematics)Scheduling (computing)Logical constantTime zoneDiagram
AuthenticationProxy serverSpacetimeMetadataService (economics)Communications protocolDean numberGame controllerVariable (mathematics)Template (C++)NamespaceComputer fileBitPoint (geometry)Flow separationLevel (video gaming)System administratorPhysical systemState of matterMathematicsGroup actionSet (mathematics)Key (cryptography)Service (economics)Mobile appGame controller1 (number)Rule of inferenceForm (programming)Web serviceGene clusterUser interfaceDifferent (Kate Ryan album)Default (computer science)Connectivity (graph theory)Utility softwareIP addressNumbering schemeMultiplication signGoogolAuthenticationPasswordMultiplicationConfiguration spaceProxy serverInformationIntegrated development environmentFormal languageBefehlsprozessorSemiconductor memoryInverse elementRight angleBlock (periodic table)Interface (computing)Shape (magazine)Open setQuicksortSoftwareDirectory serviceCuboidMechanism designData managementNumberXML
Dean numberEncryptionGene clusterForm (programming)1 (number)Group actionCASE <Informatik>Pointer (computer programming)Public key certificateBasis <Mathematik>Configuration spaceLaptopProduct (business)Projective planePhysical systemGame controllerPC CardRenewal theoryPhase transitionConnected spaceLine (geometry)Integrated development environmentDirect numerical simulationIP addressUniform resource locatorLatent heatDifferent (Kate Ryan album)Error messageNatural numberSet (mathematics)Process (computing)Right angleNamespaceRule of inferenceNetwork socketInstance (computer science)Similarity (geometry)Service (economics)Pattern languageUtility softwareSocket-SchnittstelleMortality rateComputer configurationGrass (card game)Domain namePoint (geometry)Shared memoryDemosceneSingle-precision floating-point formatCuboidView (database)Revision controlState of matterMereologyCommitment schemeKey (cryptography)Level (video gaming)Combinational logicDisk read-and-write headGoogolCartesian coordinate systemMobile appProxy serverDefault (computer science)BitComputer animation
Variable (mathematics)Integrated development environmentPasswordBlogVolumeVolume (thermodynamics)Computer fileMereologyInformationEntire functionDemonIntegrated development environmentDirectory serviceSpacetimeRevision controlSet (mathematics)Physical systemConfiguration spaceSingle-precision floating-point formatLevel (video gaming)DatabaseMobile appLatent heatVariable (mathematics)PasswordSoftware developerMiniDiscFlow separationServer (computing)Different (Kate Ryan album)Projective planeDivisorIP addressRun time (program lifecycle phase)BitMultiplication signLoginCASE <Informatik>Data storage deviceGoodness of fitPoint (geometry)Key (cryptography)ChainMappingMetadataReplication (computing)Shared memoryNamespaceTemplate (C++)Stress (mechanics)Information securityExecution unitMaizeOpen sourceElasticity (physics)Stack (abstract data type)Greatest elementAddress spaceXMLComputer animation
Vertex (graph theory)Instance (computer science)Configuration spaceClient (computing)Letterpress printingFile formatSpacetimeFunction (mathematics)Operator (mathematics)Instance (computer science)Slide rulePhysical systemVideo game consoleBlock (periodic table)Semiconductor memorySoftware testingFitness functionGame controllerData managementOperator (mathematics)DatabaseMathematicsRevision controlProxy serverDifferential (mechanical device)Service (economics)LoginBitMessage passingAbstractionPoint (geometry)Limit (category theory)Group actionLine (geometry)InformationAddress spaceElectronic mailing listDirectory serviceComputer fileIP addressBefehlsprozessorNamespaceConnected spaceMobile appMultiplication signUniqueness quantificationProduct (business)QuicksortMereologyReduction of orderAreaFunction (mathematics)Cellular automatonAnalytic continuationTemplate (C++)Denial-of-service attackCuboidOffice suiteInternet service providerXMLComputer animation
Structural loadMathematicsGame controllerSound effectPulse (signal processing)Alpha (investment)1 (number)Inheritance (object-oriented programming)Physical systemMereologyQuicksortLimit (category theory)Decision theoryInformationRight angleBeta functionSemiconductor memoryInstance (computer science)LastteilungBefehlsprozessorScaling (geometry)Proxy serverBuffer overflowRevision controlLevel (video gaming)BitCartesian coordinate systemPattern languageConfiguration spaceTerm (mathematics)Point (geometry)BuildingProcess (computing)Inverse elementGroup actionData managementMultiplication signPower (physics)BlogMobile appSystem callLoginCuboidGoodness of fitNP-hard
XML
Transcript: English(auto-generated)
Thanks for having me. This may come as a surprise to you, but I am not Josh Berkus.
But the more I thought about it, I realized that you could be easily confused, because Josh's name's on the program still, we both have beards, we both like Django and Postgres and Kubernetes. We even both have the same damn glasses. But there's some differences.
Over here on the left, we've got kind of the average user of Postgres's level of knowledge, and then I know a little bit more, and then Josh knows a whole lot more. But about now you're probably asking yourself what the hell this has to do with anything about the talk, and there's one final important difference that does pertain to this talk.
My back works pretty okay. Josh is not so much. Which is why I'm up here to talk to you about Django and Kubernetes. Josh managed to hurt his back making these lovely speaker GIFs for all of the DjangoCon attendees. Josh is an excellent potter as well as open source geek, and last week he hurt his back finishing these up.
So he wasn't able to make it, and the DjangoCon team was like, hey, any chance you want to give a talk on Tuesday? So this will probably not be my most polished talk ever, but on to the topic that you are all here for, Kubernetes. Kubernetes is arguably the best and most popular container orchestration system out there today.
That doesn't mean that it won't get replaced in a year by something cooler and better, but right now this is kind of what everybody is playing with for the most part. But before we dive too deeply into Kubernetes, we have to get into some terminology. So the name, Kubernetes, what does it mean?
Is it Greek for ship captain? Or is it that Google learned its lesson naming things after Go? Or is it three all of the above? If you picked three, you are correct. It is Greek for ship captain, and I do think that they named it because they had so much trouble with Go.
Today, most everybody is looking at or moving toward containers in some form or fashion, and containers are great about being able to package up all of our dependencies for an app into a thing that we can then move around and share and use. But working with them on their own is not exactly the most user-friendly thing.
If you had to remember which ports am I using, what volumes do I mount where, that stuff gets complicated over time, and that's why container orchestration systems exist. Some examples of those, if you're not familiar, are like Docker Compose is really a container orchestration system. Docker Swarm, Mesos, AWS Container Service, and Kubernetes are all handle orchestrating these containers that you want to use together.
So what's container orchestration? Event loops are used by Kubernetes components to reconcile things between local machines and the desired cluster state. So what does that mean to us?
We basically tell Kubernetes, this is how I would like the world to look. And then Kubernetes sits there and spins in a loop and tries to make that happen for you. It can't always do it, but it will continually keep trying to make it happen. So the other term they use from the documentation here is a control loop,
which in robotics is kind of the main process that's just sitting there going, Am I standing up? If I hit a wall, do I need to back up? What's happening right now? And that's basically what Kubernetes is constantly doing. Is everything that's supposed to be running, running? Can everything talk to each other that's supposed to be talking to each other?
And I'm not going to lie to you and say that Kubernetes is super easy to learn. We are definitely not going to learn it all in 40 minutes today. But it's a big complicated system. It's a bear. It's a big scary bear. But my goal is to change your impression of it from this to more of this.
When I started playing with Kubernetes, the terminology is what tripped me up. There's a bunch of new terms, there's a whole bunch of new concepts that you just don't tend to think about, but we've all done before, but the way they talk about them sometimes is confusing. So most of today is going to be coming up and getting comfortable
with what these different terms mean, and then we'll piece it all together and have a working Django app towards the end. These slides move slow when there's big images. So Kubernetes has a concept of masters and nodes, worker nodes.
The masters are where all the Kubernetes magic happens, of what should be running where, and the nodes are where your containers actually run. You can have differing numbers, they're not a one-to-one relationship, so this would be a simple three-master cluster with three worker nodes. A more real-world scenario would be something like this.
So inside of an AWS region, you have a master per availability zone, those three are clustered together, and then you have some number of worker nodes also in each availability zone. This is so that underneath, Kubernetes is really just an etcd cluster that has a nice API over top of it.
So when you say, I want you to run these sets of containers, it puts that data into the etcd cluster, which gets then clustered between all of the masters, and then the API and the scheduler little daemons that run on these nodes constantly are looking at that and saying, am I running what I should be running?
Is there something out there that needs to be running that's not running? Where should we run it? One of the things that trips people up is authenticating to your Kubernetes cluster. Almost everything happens through this config file in your home directory, effectively known as kubeconfig.
There are multiple, authentication with Kubernetes is fairly pluggable, but not as pluggable and easy to use as, say, like, Django authentication backends, but out of the box, you tend to have one user with a password and one set of SSL keys to talk to it, and you share that amongst multiple sysadmins.
That's kind of the default configuration. It feels wrong and messy and sick, and it is kind of wrong and messy and sick, but it works. There are other systems you can authenticate against. Google, if you use Google Apps, so you can give just certain people access to the cluster. There are other schemes that you can employ.
I'm not going to get too far into that, but it is important that you have a kubeconfig and that it is properly set up pointing at your cluster, and you can have multiple of them so that you can switch between multiple clusters of which one you're talking to at any given time. And so you can access your cluster by proxy. So if you run kubectl proxy,
it looks for that default kubeconfig. It looks for the cluster that you're currently pointing at. So I have five or six clusters in my config, so I pick which one I'm going to be playing with at any given moment, and then if I run kubectl proxy, I am then able to access the API of the actual masters.
If your cluster is running the Kubernetes dashboard, you can then access it with this. I need to move this to a different...
That's not going to let me play with my browser, is it? Well, I guess we're going to skip the dashboard portion of the evening. The dashboard is a really decent web interface to the cluster as a whole.
You can see all of the various components. You can see all of the config. You can make edits to the config. You can see resource utilization across your nodes, which ones have high CPU or high memory usage, and you can just get a nice dashboard state of your cluster. All of that same information,
all of the information for the dashboard comes from the API, which is also all the same exact information that you get on command line tools that you can also then access from Python and Go and other languages. One of the ways that you keep things sane in a Kubernetes cluster is by using namespaces. A namespace in Kubernetes is a fence between other containers.
So pods in a namespace, containers in a namespace, can only easily access other containers in that same namespace. So you can use it as kind of a light mechanism for multi-tenancy.
You can also use those kinds of namespaces as a light way to kind of separate dev from stage from prod, all inside the same cluster. But it's not as hard and fast of a wall as true multi-tenant separation. Your containers could end up talking to each other. It's a balsa wood fence, not a brick wall.
In Kubernetes, you define resources using YAML. So this little bit of snippet at the top there is all the YAML you need to create a namespace called revsys-rocks. You create it in the cluster by just running kubectl apply dash f,
point it to the file you created, and that will come back and say I've created this namespace. If it's already created, it'll say I configured this namespace because it was already configured. And you can reapply these same files because all you're doing is adding state into the system. And so if the state is the same, nothing changes. But if there's a new state, actions get taken.
Deployments, deployments are a template of how you would like the world to look. So you say I want to have this Django app and I want to run this particular container and I want to use these environment variables and I want to run three copies of them.
This is kind of hard to read size-wise, but you'll see we just call it a kind of deployment and then we say we want to have two replicas and the template is this particular container and we want to open that container port 80
to inside the cluster. And just like with the namespace, we do the exact same apply command to actually put that into the cluster. Services are, in Kubernetes, are what we think of as a service. I've just created a Django app running in those couple of containers.
Now I need to tell the rest of the cluster that there is this web service out there and I define that like this. We create a service. Notice all of these are in the same namespace, revsysrocks. I tend to use the same name for the namespace
and the app and the service and everything to just keep myself sane. You could call them things differently. I know that one of my coworkers, Stephen, would probably call this service HTTP or WSGI, where I would call it the name of the actual service that I'm thinking of as the website.
There's no hard or fast rule here, but all we're doing is saying, hey, for this service, I'm going to open up port 80 and it needs to go to the container port 80. Kubernetes has a concept called an ingress controller. So far, everything that we've done is available inside of our cluster to other things running in the cluster,
but it is not available to the outside world in any way, shape, or form. Opening that port 80 did not open port 80 on an external IP address anywhere. So the ingress controllers map outside world things to inside the cluster. Now, depending upon where you host this changes what kind of ingress controller you can use. So if you're on AWS, it would use an ELB or an ALB
as your ingress controller, and it manages what points where when. So you just say I want one of these and it'll go out and create one and then you start pointing DNS at the CNAME. You don't have to actually go configure it at all. Which leads to a quick aside here.
We use a controller called kublego which handles everything about Let's Encrypt certificates. You install this into your cluster and then in these YAML definitions you can use what are called annotations and they're basically just keys in the YAML that Kubernetes itself is not particularly looking for.
So a controller is just something that is listening to this API, looking for state changes and taking some action. The default Kubernetes ones handle things like I need to be running this container over on this node but you can create your own annotations and take other actions. So in this case somebody created a system to handle Let's Encrypt certificates.
So you say hey I want a Let's Encrypt certificate for this particular host name it'll go out and register it. It hijacks the .wellknown location, handles all the key management, stores it inside the cluster and presents that to the world as Let's Encrypt SSL connection from then on.
And you literally have to do just a couple of lines of config. So this is an ingress controller definition and you'll see we have a similar kind of pattern. Name, namespace, right? Then there's the rules there and it's a host. The domain is actually revsys.rocks,
one of the new top-level domains, so don't get confused. That could be .com or .org or whatever. But then we have a little part that I want to highlight and this is the kublego part. So we just have an annotation there saying hey I want TLS ACME and I want to use an nginx ingress controller and its host should be revsys.rocks
and I want you to store that as a secret named revsys-rocks-tls. And I don't have to do anything else. When it comes up it gets the cert, if it needs renewal it'll renew and I don't have to deal with that on a per application basis or even per container basis. I could throw a couple of Rails projects
and a Go project into this cluster and they have Let's Encrypt and it's totally independent of whatever I'm doing in my containers. So one of the last pieces of terminology is a pod. When we create deployments all the containers in a deployment form a pod.
Pods are sets of containers that are deployed together on a host. So if you have things in a pod and it has four containers to it all four of those are going to be deployed on host A. If for some reason it can't deploy on host A or host A dies they will all be picked up and run together on host B. So they are always a set together
like a pod of whales. This could be useful for lots of scenarios. In all the examples I have today we are only using one single container so the difference there does not become particularly apparent. But if you needed additional containers that only talk to each other and not necessarily the outside world this can be very efficient.
You can do things like share a Unix socket on a host that you wouldn't be able to do because you couldn't guarantee that they would be running on the same host. So if you have like a one container Django app with one memcache instance you could have that talk over a Unix domain socket instead of a TCP socket and get a little bit better performance.
And you can only do that because you know they are always running on the same hosts together. So at a high level view Kubernetes is the masters run this API and store this cluster state and the nodes run pods which provide services inside the cluster and ingress controllers map the outside world
to the inside world. So if you have a host let's say we have three worker nodes and we have some stuff running on host A and some stuff running on host B if you shoot host A in the head AWS just terminates the instance the other masters are going to go
wait a second the pods that were scheduled on host A are no longer running on a host because we can no longer see host A I need to schedule them somewhere where do they fit? Okay, host C is pretty empty I'm going to run them over here change all of the pointers all the different proxy ports the ingress controller all that stuff gets changed over
and you're back up and running. So you can do things like upgrade your worker nodes from one AWS instance size to another and never have any of your stuff go down and not have to change any of your configurations or IP addresses it gets you away from pointing at IP addresses and temporary host names
and makes things move a little more smoothly So how do you run Kubernetes in the real world? There are kind of three different things you might interact with one is called KOPS K-O-P-S it's a utility for spinning up kube clusters in AWS
it works really well it handles all of the AWS specific nature of Kubernetes clusters so one of the hardest things about Kubernetes is getting a cluster to start it is not easy to turn on it is really hard to kill once you've turned them on and that's kind of its job but getting them turned on is kind of involved and prone to error
so people have created these wrapper utilities to make the process a little more turnkey for us mere mortals The other option and this is the option I would encourage you to play with first if you have an interest in Kubernetes play with it on Google Container Engine which is a hosted version of Kubernetes
with Google the reason I suggest it is that you know that you have a well working good to go Kubernetes cluster to play with any problems you're having are your misunderstanding of how Kubernetes works or a configuration mistake and not perhaps you set up the cluster poorly and then there's also Minicube
which runs a single single node Kubernetes cluster on your laptop using Vagrant or VirtualBox and Linux systems on your laptop and that's a great way to play with Kubernetes in the small for developer environments you can use those same definitions to define which containers to run
and which services to expose on your laptop and then just use them in your production clusters so one of the things you've got to be able to do with containers is configure them we're all 12 factor apps now so we've got to be able to push this configuration into these containers and Kubernetes provides several
different ways environment variables of course we can just define environment into the YAML there towards the bottom, I'll highlight this a little bit you can see we've just defined an environmental name and we've put in a value and that gets injected into the containers environment that's great and all but
a lot of times we don't want to expose all of that into our configuration, we can also use what's called config maps and this lets us map sets of variable like things, whole files or entire directories of files into our pods so maybe we don't want to have to list every single environment variable in that
deployment YAML, we can say here's a config map of 25 environment variables, take these and apply them into this pod and you can pick which map goes to which container and it just does them all for you you can also do things like I want to use this nginx configuration file put it here on disk and it will grab it from
Kubernetes configuration secret store and plop it into the pod at runtime Kubernetes also has a concept of secrets these are great, we could obviously put our database password and our API keys into those environment variables and our deployment YAML but that means that everybody gets to see them
perhaps we don't want our developers to know those and just the ops people should hold on to those, so Kubernetes lets you define secrets secrets are available like most things only inside the namespace that they're defined in so you can't share secrets across that fence but unfortunately for secrets they're not particularly
secure, right now Kubernetes stores them as base64 encoded text on the master so they're not as secret as you might want now to be fair they are working towards real secrets encrypted on the master secrets and this was just a stepping stone to
getting there but it does keep secrets that should not be on a node from getting to that node, so Intellipod needs to run there that needs access to that secret, that secret won't exist on that node, so it does keep them off places where they have absolutely no business, it's just that once they're there they're not particularly secret
and so this is how you use secrets in an environment variable, so we say hey I want to have this database password environment variable, get its value from the secret named revsys project db password and the key in there of password you can also use
I mean this is just a set of hosts, right, so you could run a vault cluster in your Kubernetes cluster and get your secrets from vault or some other kind of really truly secure secret storage so because we don't know where our containers are going to be
running, centralized logging becomes terribly important for being able to figure out what's going on, so if you use Google Container Engine the logs from your cluster go straight into Google's logging tools, there's Stackdriver logging system that works fine, we've had good luck with the EFK stack which is Elasticsearch Fluentd, or Fluentbit
which is a smaller C version of the Fluent daemon and Kibana but you've got to have this or you're not going to be able to tell what's going on I don't even know where revsys rocks, which host it's running on, I'd have to go and dig and find out where it's running to even get onto the host to then look at logs, so having centralized logging is important
and so for part of that we've lightly open sourced this, we use it, I don't know how useful it will be for you all but it's jslog4kub and it configures Gunicorn and your Python apps to use JSON logging to standard out and includes information
that is Kubernetes specific so like what was the pod's IP address inside the cluster, what host was it running on, what was the name of the pod, what app is it in, those kinds of metadata that's Kubernetes specific that gets added into the JSON that's emitted in your logging, so
data persistence is pretty important and there's a couple of ways to handle it with Kubernetes the hard way is with persistent volumes this works but it's kind of hard to manage and it's kind of hard to wrap your brain around this would be, this is
advanced Kubernetes here, what you're doing is you're saying I have this volume and I want to I provide a certain amount of space and then your app's claim, they make a persistent volume claim of how much space they need and Kubernetes tries to match up the claims with the volumes as efficiently as it can and then will mount those
volumes on the hosts where those pods with the claims run and then if those pods get evicted for some reason or the host dies it then remounts that volume to the new host where these things run and in a perfect world that's exactly how it works and it works that smoothly I have yet to experience that perfect world so the easier
solution is to do off-cluster storage and this is where I encourage people to start and all this is is the existing way you are doing storage, you have a database server somewhere that all of your containers then connect to and you manage your database server as bare metal or you use Amazon RDS or something like that for those kinds of persistent
data stores One of the things that Josh was going to talk about is Petroni. Petroni is a templating system for highly available Postgres The idea is that you could keep a master running in the cluster and slaves running in the cluster and
replicate your data from one to another and as containers were killed off or nodes died you could keep that replication chain working between the nodes to the point where you didn't lose data. I've heard good things about it, I've never actually played with it and so I wasn't comfortable showing you how to do it having never done it myself
but I do want to mention it in case you're interested in playing a little fast and loose with your data Oh, and this was out of this slide is out of order, I'm sorry The idea here is your persistent data instance is just an instance there outside of your actual blue Kubernetes cluster just inside the same
VPC so that the cluster can access it but it's not actually running on Kubernetes, it's just a bare metal node using Ansible or Puppet or whatever you want to use or do it by hand So one thing that I do not have a ton of experience with
but I know is useful is Helm. Helm is a package management system for Kubernetes You can think of it as templating those YAML blocks but it's useful in more complicated scenarios so you can say run me a console cluster and I want to have this many nodes and it will figure out what all
needs to be applied to the Kubernetes API to get you an up and working console cluster with the federation and the leader election and handle all that stuff for you so you can build these templatable systems to the point where I should be able to take your system and helm install it and I just have
that running and working on my cluster and I shouldn't really have to do anything else other than maybe a little bit of secrets management So one of the things because Kubernetes is just an API really we can use the API from Python
So this is all you have to write if you've got a kubectl proxy running on your local host or you have a well formed kubeconfig file in your home directory all you have to write to get a list of all the pods running in your cluster and I'm just printing out their pods ip address
the name space and the name of the pod this is a generated API off the swagger docs from Kubernetes and it's kept up to date with releases so you should have always have full access of the API from Python so you do not have to build your tooling in Go unless you want to
So what does that look like? Here's the output for that I ran this on our revsys production cluster and you can see various name spaces we've created there in the middle and the various pod ip addresses and then the names of the actual pods you'll see that it takes the name that I gave them
like revsys rocks and then appends a uniqueness to it and that's that particular instance of that pod so every time a new pod comes up it gets its own unique name and if it gets killed a new one comes up so you can see a differentiation of the logs even if it's the same container one got evicted and a new one got created
you'll see that name change happen So everything in Kubernetes works with a operator or a controller and why would you want to create your own? Well like Kube Lego you can create your own tooling that takes action when these things happen
So you add a little bit of annotation of your own and you can watch the cluster using that little bit of Python and say ah I'm seeing a new pod come up that's annotated Frank needs to do something to it and I see that and I can go take action either inside the cluster or outside the cluster however I need to see
whatever I want to have happen when that annotation comes up I can make happen. So here's some examples of operators you could build. Pipe a message into Slack anytime somebody creates a new deployment. When somebody's launching a whole new thing pop a note into Slack so we know that that happened or maybe we want to check
anytime pods come up and down for whatever reason we want to get that message in Slack. That would be like you know 10 or 15 lines of Python. Nothing particularly hard package that up in a container tell Kubernetes to run it You could watch your Django apps and look for the database connection information
and automatically back up any databases that are running and being used by your cluster without having to go in and define each one. You can just say oh here comes a new Frank's test system 47 just came up. It's annotated as backup equals true so I'm gonna go back it up and put it to S3 and I have one centralized system for dealing
with that. Just like we have centralized logging we can have centralized control because we've abstracted out the whole concept of ops to this API we can watch. Maybe you have really complicated rollout scenarios where you have six hours of collect static runs before finally things come up
maybe you want to handle low amounts of downtime by spinning up an entirely new service once it's all ready spin down the old one and move traffic over to the new one. You could orchestrate that with just a little bit of Python even if it's something Kubernetes itself doesn't really support. Hopefully
that is enough information to make you interested in Kubernetes but I'm sure you probably have questions So how do you handle like CPU like I have some service that's gonna need a lot of CPU and I don't want to run these two services on the same node because they're gonna conflict with each other and that
kind of stuff. So in the effort of fitting things on the slide and not melting your brain too much I left out resource quotas which are just items that are in that same YAML and you say this takes this much memory it has a soft limit of this and a hard limit of this and you can have that for CPU, memory and storage
and Kubernetes will handle it much like any other kind of quota system. So if it reaches its soft limit, information goes into the API of I'm at my soft limit at my hard limit it actually kills the pod and then recreates it. So you can tag pods by how much resources they should use and then where you want to stop them if they grow beyond that
you can also then target nodes. So you may want to have a cluster with some memory heavy AWS instances and some CPU heavy AWS instances and you can say this pod needs to run on one of my memory heavy instances and these should run on my CPU heavy instances only and that's how you can do
that and so it will stack as best it can into those nodes based on the values you've given it it will overflow if you don't put any resources so like in my examples it will just keep packing them into nodes and you will eventually hit swap and things like that but if it can't
then find a spot because there's not enough resources to put a pod it will continually try to find a spot for it and you will see information in the dashboard and the logs that I can't run this pod I do not have resources to do it you add another node to your cluster and it immediately puts it right there. Sorry, I'm totally new to Kubernetes
on the ingress point does it come to the nodes or to the pods? So it comes to the ingress controller and then it's proxied inside the cluster from there but you would think oh this is going to add this extra hop and it's a pain it really in practice at a really really huge scale it matters but
for your day to day use no one's ever going to notice that that extra hop is in there it's a little go proxy and it's super efficient. So considering the pods auto replicate I guess the load do I still need a load balancer in front of it or no? You need so the ingress controller from the outside will be the load balancer right and so then it comes to
the cluster and then it load balances from there and it handles all that where is this pod running and it just shoots it to where it needs to go inside the cluster and you don't have to think about it. So in thinking about your application what are some ways to make a decision on does this
solve more problems than it creates in terms of how do you decide at what point this is going to do that for me it's going to solve more problems than it creates. It's a tough call and that's like with any other tooling right like on one level I think it's easier sometimes to SSH into a box and I've got to install the thing and do it by hand but that's
not reproducible in any way and so like I mean every tool has its pros and cons. The thing I like about this one is that I'm going to be doing stuff with containers so I need some sort of container based system for the most part and most other non you know Ansible and puppeted chef and things like that are not kind of container focused so I don't see them as good
tools for solving things around containers that much. Where you pick like which one you use or if you use one at all is hard. The thing I like about this is it really does free me up to not think about the mundane things like where is this going to run and what port is available
to open on it and how do I proxy from this port to that port. I don't have to think about any of those details and it does give me the power to listen on the API for when things change and take some sort of action so I like it for that. I don't know that if you have one app and you run it and you do a
deploy once a week this is probably overkill. If you're managing 50 microservices and you deploy 10 times a day you probably are already building something like this or using something like this to manage all that. Thanks very much for the introduction. I was wondering what the next step might be. How did you go
about learning this? Do you have resources that you thought were particularly helpful and could you recommend them? So Kubernetes is a very fast moving beast. I think I first started playing with it at like 1.2 and it's already at like 1.7 and they come out about every six months. They have actually a fairly nice
process of things come out and they are marked as alpha and then they are marked as beta and then they eventually become stable. And once they get to beta the YAML configuration for the most part doesn't change and you can pretty much just pick it up and move it over. The alpha stuff is pretty alpha and good luck if it works.
So the documentation often lags behind the version just a bit on the newer stuff or the stuff that just recently changed a lot. So the documentation should be an amazingly great resource and it is as long as you keep in mind that if this thing came out in the last version the docs may be wrong. Or if it just
moved from alpha to beta the docs may be slightly off. Right? And so really it's mostly blog posts and tutorials where you're like how did somebody else go about this? Let me go look at their Kubernetes configuration and then some playing around. There is no really great here's the book on Kubernetes and it solves all
of your problems. Thank you Frank.