We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Python, Docker, Kubernetes, and beyond?

00:00

Formal Metadata

Title
Python, Docker, Kubernetes, and beyond?
Title of Series
Number of Parts
132
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Have you ever tried to manage deployment of multiple python applications through various linux distributions ? If so, you must have heard of Docker and maybe also Kubernetes. Distributing python applications using docker is simple and allows to create static packages containing everything required for them to run. Also it allows to freeze everything, packages, available libraries, files on filesystem. In my speech I would like to tell you about our brief journey, of moving our trading platform from standalone application directly on host system, through deploying it in docker and latter moving it to kubernetes. I will explain our struggles with implementing stable and fast CI using Gitlab CI and Docker, image (package) storage and cleanup of old images and finally I will tell you how we are deploying our platform to kubernetes, with nothing more than yaml-s and templating.
Keywords
35
74
Thumbnail
11:59
Software developerComputing platformData storage deviceProcess (computing)Server (computing)ImplementationIntegrated development environmentHuman migrationContinuous integrationComputer-generated imageryWindows RegistryCache (computing)BuildingPrincipal ideal domainConfiguration spaceNamespaceService (economics)Game controllerVariable (mathematics)Template (C++)Moving averageService (economics)State of matterNamespaceLocal ringDefault (computer science)Loop (music)Software developerArithmetic meanProduct (business)DatabaseProcess (computing)Level (video gaming)Gamma functionSampling (statistics)Game controllerCartesian coordinate systemCondition numberGastropod shellSingle-precision floating-point formatStudent's t-testChainSicRevision control2 (number)Multiplication signConfiguration spaceServer (computing)Integrated development environmentStrategy gamePrincipal ideal domainSystem callLoginDifferent (Kate Ryan album)MappingCubeBlock (periodic table)Entire functionoutputRadio-frequency identification1 (number)RhombusStandard deviationProjective planeAddress spaceInstallation artPhysical systemRow (database)Slide ruleINTEGRALInheritance (object-oriented programming)Library (computing)WorkloadTraffic reportingMathematicsPlanningInformation securitySpacetimeComputer fileDean numberHuman migrationModule (mathematics)Right angleProfil (magazine)RoutingJava appletGraph (mathematics)Graph (mathematics)Chaos (cosmogony)Moment (mathematics)Data managementRootUniform resource locatorNatural numberLatent heatRepository (publishing)SoftwareInternetworkingVariable (mathematics)Software testingData storage deviceComputing platformSemiconductor memoryWindows RegistryLeakDemonCache (computing)Execution unitHash functionMultiplicationDivisorCASE <Informatik>Goodness of fitInstance (computer science)Front and back endsHard disk driveComputing platformBuildingWeb applicationWebsiteFile systemDirect numerical simulationOpen sourceMehrplatzsystemUnit testingPoint cloudCodeAbsolute valueTemplate (C++)Source codeStability theorySoftware bugWhiteboardTerm (mathematics)Software repositoryComputer configurationSocial classInstitut für Didaktik der MathematikConcurrency (computer science)Parallel portRobotUtility softwareSynchronizationComputer animation
Transcript: English(auto-generated)
So hi Let me start this with a question. So how many of you are know know about Docker Kubernetes raise your hands And how many of you are using it in production make a Spock Spock
So So let me first introduce what Lane we are technological company We are trading stocks on different stocking changes around the Europe
We are based in Prague we have a relatively small team of developers there is 14 of us As I said we are developing a trading platform, which we are also using we are not selling it outside
And We use mostly open source projects to run our infrastructure and even our most of our code uses open source project projects Outriding platform Is written in Python obviously it uses react.js for front-end
In our applications even in outriding platforms we heavily used asyncio Because there's a lot of data, and we want to process it in parallel Concurrently excuse me And we are storing data in Redis and timescale DB so
Also, we are using third-party libraries integrating it integrating them with Satan Those libraries are written in C C++ Java or even Scala So Satan, right
So There are lots there are a lot of data, so we have to split our application into multiple processes and also there are Applications around which help us like there's reporting some graph graphs
And other tools we use Also, we are using messaging to integrate all these applications together and pass data from one to another When I first started working for quantlane That was kind of a chaos
All the applications were deployed on physical servers Managed by circus circus D. It is process management system similar to supervisor D if you know Packages were installed in virtual end So each application had a separate virtual end with installed pre-installed packages
Probably the standard and everything run on the single user both of this Also this parent that it was simple
It was simple in a manner that when the new user came to the project all he had to do was Clone project create virtual and install packages, and he could run run the application this it was the same for the deployment There were some disadvantages like there was some package versioning hell which means that
We had different packages and whenever we updated one of our own packages It may use some Newber third-party libraries, and we had to somehow Migrate those changes into our other applications and
There was no failure because when the server died everything on it died and the record was not automated migrated somewhere else So what happened next
Was docker when I started we are already looking into the use of docker As it was Yes, it has a promising features like it was able to unify in our environment So that we could run the local development with the same kind of package
Let's say image as it was in staging CI and the then in production The deployment was all also simple because all you had to do was build any build image just one one command and you had Probably running application migrations were also simplified
As the image contained everything you don't you don't have to Install anything else just pull the image in You have it It's speed up our CI No matter that image
When we build them it built and the image is built it contains everything so we can use this image Through all the through all the other stages of CI so and we had atomic releases as the built image has some tech and Even some hash so we can we have image with
We have image with the hash which is atomic and unique in our registry So there are some there are also there were some challenges we had to overcome first to introduce
Docker to our infrastructure and Those were like how do we store the images for this case we? Decided to use gitlab registry as we already had the gitlab instance In our infrastructure, and it had this feature
Next Thing was image caching because It's kind of sad, but we have a pretty slow uplink to the internet like our internal network is fast It is it has a gigabit, but uplink it's around 20 megabits, maybe
so so as I said we are using gitlab and gitlab has CI which Has steps defined in the git repository so anybody who has access to get the repo git repository can update the pipeline definition and
Motivated to whatever he wants As we wanted to have a built stage in CI Like the simple CI build is this you just run docker build and
It should build the image and prepare everything but C user has access to CI definition he can update this and modify it maybe to something like this And by this he can effectively get access to the server on which
The docker daemon is running this means that You should have a dedicated building environment, which you can just take and throw out replace it with a new one clean so whenever Somebody get access to this environment and does something harmful to it
you can just clean it and Go without any problem further Next thing was CI pipeline design Because you want to have fast CI you don't want to spend 20 minutes on
Building and then testing and then maybe integration test publishing Whatever and the last thing was cleaning cleaning of all the images The standard docker registry which can download from docker hub
Does not have automatic cleanup of all the images and because our images have around 50 Or 500 megabytes and we have like hundreds of them We have we had to implement some kind of cleanup of these images
We are not we are not running in ABS in we are not only running in cloud So we have we don't have infinite storage so What doctor brought us Was as I said unified and stable environment by means that we had the same image for local development
CI staging and then production and Everything was baked in so Developer When he built them the image He could be sure that the packages which are in it and all the images
I don't mean know just those which are specified in requirements, but also the third-party requirements and the full chain will have the specified version and this will be same in CI and
staging and production Because of the docker nature when you build an image it has everything packed in so The basic of this is to create one image bake everything in all the requirements of the development requirements application some environment specifics you need and
You can run everything in this image as I will show you on the next slide It also brought us isolated environments By means that When The application runs it cannot access other applications running on the system
cannot take Control over the other processes which are running there That's probably a some kind of security feature So fastest CI and
This is our spice. This is our pipeline So first we build on the image as I said it contains everything it contains all the packages application and some environment
Definitions Then next we run tests Co quality unit tests packaging Those are run inside this image in parallel so each of those tests may run like maybe two minutes and This speeds up entire process because in those state in this that stage
none of the jobs have to install the packages which was the bottleneck of our CI so next we optionally deploy next the option they release bleeding edge version and deploy to staging and then we run integration tests and
publish the documentation know that Bleeding edge release and state deployment or staging are optional So we can run integration test Immediately after the unit tests are complete. So a doctor has also some kind of
Doctor has also these advantages like there are known bugs And every day you can find a new one For example, there are memory leaks There are some race conditions which lead to deadlocks It has no failover if you don't use the curse warm What kind of state the curse warm right now is but
Also when the local demon dies you cannot Manipulate the running containers your containers may still be running but you cannot stop them restart them Or create new ones
There are few gotchas We found out when we started using Docker and For example, there's a PID one pitfall who knows about the ID one pitfall Yeah, something you don't know
So PID one pitfall is basically a problem Or maybe a feature let's say So When you run an application in Docker it is started with PID one PID one has a special meaning in Linux because it is
an edit process which starts everything like SSH your UI whatever PID one doesn't inherit default default signal handler handlers
Which means that you have to implement them body by yourself Who knows about the signal handlers in Linux? Okay So you have to implement those signal handler handlers because
You usually want a graceful graceful shutdown of your application So when you run docker command the process is started with PID one and When you run docker stop on it, which should Terminate the process the first what docker do is it sends sick term to it
If the application doesn't shut down in 10 seconds after this it will send sick kill Effectively killing everything you may lose your state with this So it's good idea to implement the signal handlers this also applies for a process which doesn't run outside of
Which runs outside of Python outside of Docker? but also, it has one other meaning that you have to Like when a process runs in Docker you have to take care of the sub processes you run. So if you are using sub process
Module and running other processes as a child process says you have to terminate them and clean up after them because if you don't they will remain there and Docker will probably somehow take care of them or cannot will count kill them at
At the end. So also we have to Take in mind the user within and within the container because when somebody get access to your container and Runs a shell in it. Let's say some kind of attacker
The one can get the root access if you're on your applications your application on the route Which means that he can modify the file system within it even run other applications And you should avoid this you should avoid this because you don't want
it to modify the Modify the entire container he can even run some kind of spambles You don't want that After we migrated to Docker it took around maybe a month
We started looking where to move next and we found kubernetes It has some kind of a Navy IDM like Docker has a whale kubernetes has a wheel I
Don't know so What kubernetes is is basically a cluster orchestration this means that? You have a bunch of servers you install kubernetes on them Kubernetes somehow manages all of them, and you just tell kubernetes to run
The application somewhere in the class that you don't care about where you just want to have it up running and maybe accessible on some kind of address What kubernetes Was interesting in what kubernetes kubernetes was interesting for us?
because It's sold some kind of failover when the server failed so when a server in kubernetes dies it Migrates the workload from the that server somewhere else So you don't have to care about it, and you have you can't sleep at 3 a.m.. Won't wake you up
The configuration can be stored in these spaces these spaces are Logical dividers you can have namespace for production for staging given for different applications, maybe Namespace for monitoring logging
Even your applications and for each of these namespaces. You can have stored a global configuration Which can be then propagated? From these namespaces to the services running in in this namespace Also, it supports some kind of basic service discovery you can access other services based on DNS
Like my service my name space service cluster local is the standard and address just have to Fill in this my service my service and my name space It has an ingress controller
It's a person ingress controller ingress It's a way to expose the services applications running in kubernetes to outside world so the outside world can make requests and For example retrieve some kind of website, which is running in kubernetes
the other way around like The service is written in kubernetes can still access the outside world But for that outside world to access the kubernetes services you have to have an ingress controller also Kubernetes has one fancy feature, and that's deployment history which means that
When you deploy some kind of new service, and it doesn't behave you want to Revert it to previous versions, so you can call kubectl rollout undo and this will
Deploy the previous version, so it's really had some utility when you want to simply Revert something and you don't know which version version of it was running before
Right now we are in the process of migration to kubernetes our main trading platform was already migrated week ago There are still other services which are running running still in Docker on other other hosts But we are planning to migrate them in maybe two weeks, and then we can
Join all of the other servers into the kubernetes cluster We have in The environment you know configured by namespace variables So we have production namespace in which we have configuration
For some Which specifies where you can find services like the messaging some kind of data storage? Access to databases and so on and we are deploying our services using
Playing a moss with ginger to Like ginger to files containing gamos with fire variables this allows us to have some conditionals in templates and Have a single deployment file, which is adaptable for different
Processes or maybe profiles configurations and so in this example example you can see that we have Some profile specified and If the profile is see Tulu the other
other environment variables added to the deployment So no some of the multiple features about kubernetes are probes imagine that you have web application and
Deploy it to kubernetes you want to check that it is running, and if it starts like maybe 30 seconds is You just don't want to care about it. You just deploy it and want to see it running as fast as possible so Probes what proves that do is that they?
Check if application is running by Accessing the some kind of board you specify or running Some internal command a within the container This allows communities to check if the service is running and if it's held if it's not communities will automatically
restart it And then there are update strategies There are two major one one is rolling up update What this strategy does is that when you deploy a new version the old version is still running and
you new version you know new deployment with new version is started and unless the deployment with new version is Running available and stable the previous version is the previous the previous deployment with old version is still running
So when the second version the new one is available the first one gets shut terminated it's shut down and all the Traffic is forward forwarded to the new deployment
Another Update strategy is to recreate What it does is that it first shut down the previous version then starts the new and You can have some non-zero that non-zero downtime This update strategy is good when you have some kind of resource which has a
unique log maybe You can have some Data-stored in files and you don't want to applications to access the file simultaneously So those were the fancy features or interesting features of Kubernetes and
As I mentioned dr. Haswell Kubernetes has Wheel so beyond that There are many wheels. So we are looking into the Kubernetes Federation next and that's Beyond
So, thank you Thank You Peter, please raise your hand if you have any questions Are these Jinja templates?
integrated somehow with Kubernetes configs and secrets for deployment Like you can send only pure YAML files to Kubernetes and JSON files But you can you can't send the YAML Jinja do file. So you first have to fill in the variables then you can send it and
the way our deployment is designed that we take the Jinja file and fill and fill in everything we have so we have also secrets and config maps within the deployment Can I have one more? Which deployment strategy should I use if I have database migrations between versions I
Should use this recreate strategy or there is some others better solutions If you don't have any any way to Fix like to take care of the difference between those two versions you have you should use the recreate but
Maybe there's a way. I don't know. How do you monitor all this? That's an excellent question we are monitoring internal bots we're using hipster and we are monitoring
entire cluster by Prometheus So previously you mentioned the fact that the problem of the P1 within Docker how that's Kubernetes helped you
Kubernetes doesn't solve that you have to solve this by yourself with within your application Okay, as Mindy is solved and if so, how We are solving this by registering signal handler which As I mentioned we are using a sync IO. So what do we basically do? We just stop the loop which
terminates all the All the running futures I think it's called futures and Then we just have finally block where we just closed all the handles States states save the state and clean up everything we need
But also you can add the signal have what the loop? directly, hi Do you have some persistent that are on hard drive and how would you do you manage this data between all the service?
Yeah We had we had a persistence on file system we are right now migrating it to Redis, but until now what we had to do when we migrated between Hosts was that we had to shut down the service migrate the data and start the service again. We have no shared storage
We have time for last question At the beginning of the talk you said that you were running the problems with GitLab runner How did you solve the problem with for the clean environment for each pipeline?
We have dedicated the end like Docker in Docker service running which is which handles the Which handles the pipeline but so we We have we have shared we have shared shared didn't
but When We are shutting down it sometimes so we clean it So basically when that gets student it doesn't affect the entire host on which everything else is running so Actually, we have the same problem and the problem with Dean's is that you need to have a privileged container to run the Docker in
Docker and Essentially you have a root if you're if you do that So I wonder if you Had another solution. Yeah, maybe we didn't solve this the right way
Thank you. Yeah, we are women we are planning to migrate it to different hosts to separate hosts. So maybe this will solve this Okay, fantastic. Thank you Peter. Let's thanks