We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Getting Into the Zero Downtime Deployment World

00:00

Formal Metadata

Title
Getting Into the Zero Downtime Deployment World
Title of Series
Number of Parts
96
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Continuous delivery is a huge step forward in our ability to rapidly deliver features and value to the users of distributed applications, but it comes with a cost and a responsibility. Most modern web applications need to be highly available, and this also means that it should be up during the deployments. Dealing with zero-downtime deployments is a challenge, and there is no easy solution. Moreover, the solutions available vary based on the number of integrated clients, which parts of the World it addresses, how many active users it has... Isn’t there a simple way to figure out how to get there? Join me to get into the details of the key steps on your path to zero downtime deployments. Learn about the patterns, practices and techniques that make it easier, such as semantic versioning and blue/green deployments. We’ll also walk through an end-to-end demo of how a high traffic web application can survive the challenge of deployments. What seemed insurmountable at the start of the session will be practical and applicable by the time we’re finished, and you should be able to see how to start moving your production application close to the zero-downtime gold standard.
SoftwareBlogBitReal numberLevel (video gaming)Software development kitSystem callMultiplication signCASE <Informatik>TwitterComputer-assisted translationComputer animation
Android (robot)Client (computing)Software maintenanceDatabaseType theoryWeb pageClient (computing)Web 2.0Software maintenanceMobile appRevision controlCASE <Informatik>Real numberSampling (statistics)DatabaseError messageHuman migrationQuicksortComputer architectureRule of inferenceMultiplication signTerm (mathematics)FrequencyCuboidConnected spaceArithmetic meanRight anglePhysical systemSpacetimeLatent heatForm (programming)Projective planeMathematics
PlanningStreamlines, streaklines, and pathlinesProcess (computing)SynchronizationData typeService (economics)Strategy gameMereologyPersonal digital assistantMaizeComplex (psychology)Parameter (computer programming)Physical systemLambda calculusScripting languageOnline chatSimultaneous localization and mappingMaxima and minimaProjective planeShape (magazine)Message passingIntegrated development environmentScripting languageCASE <Informatik>Bus (computing)InformationTheoryTerm (mathematics)ResultantWeightPersonal digital assistantoutputBitLeakCodeRevision controlMobile appType theoryClient (computing)Error messageWeb pageNP-hardComputing platformComputer architectureQuicksortRepresentational state transferData storage deviceData managementWebsiteCartesian coordinate systemMultiplication signMedical imagingLogicSource codeDemosceneDatabaseWindows RegistryDomain nameGraph coloringMereologyConsistencyLink (knot theory)Boss CorporationProcess (computing)Strategy gameRight anglePeer-to-peerSoftware testingBookmark (World Wide Web)Computer animation
Data managementSoftwareIntegrated development environmentProcess (computing)Server (computing)Router (computing)Integrated development environmentDatabaseRevision controlLevel (video gaming)Sampling (statistics)Uniform resource locatorType theoryGraph coloringPhysical systemClient (computing)Process (computing)Data managementProjective planeVirtualizationPoint cloudProduct (business)Term (mathematics)MathematicsCartesian coordinate systemElectronic program guidePoint (geometry)Connected spaceSubsetQuicksortSoftware testingOpen sourceIP addressConsistencyLastteilungRow (database)Game theoryService (economics)Machine visionData dictionaryFerry CorstenRight angleLink (knot theory)Instance (computer science)2 (number)Scaling (geometry)Boiling pointSuite (music)Rule of inferencePrandtl numberScripting languageRoutingSoftwareChemical equationGreen's functionData storage deviceMultiplication sign
Process (computing)SoftwareService (economics)Revision controlData typePatch (Unix)Function (mathematics)InformationRevision controlPatch (Unix)Physical systemCASE <Informatik>LastteilungSemantics (computer science)Router (computing)MathematicsTerm (mathematics)Type theoryDifferent (Kate Ryan album)Level (video gaming)Server (computing)Ocean currentFunctional (mathematics)Control flowClient (computing)Demo (music)1 (number)Field (computer science)Error messageComputer wormContext awarenessConnected spaceCuboidProduct (business)Touch typingWritingCartesian coordinate systemMoving averageMultiplication signProcess (computing)Dependent and independent variablesPhysical lawForm (programming)DataflowRight angleEvoluteChemical equationService (economics)Shared memoryData miningWeb pageDigital photographyForcing (mathematics)SoftwareWebsiteComputer animation
Moment of inertiaArmComputer clusterVertex (graph theory)Hill differential equationSampling (statistics)Projective planeUniform resource locatorReading (process)Client (computing)Server (computing)Dependent and independent variablesProxy serverPoint (geometry)Network socketPhysical lawStandard deviationComputer architectureChemical equationExecution unitMultiplication signLastteilungComputer animation
MiniDiscView (database)Server (computing)Sanitary sewerInterior (topology)File formatComa BerenicesNetwork socketInformationClient (computing)BitBit rateDependent and independent variablesCondition numberArithmetic meanServer (computing)Chemical equationLastteilungComputer clusterComputer animation
MiniDiscSanitary sewerServer (computing)Gamma functionVertex (graph theory)Dependent and independent variablesServer (computing)Type theoryMathematicsCASE <Informatik>Message passingRevision controlNetwork topologyPoint (geometry)Computer animationSource code
MiniDiscServer (computing)Sanitary sewerView (database)SummierbarkeitRaw image formatVertex (graph theory)Revision controlServer (computing)LastteilungLoginRoutingCASE <Informatik>Network socketBoom (sailing)Latent heat40 (number)MereologyChemical equationSampling (statistics)Proxy serverInsertion lossMultiplication signConnected spaceComputer animationSource code
Server (computing)DatabaseServer (computing)Computer fileLastteilungConnected spaceRevision controlProduct (business)Multiplication signClient (computing)HTTP cookieLogicWeb pageWave packetType theoryCartesian coordinate systemCASE <Informatik>DatabaseConfiguration space1 (number)Data structureLink (knot theory)Rollback (data management)WebsitePhysical systemTerm (mathematics)Web applicationPoint (geometry)Element (mathematics)Single-precision floating-point formatComputer architectureInformationBitError messageGroup actionMobile WebStrategy gameSampling (statistics)2 (number)Process (computing)Direct numerical simulationIP addressLocal ringMereologyGraph coloringWeb 2.0SoftwareDomain nameMessage passingPatch (Unix)Domain nameEmailFrequencyContext awarenessFlow separationMathematicsGoodness of fitToken ringParsingCodeEndliche ModelltheorieTable (information)Integrated development environmentComputer clusterAuthenticationLevel (video gaming)Software testingSpacetimeRelational databaseHydraulic jumpCuboidMoving averageINTEGRALKey (cryptography)Latent heatProxy serverMedical imagingChemical equationWorkstation <Musikinstrument>Inheritance (object-oriented programming)Physical lawData miningBeat (acoustics)System callWordComa BerenicesSelf-organizationRight angleProfil (magazine)Boss CorporationGame theoryExistenceFile formatRoboticsMoment (mathematics)Forcing (mathematics)Mechanism designService (economics)Greatest elementGreen's functionInterface (computing)Scripting languageAdditionNetwork topologyTemplate (C++)Replication (computing)Network socketNumbering schemeReal numberAreaComputer animation
Transcript: English(auto-generated)
All right, I think it's time. Hello everyone, thanks for coming to my talk. I am going to be talking about getting into the zero downtime deployment for today. A little bit about me.
My name is Tuberg Oulu, it's a weird name to pronounce. By the way, how many of you guys are still drunk from last night? A bit, okay. I'm halfway there. And the stage is very weird. I hope I'm not going to fall down from there.
And it's being recorded, it's going to be a shame. My name is Tuberg Oulu. I work at a company called Redgate Software. We are based in Cambridge. I live in Cambridge as well in the UK. I'm actually from Turkey, so I'm trying to get used to British culture for one and a half years.
It's going well, but it's a hard culture to get used to. I'm on Twitter as well. I also have a blog, so you can follow me there. See my GIFs, cat pictures, some random rant there if you want to.
My blog is a bit more useful than my Twitter account, so you can check it out. Okay, let's set the stage. What is this about? Why are you guys here? Am I going to waste your time or am I going to give you something useful? In a very simplistic case, I think most of you guys have an architecture like this.
It's not an architecture actually. You have an API and then you have a bunch of clients. Here in terms of this, you have two Android clients, one web client, and you have an API which is version one. But now you made some changes, you improved it or fixed some bugs, whatever.
You want to deploy the new version. So what's going to happen here is that we will take down the version one and while we are doing it, all of our clients cannot connect to anything because we are not there. We deployed the new version, but during that period,
we lost all of the requests because there is nowhere to connect to, right? No one can serve you anything. So this is what I'm going to be talking today. This is what I'm going to be telling you, some specific stuff that can make it easy for you to not have this type of,
during your deployments, still being able to serve requests to your clients and then not give them embarrassing under maintenance pages or even worst case scenario, you wouldn't be telling them some sort of weird errors.
So I'm going to be giving you this type of rules and then I'm going to be also, you know, basically this is going to be based on my real world experiences, so I will talk about those as well. But why should you care?
I mean, deployments are, you can say, fairly fast, right? And maybe you say that I'm working on an internal application, which, you know, we don't have any type of requests during the midnight or something like that. That's perfectly fine. I mean, this may not fit into your case,
but there are some specific cases which I'm going to also be talking about here that you basically need to cover these cases to not give any downtime to your clients. So first of all, if you don't want to give this type of pages in your applications,
you care. And there is a more bizarre example, which this is actually from my phone. I saw this. It was saying that app is performing a database migration. Please be patient. On my iPhone, it is telling me that it is performing a database migration.
I'm a tech person, so I know what they mean. But if my wife saw this, she would go blank. Like, my phone is broken or something like that. So this is also another case. So let me give you a sample case, which is from my real world experiences.
I work at a company called Red Gate, I told you that. But I also, at nights, I do some personal projects. This is one of them. I love this thing. We haven't launched anything yet. The leak is basically going to be a famous last words. It's going to be your personal assistant in terms of eating.
So this is going to be, we will have a bunch of clients. We hope that we will have lots of requests to serve. But basically, we don't want to give any downtime to our users. So users should be using our platform without having any downtimes.
And we want to be fast as well, right? We want to be agile. We want to add features, put it there. Maybe test them with some users, but not show it to other users. So we want to cover all of those cases. And we have an architecture like this, in a very simple manner. So we also use polyglot persistence as well.
So we have lots of database storage technologies that we are working with. And we embrace eventually consistency. And we have a bunch of nodes for the API applications to serve the requests from there.
And then we have a bunch of workers that does some, you know, run some business logic behind the scenes to get stuff in shape. So this is a very simplistic case that we have. And then in terms of this, there are lots of cases here to cover in terms of deployment. So your database is one thing. And then your APIs are another thing.
And you have also workers, right? And you have RabbitMQ to serve messages. So there are lots of cases to cover here. And we have been looking into this. And we have practiced some of the stuff that I'm going to be talking today. And we just know some of those in theory now.
But it has been a good experience so far. And then the most important thing that it gives you is that you don't have the psychic weight of deployment now. So it is just a simple thing from now on when you actually spend a fair bit of time beforehand
to practice the deployments. And after that, you can just ship new features like this. Because it will be very easy. You will not have the psychic weight anymore on that part. And then, as I've been talking about, I think the continuous delivery is one thing that you need to keep this in mind.
Because there is a psychic weight in terms of deployments. You will say, OK, I'm going to be deploying it. Well, we will have downtime. People will be calling me, telling me, OK, your app is broken or something like that. Even if you give pages, people will also tell you that,
OK, your page is down or something like that. But it's a hard problem to solve as well. So lots of cases to cover. And we will see some of them here. But I'm not going to be able to cover every case. Hopefully we will set a scope here. But I think this picture covers it very well. So this is, I think, what we are trying to solve.
Get it spinning still, but replace the engine inside the thing. So it's a bit hard in terms of this. But let's set a scope here. So these are the things that we will care about today.
I mean, you may argue that websites are the HTTP applications as well, but it has a different domain than the basic HTTP APIs that we are talking about, REST APIs. So these are the two things that I'm going to cover today. But there are a few other things that you need to worry about. For example, your internal messages.
Maybe you sent through a message bus. So you need to basically care about those as well when you are doing deployments. So you need to version them, so on and so forth. But we are not going to mention that. OK, so what I'm going to be telling you now is that I'm going to give you some guidance and then probably we will show you a path on how to get into this.
So I think the first thing that is very important is to write down your deployment strategy. This is the first thing. There is a saying that, I don't know who said that, but it's a very clever thing. You can't automate something that you haven't done manually.
Amazing thing. I think you should first do it manually and see every cases and then try to automate it later. I think writing it down also makes it really easy for you to communicate with your peers as well. So you can say, OK, we are going to be doing the deployment, but they will ask you questions and you can tell them,
OK, this is how the deployment works in English, not in code. So this is what we have done. So this is not even the half of it. But it basically tells you what's the scope of the thing and why we are doing it. And then it has a bunch of more information there.
So basically write it down and say that, OK, this is going to be the deployment and in these cases, these are the things that's going to happen. But if we have this type of case, this is going to happen, so on and so forth. Try to cover as much as possible and why we are writing it down. And you will see that it will be helpful to you. I think this is the first thing to me that's very important.
The second thing is script it out. Try to script it out first. There are lots of technologies out there which helps you in terms of release management. They are amazing. I will show you one of them today. But I think the first important thing is that before going into anything,
just script it out in your favorite scripting language. It could be Bash, it could be Python, it could be PowerShell. It could be even C sharp or F sharp, whatever you use. It doesn't really matter. What matters is that when you script it out, you can run it in an immutable manner.
Every time you run the thing, you will get the same result on the same input. And your input here would be your source code or whatever. And then you will see some of the stuff that you are not going to expect maybe. And then you will basically understand the knowledge of deployment there.
And maybe you are working against a technology that you haven't been working so far. So for example, if you are working against Azure or AWS, maybe you haven't deployed there. So trying to deploy there with the release management tool is going to be painful. And if you get errors or some sort of stuff there,
you will not know if you are getting that error because you are doing something wrong against the platform, or maybe the tooling is wrong and you are getting the error because of that. So I think scripting helps you there. And then if you get an error in a tool later, you will understand, okay, I was not getting this on the script, but I'm now getting this.
There must be something wrong in my release management environment or somewhere else. So this is, for example, again, this is an example from Zlick. And then we script out the deployment scripts there. What we do is we work with Docker images. So mostly the stuff that we push is Docker images.
So we just build them, commit them, and then put them in a registry somewhere. And then some other system will pull that and then put it on the right place and then run the thing. And after you have the script, and then you know most of the stuff now,
and then you can get into a release management software, I think, because you can do stuff only with scripting, but at the end of the day, you will end up building a dashboard. So you will see, okay, I have lots of environments.
I have lots of servers. I want to see all of them, right? I want to see the stage, which version they have, which version, you know. Maybe deployment is failing, or why it's failing. You want to have a central place that you can look at your deployment process there. So there are lots of tools out there.
And then you can basically go and check them out. But the one that I know is that Octopus Deploy, amazing software, I think. It gives you lots of features. Like the one thing that...
It gives you the basics that you would expect from it. You have environments, you have instances, you basically deploy a tentacle. They call it tentacle. The naming is amazing as well. So Octopus is the main product, right? They have tentacles. And they also have an open source project called Kalamari. So it boils down.
Naming is amazing. We need to get this type of naming in every product. And it has some new features, which are very exciting. Like basically it honors the semantic versioning that you put on your artifact. And then it guides you into another different pipeline based on which type of change are you doing.
So if you are changing the major, we will also see a sample here today. But this helps you really, really well. As I said, there are others. I think Visual Studio Team Services has one.
So lots of others out there. And then I think, again, the benefit is that you will have a dashboard, you will have a central place that you are going to be seeing your stuff. So these are basically the guides that you are going to get into the thing. But there are also side things that you need to cover as well.
So for example, staging environment is really important. But the true staging environment is important. So people, this is a term that is being used in consistently.
So how many of you guys here have a staging environment that you see that, okay, keep your hands up, keep your hands up. How many of you guys in your staging environment hit a different database than the production database?
Okay, cool. So I think maybe that's the correct thing. But what I see the staging environment is that your application is there, but you are still looking at the same data. Because you want to see the application working against the production data there. And then that will also help you in terms of the blue-green deployment.
So what I see, the other thing is that it's a QA or a user acceptance testing environment where you actually take everything up and then put somewhere and then give some your QA people or somewhere else that, okay, go and look at this and then try to break it. So the staging environment, in my opinion, I think this is the correct thing,
is that you have some sort of database, you have a server, which is the green one here, and then you have a process inside it, and then you have some sort of a router, which is looking at that.
Any request comes to the router here, and the router directs it to the server. And then you have a staging environment here as well, which is the blue one. Now, I basically put it there in that color intentionally to get into this topic, which is the blue-green deployment.
How many of you guys have heard of blue-green deployment before? Awesome, nearly half. So blue-green deployments, actually, let's see a sample. So here we have a server, and then we have a bunch of databases that server connects to, and the router connects to the server.
And then what we want to do is we want to deploy a new version of our application. So what we do is we send up another server, we connect that server to the databases, and then now that server is our staging environment. So that you can have a very obfuscated URL
that you can hit internally or whatever, but your public people are not seeing that. So what you are going to be doing now is that you will just make the router look at the blue server rather than the green one. So this term is also called whip swap.
In Azure, for example, they call it whip swapping, so virtual IP address swapping. Nearly every cloud environment has this type of stuff, but also other load balances have this. So you are basically going to be swapping the router to look at the other server rather than our production one.
So this will happen now, and then your people are looking at, the clients are looking at the new server, new version of the application, but this is like a very instant process now. So deployment is sometimes a very lengthy process.
Maybe you are just going to be doing some stuff on the server as well. But when you set the server up, when you make everything running, whip swapping is like a millisecond process. But here there is a problem here.
Anyone spot the problem? Anyone? Schema changes, wrong. Maybe, but that was not my intention. Yeah, half point, half point. I don't have candy, sorry. Some people throw candy, but I don't know. Anyone?
That will be another problem. That's not, again, my intention. Exactly, that's the problem. I would give you a candy, but I don't have any. We have these things.
Exactly, so what we have done here is, when we are directing the request, we are just killing the old server instantly and then cutting the connection there, right? That's not what we want to do because we have requests on the flight maybe. Maybe you are serving millions of requests,
or it doesn't matter, maybe there is only one request that you are still serving there, right? Or you have lengthy APIs, like streaming APIs, or maybe you are uploading a picture or something, some sort of a big blob, which takes time, and then your server is still processing it, right? So cutting the connection and killing the server is not what we want to do.
So there is a term called draining the request. So again, let's see what we have done. So we have a new server, and we have an old server, which is the production, and our staging will be the production, and staging will be killed, right? This is what we have done before.
Now, we basically cut the connection there, and they kill the server, so in-flight requests are basically dead. What we want to do is, we want to basically stand up a new server, put it there, and then make the router see the new server, but don't cut the connection or kill the server
until our requests are drained. So you've got to say that, okay, we had some requests to do old server, which was our production, and those requests are still in-flight. Those need to be drained before we actually kill the server. So you basically need to have this type of system in place
that you can see the in-flight requests there, and also some load balancers have this type of stuff built in, like Elasticsearch, sorry, not Elasticsearch, AWS Elastic Load Balancer has this concept,
so you basically take a box or something like that, saying drain all of my requests before killing the connection or something like that. So this is what we want to do, basically. So this is another thing which is very important to keep in mind. And the other thing is that some people mentioned schema changes,
incompatibility, maybe the old version is broken, I want to roll back. I think flowing the context on the changes is very important in this case. So semantic versioning is something that you should also flow into your release process.
How many of you guys have heard of semantic versioning? Lots of people, okay, cool, that's very cool. So semantic versioning is basically, you know, you have a different level of changes,
it's called patch, minor and major changes, and then it has a website, you can go and check it there, it is a nice, you know, spec. And in terms of, you know, what's important to us here is that you basically, you know, going to be,
we are basically going to be setting a policy in place for our releases. So assume that you are Twitter, right? You put APIs out there and you version them, saying that this is the version one of the API. And now you go and change some bunch of stuff,
break the, you know, version one API, and now what you want to do is you want to put a version two of the API out there saying that this was a major change, but our version one API is still in place, but we will kill that in six months. So this is the policy I'm talking about. So you need to basically internally flow this as well in your system
and we will show how we can do that. And as I mentioned, you know, there are type of changes, for example, patch. Patch means no user facing changes, all internal, it doesn't affect anyone else. So this is what you see in your iPhone when you look at the, you know, when you look at the release notes,
it says fix some bugs, right? Okay, yeah, that's very helpful. So this is what you can put there when you do patch releases. The minor ones are you add some functionality into your software, but you haven't, you know, broken the backwards compatibility.
So this is basically you add a new functionality into your system, but your, you know, existing clients can still connect to you and then they can still, you know, get what they want from your system. Major one also is the actual important thing in our case,
which is you are changing something and then you are breaking your consumers with that change. This could be, for example, assuming you have a HTTP API and then you serve JSON payloads,
but now you are saying, okay, I'm going to be removing this field from my JSON payload now. And then you are going to be breaking some people because you are going to be breaking your consumers because they are basically expecting that field to be there,
they are parsing it when they get the, you know, response. So those type of stuff that you should care about and then when you do those, you need to version them and up the major version when you have those type of changes. So in our case here, I still haven't told you
why we care about semantic versioning, right? So in our case here, let's see the minor patch releases first and then we will understand the difference, I think. So minor patch releases in terms of our system is we have a bunch of servers, right?
And then this is the V0 of the API. Let's assume that HTTP API. And then we have a router load balancer and then we have three servers and our load balancer is basically balancing the load between three. But now we are going to be releasing a new version,
but this is going to be a minor patch release, which means that the current clients, current consumers of the API are still okay to connect to the new one. So there are a few assumptions here,
which is, again, you know, your current clients are okay to connect to the new version of the API and you are going to be okay with, you know, your clients to connect to the new one and the old one at the same time maybe.
And there are a few cases that you can maybe solve that problem, we will get to that later, but let's assume that now. And so basically what we can do is we can replace the old thing with the new thing, right? But keeping in mind that we want to have zero downtime and then drain requests and not give any errors to anyone
because of the deployment. So we put the new stuff on the servers and then now we have a bunch of new servers. What we are going to be doing is we are going to be doing the same thing. We will, you know, wire the load balancer to the new servers and then we will drain the requests on the old ones
and then at the end of the day we can just get rid of them. Or, you know, maybe you can keep them and then, you know, it will be your pool and then you would replace the applications inside it, whatever you do. But, you know, essentially they are gone now for the load balancer's concern.
So as, you know, this was a minimum patch, this was okay, right? And let's see a demo on this and then which will show us how this is being done actually. So here I have a sample project. So I don't have anything running now.
So I will use Docker here but it's... This sample is also available on GitHub as you can see in the URL here and there is a nice readme there as well so you can basically read it how you can actually get it up and running yourself.
So I will just get my sample up and running with Docker Compose up. So what I'm doing here is that I'm basically simulating the same thing that I have just showed you. So there is going to be six servers there running but my load balancer is only looking at three of them
and I also have a client which connects to the load balancer endpoint and then what my client does is that it sends a request and gets a response and inside that response the servers are sending the server ID, a unique ID for each server
and then my client logs them and then puts it on the standard out when it sees a request from a new server. So I have a bunch of clients, four clients here running at the same time and then they are constantly sending requests to the load balancer endpoint.
Now it logs it here saying that I have seen a request from this... I have seen a response from this server and then there is one more and then the client zero is now logging that and then client one is logging this.
So you get the point. And then what we have in place here is that we have HAProxy running as a load balancer and then we have a bunch of APIs running in Docker containers and that's actually our architecture.
So what we have here is that we have HAProxy running here so what I can do is I can connect to HAProxy through a Unix socket and then look at the thing. So this is our server cluster now
and then HAProxy gives me information for each of them. It's a bit encrypted, it's very hard to understand. You go and read the documentation. Luckily I did that. So you see two here, two means running and then zero means disabled. So my node four, five and six are disabled
but three, two and one are wired to the load balancer. So there is one more thing that my client does which is if it fails to get a response from the servers it logs that as well.
So it's logging it like that. It says fail, fail, fail, fail, fail. So in our case here if we break the deployment we should see those type of messages here but luckily we will not. So what I want to do here is that
I have these three servers that are disabled but that has version 1.1 of my APIs and then this three server has version one. So I have done a minor change
and what I want to do is I want to deploy the new versions. So I'm going to be simulating, I'm going to be actually doing the deployment but what I am going to be doing is I'm just going to say my load balancer to look at this three server
and then don't route any request to the old servers but still keep them there for the in-flight connections, in-flight requests. So hopefully when I run this, so this is basically connecting to the Unix socket which is a way to connect to the HAProxy. It's HAProxy specific
but this will depend on the load balancer that you are using. It will say okay disable these three servers and then enable the others. So hopefully when I do this there should be new logs on the left hand side but those logs should be saying that first request from blah blah blah
rather than fail fail fail fail. So I have done that, boom. So we are only seeing this stuff saying that I have got new requests from the new servers. So this is basically doing zero downtime deployment and it works in our case here, amazingly.
Again, this is available on GitHub. So you can see the sample here. There is even a GIF running there which is showing what it does. It explains the stuff. You see the commands that you need to run and there are some resources as well.
So I think it's a helpful resource that you can look at and then try it yourself, try to understand it. So this is basically running locally and I actually lied to you saying that servers servers servers those are not servers actually those are just Docker containers running in isolation. So you can simulate that very easily with a load balancer locally.
So there is actually one problem I lied to you again. That problem is that HAProxy so I used HAProxy here and then that HAProxy is an amazing load balancer
you can use it, it's free it runs on Linux, you can put it on Docker run it everywhere, scares amazingly. So I used HAProxy here but HAProxy doesn't do a good job on config reloads. So what you want to do is you want to replace the config
and then say HAProxy, okay this is your new config now and you have new servers inside it but when you do that reload during that time, which is a very small amount of time but you are basically cutting the new request
coming into the load balancer and then saying, I'm sorry I can't give you anything because I'm reloading now. Which is not a nice case considering the title of the talk, right? But there are some cases that you can cover that as well. And also Yelp
engineering team have done a research on this have done an amazing test and there's a link there, you can just check it out. Which explains what happens during those periods and how you can solve that. They kind of hacked it with the IP tables and all that stuff but you can solve it if you want to.
And one important thing they mentioned there is that you get very very little small failures on the request. Why am I telling you this? Maybe you are going to be okay with those. Maybe you are going to be designing your clients
to have that level of request failure and you don't want to go through this type of big changes. But there are also other ways that you can solve it. For example, another load balancer which AWS has they have a register and de-register
commands basically that you can register a server and de-register another one. Which is what you actually want to do in HAProxy but it doesn't give you out of the box. But the other case that you can do is you can employ internal load balancers. So this is actually another case which is a bit better than the old one
that I have showed to you. So in our case here we have one load balancer a public facing one shown in blue and then we have two internal load balancers shown in orange and then we have a bunch of servers wired to the first internal load balancer and then we have a bunch of clients. Clients are connecting to the public endpoint
and then public load balancer directs requests between the load balancers. But now what it does is that it only looks at the first one now it just disabled the second one because the second one doesn't have anything in it. So it sends the request to the internal one internal one also directs them to the servers now.
So what we can do is we can do a clever thing. So our public load balancer knows about two internal load balancers and our internal load balancers are just there as a server to do a load balancer's concern. So what we can do we can just send up new servers for internal load balancer 2
and make internal load balancer to look at the new servers and now we have a new version of the product running somewhere and there is an endpoint that you can hit there and then look at all of the servers and now what we can do is we can do the same thing on the public load balancer basically swap the request
swap the connection and then look at the new one drain the request on the old one and then cut the connection there so no new requests are going to the old one it's going to the new one. Why this is important? This is important because now you don't care about changing the configuration that much
on the public load balancer because you are doing it on the internal ones. So you somehow work around the issue here and then now the old servers can go away now because you don't need them or you can keep them if you want to spend money. So this is the case but again there is a problem here.
Who is going to see the problem? Anyone? There are a bunch of clever people here. One person. No one. Sorry? Yes!
Yes, that's amazing. So we still have one single point of failure which is the public load balancer and there is a way to solve that as well so again I am going to be talking in terms of HAProxy here but this applies to any load balancer. If you use a hosted one
they possibly do the same thing I think. So there is a nice documentation here if you want to go and read about this which is a bit complicated topic. So you need to basically have the same architecture for the load balancer as well. So you connect two load balancers together. In this case this is HAProxy again.
So you have two load balancers sitting and they are being connected to each other through an IP address. So your clients will be connecting to that IP and that IP will be looking at which load balancer to give the request. So what happens here is that you have two load balancers and you can basically maintain each of them separately
saying that okay get this load balancer and then reload the config or whatever bring it up, get the old one reload the config, bring it up again. So you can maintain them and you can update the software inside the load balancer as well. So you are again
making your chance to give errors a little team by doing this type of stuff. Again this is based on your needs. Maybe you don't need this, right? Maybe your application doesn't need this type of this type of care in terms of the downtime
but there are some cases that you basically care about those. So these are the minor and patch versions and we also covered a little bit about the load balancers but now I think this is the very important bit is that the major versions changes.
So this is the pain point now. So the pain point here is that you are going to be breaking your existing consumers so you can't just swap the servers, right? Because you have existing clients and those clients are mobile clients as well maybe and maybe web applications and mobile clients are the ones that you can't update instantly
and simultaneously everywhere because people be doing it manually themselves. So you got to have a way of maintaining the old one for a while. And then there are also several cases here to cover
like you have first major release which is another concern and then you have non-first major releases. You need to cover those as well. And there is a special case for the second major version release and there is a special case for the rest of it.
So we will see some of them. So again, similar stuff we have here. We have a load balancer and load balancer is being wired to a domain name through a DNS and then load balancer looking at the servers balancing the load, doing its job.
And what happens here is that we want to release a new major version. So what we are going to be doing is that we will stand up new servers, put the applications in them and as this is a major version, we can't just make the load balancer look at these ones
because we don't want to break the clients. So we will stand up a new load balancer and then we will wire it, wire the new servers to that load balancer and then now we have another load balancer looking at the new applications, new servers, right? And then we will wire a domain name to it.
It could be v1.example.com or something like that. Or you can do some other clever stuff like keep the domain name the same and then do the versioning in terms of the header values.
So you can keep another load balancer look at that. For simple case, let's say that it is another domain name. It's a subdomain. And then you have v1.example.com now and then now you have a new version of the API. Possibly no one connects to it yet because you haven't advertised it, right?
And then no one knows about it. No one connects to it. Still people connect to v0 but you will advertise the v1 saying okay, we shipped the v1 now and v1 is out there but we still keep the v0 for a while and we will kill v0 in six months, for example.
This can be your policy. And during that six months, people will be updating their applications to the v1 and then possibly the clients will be updating as well. So by the time when you have v0 and v1
after six months period you will possibly see that no one is connecting to v0 now, right? And then at that point you can kill v0 or you can say that okay, let's kill v0 when we have v3. So you can do this type of stuff but what's important here is that you have different
you keep the two of them running, by the way, at the same time and then you don't actually kill the old one and then put the new one there. So these are the stuff that, you know,
are important in terms of your application and in terms of the request and all that stuff but what you have also is that you have data, right? We haven't mentioned data yet. And the databases are special cases because you have data in them. You can't just, you know, wipe the thing, put the new one there because you have data in it.
So, and then you, in some, well, nearly in every database system you have a schema. It doesn't need to be, you know, explicit schema in some NoSQL systems you have, you know, implicit schema because your application needs to look at the specific data somewhere
and then pull that and then if you break something there your application cannot read that. So you need to be doing something called fast-forward database changes only. So you don't want to make changes which will break the existing applications out there.
So let's see an example of that. Assume that this is our table on the left-hand side. It is id, make, model and color. And then this is a C sharp code, piece of C sharp code that looks at the database and then gets the thing. And let's assume that in v2 we decide that we don't care about color anymore, right?
And then we have done the same deployment strategy that we have just seen for the major versions and we now have a v2 of the system. But in databases, as we don't care about the color now
we can just drop the color column, right? No, because we still have v1 connecting to the same database that we have and v2 is also connecting to the same database. So we don't want to drop the color column, we still want to keep it there even if it is, you know, a waste of space maybe,
but still v1 is connecting to it. So by the time we have v3, as if your policy is also saying that we don't maintain three versions at the same time, what you can do is when you get the v3 up, you can kill the v1
and then you can say that now, okay, as v1 is gone no one is looking at the color column, we can just drop it and delete everything there. And that's the place that you want to do the housekeeping job on your database. So this is a fast-forward change, so like renaming a column or something like that.
You don't want to do those type of stuff in this type of cases. So you have a concern in your data and this is just an example in terms of relational databases and there are lots of cases here that you need to cover and there is no one recipe that will fit everyone here. So it basically depends on your case.
So you need to be careful about it and you need to consider every possible aspects on your system. Maybe you want to have an integration testing environment somewhere or end-to-end testing that you test your system and old system and then the new system against the data structures.
That's a perfectly fine thing to do, I think. A very safe thing to do. So let's jump to the websites now. So we have websites, right? You hit www.cnn.com, you get a nice shiny looking page.
But that page has some specific stuff in it like assets. We have assets, right? Images, CSS files, JavaScript files. Those are specific to the HTML that you have inside it most of the time.
So let's see a case here. In this case, we have a CSS file and then JavaScript file and you can see the links there. This is our version 1 of the website.
And we want to release version 1 now but we have changed some things internally on the JavaScript file. Maybe we changed some of the DOM as well, DOM elements. But now what we want to do in terms of the deployment is that we will push the assets first and then push the web app later, right? If you are putting them to the CDN or somewhere.
But now when you are doing this, if you do it this way, you will be breaking the... You have a transition point at that point, right? You can still serve the version 1 of the page, HTML page, and that can connect to the CDN and get the same thing,
get the same CSS and then the JavaScript files from them. And if you overwrite them, you will be basically breaking your web application, right? This is something that's being overlooked most of the time and this is something that you don't want to do, definitely. So what's better is that you version them as well.
So you put the version on the URL saying that this is the v1 of the thing, blah, blah. Or some people put commit messages there and then you internally maintain the commit messages on the web application and web application puts it there and then connects to the CDN that way. This is also a valid way of doing it. And then now when you deploy the version 1.1,
you also put the version 1.1 on the CDN as well. So now you have separate stuff living in separate places. You still maintain the old one because some pages can still look at that. So this is a better way of doing it which is very important, I think.
And the other important bit is that you need to be aware of the context in your applications. This could be an API application as well. This could be a website. So what I mean by this is that you have a context when a request comes into your server like cookies, right?
And then the sessions, authentication tokens. So when you deploy a need change to your application and that may change the way that you parse these informations. For example, cookie. You are parsing a cookie in a different way on the new version but you still keep getting the old cookies because some cookies live longer than you expect them to be.
And then now you can't parse them and then you just fail miserably and then give errors to the client and nobody knows what's going on. This is a worst case scenario, I think, because it's also hard to debug, hard to understand.
So just be careful about those as well. Know your context. Know what type of stuff that you are getting from the request, what type of information that you are processing. And one of the other common stuff that is being done is that authentication tokens. So some people change the way they are encrypting them on the server
without actually knowing that. You still have the other authentication tokens out there. This may not be a problem that much because the worst case scenario, you are just going to be saying to them, okay, yeah, sorry, you are logged out now. Log in again. Or, I'm sorry, your request is not authorized. Get a new token and come back later.
So this is the worst case scenario. Maybe in your case you don't want to do that as well. So do the rolling of the keys nicely on the authentication tokens as well. So there is another concern. This is the last one that I'm going to talk about.
But again, this is not covering all of the cases. This is just going to give you some starting point, I think. But there are lots of cases to cover. Again, one of them is the rollback. We haven't covered that because it's a very hard problem to solve. And let's get that later. Okay, let's look at the sticky sessions.
So sticky sessions are the... So in some cases you don't want to make your client look at the two versions of the thing at the same time during that deployment period, right? So you don't want to always connect to the same server no matter what. You don't want to do that transitioning on the client.
So in those cases you can employ a sticky session logic in your load balancers. So what this means is that you basically employ a cookie or something in your request saying that this is my session ID.
And then your load balancer sees that session ID and directs your request to a server. And now, any time that a load balancer sees that session ID, you are always being connected to that same server no matter what. So in our case, if we do a miner or patch version deployment,
so during the training request period, we will be still directing the request to the old servers because there is a sticky session logic there that we want to keep. And you can have some expired dates on the sessions as well.
And then that might be your policy on the draining request and then the killing servers. But this is also a helpful concept to keep in mind if you have those type of structures in place.
This is especially important for gaming, I think. So you want to connect to the same server every time you do something. Maybe you want to connect to the same people on the same room to the same server for some reasons. But this depends on your case actually, but it's a helpful concept. Again, there are some helpful tools here, but we haven't covered everything.
Rollbacks is the one thing that's very important. But that can make its own talk. Just be careful on doing the rollbacks. I think you don't want to rollback your database at all. So again, I told you that I work at Redgate.
And there's a tool there called DLM Automation Tool. So I was part of that team. I just joined another team, but we got a lot of requests there saying that provide us a rollback script, rollback mechanism, rollback APIs, that we can rollback our databases during the deployments. But that's not actually what you want to do.
Because rolling back a database could mean that you are losing data at that point. So just be careful about those. I think we have like four minutes for the questions, but these are my contact details. I will be also hanging around here today if you have any questions.
But you can just contact me through that. I think we have like four minutes, I guess, right? Four minutes? Okay, yeah. I can have some questions if you have any. Any questions? No questions.
Awesome. I think I have done a good job. Just make sure to push the green button if there are any color blinds. The button on the right-hand side is the green one that you want to pick. Thank you very much for coming. Hope it was useful.