Reliability of distributed systems
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 132 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/44938 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Distribution (mathematics)IntelSoftwareSystem programmingComputer networkLoginTrailTelecommunicationDependent and independent variablesSystem programmingSoftware maintenanceBuildingWeb 2.0Data centerCASE <Informatik>DatabaseMobile appUniform resource locatorTelecommunicationPlastikkarteNumbering schemeComplex (psychology)Service (economics)1 (number)BackupServer (computing)Dependent and independent variablesWater vaporCartesian coordinate systemFrequencyMetropolitan area networkMultiplication signArithmetic progressionWordKey (cryptography)Representational state transferCausalityQuicksortDifferent (Kate Ryan album)SoftwareParallel portComputer animation
06:19
Content (media)Query languageError messageData recoverySystem programmingFunction (mathematics)Digital electronicsMultiplication signFunctional (mathematics)Library (computing)Telecommunication2 (number)InformationSystem programmingState of matterData storage deviceMobile appDependent and independent variablesError messageFlow separationCartesian coordinate systemDigital electronicsParameter (computer programming)WindowQuery languageDatabaseComponent-based software engineeringResultantControl flowCASE <Informatik>Data recoveryDecision theoryRow (database)NumberComputer fileComputer animation
12:38
TelecommunicationService (economics)System programmingInformation overloadCodierung <Programmierung>Error messageLatent heatDependent and independent variablesFehlererkennungObject (grammar)Error messageLatent heatMultiplication signPerspective (visual)Different (Kate Ryan album)CASE <Informatik>Mobile appGateway (telecommunications)Mechanism designGroup actionException handlingSystem programmingService (economics)InformationCartesian coordinate systemImplementationFunctional (mathematics)EmailIdempotentDatabase2 (number)Link (knot theory)Row (database)Computer animation
16:51
Queue (abstract data type)Task (computing)Data recoveryDependent and independent variablesError messageQuery languageRing (mathematics)Client (computing)Function (mathematics)ResultantGraph (mathematics)StatisticsProcess (computing)System callEmailConsistencyCartesian coordinate systemMereologyBitLevel (video gaming)System programmingQuery languageInformationFlow separationDependent and independent variablesError messageMultiplication signDatabaseCountingConnected spacePartial derivativeGoodness of fitArchaeological field surveyException handlingService (economics)Mechanism designGroup actionMeasurementCrash (computing)Default (computer science)TelecommunicationTask (computing)Mobile appCASE <Informatik>BefehlsprozessorSoftware bugScheduling (computing)Bit rateQueue (abstract data type)Order (biology)Concurrency (computer science)Functional (mathematics)Structural loadPlanningImpulse responseVideo gameWebsiteResponse time (technology)Computer animation
23:37
Function (mathematics)Set (mathematics)Radio-frequency identificationEmailError messageVideo trackingLoginPoint cloudServer (computing)InformationCore dumpBackupError messageData storage deviceCASE <Informatik>LoginImpulse responseInternet service providerMobile appDatabaseMultiplication signAnalytic setMetric systemStatisticsData centerMoving averageServer (computing)System programmingSoftware bugCustomer relationship managementSet (mathematics)Row (database)Process (computing)Medical imagingFirewall (computing)Cartesian coordinate systemTelecommunicationProduct (business)SequenceSystem callCountingGroup actionKey (cryptography)Table (information)Block (periodic table)Musical ensembleProper mapComputer animation
30:23
Server (computing)Density of statesStructural loadCache (computing)Firewall (computing)Level (video gaming)Integrated development environmentSoftware testingRead-only memoryBefehlsprozessorData storage deviceError messageSimulationMereologySystem programmingInformationBlogSoftware bugRing (mathematics)Function (mathematics)Computer networkLattice (order)EmailINTEGRALServer (computing)LoginRun time (program lifecycle phase)Product (business)Mobile appCartesian coordinate systemSoftware developerCloud computingFlow separationWeb 2.0Structural loadLevel (video gaming)Line (geometry)System programmingBitInstance (computer science)Data centerMereologySoftware testingSet (mathematics)CASE <Informatik>Row (database)Continuous integrationFunctional (mathematics)Computer simulationDifferent (Kate Ryan album)BefehlsprozessorBootingEnterprise architectureUnit testingCodeFirewall (computing)Computer fileScaling (geometry)WritingIntegrated development environmentWeb applicationCausalityKey (cryptography)Service (economics)Execution unitComputer animation
37:09
Error messageEmailFreewareUniform boundedness principle2 (number)Computer animationLecture/Conference
37:46
System programmingService (economics)Different (Kate Ryan album)Cartesian coordinate systemData centerMultiplication signCellular automatonComputer animationLecture/ConferenceMeeting/Interview
38:41
Mobile appLevel (video gaming)Combinational logicMetric systemSystem callFile formatServer (computing)System programmingIncidence algebraCalculationService (economics)Response time (technology)Pairwise comparisonPivot elementGraph (mathematics)BitOnline helpLambda calculusOpen sourceComputer animationLecture/ConferenceMeeting/Interview
Transcript: English(auto-generated)
00:06
Thanks. My name is Iri Benes and I work as a technique in kibbe.com and I would like to tell you a few words about this topic. Well the problem is that over years the systems are getting more and more complex, we develop more
00:22
complicated systems, we have like higher requirements for the systems to have like more complicated features, more of features and especially in past years we are trying to usually break up like big monolithic applications into smaller chunks. So the apps like have a lot of more dependencies and are like
00:47
getting more complex in a general. And the problem is that usually like anything can break down. Any of your dependencies like network can go down and data center and whatever like anybody can do some human mistake
01:03
and the system just can crash down. Like for example in past two months you might have noticed that one credit card scheme had like big outage in Europe and a lot of people couldn't pay with credit cards and two weeks later it happened to a different credit card company and a few weeks after that
01:25
there was like a huge storm in Germany that caused like outage of one of big data centers. And this company has like like really big budgets and they are doing the best to stay like available but I think can happen and
01:41
they can just break down. So goal of this talk is like to present few techniques that can help preventing of failing your app in case of the dependency will fail. We will see some examples in Python and the other goal is to have actually visibility over your app to know what exactly is happening
02:02
inside. Not just to deploy some app somewhere and not to know what is happening there. For example in my previous company we have been using one of the data center. They were like more expensive but they promised
02:21
like 100% reliability and so we paid something extra to have something reliable. But what happened once suddenly all of our apps running there just went down. So we called them and asking when they gonna fix it. They were asking fix what. We explained the servers. They asked which ones. We told
02:45
them like all of them. They didn't believe us so we advised go to your web. Not even this is running and they realized that they have a problem and what happened they found out that some maintainer, some plumber actually broke a pipe in the building and the whole data center was
03:04
flooded. So they were like repairing the infrastructure over six days loading the backups and they didn't even know like that something actually happened. They claimed to have like 100% reliability that they have backups. One of the backups were in the same building. So basically you
03:26
can't like trust anybody and you should know what is happening with your infrastructure. So we'll go through some techniques that you can apply in Python to make your app more stable. We'll go through some monitoring and
03:43
other thing and later some tips you can like aim on when you are deploying the infrastructure. So when you start actually developing the app you should think of what about what are you actually using in
04:02
your app. Like if you have some third-party APIs, databases, Redis, what servers you use, where they are and you should realize the importance of your resources like this database. Is it okay if it goes down
04:21
for one minute? What happens if it goes down for one hour? Do they have any SLAs? Like can you trust them? Do you have backups in more different locations? How do you use these resources? Like how often? Like when you're starting like building for example some Django app it has only one
04:42
database but over time like over years there will be more and more people working on that and suddenly you will have like dozens of dependencies, a lot of databases, third-party APIs and it can easily turn into a mess. Like I don't think that they originally designed it
05:02
like this but somehow it can naturally happen that you are like with big spaghetti infrastructure. In one of the previous companies, again, there was like some time period where DevOps usually once a week found a server and nobody knew what is like what is running there
05:22
and what is that we even had it. So when you know that what are you actually using, next thing is like to think about how you use it. Let's say we have some service and we have some dependency which can be
05:41
some REST API. Now we see like healthy communication. We have a request on time. We send a request, receive response immediately and everything is smooth. But the problems becomes when the dependency starting time outing. You have like some delay when you get the
06:02
response and your request are starting piling up and being handled in parallel. So what are the consequences? Like your users of your service will have like a slower response. Another
06:23
thing is that your app will start like eating resources. For example, if you have some Python app which is not like fully async, you may have depleted your application workers and that will be like busy waiting and looking at other resources and it
06:43
may end up that new request that your app will start to be receiving, it will just throw them away because you won't have anything to handle them. So what can you do? Check out like any communication with other systems and decide like how long is worth
07:03
to waiting for the response? Let's say we have for example some eShop and in the corner you have information about the weather in Edinburgh which is like nice to know but it's not worth for user to wait one minute to load eShop to see like this
07:20
information. Let's say that the weather API like time outing so we will just cut the request for example after one second or three seconds and just then show the user the information. Most of the like Python libraries usually have parameter timeout which with what we can set the number of seconds you
07:42
want to actually wait for the response and otherwise cut the request. So when you apply the communication can look like this. Even though the dependency starting piling up the request you are returning the response on time but you will lose the information you don't know the response. A few
08:04
things to be aware of like if you cut a request that actually manipulates with some data for example storing in database and you cut it off it may be like executed anyway. So you may store something in database or may play with data
08:23
and you don't know if you did or not. And second thing like if you have some Python component for example that handles the work with Redis you need to take care of if you set timeout not just to querying the result from Redis or database or also for initialization of
08:43
communication because some of the libraries allows you to set the only timeout for the query itself. So even though that you set up the timeouts you might end up like breaking your app because you did or the application the library
09:02
doesn't allow to set it properly. It's like a really basic technique that should be in my opinion applied to any communication and one more like advanced technique is applying circuit breakers. The philosophy behind it is like when you have some dependency that starts failing
09:22
or why should you communicate with it. For example you have this eShop with weather information if several users will end up with error loading the weather you will stop displaying it at all and users won't get the
09:41
information at all. So if you know the dependency fails you won't even wait for any timeouts or any seconds you will just immediately return instant errors so you won't file any request or anything and also it's better for third-party system recovery. Let's say that the
10:02
other system is failing because it's overloaded so if actually stop calling it you may give it time to auto-recover. In Python there are some libraries usually implemented as decorator you will just set up decorator you will specify how many errors in a row
10:23
you will allow for this resource and some interval how long you will wait until you try to communicate with it again. You will apply decorator for function in case the function will raise errors like five times
10:41
in a row you will stop using the function decorator will stop calling the function for 60 seconds in this example. How it works inside the breaker has three states one is closed which means the communication is okay everything is smooth and you call the dependency
11:02
normally as issued second state is open which means something is wrong and you stop calling the dependency at all and half open is state when you were open and you are trying to query just try to send out a few
11:22
queries a few requests and you will see if it's healthy or not and based on that you will decide if you wait or close the breaker. So communication can look like this you have first request that gives you like 100 or 200 or some success response so breaker is closed
11:43
second request is still okay so you are keeping closed through the request you will get some error or timeout or just some error response but you will check that it's only first error so you will keep the breaker closed then you will receive another
12:01
request another error and you will see it's already like and error so I will open the breaker then there will be like this communication window that you will stop calling the dependency and after that you will try to you will get to half open state and
12:21
try to send out some request if you are okay you will close it and continue or add some time window again so the communication looks like this once you start receiving errors or timeouts you will just stop communicating with it so in general
12:41
like you will return instant errors and you won't deplete any resources and you will give a time to recover for the system and two Python implementations yeah so far we have been using breakers and timeouts just to kill some dependency
13:05
that is malformed but in case you have something really critical to perform some action for example in a shop if you are like handling the payment or something in case like the other system for example payment gateway will start failing
13:23
you may really want the response so you will duplicate the request or again till you get the response so from this perspective handling your records will take longer because you are internally calling the service more times but you will get
13:41
your response again implementations it can be made as a decorator you will just set up like how many times you want to repeat the request it shouldn't be like infinite time or whatever you can specify how long you will wait between
14:02
the requests probably you can set up also some jitter or something applied to function if functions throws an exception function will be called again and again how you configure it but there are also problems with it first thing is if you actually calling some API like
14:25
more times you might end up like changing data more times you can store like some objective database more times if it's actually internally executed and also a bigger problem is when you
14:42
are style piling repeaters either in your application internally on several logical levels or through some other dependency you may end up like a smashing some resource here we have
15:00
like one request to our service but we will make free request to some dependency some dependency will make again free request to other dependency and in general from one request you may end up with nine requests so this is really dangerous in case the
15:21
final dependency will start failing because for example it's overloaded and with this mechanism you can like totally smash it how to prevent it for example your API or the dependencies could return specific error codes which can tell you okay i'm overloaded
15:43
don't repeat the request at all it won't help or okay i have some different kind of error you can repeat the request and we will see how it goes you can also set up some budgets let's say that you will allow to
16:01
some dependency repeat calls like 10 times in a minute so if you repeat it 10 times in first 10 seconds you will just stop repeating it and give it like more time to recover or you can set up some idempotency mechanism for example you will get to
16:21
your request header information which will tell the dependency that it's duplicated request and the app can like handle it properly it can help for example when you are manipulating with data you will see that you are sending the same request that already
16:40
for example have been handled somehow so we will just return some error that is already resolved and quick thing also if you don't have to do anything like synchronously for example again you are receiving orders you want to see some
17:00
confirmation email or something you can just put the request to some queue schedule some task which will be handled async and the system will go like faster smoother because it doesn't have to perform the action synchronously also what is important on the
17:21
other hand not to over complicate the system like to apply several of these mechanism on several layers you may end up that some layer will swallow like how part of the traceback or some exception may not bubble up for some reason so you may have like hidden
17:43
errors or partial informations or you can receive some error more times if you don't handle it properly okay and one more technique in our company we have developed developed some
18:01
system of diagnostics or repairs how we call it and it's in case that you have some communication that you actually want to do as synchronously information you need but in case it happens it's like okay-ish to do it asynchronously after a few minutes
18:22
so we have some system that paradoxically checks this system for inconsistencies and based on that it will automatically fix them for example you are calling some dependencies synchronously it will start time outing so you can have some side the job that will actually pull
18:42
the system for the results of those actions and handle the information properly in the system it also helped us a few times with some buggy releases we have released some bug but the system like discovered some inconsistencies automatically fix it
19:02
so we actually saw the errors we could handle it and there was like no damage done because the system auto-healed yeah so when we know what dependencies we have and how to handle them it's also really
19:22
important to monitor it I was working once in the company and we have developed some e-shop and once somebody contacted us that we are selling something really cheap so we were like from the start happy that
19:41
yeah we are above our concurrence but then we dig out that we already selling it too cheap and we found out that there was some service calling the currency rates from a bank but it got broken in some time and the currency rates were not
20:02
updated for a few weeks and there was in one country like bad political situation and the currency rates changed a little bit so we were selling it like cheaper like under price in some market so it was a little bit painful at the time
20:24
so it's important to know what is actually happening in your system let's say you have a system that has like a dozen of dependencies something crashes down and you have to go to the system and see what is happening what exactly went down so you can monitor like all
20:43
of these resources measure responses times the errors it returns connection counts through post writes if so SQL queries you can go deeper debug like slow queries in a post graph if you don't know the explain
21:02
comment you should definitely check it which can give you like the details about execution plans of your SQL queries you may find out that there is like something really inefficient and under a little bit heavy load it can actually like smash your database deplete CPU or whatever
21:25
you can install in your Python app some IPA, APM that will like for a very small work give you a really nice overview what is happening in your app you have like a
21:42
dashboard or some alerting by default about accounts your API is doing over a database or this or whatever and it can give you like great insight into your application and you will usually be surprised what is
22:00
happening inside and how can you like debug it next level of monitoring is define some pink endpoint also a really simple thing it will actually use some third party surveys that will periodically check if your application is alive so it's good that it's not
22:22
like part of your infrastructure it's something that could do the guys from the the first story and when you are doing the pink endpoint you should also define what does it even mean that your application is healthy so the pink endpoint could
22:40
query database readies or check some other dependencies to tell that your application is actually alive not just returning some dummy responses next level of monitoring is let's say that your application is okay but you can go a little bit
23:00
further for example you may monitor the functionality not just that your api call is healthy but something that was supposed to be happened actually happened for example if you process order you will send some statistics okay we have somebody purchased a tv
23:23
in germany and send the statistics somewhere it will not tell you the reason in case we are stop selling televisions like what exactly happened but you will have some impulse that will give you that you will be given to check
23:41
for some issues let's say your application is like 100 percent okay but some firewall or something can like block the traffic or you can see that some data center just went down and you were cut off the traffic or something oh yeah what happened for
24:06
example in our company we were selling we are selling the flight tickets and one evening we were we had this monitoring and we've been elected
24:20
like we are selling maybe too much of the tickets so we are investigating it for hours we couldn't fight anything and we were selling more and more and we couldn't see the reason but what actually happened one bank in i think it was in indonesia
24:40
they were charging actually the bank was charging customer only one percent of the price so we had information correct information about pricing but somewhere the bank had some buggy release and actually starting charging customer less so we everyone started buying this product
25:03
and we couldn't find like the reason in time but if we didn't have this monitoring like the image could be like much bigger because the banks eventually charge the customer and we handled the refund process
25:22
also when you have a set of these metrics we currently in company have hundreds of such metrics so we have a team of analytics that are actually building some apps on top of it for detecting some anomalies which can
25:42
give you like some impulse or insight that something is actually getting wrong and if anything from these helps there is usually something that will report you the error so when you have everything every
26:01
dependency checked when you send information about it like if you have a monitoring setup the next step is actually set up proper alerting it's nice to know that somewhere you have information that this is time outing but you should be alerted to actually
26:24
be able to take an action so if you if you have any monitors you can set up proper alerting you can set up responsible people for the alerts for each of those like check
26:41
the appropriate channel for the art if it's something like really important setup pager or some phone call also escalation policies in case like somebody is not able to respond to this alert also do really basic
27:03
stuff you probably all of you know roll bar or sentry or some error reporting tool will just basically wrap your python application and report the errors to some system and based on that you can be alerted again yeah
27:24
so when you have monitoring and proper alerting next step is to check out if you have a proper logging let's say you are elected but you should be all the time be able to to get a
27:41
details what is actually happening in your app if you are handling communication with the resources above that we already went through you should think okay what happens if a communication with this api falls will i have some information
28:00
to get the details like instantly or not so everything should be like locked properly in case it makes sense in one of the companies we've been building for example crm and one of the colleagues was working on the core and there was
28:22
like a relic core database table and he made he was using primary key as count of the entries plus one so he didn't use some sequence or auto increment and it was like working well till
28:41
somebody removed user from the database and cascade delete like deleted a bunch of records and there was like a huge mess in a database so the first thing when we noticed what we want to do was to restore a backup of database we couldn't do that because our
29:02
server provider had some outage and they didn't store our backups for a week or two so we spent several days of parsing syslogs and reconstructing the database from it which was like really helping to do we didn't have like that good logging
29:20
to do it like a better way but at least we had something because otherwise it would be really screwed up yeah so proper logging will make your debugging like much easier you may consider storing importation important info in database
29:41
to get like better statistics over it like get more into detail or also uh consider adding request id to our logs so you can bulk together logs from more different layers of the logging
30:01
so next step if you have everything handled properly you know if you are a python app if your infrastructure is that you need to put it somewhere so it's usually about the price or laziness and the features provided you can go
30:22
with your own servers you can use some hosting or cloud services and recently becoming more popular some serverless solutions for example on avizonda you can deploy like hello world app just writing
30:41
these two lines and defining API endpoint in sangui and everything is handled for you auto scaling and servers and basically everything you need just two lines of code which is like pretty convenient for some smaller services
31:01
okay and also consider auto scaling what happens that the load of your application like grows a little bit like your cpu or something is getting depleted usually in cloud services you can set auto scaling policy to boot up more servers or containers
31:22
also consider like if one of the servers goes down what will happen you can auto restore them boot up some other servers for example in different data center a few things that you can do for
31:41
helping reliability of your app you can do also on web server for example set also repeater on nginx level so if your request will file it will be automatically repeated and you don't have to touch anything in your app set up caching to help with the load
32:01
you can proceed with web application firewall that will give you like more features that can like make your app a little bit more stable and you don't have to care about something on app level also when you're developing you may consider like separated environments for development
32:21
stage production testing also you have more layers of testing you can put your continuous integration smoke test integration test unit test you can try to send some part of the traffic to
32:44
some kind of instance it will be basically a release candidate so in case you are like not 100 sure that the new release will be okay you can test it in partially in some part of the traffic you can apply the performance test
33:03
also when it comes to monitoring so far we've been talking just about application itself but you need to also monitor the infrastructure itself yeah what can you also do with
33:21
monitoring for example setup monitors on top of nginx you can be alerted when you receive like a bunch of non-100 200 requests in a row you have logs on a web server so you can define again a request id populated to your app to join all the
33:42
logs together on a more different levels of logging on the runtime what companies bigger companies do actually simulates the outage you can turn off some data center or simulate that some
34:01
dependency is down and see how actually your app is behaving if you do anything for preventing and your your app will actually have some outage
34:20
it's usually the good practice to write post mortem after everything is resolved to actually get some log about what actually happened so you can learn from your files you will think deeply about what was actually happened you can apply the
34:43
fixes for the current outage to other parts of the system because it can be vulnerable also to some issue and the key is like to keep track and get into a real cause of the problems so
35:08
we went through some techniques how to prevent failing that can be applied on different levels of your application mainly in you can do in python code
35:22
like timeouts breakers repeaters and yeah we've been we went through monitoring you should know like what you are actually using a monitor or servers application from outside from
35:40
different level of your application you can use apms and some custom functionality monitoring and play with it more deeply we have like a lot of different ways to test that your application is actually stable you should think of
36:00
architecture there is like a lot of you can do about reliability or more different levels of your stack also which is really important do proper logging again you can apply it on more different layers and
36:22
join them together to have really good overview what is happening inside not just to randomly touch your application when there is some outage and based on that also proper our thing think of all the use cases
36:41
for the alerts you set up if you know what you actually do with alert you will receive in advance how how to reach somebody who can handle it and i think it went a little bit faster than i expected so if you have some
37:02
questions okay so anyone so anyone having questions please come to the microphones by the way one also important thing our company is
37:21
hosting a party tomorrow in a pub nearby so if you want to get free drinks or hang with people you can stop by a kivycom booth and get some details of what it is just a second hi
37:46
thanks for the uh thanks for the talk um how do you know that your monitoring system works you monitor it i suppose oh yeah usually we have like a more different
38:02
levels of of monitoring so when we were thinking about the thinking of your application and i think that all of them would have to go down at the same time which is like the probability is really
38:21
slow because it's usually some for example some our self-hosted services or more different third parties with different data centers are whatsoever yeah
38:42
yeah it's also about the consequences of not working with your app and so it's like more different levels uh you mentioned the serverless
39:00
there uh do you have any uh kind of for example aws uh do you have any kind of like metric on how many servers will be needed for to get the same thing running for in comparison to on server for example lambda on serverless uh sorry i i don't have like um
39:22
it was just recently starting playing with it so i don't have like any like deep insights to help you with this request but maybe we can if you want we can talk after the talk and i can ask some colleagues that have like deep
39:40
experiences with it cool thanks well before doing python i have worked 28 years in a bank uh in it when you have a real incident i see
40:02
that or the the combination channels when you have a real incident you really want to have a human in charge so uh i would suggest that as long as you have not talked to the human you're not sure that the the alert has been taken into account especially when you're in escalation oh so uh well
40:21
as my experience is that if there's no phone call you're not sure that somebody else yeah like usually it probably depends how important are its and what is the response time you need if it's something like a set like definitely go with a phone call anything sometimes okay just to uh
40:42
alert on some secondary system and it's okay if it's handled like in 30 minutes because it's not becoming critical yet but i agree like definitely it's critical just go with phone and art anybody you can and do the best to reach the appropriate people hi uh i've got a
41:03
question about mapping out dependencies because documentation is all great but do you have any automated tools that will allow you to like make a graph of dependencies with some calculations and so on oh yeah it's a little bit tricky currently i think some of our teams like
41:22
uh working with cloud formation and stuff like this like to get all of this together to one place but i think that there is some solution being developed that could be like potentially uh open source but it's definitely not in the stage that
41:40
we could provide anything and i don't know about anything like that could like uh help with this it's like it's a bit tricky often if you have like services on more different hostings thank you very much for
42:01
everyone