Amanda: A New Generation of Distributed Services Framework
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 70 | |
Number of Parts | 119 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/19987 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Berlin |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
EuroPython 201470 / 119
1
2
9
10
11
13
15
17
22
23
24
27
28
41
44
46
49
56
78
79
80
81
84
97
98
99
101
102
104
105
107
109
110
111
112
113
116
118
119
00:00
Web serviceSoftware frameworkInterior (topology)Computing platformGoodness of fitWeb serviceDistribution (mathematics)Computer animation
00:19
Computing platformSystem callMilitary baseSoftware developerWeb serviceExecution unitVideo gameLecture/Conference
00:47
EmailMereologySoftware developerINTEGRALSound effectVisualization (computer graphics)WebsiteMedical imagingBitLecture/ConferenceXML
01:33
Client (computing)Lecture/ConferenceComputer animation
01:59
MassMIDITexture mappingArchitectureLevel (video gaming)Directory serviceEndliche ModelltheorieMetadataBitPhysical systemDataflowArithmetic progressionProcess (computing)Lecture/Conference
02:46
Uniform resource nameSpecial unitary groupWebsiteEstimatorOrder (biology)SynchronizationRevision controlScheduling (computing)Different (Kate Ryan album)Level (video gaming)Source codeACIDLecture/ConferenceXML
03:25
Raw image formatDifferent (Kate Ryan album)Range (statistics)MultiplicationCartesian coordinate systemIntegrated development environmentUniform resource locatorSource codeDatabaseData storage deviceRevision controlSubset1 (number)Lecture/ConferenceXML
04:03
Software developerLevel (video gaming)BitSound effectWebsiteComputing platformWave packetMaxima and minimaArchitectureModal logicMereologyData managementVisualization (computer graphics)CodeLecture/Conference
04:48
Web serviceTransportation theory (mathematics)MultiplicationThread (computing)Software developerProduct (business)Electric generatorBitConcurrency (computer science)Level (video gaming)Different (Kate Ryan album)Right angleEndliche ModelltheorieXML
05:25
Endliche ModelltheorieQuicksortDataflowBitMedical imagingBefehlsprozessorMultiplication signVolumenvisualisierungComputer fileVirtual machineResultantWeb 2.0Lecture/Conference
06:12
Uniform resource nameMultiplication signLatent heatTwitterGame theoryWeb serviceBitPhysical systemDistanceSocial classLecture/ConferenceXML
06:59
TorusVarianceVector spaceScaling (geometry)Web serviceInformationParameter (computer programming)Physical systemSoftware developerMiniDiscBit rateData storage deviceOnline helpSocial classPoint (geometry)Source codeLecture/ConferenceComputer animation
08:03
Query languageHigher-order logicClient (computing)Chi-squared distributionDatabaseInformationConsistencyData structureNatural numberPhysical systemSystem callLecture/ConferenceXML
08:34
BitPhysical systemWeb serviceCASE <Informatik>Endliche ModelltheorieGreatest elementSelectivity (electronic)Lecture/Conference
08:59
Execution unitGrand Unified TheoryData storage deviceLogarithmNetwork switching subsystemMetropolitan area networkMathematicsProxy serverWeb serviceWeb serviceData dictionaryVirtual machineCodeConfiguration spaceScaling (geometry)Software developerQuicksortInterface (computing)Level (video gaming)Proxy serverInternet service providerMereologyAbstractionQueue (abstract data type)Different (Kate Ryan album)BitTransportation theory (mathematics)Data structureStandard deviationIntegrated development environmentMaxima and minimaWordSource codeInteractive televisionConsistencyExterior algebraGreatest elementXML
10:24
Web serviceSoftware developerQueue (abstract data type)Transportation theory (mathematics)Adaptive behaviorConfiguration spacePoint (geometry)QuicksortLecture/Conference
10:59
Royal NavyProcess (computing)Web serviceConcurrency (computer science)Parallel portSoftware developerMedical imagingSoftware testingStrategy gameThread (computing)Block (periodic table)BefehlsprozessorComputer animationLecture/Conference
11:46
Internet service providerScale (map)Interface (computing)Concurrency (computer science)Queue (abstract data type)UML
12:06
Point (geometry)Block (periodic table)BuildingProduct (business)Physical systemLecture/Conference
12:53
Formal language1 (number)Web serviceWeb 2.0Level (video gaming)Process (computing)Office suiteMetropolitan area networkSingle-precision floating-point formatClient (computing)Zoom lensSystem callLecture/Conference
13:31
Transportation theory (mathematics)Type theoryException handlingWeb serviceDistribution (mathematics)DebuggerCodierung <Programmierung>Level (video gaming)Positional notationLimit (category theory)Formal languagePoint (geometry)Sheaf (mathematics)Lecture/Conference
14:12
Link (knot theory)Point (geometry)Physical systemFormal languageFault-tolerant systemDifferent (Kate Ryan album)Multiplication signWeb serviceQueue (abstract data type)MultiplicationProxy serverTransportation theory (mathematics)Set (mathematics)LastteilungInstance (computer science)VotingSelf-organizationLecture/Conference
15:16
BitWeb servicePhysical systemEvent horizonXMLUML
15:48
Instance (computer science)SpacetimeWeb serviceAnalogyArchitectureStatisticsGreatest elementMultiplication signCalculationComputer animationXMLUML
16:12
Web servicePhysical systemStatisticsMathematicsPoint (geometry)Level (video gaming)Overhead (computing)CalculationWordBitResultantClient (computing)LoginComputer animationLecture/Conference
17:14
Endliche ModelltheorieSoftware frameworkElectronic mailing listPhysical systemClient (computing)Data conversionConfiguration spaceWeb serviceWebsiteXML
17:55
Pointer (computer programming)Knowledge-based configurationPoint (geometry)Endliche ModelltheorieModule (mathematics)Transportation theory (mathematics)Lecture/Conference
18:18
Level (video gaming)Physical systemPoint (geometry)Configuration spaceWeb serviceCASE <Informatik>Military baseLatent heatLecture/Conference
18:42
Web serviceBefehlsprozessorPhysical systemScaling (geometry)Server (computing)CodeSlide ruleLecture/Conference
19:13
EmailProduct (business)Moment (mathematics)WebsiteComputer animation
19:36
Lecture/ConferenceComputer animation
19:56
Web serviceSystem callSelf-organizationNumberCommunications protocolEnterprise Service BusForestIntegrated development environmentBoss CorporationUsabilityHookingWebsiteACIDArchitectureOperator (mathematics)Office suiteMereologyRule of inferenceSynchronizationBit rateDampingCodeReduction of orderSoftware testingMedical imagingStandard deviationSoftware developerView (database)Source codeInformation securityMathematicsProteinInterface (computing)Instance (computer science)Data structureDifferent (Kate Ryan album)Sheaf (mathematics)Chemical equationMatching (graph theory)Event horizonPhysical lawLevel (video gaming)MultiplicationRevision controlConfiguration spaceStructural loadMultiplication signMetric systemAreaNetwork topologyQueue (abstract data type)Proxy serverEnterprise architectureBus (computing)LastteilungTexture mappingUniform resource locatorPolygon meshSet (mathematics)Product (business)Military baseAlgorithmData storage deviceSingle-precision floating-point formatLecture/Conference
Transcript: English(auto-generated)
00:15
So yeah, good morning everybody. I'll be talking about Amanda, our distributed services platform.
00:21
I won't be showing any other pretty pictures, I just hope that the talk can live up to the amazing work everybody else has been doing. Couple of things about myself first. I've been a software developer at MPC since about 2010. Been working with Python since 2009. Love services and everything that's plug-in based. Slightly obsessed by monitoring
00:41
after various phone calls at three in the morning. And I had the great opportunity to actually hold an Oscar for Life of Pi. It was a great opportunity down there. So I'm part of the infrastructure team at MPC. I've been working there since, like I said, 2010. And we create visual effects for advertising and feature films. These are a couple of the movies
01:00
that we have been working on recently. And we actually do this across eight sites with what we call fully integrated cross-site pipeline, which makes sure that our data flows from one site to the other, depending on what the departments are and where they work. So I guess not everybody here might be specifically familiar with what visual effects are. So this is a quick quote from Wikipedia.
01:23
It pretty much comes down to everything that is either expensive, dangerous, or would hurt an actor during filming. So we're trying to avoid it. But I guess a couple of actual images of the work that we do is probably gonna be a bit better. So this is a shot from World War Z as we got it in from the clients. And this is the actual work that we did to it.
01:41
So everything that you see there in the background is absolutely fake. Same thing here, this is a shot that we got in from Godzilla, one of our latest movies. Same thing here, this is what we got in. And this is the actual work that was done to it. If you look well, you can even see that the guys in the tank got replaced with CG characters. That's how far we push things these days.
02:00
So to do this, we work, of course, with a lot of assets, where an asset is something like a creature or a texture or whatever else that we need that is fake. And we actually make sure that it flows through the whole system. To do that, of course, the artist first does a bit of his magic. Once that is done, he creates what we call a daily, which is a short movie to actually show the work that he has been doing
02:20
and that can then be reviewed by the supervisors. Once that is done, he can approve the asset and he can, of course, add some comments and things from there. Once it's approved, we actually go through a releasing stage where a lot of things happen. We actually create directories where we store our data. We add in some actual metadata about the assets as well and make sure that everything flows
02:41
into the next department. So here, for example, we've got our modeling team, which, for example, creates an actual character and then some textures. And while we release, we actually make sure that we update all of the dependencies. We make sure that we notify all of the different artists that new things have been released. And we actually make sure as well that we sync any data that we have
03:00
to all of the different sites. So of course, we have to keep a couple of things in mind doing this. There's not one artist working. There's about 1,600 working. And we release thousands of versions of assets a day. So we have to keep that in mind, but also an ever-changing schedule. So one day, it might actually be quiet and the next day, we might have
03:20
a completely different schedule with a trailer that needs to be delivered in a couple of weeks, which means that we have a whole lot of different sources that we use coming from a database, third-party APIs, storage, a whole lot of different locations. And they're used by in-house tools that we have been writing, third-party applications that artists tend to work with. And of course, we have a whole lot of actual multiple environments.
03:41
So we don't work with one single environment. We've got a whole range of different environments, which means that for every single show, we can have a subset of tools that they use with a specific version where a different show might be using completely different ones. Other things to keep in mind are users themselves. The artists themselves want something that's quick
04:00
and easy to use, something that's consistent. They don't want to have to worry, oh, I'm using this API, so I'm gonna have to use this way of doing it, or I'm using that API, I'm gonna have to use that way of doing it. We do also have to keep in mind that these artists are not necessarily trained developers, but they do write code. They hack around quite a bit, and we need to make sure that we can present them data in a safe way for us, in a safe way for them.
04:22
So we want to expose only certain parts of our data to them in a nice and consistent way. Similar for developers, we have developers of any level coming in. Some of them are trained in more visual effects side of things. Others are trained in asset management. But they're not necessarily trained in anything that's distributed or, you know,
04:40
scoping across eight different sites around the world. So to do this, we developed a service-based architecture called Amanda. We provide that as a platform, as a service to all of our different artists and developers. And it's a multi-protocol setup with multiple transports and multiple concurrences. I'll be going into every single bit throughout the different slides, but it's just a small introduction to what it is.
05:01
And we try and provide an ecosystem to write a service for developers of any level. So anybody that comes in on the first day should be able to write a service during that day and get into the production by the end of the day. So we're currently running our second generation that was written in 2012 and has gone live in 2013. And it replaced our first generation, which was a push model, which caused a lot of problems.
05:23
So as soon as the request would come in, it would actually start scaling with extra threads and start running and running and running. And there was no way for us to actually limit that in any sort of way. So we have now moved to a queue-based model, which just allows us to limit things a bit nicer and actually make sure that we have a specific flow and can control that flow in a way, way nicer way.
05:42
So just some stats. Godzilla, which is one of the latest movies, like I said before. Like I said, we have a render file which has thousands of CPUs. But if it would have rendered, which means creating that final image on one single machine, it would have taken 444 years to actually render, which I guess a fair bit amount of time. And we've got 650 terabytes of worth of data
06:03
that went through the system as well. And that generated during our peak times about 250,000 demand requests a minute, which is 120 million requests in eight hours. And for those of you, I guess since we're in Germany, most of you have seen the Brazil-Germany game. It's about four times the amount of tweets
06:21
that were about that specific game. And congrats, Germany, on winning, by the way. So I'm now gonna just step into how we actually have been setting up the whole system from the ground up. I'm gonna be starting with actual service. And the way we have done that is that the service is nothing but a class. So we're gonna make here a make movie service.
06:42
We've got 20 minutes to make a movie, which is probably gonna be a bit short, but let's try anyway. So we're gonna start with greeting the director because we need to get some work in, which is your typical hello world scenario. And the important bit here is that it's a class. It's absolutely standalone. It's completely testable. And you don't depend on any of the tools
07:01
or any of the scaling features of Amanda, which is very important for us because we don't want people to have to worry about any of these things. We have these little decorators here called ad public. We also have an ad protected and ad private, which allow us to actually expose what methods are available throughout the system for other people to use.
07:20
So an ad public would mean that an artist and a developer can use it from outside. An ad protected would mean that you can mainly call it from a different service. So cool, we have a service now, but it's not actually doing anything useful. And it's definitely not gonna help us getting the kind of ratings that we've been having on rotting tomatoes. So let's actually make it do something.
07:42
To that, we provide interservice calls, as we call them. And it actually allows us to call different services. And the way we do this is by actually declaring a dependency inside that class. So I can say I have a dependency on the storage service. And here I'm using the storage service to actually check if the data is on disk. And I can do that with self.storage and check exist and pass in the parameters that it needs.
08:02
At that point, of course, we also need some information about our database itself. Oh, from my show itself, sorry. And we can do that with what we call infrastructures. An infrastructure is really a way for us to formalize our access to the backend, such as databases, logging, configuration, sessions, any of those things.
08:20
And in here, you see the underscore DB that is an actual infrastructure. And it just provides the users with a nice, clean, and consistent way to actually access databases. They are in themselves services, but they're stateful services, so that we can do things like pooling and caching and those kind of things. And they are local to the service.
08:41
So these services are actually not spread across the system. They are inside that same Python module, which allows us to do the pooling, of course. And the really, really nice bit about this, and I'll be hamming on this quite a bit throughout the whole talk, is that we can swap any of those services with other services. So it means that, for example, in this case, here at the bottom, you've got our config, where we're getting something out of the configuration.
09:03
In a development environment, this could be a dictionary. In production, this could be an XML file. This could be a YAML file, could be whatever file. And we can swap that in and out with different services, without, once again, the actual developer of the service having to change anything to his code. So now we've got something that does something, but it's not very useful in any sort of way.
09:22
It's not scaling. It's still local on one person's machine. And we also don't have that bit where we can actually provide a consistent interface. But we did create all of the abstractions that we need so that we can change any of the parts that we already have with other parts that we might wanna use in the future. Let me introduce you to the service provider. And this is how you actually create
09:41
one of these service providers. And this actually allows us to get that consistent interface. It hosts the services for us. And at the bottom, you can see, you know, we create a make movie service here, and our story service, we pass them in, and we can then call them with services that make service, you know, make my movie magic happen, or logging and actually change the logging level.
10:00
So that's the kind of things that we actually allow them to do. But we still don't, like we're still not able to scale in any sort of way. And we came up with the idea of proxies. And proxies are stand-in services for the requested service. So they pretend they are the service that you want, but they're not really the service that you want. And underneath the hood, they just stick your data into a queue,
10:21
and that queue can pass it on to whatever. We also tend to call these queues transports. And once again, they're completely transparent to the user. The user doesn't have to care. The service developer doesn't have to care where his data is coming from. So queues or transports, they allow us to abstract away technologies like RabbitMQ, ZeroMQ, UDP, any of those things.
10:42
We can all abstract them away. And it allows us to transparently swap out things like adapters. So if someday I wanna use RabbitMQ, and the next day I wanna use PyAMQP, I can. Without, once again, having to change my service, my service provider, or anything else, I just have to pass in the transport bit, which is configuration. So at this point, we can scale a bit,
11:02
but it's still gonna be expensive to run 250,000 requests simultaneously, because we need a whole lot of these services running. So of course you wanna do some parallel processing and some concurrency kind of things. Service developers, for us, shouldn't have to worry about how they're doing concurrency and how that works. And they shouldn't have,
11:21
they need to know if something is gonna be CPU intensive or IO intensive. That is something that we do want them to think about. But we don't want them to think about, ooh, I'm gonna have to pull a thread there and do this that way, that way, that way. We think that we should accommodate for both, because some tests can be CPU intensive, other tests can be IO intensive. And we don't want them to worry about that. We wanna be able to use threading in one way
11:41
and dreamless in another way, or multi-processing, even if we wanted to. So, so far we have been building this little block here, which we have been seeing. And what we did is actually stick a worker pool in front of it. The worker pool provides a simple interface across various concurrences. And the pool is fed with requests from internal queues
12:00
that is filled by consuming from our queues in there. Once again, workers can be changed, can be extended, and they can actually be chained, just like you would do with middleware. So you can just build a whole nice setup here, like at this point we've got a nice little building block that we can reuse everywhere. So at this point, we really have something,
12:22
like we have all of our building blocks that we need to start building a slightly larger system at this point. And the nice thing about it is that we can actually start chaining these blocks together. And that's what we did in production. So in production, we have a cross-language pipeline, as in, you know, we don't have Python. We've got 95% of Python at MPC for most of our tools.
12:42
But of course, we need some C++ for anything that is really, really heavy. You know, on, yeah, this is just really heavy to do. At that point, you might just wanna use C++ for any graphics. We have some JavaScript laying out for some of the web tools, we have Lua, we have a whole bunch of other ones and we actually wanna be able to present all of the
13:02
data that Amanda has and all of the services have to all of these different languages in a nice and consistent way. So what we did is actually, our first worker pool, we replaced it with MicroRisk and Flask. Nice and lightweight and simple. Just a little zoom in it so you can actually see where it changed. And that allows us to actually use HTTP
13:21
quite effectively. It allows for simple clients on every single language. I mean, any language these days should be able to make an HTTP call. And it's a nice, simple client that people can use and people don't have to worry, ooh, I'm gonna have to do threading to use this transport or this transport. We take care of that and we take that away. It does limit us to native types
13:41
because our HTTP transports either transport JSON or XML, because JSON and XML are pretty much available across all of those languages as well. But it does limit us to native types. Oh, we need to start extending the encoders and decoders to actually start dealing with those issues. So our front end here is a MicroRisk and Flask worker. And actually we don't really do any work in Flask
14:00
except for session handling, which itself is an actual service. The rest is just being proxied across to RabbitMQ where RabbitMQ takes care of the distribution across all of the different services that we might have running. So at this point we've got a system that can be distributed, that is available to all of these different languages.
14:20
Of course we wanna make sure that it's fault tolerant as well. So what we did is actually we run two instances of those MicroRisk and Flask workers and we stick NGINX in front of it to do load balancing and failover. Nice and easy. And we actually run a non-clustered RabbitMQ setup. So rather than actually clustering RabbitMQ for those who are familiar with RabbitMQ, we run multiple instances of RabbitMQ. And what that gives us is that we can actually
14:44
use our proxies to consume from multiple queues and transport at the same time. So like I said before, we can swap in any of these transport with a different transport and we can go as far as running RabbitMQ and another RabbitMQ, but we can also run RabbitMQ, ZeroMQ, and Redis at the same time and we can start consuming with one single proxy
15:00
from all of these different transports at the same time. So if in the future something nicer comes along or something better comes along or a whole set of changes, we don't have to rewrite the services. We don't have to rewrite anything else. We can just swap all of these bits in and out. So at that point, with that going, the last bit that is left is monitoring, which I'm quite keen on and there's something that needs to be done.
15:24
So what we did is we assign an actual ID to every single request. As soon as it comes in, we make sure that it has an ID and the ID is being followed throughout the system. So if I go from service A to service B to service C in Vancouver, if it blows up in Vancouver, I have a trace that it blew up in Vancouver because every single request is logged and I can actually start searching
15:41
on those request IDs throughout the system and find the whole trace of all the different requests. So since we really love our services and service-based architectures, we actually made sure that we have a statistics service and a logging service and so the data that we had in here,
16:00
for example, at the bottom where we have a calculation of how long it takes to get from the front end, so from Michael Whiskey to the end of RabbitMQ or the amount of time it actually took to execute the request, we can map these onto the system itself and we actually send those to a statistics service or a logging service. And what that allows us to do, once again,
16:21
is that if at some point we're now using Carbon, if at some point we wanna use StatsD, we can change the statistics service without, once again, having to change everything else. The nice thing that we did with our workers, since they can be wrapped and can be done a whole lot of things with, is that we have one single worker that executes the requests and that worker is wrapped in a statistics worker, so as soon as the request has been done handling it
16:42
and since we have transpose and queues, we already have the result going back to the client, that is the point where we actually start doing our stats calculation. So we don't have the overhead of actually doing our stats while we're still executing the request, it's done afterwards. Of course, there's a bit of calculation upfront, but other than that, it all happens afterwards.
17:00
Same thing for logging, all of our logs are going through a logging service which allows us to actually dynamically change our logging levels by an Amanda request. I can say, change my logging level to debug for this specific service and we can make those changes on the fly as we need them. So maintenance-wise, we use Salt. For those who don't know Salt, check it out. It's a really cool tool.
17:21
Similar to Puppet and Chef for those who know those. It's in Python and actually, we extended it within Amanda module, so Salt can now run up Amanda services and can run up a whole framework for us. And we actually wrapped the Salt client itself in a service, so we actually used Salt to investigate the system on the fly via actual Amanda requests to know what's going on in the system
17:41
without really having to log into the master node. And what it really, really gives us is that predefined, repeatable configuration that we need because we've got eight sites to look after. We want to be able to make sure that what's running inside A is gonna be the same as inside B is inside C. We wanna make sure that it's all the same. So we've got an adaptable, extendable,
18:01
configurable system at this point. We can change services, swap them in and out like we want to. We can swap our transports with whatever tools that we need. By the way, big thank you to everybody who has been writing a lot of these modules like librebinmq or simple JSON or whatever. We use them a lot and thank you for that.
18:20
It's very extendable and configurable and it's all configuration based. We can abstract the whole system from system level all the way down to service level. And we really have a best of breed system at that point where we might build a system for a particular show or pipeline or for any of our specific use cases.
18:41
So there's a couple of things that we're still looking at. Containerization is one. We don't want service A if the CPU's going crazy on it to actually take out service B. So we're looking at containerization. We're looking at autoscaling as well. If you have done investigations just like us and you wanna have a chat about it, it would be great. And we're also looking at the possibility of actually open sourcing the whole system.
19:03
So that's pretty much it for the whole technical thing. Sorry, I didn't go a lot into actual Python code itself. It's 20 minutes so actually digging into it would be quite tricky. Just a couple of slides. We are actually looking for people and we've got a lot of things in production at the moment. The Jungle Book one, keep an eye out for that one. This should be a really, really cool movie.
19:21
And of course, we are hiring as well. Across all our studios, across everything. So yeah, either have a look on the website or have a look at recruitment and or come and talk to me after the talk, of course, as well. Thank you and yeah, any questions really.
19:48
The microphone for questions is over there. Just stand up and go there if you have any questions. Do you deal with versioning of these services in any way?
20:02
Do we do it, sorry? Versioning of these services? A lot of services. Yeah, so every single service as it's deployed gets assigned a version and we have an actual, that's why we use Solve as well, so that we have a configuration set of those services and if a service change, we actually run up a different version of it and we can actually have a staging and a development mesh where we can push those changes
20:21
out first and run a bunch of tests against and spread it out to a couple of users to start using before actually pushing it into production. So every single service is versioned, yeah. Hi, I have a question. How Amanda differs from, let's say,
20:43
standard enterprise service bus? Because I do get it, I don't understand why have you rolled the code from scratch and not used, for example, let's say a service bus where you plug, you can plug in different services and so on. You mentioned salary, right?
21:01
I'm saying ESB, enterprise service bus, because when you do want to do, let's say, service-oriented architecture, you just use an ESB and I don't know why haven't you done that? I don't know, to be honest.
21:20
I'm not too familiar with ESB, to be honest. This is a technology, not a tool, like enterprise service buses when you want to integrate a lot of different various environments and so on, you just use an ESB with multi-protocols and so on and this looks quite same to me
21:43
and maybe we can chat about it. Yeah, let's do it, let's do it, yeah, of course. Interested to learn about that one, yeah. Hello, it was a great talk. I have a question about load balancing. What do you use to do it? Have you got any algorithms and metrics? Sorry, I couldn't hear you.
22:00
What about load balancing? What technology do you use to do it? To do load balancing? Yes, yeah, proxy or PVS or something like that. So at the front-end, we've got NGINX, which we use for load balancing. So we've got multiple micro-risking flask instances set up there and NGINX load balances between that and in production we use, on the other side, we use RabbitMQ to actually do load balancing.
22:22
So we have our proxy set up and then have a certain amount of requests that they can handle simultaneously and if we see that queues are getting too long, we just start spinning up more services. That's why we're looking at auto-scaling as well to actually deal with those issues. Hey, what do you do about the large data amount?
22:42
About, you know, you have a service which operates on data like source images and they're not available on your other location around the world. How do you make sure that the data is available and how to get it pushed around the world? So we've got various things that we do.
23:00
We've got one of our infrastructures that we have is what we call a cross-mesh infrastructure and of course we cannot check if something is on, say if we look at storage, we might not have enough storage in Vancouver and we can actually make what we call a cross-mesh call and you can do with self cross-mesh with this site and you can then use the same service interface to actually go and call that specific method,
23:22
say in Vancouver, to go and check with the storage service in Vancouver if it is available down there and then we've got of course our syncing queue which takes care of actually syncing all of the data across all of the different sites which happens at release time. We have specific rules set up as part of a service that says, okay, this method, this asset has been released.
23:40
Does it need to be synced to any of the other sites? So that's all service based. And the same for generating data, how do you prevent generating terabytes of data by some artists who, you just do monitoring and you look at the operations and... So the requests going through it are very small and very lightweight
24:01
so we wouldn't be sending terabytes of data through there. We just use our sync service as we call it to actually detect and make sure that the data that needs to be synced is going to be synced. We have large dependency trees in these assets where we can say, oh, this asset has this texture and this texture and this texture and these kind of rigs. Go and check if we need them
24:21
in the other sites as well or are they just doing something like lighting where they just need to render frames for example. Thanks. Any more questions? If not, thank you very much. Give me a hand. Thank you.