Ray: Scalability from a Laptop to a cluster
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 130 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/49929 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202074 / 130
2
4
7
8
13
16
21
23
25
26
27
30
33
36
39
46
50
53
54
56
60
61
62
65
68
73
82
85
86
95
100
101
102
106
108
109
113
118
119
120
125
00:00
Physical systemWave packetAddress spaceQuantum stateDistribution (mathematics)Library (computing)Artificial neural networkCartesian coordinate systemVirtual machineCore dumpInformationEmailComputer animationMeeting/Interview
00:27
ScalabilityDean numberQuantum stateEvent horizonDemo (music)Pointer (computer programming)Auto mechanicEvent horizonQuantum stateWeb browserWeb pageTwitterFerry CorstenQuicksortScalabilityPresentation of a groupWebsiteQuantum stateDean numberDemo (music)Computer animation
01:01
Quantum stateCore dumpWeb browserReinforcement learningLibrary (computing)Cellular automatonDemosceneCartesian coordinate systemLaptopMachine learningQuantum stateVirtual machineSource codeXML
01:37
Parallel portTask (computing)Key (cryptography)Length2 (number)Functional (mathematics)Query languageLoop (music)ResultantTask (computing)Multiplication signCodeTupleService (economics)FrequencyParallel portLibrary (computing)Reference dataUniform resource locatorBitQuicksortMetadataGraph (mathematics)Social classRegular graphIntrusion detection systemConcurrency (computer science)Data storage deviceQuantum stateData dictionaryIterationLink (knot theory)Object (grammar)Moment (mathematics)Extension (kinesiology)HypercubeNumberWordBit rateCartesian coordinate systemSimilarity (geometry)Correspondence (mathematics)NeuroinformatikSystem callDefault (computer science)Line (geometry)Reinforcement learningCASE <Informatik>Point (geometry)Scheduling (computing)Instance (computer science)ScalabilitySystem programmingParameter (computer programming)Core dumpReverse engineeringTunisProcess (computing)Computer programmingQuantum stateComputer animation
10:41
Distribution (mathematics)Execution unitReinforcement learningIntegrated development environmentGroup actionData modelBefehlsprozessorScalabilitySimulationStreaming mediaQuantum stateSoftware frameworkMachine visionLatent heatTime domainGame theoryConvolutionLibrary (computing)AbstractionLetterpress printingProcess (computing)Integrated development environmentScalabilityDemo (music)PlotterMereologyWritingSoftwareLibrary (computing)Wave packetGeneric programmingAlgorithmNeuroinformatikGame theoryAbstractionWeightQuicksortReinforcement learningDecision tree learningDifferent (Kate Ryan album)Artificial neural networkPole (complex analysis)Multiplication signLimit (category theory)State observerPattern languageGroup actionComputer iconSoftware frameworkNatural languageArithmetic meanQuantum stateKey (cryptography)Data managementQuantum stateDomain nameSystem programmingMachine learningCartesian coordinate systemMoore's lawRoboticsLoop (music)TwitterTerm (mathematics)IdentifiabilityResultantObject (grammar)Field (computer science)Reading (process)YouTubeBuildingVirtual machineQuery languageFunctional (mathematics)BitWindowTrailScheduling (computing)Chemical equationAreaService (economics)Analytic continuationFile formatRegular graphSpacetimeData dictionaryVariety (linguistics)Machine visionMaxima and minimaNumberClosed setPoint (geometry)Medical imagingTask (computing)2 (number)Moving averageOrientation (vector space)Latent heatRAIDDemoscenePhysical lawWhiteboardCASE <Informatik>Cycle (graph theory)IterationAutomationNetwork topologyAverageElectronic mailing listComputer animation
19:44
Library (computing)ScalabilityConnected spaceFunction (mathematics)ConvolutionFactory (trading post)RobotSimulationArtificial neural networkGame theoryPopulation densityComputer networkHypercubeContext awarenessAlgorithmQuantum stateData managementFlow separationInstance (computer science)Drop (liquid)MathematicsStatement (computer science)DisintegrationGoogolLocal GroupVirtual machineQuantum stateReinforcement learningQuantum stateData structureVirtual machineArtificial neural networkSpacetimeInstance (computer science)Computer simulationComputer architectureCartesian coordinate systemTunisGraph (mathematics)ScalabilityGroup actionData managementINTEGRALNumberDifferent (Kate Ryan album)Arithmetic meanInformationUniform resource locatorBuildingSystem programmingMathematical optimizationMachine learningReal numberWave packetScheduling (computing)AreaSoftware frameworkAlgorithmRight angleCore dumpBookmark (World Wide Web)Machine visionProcess (computing)Object (grammar)Semiconductor memoryTask (computing)Game theoryBefehlsprozessorQuicksortGoogolMereologyMultiplication signStatement (computer science)CoroutineLibrary (computing)Boundary value problemHypercubeMehrprozessorsystemSingle-precision floating-point formatConvolutionParameter (computer programming)DemosceneCASE <Informatik>Bayesian networkPattern languageComputer animation
25:06
Library (computing)Goodness of fitQuantum stateScalabilityInstance (computer science)Address spaceLaptopSoftware bugInformationScripting languageProcess (computing)CodeSystem programmingTouch typingZoom lensProjective planeQuantum stateBit rateBitProgram slicingMeeting/Interview
Transcript: English(auto-generated)
00:06
So what I'm going to talk about is Ray, which is a Python library, mostly, although it's got a little C++ core for making it really easy to distribute Python applications across the cluster. And it was kind of inspired by the challenges of doing distributed machine
00:22
learning, you know, training neural networks and so forth. You can reach me at this information on the left in my email address. I'm on Twitter at Dean Wampler. Ray.io is the site for Ray. And I work for any scale of the company developing Ray. And we're having a Ray Summit this fall, September 30 and October 1. If you're interested, you can find out more at our events page on our website.
00:43
So this will be a really fast talk because I have 30 minutes. I'll cover a lot. What I hope you'll get out of it is sort of the gist of what's going on with Ray and why you might be interested in it. And we can certainly take questions in the Discord channel afterwards. So let me start with a demo. I'll exit from this presentation and then switch to my browser where I have
01:03
an application running. So what I'm going to do is just walk through first what it's like to work with a core API. Whether you're doing machine learning or not, this is the API you might use for distributing your applications. And then I'll run through an example using reinforcement learning in a library for Ray called rllib.
01:21
So you can get a sense of, you know, what Ray is doing behind the scenes, as it were. So I went ahead and evaluated a few cells already. I did some imports. I initialized Ray, which here is just running on my laptop. I could also tell it to connect to a cluster. And then there's this what we call the Ray dashboard that comes up
01:40
so I can actually see what's going on if I'm trying to understand performance issues or, you know, see what's going on and so forth. We won't look at that again for time's sake, but just so you know it's there. So the example I'm going to simulate is the case where maybe I need to call an expensive data store, expensive in the sense that it takes a lot of time to compute relative to
02:00
just regular computing. And just to make it a simple case, I've defined two dictionaries, one of which takes the keywords like reinforcement learning, hyperparameter tuning, and so forth, and returns the corresponding high-level library in Ray for those particular tasks. And then there's a second one, Ray URLs, that takes the values returned from the first dictionary
02:21
and gives you the links to the documentation for them. And you'll see why we have a second one in just a moment. I'll mostly start with the first one. So let me define it just a regular Python function that will, you know, take in one of these phrases like reinforcement learning and look up the value in the dictionary and return both that key and the value when it's done.
02:40
And to kind of simulate doing something that's compute-intensive, I'm going to sleep for a period of time equal to the length of the phrase divided by 10. So it turns out reinforcement learning, I believe, is 22 seconds, so it'll sleep for 2.2 seconds when we call it. And then if I define this function that just iterates through all of the phrases in the dictionary
03:02
and then calls them and time how long this takes, we'll find that it takes about 7.1 seconds because it'll do one at a time and sleep for each one, and it turns out there are 71 characters in the keywords, so that's why we slept for 7.1 seconds. Well, this kind of query could be done in parallel.
03:22
There's nothing that's related between the queries that I did, so let's see how we could use Ray to turn something into an asynchronous process that we can then do in parallel where possible, but without doing a lot of low-level parallel kind of programming. And the way we do that is we define something called a RayTask.
03:41
We annotate a regular function with this annotation at Ray.remote, and I can just turn around and call the other one, so I can have it both ways. I can have the original Python function or this RayTask and use whichever I want. And then when we actually call it, we add this .remote to the invocation. It's a visual cue to us that we're actually doing something with Ray
04:02
and not just making a regular call, and this thing that it's returning is actually a future, so we've fired off this asynchronous call somewhere in our cluster, and at some point we can retrieve the value using Ray.get, and this Ray.get will block until the value is ready. I waited long enough so it came back immediately,
04:20
and here's what we got. So let's actually do a similar search iteration that we did before. Notice what I'm going to do. I'm going to fire off all four queries at once, save those IDs, and then call Ray.get to get them all at once, and this will take 2.2 seconds. Why 2.2? Because the longest key is reinforcement learning,
04:42
so even though some of them finish sooner because they have shorter keys and we slept for a shorter period of time, the way I called Ray.get actually means that we'll wait for all of them. I'll show you a workaround for that in just a second, and you can see that, yes, reinforcement learning is 22 characters long. Right, now here's why I have the second dictionary,
05:01
and it's a really nice feature of Ray that lets us do tasks that actually depend on each other in a reasonably intuitive way. So what I'm going to do is define a function that is going to get the doc URL from. So recall that I queried the first one. I got something like Ray.rlib.
05:21
Now I can use that to query the second dictionary to get the URL for the documentation. The way I've written this function is it's actually going to take the tuple that was returned from the first one, and that will actually be very convenient in what I'm about to do, but otherwise it's kind of identical to the first one, and also I'll create a task as well as a function,
05:41
and then here this method will actually first query the original dictionary, you know, with all four keys. Then it will turn around and take those results and query for the doc URLs, and then it will finally return the results there. Let me go ahead and run this. It will take like four seconds or something, but I want you to notice something crucial that's happening in line two.
06:03
We have task dependencies now. I can't schedule the second, these doc URL tasks, until the value from the first task is completed that corresponds to that lookup. So, you know, in a regular distributed system library, I'd have to poll or something, wait for the first result, you know, unpack it, and then pass it to the next one.
06:22
Ray is doing all that for me. This subject name is actually one of these IDs. It's not even a regular tuple. It's actually a future ID, but Ray knows that it needs to unpack that for me, and it also waits to schedule this task until the thing is ready. So I can have a graph of dependencies, and Ray will handle that for me,
06:40
but otherwise this code kind of looks like regular Python, you know, synchronous code where I'm not even thinking about distributed systems. So I think the reason I got excited about Ray when I joined AnyScale was I just loved the way that it kind of took normal concepts we like, like functions, added extensions that let us still kind of pretend
07:00
like we're working with synchronous code, but now we have the magic of distributed concurrency across a cluster and the whole bit, so it's pretty nice for that reason. And sure enough, it took about 4.4 seconds to do that whole thing. I did mention that we were blocking for everything. There is an idiom with a function called ray.wait where we can loop getting the results that have already finished
07:23
while we wait for more to go on. I won't take the time to walk through this code because it's a little complicated. I don't have a lot of time, but what we can see happening is as results finish, I go ahead and process them. You know, the first one came back in two seconds while other things are still running, and then eventually I finished the loop
07:42
as I basically drained the queue, if you will. So that's a nice way to do work with things that are already finished while other work is running. And then lastly, really, I'm finishing almost everything about the core API right now is you can also put objects into the distributed object store that Ray is using.
08:01
So really, in a real application, those dictionaries I wrote are reference data. I probably put them in the object store so tasks all over my cluster can query, you know, pull out those objects and use them as needed. So this is a nice little sort of, let's call it the reverse, the dual, if you will, of ray.get.
08:22
All right, one thing I have not done yet is dealt with distributed state. These tasks that I've been writing have all been asynchronous, or sorry, well, they've been asynchronous, but they've all been stateless. What if I want to keep some moving state and I want to have that distributed over the cluster? Well, once again, we start with a familiar idea, something that we know already is a good way to park state,
08:41
which is a class. So I'm going to declare just a regular Python class that now is called a search service. It's going to have these dictionaries just for convenience that we mentioned. But one thing it's going to do differently now is that it's actually going to remember how many times I asked for each key, as well as keys that don't exist in the dictionary. So it's actually keeping a state of, you know,
09:02
sort of metadata about how this service was used. And then we have a query method that does basically what we've done before, where we query for a given phrase. This one also handles the case that it'll determine which dictionary it needs to read and that kind of stuff. And then we'll have a convenience method for getting all of the keys from both dictionaries.
09:21
But otherwise, it's kind of the same code we've already had. As usual, you create a service. You know, we can then do queries over it. And if we run this, we see that, yeah, it returns all of the, here's the known keys, an unknown one that I've appended to it because I'm now going to try timing this thing. And this will take about 11 seconds
09:40
because it's going to, you know, one at a time, go through each of these, you know, sleep for each call and so forth. It's important to note that I've actually gone back to synchronous execution in the sense that I have one actor, as we call it, it's going to process one request at a time. It's not going to do it in parallel the way we were doing it. The way you would get back to parallelism
10:01
is have multiple instances of these, like a farm of them or something. Or you can also do a concurrent invocation. So by default, you'll do one at a time, which is usually what you want because that means I'm not likely to corrupt the state in this class, which is the number of queries. So that's usually the way you want to go.
10:20
And you can see the results we got back, and it took about 11 and a half seconds. And sure enough, we queried each of those keys once, including this one that it had never seen before. Well, we can very much turn this into an actor now, just very much like we did before. We'll subclass our search service. We'll annotate it with ray.remote.
10:41
The one other thing we have to do here is in an actor, you can't reach in and read objects or fields inside the actor. So we have to have getter methods or accessors to get at them. So I've added those to get the dictionaries and also to get that query object that's keeping track of how often we've called something.
11:00
Notice how we construct one. It's service actor.remote. We'll use the same sort of function we used before, but now what we're going to do is fire off all the queries we want. And again, that's the setup here for a list. And then we'll use our wait loop to pull them off as they're done and print out the results.
11:21
So if we time this and watch what happens, we can see that some of them are starting to come back, but this is synchronous again because of the way we've now designed it, but at least it's completely thread safe and robust. So it did take 11 and a half seconds, but it worked as before. And we can see that once again,
11:40
we called all of those keys once. All right, to finish this, let me just rip through very quickly an example of the high-level library in Ray called RLlib for reinforcement learning. If you don't know what reinforcement learning is, it's the thing you may have heard about that beat the world's best Go player. It's the technology used to beat Atari games. Essentially, you have an agent.
12:01
It's looking at an environment. It's making observations about the state of the environment. It's trying to guess what the best action to take is, and then it observes the reward it receives. And then it tries to learn and get better and better at choosing actions based on the state. And what I'm going to do actually is use a popular example
12:20
of a reinforcement learning environment, which is called cartpole. And that's basically where I have a one-dimensional cart moving back and forth, and I want to keep this vertical pole balanced as long as possible. And this is very easy to set up, and Ray, for time's sake, I can't go through all the details we're seeing, but I will say this. We are going to train a two-layer,
12:42
fully connected neural network, the two hidden layers, with 50 weights on each one, and we'll actually see how I get smarter and better as it goes. And while I'm waiting for... There it goes. That's finished. So this is just a loop. It's going to do the training, and it'll print out results as it goes, and now we'll watch it happen. This will take a few seconds,
13:02
but what we'll see is we're going to print out... As it runs, it can do up to 500 points, and that's when it just stops. But what we want is for most of these so-called episodes to get as close to 500 as possible. So the number we really care about is the middle number. That's the mean so that, on average,
13:21
it will do this well at this level of training. And as you can see, this number's getting higher. Excuse me. So it's the maximum score that it got when it was doing a training run, and that means it's getting smarter as it goes. Now, the iterations, I think we're going to do 20, and it turns out that it'll get good but not great.
13:42
If we let it go longer, it could actually get really good, close to 500 every time. So we'll let this continue. I'll go ahead and evaluate the next cells so that they can load when we're ready, and what I'm going to do is just show you the data, which kind of reproduces in a nicer format what we're seeing printed, and then we'll actually plot it just to see what it looks like.
14:03
And then we'll do one last step, and I'll be done with the demo. So it's up to roughly half of a maximum score. I think it gets up to about 350 or something when it's done for this particular size network and for 20 training steps.
14:24
And then the last step we'll actually do when it's finished is we'll take that network that we've trained that's saved as a checkpoint, and we'll actually try running it, and we'll see how it works when it runs like five or six cycles. So there's the data. You can see that the reward mean, which is the center column, got up to about 305,
14:44
and here's what it looks like when we plot. The max, we very quickly got to the point where we could get a maximum score, but the mean score still rose during that whole time. And now we'll try this rollout, and you'll see a window pop up where it's actually running these so-called episodes where it's trying to balance the cart,
15:02
balance the pole on the cart, rather. All right, there we go. So this does about five, I think, episodes. This one's not too bad. It's holding up pretty well. We'll also print out the score.
15:22
I scroll this down a bit. So that first one, it actually got all the way to 500. This one, it looks like it'll stop almost at 500. So this is actually doing pretty well, even though our mean score isn't that high. So hopefully you can get a sense that it's actually pretty easy to use a high-level domain-specific library like this
15:43
to do the work we want to do, but at the same time, it's actually using this distributed compute framework under the hood, and the example we've shown would actually go much faster if we were using a real cluster. And then when you're all done, you can shut down if you want. So let me go back to the talk.
16:03
All right, so why Ray? Well, it's kind of emerged out of two big trends. One is that the size of neural networks is growing enormously, which also translates to how much compute is required to train them. We're far outstripping Moore's law in terms of growth. At the same time, Python, as you all know,
16:21
has been seeing enormous growth of interest in the last decade or so, driven in a large part by interest in machine learning and all the great libraries available in machine learning, data science, and so forth, written in Python. So that kind of means that we really need an easy way to distribute Python over a cluster if we're going to get past the limits of Moore's law,
16:42
but we want something that people who don't really want to care about distributed computing can do so easily. So there's a whole bunch of icons here about different steps that you have to do when you're doing like a typical machine learning pipeline, all of which typically require some sort of distributed computation to meet the scalability requirements.
17:01
And the vision of Ray is that we could have this sort of very generic framework. If you think about the original part of the demo there, it was really nothing about machine learning. It was just about scheduling tasks of arbitrary size, managing distributed state. And then on top of that, we can build these libraries that handle specific domains. These are four of them right now that are part of Ray,
17:21
that are part of the machine learning space. There's others that are being written in neural, rather natural language processing. And a lot of people are starting to write generic applications with Ray as well. So let me just talk briefly about these libraries. We actually just saw an example of RLLib. I'll see how I'm doing for time, pretty good.
17:42
Here's a different icon or image than the one I showed you. But it's one of these huge spaces that's starting to see a lot of interest for a wide variety of reasons. The first big successes were in gameplay, like beating the world's best Go player. They've been used a lot in robotics. Actually, this one in the middle,
18:00
this bipedal robot is actually implemented with Ray. There's some interesting work being done, getting closer to regular business problems. Industrial automation, workflow management, that sort of thing is becoming a hot area. As well as optimizing systems like network topologies, HVAC systems.
18:21
Netflix and YouTube have published some interesting papers about using reinforcement learning to improve their recommendation systems, which has been an old problem, but now they're finding new ways to do it. And then, of course, in the finance world, they always leverage everything. Finance is a time-oriented problem, and that's what our reinforcement learning is really good at.
18:44
So peeling the onion a bit, what's going on in AlphaGo, the Go player, is that the observations in this case, so the state of the board, the actions are where you're going to play stones, and the rewards are really, in this case, instead of a reward at each step, it's only win or lose. That's all. So you don't really know how you did until the end.
19:01
And behind those things, they built this huge neural network that, at various layers, can identify different patterns in Go play. Reinforcement learning is a very broad field, and there's also lots of different ways that people are using it, and building algorithms that do reinforcement learning.
19:20
So what Ray R. O'Liv tries to do is to give you support for all of these different ways of doing things. OpenAI Gym is where we got that cart example, for example, and then give you lots of different algorithms that are built in, or the easy ability to define your own, lots of abstractions that can be glued together in different ways, and then all of this running in a distributed way
19:41
at arbitrary scales using Ray. You can actually try it out in SageMaker. If you're already using SageMaker, there's Ray in the middle here of this picture, and Azure just rolled out support for reinforcement learning with Ray as well. Back to why Ray was created,
20:01
if you think about all of the different compute requirements that you need to be able to support in reinforcement learning, you've got these things like simulators and game engines that look a lot more like regular applications with complex distributed graphs of objects in memory, and the agent itself may be fairly complex, which are much different
20:21
than what we're used to doing in data science and neural network training, but we have to do that too, and we have to do this over and over again as efficiently as possible because the only way we can train in reinforcement learning is to just play over and over again and learn as we play. So the kind of diversity of CPU requirements,
20:41
memory access patterns, and so forth is what drove the need to build something that's very flexible like Ray, but very efficient also so that you can build on top of it tools like RLLib. The second one I'll mention quickly is Tune, and this is for hyperparameter tuning. What is hyperparameter tuning? Well, it's become a research area in its own right
21:02
because of problems like when you need to decide what's the best neural network architecture for my problem. Every number you see here is a hyperparameter, meaning before I even train anything, I have to decide on the structure of this neural network. So like how many layers am I going to have? How big are they?
21:20
If I'm doing convolution, what's the size of the convolution? What about pooling and so forth? So the space of possible neural networks is enormous, and you don't want to naively just search all possibilities. You want some intelligence that makes it easy, hopefully, to get to a good architecture relatively quickly
21:41
without a lot of expensive compute time. So Tune is really great for being very concise in how you declare what you want, and then it integrates in lots of frameworks like our favorite neural network and machine learning libraries, as well as intelligent algorithms that try to optimize the tuning process. In this case, I'm using Bayesian optimization.
22:02
It's also optimized for neural network training because that's where the real problem of hyperparameter tuning exists, and it's designed to be easy to plug in new frameworks and new algorithms. Last thing I want to talk about for those of you who aren't interested in machine learning is what about using Ray to build applications or microservices?
22:24
I won't go into all the reasons we create microservices. I just want to focus on one part, which is the need to separately manage things, and if you think about it, we often have separate microservices because we might need lots of different instances of some things more than others for scalability reasons.
22:40
Some microservices might be evolving much more quickly than others, so we need to swap them out very frequently, but the downside is that we have all of the stuff to manage ourselves. We have to have a lot of instances because no one machine is big enough, probably, and we don't want to rely on one machine that, if it fails, could bring down our application, so we have all of this manual stuff we have to do to deploy our applications.
23:04
Well, the vision of Ray is that we could actually go back to thinking about one logical instance for each microservice, but behind the scenes, Ray is scheduling all of these actors and tasks across our cluster sort of transparently for us, and we don't have to do nearly as much of that management of instances
23:20
that we had to do before. There's a lot more that needs to be done here to make this as fully powerful as we might want, but this, to me, is one of the exciting aspects of Ray. It's really what we've been doing with ROLive and these other tools, but it applies equally well to other general-purpose applications as well,
23:40
and it also nicely integrates with systems like Kubernetes because Ray is working at a very fine grain, so it works at a lower level of granular, or I guess maybe it's a higher level of granularity than whichever, than Kubernetes does, so it integrates nicely with these other frameworks. So if you're interested in adopting Ray, one thing you might look at is,
24:03
if you're already doing multiprocessing with these libraries, joblib or multiprocessing pool, Ray actually provides some drop-in replacements that break the single-node boundaries, so with just changing the import statements, you can now do scheduling across a cluster rather than just across the cores in a single machine,
24:23
and it also integrates nicely with async.io if you like programming coroutines. There's a very nice way to use Ray that way as well. So check out ray.io for more information. I've been writing tutorials that you can find at any scale academy at this URL.
24:42
You're welcome to join the Ray Slack. That's the best place to find out about Ray, to ask questions and so forth, and there's even a Google group for it. Once again, please check out our RaySummit conference this fall. It's free, it's online, and we hope to see you there.
25:01
Thanks for listening, and I'll be happy to take your questions in the Discord channel. Hey, thank you so much. That's amazing. Actually, I really love it because, you know, I really hate setting up all these, like, pipeline, you know, like, distributed thing for everything. I was doing data science, but that really bugs me, and this is actually really good that you have one thing for everything.
25:23
So let me have a look at the... So because there's no Q&A in Zoom, so I'll have a look at the Parachat to see if anybody's firing any, like, questions in. If not, then I got to... Maybe I would ask one question myself because I'm really interested. Sure, sure. Is it, like, so you said that because, like, it's for everything.
25:44
So, like, is it, like, so is it difficult to set it up? Like, I really, I'm not good at deploying anything. So, like, would you say that it's good for, you know, people to, like, independent, you know, for example, like, my pet project, I just do it myself. I don't have my colleagues. Is it okay for people to do it that way,
26:02
or is it usually for more, like, a corporate solution for that? So... Yeah, that's a really interesting question. Ray clusters, when you go, like, when your laptop isn't big enough and you need to go to a cluster, it can be as easy as just standing up some instances in Amazon and then running the initialization script on each one of them.
26:21
And then it does a pretty good job bootstrapping itself from there. When you submit code to that cluster by, and it's transparent, you just do ray init, here's the address in my cluster, then it will actually upload libraries and so forth that you need. So it's generally, you know, I've used a lot of distributed system tools, you know, Spark and Hadoop and all this stuff.
26:42
And it's about as easy as it can be in that way. It's not completely seamless. Some of the technology we're working on at any scale will hopefully make it even easier where you don't have to think very much about it at all. But right now, there's a little bit of setup, but it's not too difficult and it can be as lightweight or as big as you want.
27:00
Yeah, that's good. That's good. I love the flexibility. So thank you so much. I think it's really interesting. So like if people, you know, maybe, you know, you can get in touch, you know, this information is there in the slides. And thank you so much. And I hope you enjoy the rest of the conference and the show show afterwards as well.
27:20
Thank you. Thanks for having me.