We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Follow Your Celery Tasks with Director

00:00

Formal Metadata

Title
Follow Your Celery Tasks with Director
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
All Python developer who want to run asynchronous tasks should know Celery. If you have already used it, you know how great it is ! But you also discovered how it can be complicated to follow the state of a complex workflow. Celery Director is a tool we created at OVH to fix this problem : using some concepts of Event Sourcing, Celery Director helps us to follow the whole lifecycle of our workflows. It allows us to check when a problem occurred and relaunch the whole DAG (or just a subpart if tasks are not completely idempotent). During this talk we will introduce you the different concepts of Celery Director then we'll make a demonstration of it. All Python developer who want to run asynchronous tasks should know Celery. If you have already used it, you know how great it is ! But you also discovered how it can be complicated to follow the state of a complex workflow. Celery Director is a tool we created at OVH to fix this problem : using some concepts of Event Sourcing, Celery Director helps us to follow the whole lifecycle of our workflows. It allows us to check when a problem occurred and relaunch the whole DAG (or just a subpart if tasks are not completely idempotent). During this talk we will introduce you the different concepts of Celery Director then we'll make a demonstration of it.
33
35
Thumbnail
23:38
52
Thumbnail
30:38
53
Thumbnail
16:18
65
71
Thumbnail
14:24
72
Thumbnail
18:02
75
Thumbnail
19:35
101
Thumbnail
12:59
106
123
Thumbnail
25:58
146
Thumbnail
47:36
157
Thumbnail
51:32
166
172
Thumbnail
22:49
182
Thumbnail
25:44
186
Thumbnail
40:18
190
195
225
Thumbnail
23:41
273
281
284
Thumbnail
09:08
285
289
Thumbnail
26:03
290
297
Thumbnail
19:29
328
Thumbnail
24:11
379
Thumbnail
20:10
385
Thumbnail
28:37
393
Thumbnail
09:10
430
438
Task (computing)Task (computing)Server (computing)DataflowComputer animation
Task (computing)Demo (music)Software developerTask (computing)Value-added networkOpen sourceComputer animation
Web pageTask (computing)Message passingScheduling (computing)Military operationQueue (abstract data type)Thread (computing)Distribution (mathematics)Function (mathematics)Block (periodic table)Queue (abstract data type)Task (computing)Message passingFerry CorstenPhysical systemProcess (computing)CASE <Informatik>Disk read-and-write head1 (number)Dependent and independent variablesTouchscreenUsabilityReal-time operating systemDescriptive statisticsVirtual machineRight angleThread (computing)BefehlsprozessorArithmetic meanMechanism designWordMatching (graph theory)CodeWeb serviceComputer animation
Task (computing)Service-oriented architectureChainChord (peer-to-peer)Local GroupTask (computing)Message passingChainMereologyGroup actionPrimitive (album)Demo (music)Level (video gaming)MetreReal numberInternet service providerQuicksortQueue (abstract data type)Computer animation
Web pageExecution unitDemo (music)Task (computing)ModemMenu (computing)MIDIDemo (music)Physical systemInstance (computer science)Service-oriented architectureComputer fileComputer animation
PiWindowExecution unitConvex hullMotion blurTask (computing)Graph coloringFamilyComputer animation
RadiusTask (computing)Binary fileSystem calloutputComputer animation
LaptopUniform resource locatorServer (computing)Control flowToken ringDirectory serviceLocal ringKernel (computing)Web browserCloud computingLimit (category theory)Random numberoutputService-oriented architectureCopula (linguistics)SynchronizationTask (computing)Computer fileCartesian coordinate systemNumberSingle-precision floating-point formatFunctional (mathematics)Flash memorySoftware testingConnected spaceSocial classMobile appSoftware frameworkCodeService-oriented architectureWeb 2.0LaptopComputer animation
Motion blurTask (computing)Random numberLocal ringConnected spaceTask (computing)Functional (mathematics)Computer animation
Random numberCopula (linguistics)Software testingProjective planeFunctional (mathematics)Normal (geometry)Task (computing)Computer animation
Random numberTask (computing)Connected spaceLocal ringBuffer solutionKernel (computing)Message passingMotion blurInversion (music)QuiltSynchronizationChainCollatz conjectureInformation securityNormal (geometry)Random number generationComputer animationSource code
Copula (linguistics)Convex hullState of matterTask (computing)Object (grammar)Functional (mathematics)Service-oriented architectureSoftware testingResultantComputer animation
Task (computing)Kernel (computing)Buffer solutionMessage passingObject (grammar)Task (computing)State of matterMeta elementComputer animation
Inversion (music)Task (computing)Kernel (computing)Message passingMotion blurChainWindowResultantNumberGoodness of fitTask (computing)Set (mathematics)ChainFunctional (mathematics)DataflowSystem callView (database)Domain nameService-oriented architectureElectronic signaturePrimitive (album)Computer animation
Task (computing)Message passingChainGrand Unified TheoryLink (knot theory)Copula (linguistics)Motion blurFunctional (mathematics)Task (computing)ResultantComputer animation
ChainLocal GroupComa BerenicesTask (computing)ResultantGroup actionSummierbarkeitComputer programmingFunctional (mathematics)Ferry CorstenSoftware testingComputer animation
Task (computing)Motion blurLocal GroupChainExecution unitLine (geometry)ResultantTask (computing)NumberMachine visionFunctional (mathematics)TupleSummierbarkeitAdditionComputer animation
Local GroupCopula (linguistics)Data miningChainComputer fileComputer animation
Task (computing)Service-oriented architectureChainLocal GroupChord (peer-to-peer)Demo (music)LeakInfinityLink (knot theory)Denial-of-service attackSineCodeChainGroup actionMessage passingComputer animationLecture/Conference
Demo (music)MIDILink (knot theory)Software developerResultantTask (computing)Different (Kate Ryan album)EvoluteComputer programmingComputer animation
Software developerTime evolutionTask (computing)TrailCommon Language InfrastructureLatent heatSystem callTask (computing)Electronic mailing listRevision controlOcean currentFlow separationResultantEvoluteSoftware testingDataflowYouTubeLatent heatComputer animation
Task (computing)Motion blurComputer configurationInstance (computer science)Line (geometry)Interface (computing)Fluid staticsMessage passingServer (computing)DatabaseVariable (mathematics)Integrated development environmentStructural loadLetterpress printingSinguläres IntegralContext awarenessInformationDatabase transactionInclusion mapRun time (program lifecycle phase)FrequencyError messageDiscrete element methodTask (computing)Computer wormSource codeIntegrated development environmentDefault (computer science)Queue (abstract data type)Computer programmingStructural loadCollisionRow (database)SpacetimePoint (geometry)Computer fileDatabaseChief information officerObject-oriented programmingInformationProjective planeVariable (mathematics)Letterpress printingSoftware frameworkElectronic visual displayFunctional (mathematics)ResultantRadical (chemistry)Power (physics)Computer animation
Task (computing)BootingInformationServer (computing)SynchronizationTask (computing)Letterpress printingSource codeComputer animation
Menu (computing)Web 2.0TouchscreenRow (database)Task (computing)Source codeComputer animation
Task (computing)Server (computing)InformationSynchronizationBootingMotion blurError messageElectronic signatureDefault (computer science)Task (computing)Message passingOptical disc driveTunisRow (database)Computer animation
Structural loadKernel (computing)LaptopTask (computing)Local GroupSinguläres IntegralError messageDivision (mathematics)Computer fileLaptopDecision theorySoftware testingGroup actionWeb 2.0Functional (mathematics)Task (computing)Flow separationDivision (mathematics)Computer animation
Error messageConvex hullView (database)Group actionTask (computing)Multiplication signFerry CorstenElectronic visual displayError messageSource codeComputer animation
Error messageComputer fileTask (computing)BootingInformationState observerSynchronizationView (database)Execution unitWeb serviceTask (computing)Computer animationSource code
Demo (music)Uniform boundedness principleFerry CorstenKey (cryptography)DataflowRow (database)Web 2.0System callComputer animation
Disk read-and-write headFeedbackQuicksortInsertion lossComputer animation
Open sourcePoint cloud
Transcript: English(auto-generated)
Hello everybody. So I'm Nicolas Rochfer. I'm here to talk about Celery, which is a task engine to orchestrate, to combine some tasks. We will talk also about Director, a tool
we made in Jovietch to easily create this workflow. Okay? Why we need to work with Celery? We will see that together. To be on the same base, we will see what is Celery, and I will make a quick demonstration, a very basic demonstration.
And then we will see that in our team we have some custom needs. We need to execute some background tasks, but some future provided by Vania Celery was not enough for us. So we created a tool, Director, which is now open source this week, so you can try it right now. What is Celery? This is the
official description in the Celery documentation telling Celery is an asynchronous task queue based on distributed message passage. Celery
means. In fact, the important words here are task queue. What is a task queue? In fact, it's really simple. It's just a mechanism used to execute some tasks in other matching threads. How to do that? When we are
talking about task queue, we are in fact talking about producers and consumers. On the middle of the screen, you have the queue. In Celery, a queue is named a broker, and the most common ones are RabbitMQ or Redis. The idea is to
pass message from the left of the screen, so from the producers to the consumers. The idea is producer does not want to execute themselves the task. They want to make execute it by another system, another consumer. Okay?
So, just in summary, what is Celery? It's just a mechanism, a Python library, used to execute tasks. A task, in fact, it's just Python code somewhere else. And why to do that? Just some use cases. For example, to not block the user if you
are working in a web service and your user makes a request on it. You don't want to block it. You want to execute a long-running task somewhere else and return a quick response right now. Your producer cannot have enough
resources to execute the task because it's a complicated task to do involving some CPU bound, and you don't have, as a producer, the resources to do it, but your workers can do it. You have big machines to do. We can also talk about network accesses. There are a lot of use cases when you want to use some
Celery tasks. And here, I will show you how to create tasks and workflows using Celery. Just for that, this is what we will use in this demonstration. Just remember there is two parts, the producers and the consumers.
To produce message, we are using the .delay method. There is some of the delimiters to send the message in the queue, but here we will go to use the .delay method. And on the other side, we have the Celery command providing some subcommands, and one of these subcommands is the worker command. So, we
will produce message with this, and we will, I don't know, and we will consume message with that. First part of the demo will be just create simple tasks, and now we will see how to combine these tasks using some Celery
primitives, and we will see the chain and the group primitives. And this is a demo part, and I hope it will work. Remember, I need to have a broker running, so I already installed a Redis instance. You can have a
RabbitMQ. You can also have a file system, so no running instance in it, but I prefer to use Redis. Oh, yeah. Indeed. So, I think it's over there. Colors. And this one, no? Okay. This is Celery. Thank you. Do you have
any questions? Okay. Okay. So, let's start. I already installed my requirements, so I'm in the virtual environment, as you can see. Okay. And
I will open Jupyter, just here. Okay. Do you see? Okay. It's okay. I
can undo that. Okay. So, I already created a task file named task.py containing all my Python code to execute somewhere else. How to do
that? First thing to do is to import the Celery class, of course, and create a Celery application. If you already use a web framework like Flask, for example, you know you need to create an app application, a Flask application, and use it somewhere in your code. It's the same thing using Celery. I created my application. This is the name of this
application, and I need to give it the connection to the broker. I also give it the connection to the backend, which is just here to store the task results. Okay? And here, I have two simple
functions. These functions are really simple. It's not the important here to make some complicated code. I just wanted to show you how to send a task somewhere else. So, we have the first function to use to return a random number, and we have another function used to get first parameters, a list of
numbers, and return the addition. Okay? This will be my producer. The notebook will be the producer, and I have to launch the consumer like this. As you can see, there is my task file.
Okay? And I can launch it. Okay? As you can see, Celery is launched and discovered my task because we transformed this Python
function into Celery task using the decorator. Okay? So, first thing to do is to, of course, in the producer, import the function, and it's not because we transform a Python function into a Celery task that we cannot execute it
normally like normal functions, Python functions. So, here, okay, I already have my results. Sorry. I can execute it normally. As you can see, a random number has been
executed. Oh, this is just a notebook, Python notebook, because I send it in the background. Okay. So, if I did not... Okay. Now, I can execute it using the delay
function. Using it, I will, in fact, send a task in the broker. I will not execute the task in the producer. Instead of it, I will send it in the broker, and as I have a rocker running, it will execute it. What I have in return is an async result object. This is, in fact, a
Celery object telling me maybe the task is finished, maybe it is not, but this object allows us to have the state of the task. Is it finished? Is it pending? And so on. As you can see, my rocker really executes the
task. It was not my producer. It was my rocker. And, as I said, we can use some async result method, like the .get method, to really have the result. Here we have nine, and it was the good number. Okay. It's
just a simple task showing you how to send a task and how to execute it somewhere else. Now, we can use Celery Primitive to combine this task and create some workflows. One simple workflow will be this one, a chain workflow. This workflow, in fact, will execute
some task in the right order, one after the other. Okay. We will see that. I have to import the chain canvas. Here I call one first task, the random task, and another task. As you can see, I'm using .si
function. I'm not using .delay. Why? Because here I want to create a signature. I don't want to really send the task in the broker. And, as you can see, I have my two tasks created. Nothing has happened
here. It's still my notebook. Sorry. Nothing has happened. But now I can really execute the delay function used to apply this task. And, as you can see, my two tasks have been created. Okay. I can
get the result. Now, we will use the group canvas. This canvas allows me to launch some task in parallel. As you can see now, I want to first
one execute two tasks in parallel, and then retrieve this result, the result. How to do that? Using the .s function. .s will take the result of my previous task. .si does not take care about the result of the previous task. Here we can see
that first we execute getRun in parallel, and then we take the sum of this task. Okay. Still my book logs. We have an async result again, and
our task has been executed in parallel using tuple worker, and the sum of these numbers has been taking, has been addition by the getSum function. And now, I can take the result of this canvas. Okay. This is how to use
Celery. As you can see, it's really simple, but I think it's powerful to use it because of this simple API. This is just in the file to have
the code of the demonstration. Okay. We will pass on that. Here we created the chain. We execute it, have the result. Here we created the
group, execute it, and have the result. And okay. As you can see, Celery is powerful, but suffers some problems for us in our team. One of them, it's really difficult to see the dependencies between the tasks in the workflow.
So, we wanted a tool that allows us to track the evolution of this task. Maybe one task failed, but what task? And what other tasks cannot be handled because of this failure? It was complicated. We also wanted to execute them using some API call and not directly in the
producers like this notebook. We wanted to create workflow using YAML format. So, task has been created, of course, in Python, but in a folder. And in another way, in another file, YAML file, we combine this task to create workflows. Okay. We wanted to periodically
execute some workflows. Celery allows you, by default, to periodically execute some tasks, but not a whole workflow. And this is in our to-do list. It's not yet provided in the actual, the current version of GitHub, but we want to retry
a failed workflow at a specific task. For example, if several dozens of tasks succeeded and the twelfth task failed, we didn't want to relaunch the whole workflow because we know that our tasks are idempotent. So, we
need to store the result of this task and then relaunch the workflow at the failed task because we fixed the problem. Okay. So, how to use director, and I think this is it. The demo. So, again, the installation
is just pip install, Celery director. Here, pip install that. And doing it, you will have a new command named director. First thing
to do, I will check if I, okay. First thing to do is to create a workspace. So, just remember, this tool will allow you to create, to easily create new Celery project. It's the kind of framework above Celery.
Director in it, and I will create a workflow workspace, for example. Okay. I have here, first thing to do is to, as it's written, to set this environment variable. Okay. I can now go to it, and when you
are doing it, you will see that there is a task folder containing an example. Okay. So, here, I just have to import the task
decorator, create a Python function, and decorate it, giving it some name. There is a simple example given to you by default, which is an extract transform load example, just using some print function. And you have the YAML file telling it if you
want to do this kind of simple workflow, you can do it using this syntax. Okay. Celery needs to store the result of the task in its own database to make the dependencies easier to display.
So, for that, first thing to do is to create a database. Here, I have a file. By default, it's using SQLite, but of course, we recommend to use a more powerful database like PostgreSQL. We are
using SQL. And now, I can list this example. Right now, I have this one. And I can execute it. So, remember, I don't have to open a Python terminal. I don't have to open a producer and
import the source code. I just have to use this command. I give it some default payload. It will be an empty payload. In the next version, we will remove this useless information. But if you want to say FUBAR, you can. All right.
Yes. It's director workflow one. Sorry. And now, the task has been sent in the queue, so in the relays. I can
now open a worker to consume them. Oops. We didn't see anything, but okay. Task has been executed here. Okay. Adjust print task. How to display them, because it's still not
useful. We can now open this. Okay. Okay. Okay. I will open it using this one. We have the web UI telling you.
Okay. We are in a small format, so screen is not really beautiful here. But we have our workflow, and we can display the task executed. Okay. It's
done. Yes, I have it. I can show you a failed execution. So, the ID is, for example, to create an error file. So, from director, import task. I need
to create my task, giving some hours. This is the default signature of our hours. And I can make an error. Okay.
I transform this one into task, give it a name. I think it's okay. And I can now open this one. So here, I just copy past, because I have my notebook
used to help me. Okay. I think it's okay. It's okay. Yes. So here, I created a new workflow name will
fail, containing several tasks. The first one will be the head start one, and then we created a group containing several tasks. This task, a transform task, will succeed. And here, we have our failed task will be failed because of the zero by, division by zero. And
this task will not be executed, of course. But we will see that in the web UI. I can now execute it. Name is will fail. No. I think. I'm not sure. Yeah. I can. Okay. The
worker. Of course, the idea is to have several terminals, several chairs, or several containers launching all of this function. And now, we can see that in the web server, which is just here, we
have an error using, showing you the workflow. And we can see that our task failed here. I don't have time to execute Flour and show you the trust back. But if you are also using Flour, which is a well-known
tool in Celery community, you can check here the complete trust back display. Today, in OVH, we are using this tool to manage our workflows and our tasks. And it's really easier to make it because we can use some web services to call
our task. I show you here how to do that using, using some cli like this. But it's so important to note that we can now execute a workflow using some API call, using some post call. We also work to provide
you a way to execute a workflow directly in the web UI. And this is a result. And here, if you are using Flour, you can display the complete trust back. More useful to make some investigation. And
it's also, it's finished for me. The tool has been open since this week. It's really fresh. You can try it and give us some feedbacks, if you want. Installation is really easy, as you can see. So, thank you for your attention. And
that's from picture to picture.