We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Inside ActiveJob

00:00

Formale Metadaten

Titel
Inside ActiveJob
Serientitel
Teil
30
Anzahl der Teile
89
Autor
Lizenz
CC-Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
ActiveJob made a huge impact when it landed Rails 4.2. Most job processors support it and many developers use it. But few ever need to dig into the internals. How exactly does ActiveJob allow us to execute performant, thread-safe, asynchronous jobs in a language not known for concurrency? This talk will answer that question. We'll build our own asynchronous job processor from scratch and along the way we'll take a deep dive into queues, job serialization, scheduled tasks, and Ruby's memory model.
81
StandardabweichungMultiplikationsoperatorThermodynamischer ProzessDatenparallelitätParallele SchnittstelleProzess <Physik>Elektronischer ProgrammführerInformationCoxeter-GruppeVideokonferenzOrdnung <Mathematik>MengenfunktionFramework <Informatik>Varietät <Mathematik>RechenwerkSchlüsselverwaltungTermLochkarteSchedulingTaskVektorpotenzialMathematikDifferenteMultiplikationWeg <Topologie>BitSoftwaretestDelisches ProblemSoftwareentwicklungProgrammiergerätProjektive EbeneGruppenoperationExistenzsatzAbstraktionsebeneDatenbankProdukt <Mathematik>SoftwareentwicklerFront-End <Software>ProgrammierumgebungCodeKlasse <Mathematik>SystemaufrufVirtuelle MaschineZentrische StreckungCASE <Informatik>ImpulsObjekt <Kategorie>BenutzerbeteiligungWarteschlangeKonfigurationsraumKonfiguration <Informatik>Anpassung <Mathematik>SymboltabelleNormalvektorMapping <Computergraphik>Exogene VariableWeb-SeiteE-MailImplementierungParametersystemRelativitätstheorieZahlenbereichDreiecksfreier GraphDatensatzSoundverarbeitungRechter WinkelFitnessfunktionHyperbelverfahrenEinfügungsdämpfungRelationale DatenbankSprachsyntheseQuick-SortHydrostatikMusterspracheLeistung <Physik>Workstation <Musikinstrument>OrdnungsreduktionWechselsprungBildschirmfensterFigurierte ZahlEndliche ModelltheorieFächer <Mathematik>Nachbarschaft <Mathematik>Kartesische KoordinatenReelle ZahlFunktionalGraphComputeranimation
Exogene VariableObjekt <Kategorie>ZeitstempelKlasse <Mathematik>WarteschlangeCASE <Informatik>MAPAnpassung <Mathematik>Suite <Programmpaket>SpeicherabzugMetadatenInformationNatürliche ZahlProzess <Physik>Attributierte GrammatikSoftwareentwicklerDefaultThermodynamischer ProzessSchedulingInstantiierungSpeicher <Informatik>Service providerPhysikalisches SystemLochkarteApp <Programm>CodeSchnittmengeMereologieHinterlegungsverfahren <Kryptologie>Lokales MinimumSoftwaretestDatenverwaltungGrenzschichtablösungNebenbedingungTaskDatenparallelitätFront-End <Software>ThreadQuantenzustandQuick-SortProdukt <Mathematik>Ordnung <Mathematik>KomponententestBildschirmmaskeRechter WinkelSichtenkonzeptVakuumpolarisationVersionsverwaltungSuperposition <Mathematik>MultiplikationsoperatorHalbleiterspeicherWechselsprungGüte der AnpassungFluss <Mathematik>Reelle ZahlParallele SchnittstelleOrdnungsreduktionSystemaufrufStellenringKategorie <Mathematik>SystemprogrammEndliche ModelltheorieInstallation <Informatik>Cluster <Rechnernetz>Akkumulator <Informatik>t-TestServerMaßerweiterungDifferentialgleichungCodierung <Programmierung>AggregatzustandComputeranimation
InstantiierungParametersystemSchnittmengeThreadProgrammierumgebungRechter WinkelMultiplikationsoperatorZeitrichtungAbstraktionsebeneDigitalisierungSystemaufrufTaskZahlenbereichZweiSerielle SchnittstelleReelle ZahlKoroutineVektorpotenzialFehlermeldungThermodynamischer ProzessProdukt <Mathematik>DifferenteResultanteDatenstrukturKlasse <Mathematik>CASE <Informatik>CachingVarietät <Mathematik>Globale OptimierungGeradeProzess <Physik>AggregatzustandServerObjekt <Kategorie>Ordnung <Mathematik>DatenparallelitätMathematikHash-AlgorithmusAdditionMustersprachep-BlockSelbstrepräsentationWiderspruchsfreiheitMultiplikationSoftwaretestComputerspielIndexberechnungRadiusCodeOrdinalzahlDatenbankSynchronisierungFlächeninhaltMetropolitan area networkVersionsverwaltungGemeinsamer SpeicherFormale SpracheWechselsprungSpeicher <Informatik>Kontextbezogenes SystemKartesische KoordinatenGrenzschichtablösungWarteschlangeQuelle <Physik>DruckverlaufMetadatenBenutzerbeteiligungKonstruktor <Informatik>MAPNormalvektorRelationale DatenbankKonfigurationsraumSchlüsselverwaltungMereologieNeuroinformatikModallogikExogene VariableAnpassung <Mathematik>Computeranimation
ThreadAbstraktionsebeneCASE <Informatik>E-MailKonfigurationsraumSymboltabelleWort <Informatik>Klasse <Mathematik>Weg <Topologie>InjektivitätZahlenbereichMereologieFehlermeldungZweiTaskMultiplikationsoperatorSensitivitätsanalyseStützpunkt <Mathematik>Thermodynamischer ProzessKontrollstrukturPunktRechenschieberSoftwaretestKartesische KoordinatenElektronische PublikationProzess <Physik>NormalvektorReelle ZahlWiderspruchsfreiheitProdukt <Mathematik>EinsHalbleiterspeicherComputerspielLochkarteResultanteService providerElement <Gruppentheorie>ProgrammbibliothekSoftwareentwicklerVerdeckungsrechnungRelativitätstheorieSchnittmengeInstantiierungGerichteter GraphOrdnung <Mathematik>ProgrammfehlerSpeicher <Informatik>CodeGeradeRefactoringVollständigkeitBitKontextbezogenes SystemDatenparallelitätFitnessfunktionSystemprogrammMAPMailing-ListeRechter WinkelSystemaufrufInstallation <Informatik>RechenwerkCoxeter-GruppeNatürliche ZahlApp <Programm>ImplementierungDifferenteQuick-SortWarteschlangeZeitstempelFunktionalp-BlockSkalarproduktRichtungUmwandlungsenthalpieComputeranimation
Klasse <Mathematik>AusnahmebehandlungSoftwaretestThreadMAPReelle ZahlProdukt <Mathematik>KonfigurationsraumAbstraktionsebeneWiderspruchsfreiheitGewicht <Ausgleichsrechnung>Quick-SortFehlermeldungCASE <Informatik>Lokales MinimumProzess <Physik>Overhead <Kommunikationstechnik>GamecontrollerArithmetische FolgeE-MailOrdnung <Mathematik>Speicher <Informatik>Thermodynamischer ProzessObjekt <Kategorie>AlgorithmusInhalt <Mathematik>DatenparallelitätSynchronisierungDatensatzMultiplikationSchnittmengeKontextbezogenes SystemLochkarteParametersystemWarteschlangePhysikalisches SystemSystemzusammenbruchBildschirmmaskeTermKartesische KoordinatenSoftwareentwicklerEinsNatürliche ZahlDifferenteTaskInstallation <Informatik>AdditionOrdinalzahlCodeStützpunkt <Mathematik>SoftwareArithmetisches MittelZellularer AutomatOrtsoperatorStichprobenumfangInstantiierungFigurierte ZahlLeistungsbewertungLuenberger-BeobachterRechter WinkelSchlussregelMaßerweiterungElement <Gruppentheorie>BeweistheorieMultiplikationsoperatorVariableZahlenbereichBildgebendes VerfahrenAppletAnalysisBridge <Kommunikationstechnik>Migration <Informatik>FlächeninhaltEinfacher RingFamilie <Mathematik>Innerer PunktDatenverwaltungDreiGarbentheorieIntegralVorlesung/Konferenz
Computeranimation
Transkript: Englisch(automatisch erzeugt)
So it is time to get started. So this is Inside ActiveJob. This is the Beyond the Magic track. My name is Jerry D'Antonio. And let's get started. So first, a tiny bit about me.
I live and work in Akron, Ohio. If you're an NBA fan, you probably have heard of Akron. There's a local kid who's a basketball player, has done pretty well for himself in the NBA. Went to school about 10 minutes from where I live. I work at Test Double. You may have heard of Test Double. Justin Searles, one of our founders, was on the program committee for RailsConf, and he's
speaking tomorrow. Test Double, our mission is to improve the way the world builds software. And I know that sounds very audacious, but we truly believe that every programmer has it in themselves to do that, and I believe that every person here has it in themselves to do that, and that's why you're here. So definitely it's been a great company to work for, and I'm very proud to represent Test Double here.
Personally, one thing I've done, my biggest claim to fame lately is I created a Ruby gem called Concurrent Ruby. You may have heard of Concurrent Ruby because it's starting to be used in some very well-known projects like, for example, Rails. Concurrent Ruby is a dependency of Action Cable. In Rails 4 and Rails 5, it's used by Sprockets, also used by some gems like Sidekick, Sucker
Punch. It's used by Elasticsearch and Logstash utilities, and any Microsoft Azure Ruby tools. So much of what I'm going to be talking about today draws my experience from that, but this is not going to be a sales pitch for that. This is going to be about ActiveJob and Rails itself. So because this is the beyond the magic track, this is not going to be an introductory
topic. This is going to be a deep dive into the internals of ActiveJob. So I've had to make a couple of assumptions in doing this. I'm basically assuming that if you're here, you've used ActiveJob, probably in production. You've used one of the supported job processors.
You have some understanding of concurrency and parallelism. If you need a better introduction to ActiveJob itself, I highly recommend the Rails guides. The Rails guides are very excellent at this and provide a lot of great information. If you need an introduction into concurrency within Ruby itself, a shameless plug, I did
give a presentation last fall at RubyConf called Everything You Know About the GIL is Wrong. That video is available on YouTube, and that could be an introduction into that. So with that, let's jump into what is ActiveJob. So I need to briefly, in order to get into the internals of this, I need to briefly remind us of what it is and where it came from.
So ActiveJob, according to Rails guides, the definition is this. ActiveJob is a framework for declaring jobs and making them run on a variety of queuing back ends. Jobs can be everything from regularly scheduled cleanups to billing charges, mailings, anything that can be chopped up in small units of work and run in parallel. A couple key terms there.
It's a framework. We're going to talk more about this, but asynchronous job processing pre-existed ActiveJob. There were things like backburner delay job, queue, rescue, sidekick, sneaker, sucker punch. Many of these things existed before ActiveJob was created, and ActiveJob came along as
a way of unifying those. ActiveJob helps us schedule tasks to be run later. That was mentioned briefly this morning in the keynote, that when you don't want to block the currently running reb request and you want something to happen later, you use ActiveJob in order to make that happen.
That can happen through what we call ASAP processing, which is where you get to this as soon as you can, or by scheduling it at a later date and time, potentially. This also allows us to support full parallelism. Some of the job processors are multi-threaded. Many of them, however, actually are forked, I'll talk about that more, and can run
multiple processes on machine and scale across multiple processors, and in some cases across multiple machines. The impetus for ActiveJob is that background job processors exist to solve a problem. We have these long-running tasks that we don't want to block the web request.
We want to be able to send a response back to the user and get the page rendered for them, and some of these tasks then occur after that. For example, if I'm sending an email, this email takes time. It's asynchronous to begin with. Why should I block the reb request to make sure that email posts when I can send the
response back and have that post shortly thereafter? So ActiveJob supports that, and the processors behind that support that. So like I said, ActiveJob, this will be important when we get into the internals, ActiveJob came later. There were all of these job processors. Each one was unique, and each one, they all did virtually the same thing.
They had slightly different capabilities and went about it differently, but they all solved the same problem. So ActiveJob was created to provide a common abstraction layer over those processors that allowed the Rails developer to not worry about the specific implementation. If this sounds familiar, this is not dissimilar to what ActiveRecord does.
Relational databases existed. ActiveRecord created an abstraction layer over that that allows us to run, use different databases to switch between different databases if necessary, and most importantly, run different databases in tests, prod, and dev. ActiveJob does the same thing.
It's an abstraction layer that will allow us to choose different processors, change different processors as our needs occur, and run different processors in tests, development, and production. And so ActiveJob had to do that while supporting the existing tools that people were already using.
So according to the Rails Guide, it says, picking your queuing backend becomes more of an operational concern. You as a developer don't care which backend you use, is being used. You simply write the jobs and then use whichever backend in whatever environment makes the most sense. So because we're going to be looking at some code, I want to real briefly remind us
what the code looks like for ActiveJob before we jump into the internals. So this is a simple job class. This should look familiar to everybody. The important things are that this class extends ActiveJob base and that it has a method called perform. Most of what ActiveJob does is encapsulate it in the ActiveJob base class, which goes
and will eventually, as we look through the details, call this perform method on an object of this class when the job actually runs. And we'll look at those details. And as a reminder, the way we configure our backend is we use this active job queue
adapter configuration option within our application, RB. Now, InsideJob is what I'm going to call the adapter we're going to build here, because we're actually going to build in here a real adapter that is functional. So all of the adapters that are supported by Rails have a symbol that follows normal
Rails inflections that maps the adapter name to what you set the config value for. So if InsideJob existed as a supported adapter in Rails, this would be how you would set that. So that's how you configure which backend you want to use. And then later when you want to actually do something later, you call the perform later
method on your class, passing it one or more parameters. That should look familiar to everybody. And if you want to schedule the job for a certain time, then you can use the set function to specify when, and there's a number of different ways you can do that. So that's just a reminder of what we see on the front of ActiveJob.
All of that should look familiar to everybody. What we're going to talk about is what goes on behind that when you make this perform later call, okay? So like I said, we're going to build an asynchronous backend here, right up here, during this presentation, one that actually works and is functional and will meet a...
It's minimal, but it will meet the requirements of ActiveJob and show us how this works. So a couple things just to give a sense of where we're coming from. Like I mentioned, there are multi-threaded adapters and there are forked adapters. Multi-threaded adapters run your job in the same process as the Rails app itself, okay?
There are a couple that do that. The advantage of that is those can be very fast and you don't have to spawn separate processes and manage separate processes, all right? We all know that MRI Ruby does have some constraints with concern to concurrency, but it's not as bad as most people think.
That's what I talked about at RailsConf last fall. And since most... MRI Ruby is very good at multi-threaded operations when you're doing blocking IO. And most of the tasks we make these background jobs for are doing blocking IO. They're sending emails, they're posting things to other APIs. And so since they tend to do blocking IO, they tend to work very well with Ruby's concurrency
model, so a threaded backend is simpler because we don't have to manage separate processes. Many of them, however, do use or they do spawn forked separate processes where you have to run separate worker processes. Those give you full parallelism, but they require active management of those processes.
So for what we're going to build here, we're just going to do a multi-threaded one because I can do that very easily and it will demonstrate all the things we're going to do. And we're going to use thread pools for that. Most job processes will also persist the job data into some sort of data store. Redis is very popular for this. The reason for doing that is that if your Rails process exits, either on purpose or by crashing,
if all of your job data is in memory, you're going to lose it and those jobs will never run. So generally speaking for production, you want to have a job processor that does store the job data in some sort of external data store to allow it to persist beyond restarts. We're not going to do that here, mainly because in simplicity I want to demonstrate
what goes on in active job. We don't have to go to that level of effort. So our job processor will not persist through data store, so it makes it good for testing and development, but not necessarily we wouldn't use what I'm going to build here today for production. So in order to do this, we're going to need three pieces.
The first one is active job core. This is provided by active job itself, and it is the job metadata. I want to talk about this more, but it is the thing that defines the job that you need to perform later on. It is probably the, I'd say, the most important piece of all this because it's the glue that binds everything else together.
The two pieces we're going to build today are the queue adapter and the job runner. Remember, active job came about after the job runners. So the job runner is independent, and it provides the asynchronous behavior. The job runner actually exists as a separate thing. Sidekick is a separate thing. Sucker punch is a separate thing. You install those separately. The queue adapter has the, its only responsibility is to marshal the job data
into the asynchronous job processor. So the job processor provides the increased behavior, and the queue adapter marshals between your Rails app and that job processor. And those are the two pieces we're going to build here today, the queue adapter and the job runner.
For all of the job runners supported by Rails, the queue adapter is actually in the Rails code base. If you go into GitHub, go into the Rails code base, and look in active job, you will see that there is a folder of queue adapters, and there's one queue adapter in there for each of the processors that Rails supports.
There is also a set of unit tests as part of the Rails code base that are run against every one of these job processors on every commit. They ensure that all of the supported job processors meet the minimum requirements of active job. The one we're going to build today actually will pass that test suite.
Strictly speaking, the Rails core team has responsibility for the queue adapters and for that test suite. But knowing from experience, the people who create the job runners themselves work very closely with Rails to make sure that those adapters are up to date and work well with the processors.
So let's jump in and talk about the active job core class. Like I said, this is the glue that ties it all together, right? It's not obvious. So this is the job metadata. It is an object that represents all of the information about the job you post. It carries with it the proc that needs to be run. It carries with it things like the queue and the priority,
which I'll talk about in a minute. And it carries with it all of that metadata, okay? It provides two very important methods, which I'll talk more in a minute. But they are the serialized and deserialized methods, okay? These are very, very critical, but I'll talk about them in a minute. The job metadata itself, there are several attributes on this object,
which we will look at and use internally within active job. These are not things that you as a Rails developer have to know about, but these are things that when turned inside of active job are very important. One of them is the queue name. Most of us should be familiar with that. You can specify when you create a postmanage jobs, what queue it should run against, right?
And if you don't specify, it's the default queue. Priority, some job processes support prioritization, where higher priority jobs run first. We're not going to support prioritization in ours. That's optional, but the priority would be attached to this as well. If you schedule a job to run a specific time, you get an attribute called schedule that, which tells you when. And we'll look at that because we are going to do scheduled jobs.
The job ID is internal to Rails, and there's a unique ID within the Rails instance itself that identifies each job. Rails uses that within active job to track each one of these, all right? The provider job ID is one that you can provide within your job processor.
So if we wanted to within our job process, if we wanted to have our own kind of ID system that made sense for us and worked, we could then attach it to the job metadata under the provider job ID, okay? So Rails does not create that. We would create that ourselves. We're not going to use the provider job ID today because it's not essential, but it is available, and it's something we would add.
All right, so let's actually build a queue adapter. We're going to go outside in, right? So like I said, the queue adapter is responsible for marshaling data into the job processor. The job processor is the more interesting piece. We'll look at that in a minute. But we're going to start with the queue adapter, and we're sort of going to pseudo TDD this, right? The queue adapter, most of the queue adapters were written
when active job was created because the job processors already existed, and they had to handle that marshaling. In our case, because we don't have a queue adapter, or excuse me, we don't have a processor yet, we can decide what the API is going to look like. So within our queue adapter, we only need two methods. It's very simple. One is enqueue, and the other is enqueueAt.
The enqueue method takes that job object we looked at a minute ago, and it marshals that into our processor. And the enqueueAt takes the job and a timestamp and marshals that into our job processor. So notice in this case, I've decided to make the API very simple. We're going to create a thing called insideJob.
We're going to have class methods enqueue and enqueueAt. We're going to pass the serialized job. We're going to pass the queue name. And in the case of the enqueueAt, we're going to pass the timestamp. So a couple things to note. One, this is not very OO. These are class level methods that we're calling on this class.
And I did that because I want to emphasize the stateless nature of this. This is very critical to understand, okay? ActiveJob is by its nature stateless. The state for your job is encapsulated in that job object.
All of the metadata about the job, everything related to that job, all of your state is in that that we're passing through. The actual queue adapter itself is inherently stateless, where its job is just to... And you'll notice we even call a class of a method when we post the job, because we're sending this thing to happen later on. It's a fire and forget.
We're not creating anything that's going to be persisted. And in fact, any kind of stateful behavior here would be potentially thread unsafe. So we're just going to call these class methods and throw this data at it. And then we'll build those class methods in a minute. And that's all it really takes to build a queue adapter, right? Now, one thing that's very important in here is the serialized method.
And I have to go into this in a little detail. The reason why we call the serialized method is twofold. First off, and less importantly, is thread safety. Remember, Ruby is a shared memory language that has object references. So if we have maintained a reference to anything that was passed into that,
and you hold on to that reference, when this thing finally goes and gets processed later on, if it's processed in the same process on another thread, we run into potentially not thread safe behavior. Now, the normal usage pattern makes that not really a big deal. But if we serialize the job into a representation of that, we then let go of those references and make it thread safe.
There's one more important reason, though. And the most important reason is for consistency. Remember, we want to be able to work with multiple job processors. In prod and dev and even test, we want to be able to. So when those job processors are going to persist into a data store,
such as Redis, they must serialize somehow. I mean, you can't take a Ruby object and throw it into Redis, or throw it into a relational database. We have to serialize it somehow. That's sounding bad. So if every job processor created its own serialization method,
we could potentially run into problems when we switch between these. So we don't want to have hidden errors where we run this in test, and we run it in dev, and all the serialization works, and then we run it in production with a different processor, and the serialization fails or does something different.
So ActiveJob provides one common serialization routine method and one deserialization method, so all of the job processors can choose to serialize the same way. And in so doing, that will reduce one potential set of errors when we move between job processors.
So we are going to serialize here, even though this is the simplified version and we're not storing this in a data store, we want to do that serialization to make sure we get that consistency across processors. So internally, like we said, we need to do two things. We need to provide, inside the queue adapt...
I'm sorry, I've moved on to the job processor now. So we have the queue adapter, now we need the job processor. The job processor's responsibility is to provide the asynchronous behavior, and that asynchronous behavior is queue dependent. So we want to have multiple queues and have each queue process a different set of jobs. So for this, we're just going to use a... What we need is we need to be able to post jobs into different queues
and have them behave asynchronously. We're going to use a simple thread pool for this, right? Because within the context of this simplified application, a thread pool works fine. A thread pool has its own queue, and therefore, by creating a separate thread pool for each queue, we do get a separate queue for these different jobs. We just have to map the thread pool to the queue name,
which we'll see in a minute. And then, obviously, a thread pool has one or more threads and therefore provides asynchronous behavior, so we can very simply deal with these needs of the queuing and asynchronous behavior by just creating a thread pool. So what we're going to do here is we're going to create the thread pool, but because this is all very multi-threaded
and therefore needs to be thread safe, not only are we creating these threads within our job process itself, but because Rails can be run under multi-threaded web servers, we need to jump through a couple hoops in order to get some thread safety here. So we're going to use a concurrent map class. This is similar to a Ruby hash
and supports similar APIs, but it has some additional behaviors. One, it's thread safe, but it also has some additional behaviors to make that work. Hopefully most of you know that with Ruby's hash, when you create a new hash, you can pass a block to the initializer, and that block will be called if the key does not exist,
and that block will be used to initialize that key. So what we're going to do is we are, whenever we try and retrieve a thread pool from our map of queues, if it doesn't exist, we're going to create a new thread pool at that time. So we'll lazily create our thread pools as new queues are needed. Basically do this together and just see kind of one way that you might make that pass.
So this computer absent is just a necessary thing in order to provide the atomicity and synchronicity that we need to have in order to create this new thread pool in a thread safe manner. So there's some concurrency needs there, but the end result is that's basically like creating a hash and passing a block in to the constructor.
And then we're going to have a create thread pool class. In this case, we're just going to create a cache thread pool. Thread pool is a cache thread pool is the simplest kind of thread pool we can create, rather than getting into the details of all the different configuration we could do. Basically a cache thread pool has an unlimited queue size. It will grow and add more threads as needed. And with threads become idle, it will shut them down and remove them.
So over time, you'll get an optimal number of threads and that, which for our simplified processor is fine. So now I mentioned that we need an enqueue method inside our job processor. It's going to look like this. Basically, when we enqueue this job, when the job is enqueued, we're going to simply post the job to the thread pool.
And when the thread pool pulls it off, we're going to call active job base execute. That's the important part right there, active job base execute. The first line that comes with the post, that's just getting the thread pool, creating a new one if necessary, then posting that to be run by the thread pool
whenever the thread pool has an available thread. Active job base execute is responsible for actually interrogating the job, looking up our specific class to process that job, and then posting and calling the perform method
on an instance of that and passing in the arguments. So when you, in your class, early on we saw we create that perform method and it takes a set of arguments and it runs active job base, handles the interrogation of the job, creating an instance of that and calling that method. All we need to do is call the execute on that
when our thread pool takes us and runs it later on. And that's all it takes. Active job handles that, like I said, the internals of that. And that right there is enough for us to actually post asynchronous jobs that perform in an ASAP way in a real environment. Now for the enqueue for later,
it's a little more complicated. We just have to get into the time. You're gonna commit? So fortunately, we do have at our disposal a high level abstraction that handles these kinds of scheduled tasks. And coincidentally, it's called scheduled task. So the internals of scheduled task are sort of beyond the scope of this, but the idea is a scheduled task will take a number of seconds in the future
that something is supposed to occur and it will queue it up and it will at roughly that time, it will then pass it on into a thread pool to make it work. So when we notice when we actually use our perform at and we use that set method, Rails provides a lot of convenience things
for allowing us to specify when the job happens in the future. Rails gives us all those great time helpers that we like, one day from now and one week from now and at certain times and so forth. Active job is responsible for taking all of those convenience things that we use as the Rails developers and converting them into a number of seconds
in the future when this runs. So by the time we get this, we already have the number of seconds in the future. So in our job process, we don't have to worry about all of those wonderful date utilities that Rails has. Rails does that for us. So in this case, it's really convenient for us because scheduled task, not coincidentally, takes just a list,
a number of seconds in the future when the thing should run. And in this case, normally within concurrent Ruby, all of the high level abstractions run on global thread pools so you don't have to worry about managing your thread pools. In fact, most developers should never use a thread pool directly. Most libraries that provide thread pools provide them internally and provide high level abstractions
that use those thread pools. So under normal circumstances, a scheduled task or a future or a promise or any of these things, use the global thread pool. But in this case, we need a specific thread pool because that thread pool represents our queue. So all of the high level abstractions in concurrent Ruby support the tendency injection of a thread pool.
So in this case, this executor option, which is very common, is a way of saying, when you do run this thing, run it on this specific thread pool. So what we're doing here is saying, look, we know how many seconds in the future this thing needs to run. We know which thread pool we want it to run on. Just go and handle that. Scheduled task handles that. And at the time the thing needs to run,
it'll then grab that job and it will run this block and we're gonna do the same thing we did before. It's just called active job base. Dot execute. That execute method doesn't know anything about the asynchronous behavior. It just knows now's the time to execute that. It's the same thing we saw a minute ago. And just in case we somehow get a time value
that's not in the future, we're gonna just check that delay and we're gonna post it directly if somebody's not in the future. And again, that's all it takes in order to post a task later on. Rails handles all the time sensitive stuff. We just need to make sure that we can do it at that time in the future. And believe it or not, that in its entirety is a functional
asynchronous job processor. So the next slide is gonna have a bunch more code on it that we normally should see. And I'm putting this all in one slide because I want you to see just how simple this can actually be that in fact a real functioning asynchronous job processor can in fact fit on one slide. And this is basically it.
We have a class called inside job. We have our queues constant where we have this thread safe map where we're gonna keep track of all of our thread pools. We're gonna have that create thread pool method which will just return the pool we want. Then we have our enqueue behavior which just throws the job onto the thread pool. And then we've got enqueue at which actually looks at that delay,
that timestamp and gives it to a scheduled task. And that right there is actually a fully functioning asynchronous job processor that can work with active job. And like I said the other part was the queue adapter. And remember the queue adapter just looks like this. It was simply when active job calls enqueue or enqueue at
simply posts this thing off into my job processor. So that's it. So believe it or not that is actually I said a fully functional asynchronous job processor that will work with active job and could be used in test or development in order to actually get asynchronous behavior
without having to install reticers under deep dependencies. So the next question you're probably gonna ask is all right Jerry are you gonna put this code up online so we can look at it later? And the answer is yes. If you want to see this code well you can find it a very convenient place and that's Rails. The genesis of this presentation was that last fall I went to the Rails team and said
you know what would be really useful if we had a simple asynchronous job processor in Rails 5. As you all know we can in our config specify the inline adapter. The inline adapter will go and it will run the job synchronously so we don't have to deal with those underneath dependencies but the problem with that is it's not real asynchronous behavior
and if we're using the inline adapter in test or development we can sometimes mask problems by not having real asynchronous behavior. I said why don't we just build a simple one and it's we'll call it async job we'll make the instead of inside job it'll be async job we'll make the the symbol just async and why don't we allow people in test and dev
to run these jobs really asynchronously in order to potentially find bugs in them. And the Rails team said that's a really good idea and they worked with me and we got this merged into Rails 5 last fall. So if you use Rails 5 and you use the async processor this is basically what you're gonna have.
This code was lifted almost line by line from the original implementation of that. Now since then the Rails team has done some refactoring on that so if you want to look at the implementation now it'll look a little bit differently. So just to give you some context of what you see different they decided to collapse things into one file. When I originally wrote this I had two files one for the the Q adapter and one for the job processor
to sort of mirror that that normal behavior you would have of the Q adapter and the processor being separate. They collapsed them into one file because they're very short and didn't need to be two. They renamed some stuff to go along with better Rails conventions. They are assigning that provider job ID again in this case we're not really needing it but having that does again provide for greater consistency with the production ones.
And they decided to throw one thread pool for everything in expense with having multiple queues. Because again in test all we care that these things happen asynchronously we don't particularly care about configuring the queues for various different behaviors. So if you go look at it now you'll still see async job and it will do exactly what we showed before
and it's right now available in Rails 5. So if I've piqued your interest and you want to learn more about this and see other things that you can do with this the two things I would suggest you look at more deeply are sucker punch and sidekick. Sucker punch is a threaded in-memory asynchronous job processor. Does a lot of what this does
but does it way better and more fully. The creator of this tells me that the main use case is if you want to send emails from a a hosting provider a one-click hosting provider like Heroku you can fire off these emails because there's not a lot of high cost of failure if that thing goes down.
And so those jobs are just retained in memory not persisted through datastore. But sucker punch does use thread pools just like this does it does map those queue names to thread pools and provide some configuration of those. It also does some really cool things where it decorates every job with a job runner class that does certain cool things like track the number of successful jobs
track the number of failed jobs handle errors and do things in this nature. So it's a really good example of how you can decorate a job when you push it into a thread pool and do some really cool things. Like I said for most of us we shouldn't use thread pools directly the high level abstractions in the concurrency libraries provide those capabilities. But this is a really good example
of how you can do that. And also sucker punch does some really cool shutdown behavior where if the rails app is shut down for some reason it will look at the number of the jobs that are still running and try and allow the jobs to execute completely before shutting down and some other things. So there's some cool stuff in there. And again sucker punch uses a lot of the tools
that we saw in here it uses concurrent Ruby thread pools it uses concurrent Ruby scheduled task. Another great one is of course sidekick. Sidekick is also an in-memory I'm sorry it's also a multi-threaded job processor it does not use it does persist all your stuff to a data store so that way your job data will persist
beyond a restart of the application. It does not use thread pools the way we saw here sidekick actually spawns its own threads and manages its own threads but it still deals with all those same things with the internals of active job. And of course sidekick has a whole bunch of additional features. Like I said sidekick doesn't use concurrent Ruby thread pools but it does do some use concurrent Ruby
for some of the low level synchronization and atomicity stuff thread safety stuff you saw here. So the two great examples if you want to look at this further then go look at those code bases and see beyond what we've done here. So with that I just want to say again I work for test double we are hiring
and we are also for hire. So if you we love talking to people about software development and about software about how we can all improve software so if you'd love to chat with us by all means reach out to us you can find us an email social media myself and Justin will be here for the rest of the conference in fact Justin will be speaking on Thursday in the afternoon at 3 30
and he's gonna be talking about RSpec in Rails 5. So I've got stickers up here I've got stickers in my bag I hope to get a chance to talk with you sometime before the conference is over and with that again my name is Jerry D'Antonio thank you for having me.
So I do have five minutes if anybody has questions I see everybody's getting to run out that's cool I'm hungry too. So the question was resource contention within the job itself if you have multiple threads running simultaneously and trying to do things all of the asynchronous behaviors
provided by the job processor itself so all active job does is provide the compatibility layer it's important that the job processors themselves handle all the concurrency any kind of locking or synchronization that is necessary but generally speaking if you follow the best practices a lot of that contention goes away so for example you're not passing an active record object
you're passing an ID which you can then use to pull that up later on we're serializing the jobs so that we're not you know storing references but ultimately it's up to the job processor itself to be thread safe yes. So the question was would you be able to use multiple job processors simultaneously and the answer is yes but not through active job active job only allows you
to specify one handler however as far as I know all I'll say most but as far as I know it's all of the job processors can be used outside the context of active job okay so for example you might specify you know sidekick for being your main job processor but say for certain things you want to use sucker punch you would then just instant sucker punch directly
and so you can do it that way and I don't know I can't imagine why rails would change that but again it's very possible yeah so the question is could we subclass active job and have two different runners I guess again it's Ruby we could probably do anything we want but you know there's that one configuration value within the application config
I guess we could specify you know we can create our own configuration values we could create some new ones grab those and do something of that nature I'm sure it would be possible but it's not something that would be directly or easily supported by rails and active job itself so multi-threading assist when you have multiple threads right a thread pool so the question was difference between multi-threading in general
and a thread pool a thread pool is a managed thing where the queue and all the threads are managed by the object itself so one of the things like I said so I could spawn my own threads just by calling thread.new right what happens if those crash what happens if I want new threads what happens if I have idle threads there's you know how do I enqueue those things there there's a lot of plumbing involved in that
we can always spawn multiple threads but in order to manage that there's a lot of extra stuff how do we handle exceptions right if you throw an exception out of thread it will crash the thread how do you handle that so a thread pool takes all of that puts it in one object with some very well known very common cross-language algorithms and manages those things
you create a thread pool you give it a set of configuration parameters things like how many threads to run at a minimum how many at maximum how many things can you enqueue if the queue gets full what do you do if you can't just the operating system won't give you more threads what do you do and it handles all of that for you so all you do is just create this one abstraction
the thread pool and you throw stuff at it and it manages all of that in queuing and de-queuing and if threads crash it'll handle that and so forth so there's some overhead in the thread pool itself because of all of that but just like anything else that overhead comes at the the value of making you not worry about those things right
so generally speaking you start with high level abstractions that use the thread pools so you don't have to worry about that then maybe later on you specify your own thread pools and inject them in so you get better control and then you know maybe if you're this guy over there you just write your own threading yourself but that's sort of the progression and that is a fantastic question
the question is how does it handle exceptions what we did here does not handle exceptions very well at all right the thread pool itself will protect itself from any kind of exception on the thread right so thread pool will not allow its threads to die because of exceptions thread pool doesn't do much with them all right again this is one of the reasons to use that to use a high level abstraction
because if you use like future or promise those things have consistent and idiomatic ways of handling things like return values and errors and so forth so if you look at the job decorator class in there actually handles the exceptions
like capturing exception on the thread itself before it bubbles up and then doing things with it so again the high level abstractions are they all just provide you with better error handling and consistent and idiomatic and things are dealing with turn values and and minimizing the weight on things and so forth and again that's why you should always start with high level abstractions and then only uncheck the thread pool
and the other is meant to be the very lowest level in that just provides sort of the engine so like i said the the actual job processor itself will handle things like errors right so if you look at whatever you mean which one of those job processors you use they are doing the error handling in this case because this is the bare bones minimum i'm not handling errors at all right
your job's just gonna die you'll not know about it but again this is meant to be minimal and trivial but if you use any one of the full-blown production ready job processors they will handle the decoration of that job and they will handle the errors and they will have their own way of doing that and you would follow the that's one thing that active job does not include is a consistent way of handling errors
so of course you could always put your own error handling in your form method and handle it there but active job doesn't really do that for you oh i wouldn't do one i mean in terms of for production i you know the ones that are out there are very fantastic they're very mature they've been used a lot i wouldn't the reason for this one was to quality low cost
in rails itself so that for development and testing you can run your tasks synchronously and get a better understanding of how they're going to work in production one thing i've talked a lot of people in fact one of the if you go back and look at that commit and you look at the actual discussion around that that pr it was actually dhh himself who said
hey i really like this idea i generally install sucker punch for dev and test and it'd be nice if i could just do that within rails and not have to have that extra dependency sucker punch is great i've worked with you know brandon the creator of sucker punch and he's he's fantastic but you know it's an extra dependency for just dev and test
rails had done a great job of writing the the inline adapter for dev and test so it makes sense for rails to provide that simple async one and so it would just minimize what what we actually need and put it in rails itself so now for dev and test you can do that and get a better sense of the real behavior in production what it might be
i'm sorry it's in rails five now yeah so like so all you need to do is in real five is just say async in your config and it's there anything else all right thank you very much