Processes and Threads
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Part Number | 34 | |
Number of Parts | 94 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/30690 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
RailsConf 201534 / 94
1
4
7
8
9
10
11
13
14
16
17
19
21
24
25
29
30
33
34
35
36
37
39
40
42
47
48
49
50
51
53
54
55
58
59
61
62
64
65
66
67
68
70
71
77
79
81
82
85
86
88
92
94
00:00
Quicksort
00:45
QuicksortBitProcess (computing)Physical systemInterface (computing)Web pagePoint (geometry)EmailSource codeSign (mathematics)NumberGoodness of fitClassical physicsDependent and independent variablesComputer architecture1 (number)SpacetimeQueue (abstract data type)CodeThermodynamischer ProzessNP-hardQueueing theoryReal numberIntegerRight angleArithmetic meanMultiplication signSeries (mathematics)Group actionState observerEndliche Modelltheorie
03:30
System callProcess (computing)Semiconductor memoryInheritance (object-oriented programming)Letterpress printingScripting languagePoint (geometry)Principal ideal domainFunctional (mathematics)Slide rulePattern languageOrder (biology)Entire functionBitState of matterError messageAreaSound effectWordMultiplication signNormal (geometry)Metropolitan area networkTheoryArithmetic meanNumber1 (number)Computer-assisted translationMaxima and minimaComputer animation
06:25
Process (computing)Real numberSpring (hydrology)BitColor managementQueue (abstract data type)Scripting languageMonad (category theory)Point (geometry)QuicksortComputer configurationMultiplication signReading (process)Tracing (software)Wrapper (data mining)Frame problemSocial classTask (computing)Stack (abstract data type)Control flowDependent and independent variablesFlow separationCodeComputer fileInterpreter (computing)DebuggerKey (cryptography)Source codeInheritance (object-oriented programming)Computer clusterInformationConcurrency (computer science)Right angle1 (number)Thermodynamischer Prozess2 (number)Line (geometry)Product (business)TimestampComputer wormCASE <Informatik>System callIntegrated development environmentSign (mathematics)Type theoryMessage passingObject (grammar)Ferry CorstenLoop (music)MereologyClosed setMathematicsWordTheoryResultantMoment (mathematics)Mathematical analysisSet (mathematics)Group actionService (economics)Electronic mailing listRow (database)Execution unitNormal (geometry)WebsiteFood energyUniverse (mathematics)AreaLinear searchMetropolitan area networkNetwork topologyInsertion lossHypermediaContext awarenessForm (programming)Computer animationSource code
15:25
Multiplication signPoint (geometry)Chaos (cosmogony)Physical systemProcess (computing)Inheritance (object-oriented programming)Semiconductor memoryString (computer science)QuicksortBitEndliche ModelltheorieRow (database)Process modelingView (database)Medical imagingWordGrass (card game)Network topologyAreaLine (geometry)XML
17:09
Multiplication signAreaProcess (computing)WordOrder (biology)View (database)GradientRule of inferenceReading (process)Semiconductor memoryDirection (geometry)PlastikkartePhysical systemInheritance (object-oriented programming)Mobile appScheduling (computing)Term (mathematics)Pattern languageFerry CorstenThermodynamischer ProzessComputer animation
19:10
Key (cryptography)SoftwareParsingQuicksortClosed setScheduling (computing)Semiconductor memoryRandomizationMereologyCodeClient (computing)Multiplication signLengthCondition numberReading (process)Process (computing)Insertion lossPhysical systemVirtual machineLine (geometry)Shared memorySpacetimeThermodynamischer ProzessOrder (biology)Atomic numberPattern languageWell-formed formula1 (number)Endliche ModelltheorieComputer animation
21:45
Pattern languagePrimitive (album)Object (grammar)BitWritingLibrary (computing)Level (video gaming)Computer animation
22:17
Multiplication signMessage passingObject (grammar)WordIntegrated development environmentResultantTheoryData managementProcess (computing)Execution unitStrategy gameAreaPoint (geometry)Level (video gaming)Atomic numberInternetworkingGreatest elementSign (mathematics)NewsletterMereologyRule of inferenceNatural numberSoftware developerConnectivity (graph theory)Computer fileChainNumberFood energyTrailMedical imagingFrame problem2 (number)System callException handlingBootingError messageQueue (abstract data type)Dependent and independent variablesConsistencyElectronic mailing listFunctional (mathematics)QuicksortLimit (category theory)Pattern languageVariable (mathematics)Fiber bundleWrapper (data mining)String (computer science)Link (knot theory)Context awarenessCodeFerry CorstenAnalogyMiddlewareSocial classBit1 (number)Thread (computing)EmailInstance (computer science)Software testingPointer (computer programming)Source code
31:32
BlogBranch (computer science)QuicksortMultiplication signData miningProcess (computing)Link (knot theory)TrailSlide ruleNegative numberData structureCuboidLatent heatForcing (mathematics)Thermodynamischer ProzessMereologyCodeInterpreter (computing)Semiconductor memoryAdhesionStudent's t-testNonlinear systemWordTangentTwitterRule of inferenceSummierbarkeitUniverse (mathematics)TheoryBit rateComputer animation
Transcript: English(auto-generated)
00:12
So, yeah, thanks for taking a look at rescue and psychic, I am James tabs, you
00:21
can follow me on Twitter, or talk to me afterwards or any other sorts of things. Clearly, we'll be digging into topics shortly, but really, if I have one point, if there's one thing you need to talk with today, I want it to be this. I think we're all here because we want to improve our craft, we want to better write and code. And so I want to start with a quote from one of the great writers in all
00:42
of the South's history, William Faulkner, with a little advice on how to be a better writer. It's reading. Read, read, read, read everything, read everything, trash classics good and bad, and see how they do it. So my goal today really is, I want to understand a little bit more about processes and threads, some of the general introductions to those. But I want to do a bit about reading a couple of what I think are
01:02
the classics, so we'll take a look at some of the source code from rescue and psychic, and see how they do what they do, why they do what they do to do the thing that they are trying to do, all sorts of things. So, let's jump right in. How many people in the room have used a rescue or psychic for this thing? Most people, okay good. Those of you that haven't, have you actually taken a look at the source code sometime?
01:23
Relatively few, awesome, okay good, you're my target. Great, so really quickly, for folks that aren't familiar with a rescue or psychic, here's a rough idea. They're both background worker systems, and what background worker systems are all about is offloading some work from your Rails app, or what have you, into a background process that can run quite simply.
01:45
So sending an email, your user signs up on your website, you don't want them to have to wait around for you to actually send a welcome email before they see their welcome page. So send the welcome page back, throw an item of work representing, hey you need to send an email to this person onto a queue, and let your workers handle it.
02:02
So roughly the architecture is, you've got a Rails app, or a cron job, or a one-off script, or whatever, that's throwing some work from the queue system somewhere, and some workers will pull from that queue and actually perform over. Rescue and psychic are two very popular solutions in this space. The thing they have in common is that they use Redis for their queuing system.
02:22
In fact they use the exact same, well not exact, but the compatible Redis interfaces so you can queue jobs for the rescue and run them as psychic. The thing that really differentiates them is the way they use processes in Redis. So that's what we'll be looking into today, the overview. Let's start by talking about Rescue since it was first.
02:42
It was developed at GitHub around 2009. Basically the problem they were facing is they had a ton of background jobs doing all of the number of things, and were kind of pushing every single solution they tried to its failure points. So it won't be like, well we're just going to have to roll around, but what's out there, what can we use? Essentially there are a couple of hard problems here.
03:02
One is handling queuing. Redis is a thing that exists, it is awesome, and does queuing pretty well. So let's use that to solve all of our hard queuing problems. And let's write some Ruby code to focus on the other things that we care about which are, this is a more or less direct quote for the ones I'm interested in, are reliability and responsiveness especially.
03:23
So that sounds great, those are great goals. The question is, how are we going to do that? And the answer to that question is, using forks. That's been done on the slide, Joe. So please laugh at that, so feel better. Thank you. Awesome.
03:40
Healing. Ok, forks. You can see system call. It splits out, you've got a currently running process. And it's just a way to fork off the kind of new process. It's really, it's easy to understand. Can y'all see that in the back there? Hopefully. I'm seeing some nods.
04:00
Oh yeah, can we turn the lights off or down or something? Is that the thing that we can do? You don't know? I'm going to assume no and apologize. However, all of these slides will be up online later, feel free to take a look at them afterwards. So basically, what's going on here, there's a function to do some logging.
04:20
Just to grab what process we're in. Our script starts off, we assign a variable, we assign a to b1, and then we fork. And fork is sort of interesting in that it creates a second process. It is a call, fork is called once and yet returns twice. Once in the pre-existing process called parent process, and once in the created process called the child process.
04:42
It returns a little bit differently in the parent process, it returns the ID of the newly created child process. And in the child process, it returns nil. So that's not the only way to use fork for Ruby, but that's the pattern that rescue would use, so I'm going to expose it to that. So the way to read this then is that first block, we fork, we look at the return value from fork, assign that to PID.
05:05
If we got something back, then we're in the parent. We have the ID of the child process. So the parent process is going to do is just print out, I'm waiting for the child, wait for the child, and then only once the child is done, it'll print out done to the fork with the value of A. That variable will be assigned before the fork.
05:20
On the other hand, if we are on another branch of the fork, we're in the child process, we're going to print out the child is in the fork, we're going to sleep for one second, just wait, increment A, and then print out and then exit. So think through that for a second and imagine what that might print out. Turns out when I read it, it looks something like this. I guess we are not guaranteed on the order necessarily there, but the parent process does wait for the child process to finish.
05:47
So it's going to wait for the child process to sleep and increment and print out. And really, I think the key point here is when the parent process exits, like surely one parent process exits, it prints out A. So even though the child process could see the A that we created beforehand, it does not.
06:02
It gets its own copy of A. When the process forks off, it has the entire memory state parent had. However, it maintains its own separate, isolated copy of memory. So updates in the child are not promulgated back to parent. That's an important distinction. That makes sense to everybody.
06:22
Feel free to ask questions. So yeah, in this model, this is not the only way to do the forking, but in this model, the child process springs through distance, does the one little thing real quick, and then dies. And again, we'll see how the rest of you use it. So let's actually dig in. Let's read some code. I have prepared a quick little example to dig in here.
06:43
When I say read, I don't mean read. I mean explore. We have much richer tools for reading code than literally cracking open a file and reading my mind from the start. So let's jump in and pry around and look at some of the source. So what I've got here are a couple of tools that I wrote just to make this a little easier.
07:01
I've got a pry job that all it does is mining.pry. If you don't know pry, it's like a debugger sort of thing. So I think the big question here is we'd like to understand how once we get a job put into the queue, what happens? How does that get executed? So let's just do it and see what happens. I've got a couple of scripts here.
07:22
There's a queue script where I can just quickly queue jobs into rescue and sidekick. And pry job just does a pry. So I'll queue up into rescue and pry job. Here I will start rescue. I swear I can take some people where I'm watching.
07:44
So if you never even see that, I can't actually think this was a good idea. Okay, move back. Cool, thank you. All right, so here we go. At some point in our rescue process, my job got pulled off the queue and is running and has hit a pry point. And so we're paused. We can inspect things. So let's see how we got there.
08:02
I've got pry stack explorer so I can see the back trace. Looks good. So again, we've got some setup there. Here, reading from the bottom, there's a bunch of things that involve rape somehow. There are presumably the rape tasks that are in. The first thing that looks relevant to my eye would be this first mention of rescue here.
08:24
I guess we can start like one frame up. So let's go to frame six and go around. Here's frame six. Anytime you're in pry, you can type where am I to give yourself some context. Dash F for the whole file that you're in.
08:42
Dash M for the method. This is a rape task, so it's a little weird. You basically use the short M, and that sign is short for where am I, just to explain that. So anyways, where am I? I'm here in this rape file. If I look for a little bit more context, this is the rescue work task that the new
09:00
rescue worker sets some stuff from the environment and then eventually calls worker.work, passing in an interval. So let's step down that call. Here's the work function. This is the centerpiece of rescues. If you understand this method, I think you pretty much have the whole thing in your head. Let's see the whole thing.
09:22
Not the whole thing because zoom, but close enough. Okay, so the work function. When a worker starts working with some sort of interval, what does it do? It doesn't start with any things, which we could go look at, but the main part is in a big long loop, check if we should shut down and break out a loop, presumably proceed not to shut down sorts of things.
09:41
If not, we are going to reserve a job out of a queue. Let's reserve, let's just pop something from a rest queue, and it gives us a job object. And then it will proceed down into a fork. So we fork.
10:00
If we have a child process, we're in the parent process, we're going to do a few things like update the line, the process status line, which we'll see in a second. But the heart of that is the same thing we saw earlier where we're going to wait on the child to exit. And really not do much else if the parent process just sits there. The child is the one that sets up some signal handlers right next to res, and then actually performs the work.
10:23
So we step down, which should be right there. Inform call, if we aren't. Down again, I should say job here. Job is a rescue job, which represents the thing we got off the Redis queue.
10:40
We step down a little bit farther, we'll get where we are. This is the perform method on the job class. So again, the job is just a wrapper around the thing that was on Redis. So that unpacks it, it grabs the payload class, grabs the payload args, in this case I didn't have any args. So it's just nothing.
11:02
And then job here is the literal job class that I find. It calls perform on the job class, which is the way rescue asks you to implement jobs. Passing in the args, which here are empty. Embracing down to a little book that I wrote so I wouldn't have
11:22
to do separate things for risking sidekick, which was probably a really terrible idea. And then into the perform call. So that's the whole thing. Any questions about that in general? So one of the things I want to point out here is that we have four jobs to work on a job process.
11:44
There's one big long loop that goes, grab a job, four to perform the job, and essentially that's it. So while this job is in motion, it's essentially hung. We're waiting on a surprise exit at home. So if I try to view other jobs, I have a tick job that just counts down.
12:03
If I try to view that, nothing's going to happen. Because that one rescue worker is busy. I have no concurrency at all right now. But one of the nice things that Redis supports is that the parent process can take six months. If I use pstree and grep for things containing a rescue.
12:21
pstree is a process tree, so I can see my full process listing. Here are the processes I have running that involve rescue in the title somehow. If you're not familiar with this stuff, I've learned too much about it. The first one is just the grep command itself. But these last two are the interesting ones. This is the parent process. This is the rescue worker that I booted up to begin with.
12:41
It's set its product line to indicate what's happening. And that parent process is saying, hey, I forked off a child with this process at this timestamp. And the child process is telling me, all right, well, I'm here processing specifically a cry job since this timestamp. And one of the really nice things is rescue supports signaling.
13:03
So there are several signals that we'll respond to. But I can use kill. Kill is really for sending any sort of signal. I can send a user one signal to this parent process. And rescue will receive that signal and interpret user one to mean, all right, worker process.
13:22
The child, like your child is running around something, is hung. Kill that process and then go back to the key and start consuming again. You can also send signals for basically shut down your chance, hard to shut down, all sorts of things. But if I send it a kill, like if I send a user one signal using kill, it should process.
13:48
It kills the cry job and then picks up the next time. So I get two more things there. Nice signal handling to coordinate the workers. And also realize that rescue does not give us any concurrency.
14:02
We can only process one job at a time. If you do want to process more than one job at a time, which you often do, you'll need something else. Lots of options. You can use a monad or a gamut or something to spin up certain workers. There's also like rescue pool, which I have here.
14:23
I've got rescue pool configured to start up three workers. Sorry, that's a little hard to see. I'm sort of switching to apologize. But again, you only get one active worker per process that you've never spun up. So if I now enqueue into rescue, award pick jobs.
14:47
I have three workers running. Those three will grab work and start performing them. There's still, the queue's a little backed up, so there's one job remaining. Those three should finish up shortly. And then one more will grab some work. It reflects what's happening there.
15:06
Rescue pool itself has forked off three rescue workers. Each of those rescue workers are the thing that we saw when we started break rescue work. And so each of those, every time it consumes a job, will fork off a child to perform that work.
15:21
How people do it, does that all make sense? Big question then, if that's how rescue is working. Oh, sorry, I guess it's all right. So, big long do. Starts up, initializes, registers signal handlers, grabs mental work, forks off a child process. The child process is the thing that always does the work. The parent process is always there, just waiting.
15:43
So, the question then is, why? And this really is an honor to our question. Does anyone have an idea of why we would want that sort of architecture? Yeah, like we saw a little bit of a sketch of that.
16:01
We had that A equals one variable and then it worked off and it read it. So, no, when you fork a process, it gets a copy of all of the existing memory in the parent process. And you might be a little concerned about if that's a lot of memory, you're making the big copy. But most modern systems will implement a copy on write where you don't actually have to copy the memory unless you are changing something.
16:21
So, usually it does the right thing. But if you are changing things, you will have a multiple isolated copy of the memory. So, I have another question. So, how do you do all this stuff? Oh, so like that's the nice thing is you're coupling doing the work from viewing the work. So, like as long as you can contact, you don't have to have any workers working whatsoever.
16:42
As long as you can throw like a JSON string into Redis, you're giving up work. So, it can be consumed at any point in time as long as Redis is up, you can do it.
17:06
Yeah, very much so. That's definitely one of the big ones. Really, I think Resque says it well. Resque, again, born out of chaos. Like, that will work with GitHub. And they really are assuming the worst because they had seen a lot of that pairing. So, really, forking gets you three big things here.
17:22
It mitigates the risk of leaking memory. Like, if your jobs do something bad, that's okay. Because they're going to run in a child process that when it's done, it exits and cleans up after itself regardless. You avoid the cost of booting Rails for your job, right? Because the B4 process gets a copy of the parent's memory. And because that parent process isn't doing much other than sitting there, it should be responsive, it shouldn't hang.
17:45
And you should always be able to reliably send signals to the parent process to do things like shut down the child if the child isn't behaving. So, that's nice. That's all well and good. Resque's great. My talk's over. We can all have fun. But wait!
18:01
Sidekick's a thing, right? Why is Sidekick a thing? I'll use direct quotes. And in fact, I should play my card. The guy who wrote Sidekick is here. So, if anyone wants to direct cards, Sidekick questions, leave.
18:25
My card is Sidekick. I think it sounded to me as though it was explicitly a reaction to your rescue. And I was like, brother, With Resque, you have some safety because you are sandboxing all the processes and not sharing memory. But that means if you're running 20 workers, you might need your Rails app size times 20 amount of memory.
18:46
That can be pretty expensive pretty quick. With some auxiliary goals, you have a more ready system that's high-performing, well-supported, and includes some of the patterns that Resque has in the ecosystem in terms of gems but aren't quite picked in.
19:01
Like automatic failure retry, scheduling jobs, things like that. So, again, noble goals. How are we going to do it? And the answer to that one is the threads. Here's my other bad joke in the picture. It's a little better. Thank you. Appreciate that. Yeah, so we'll be using an underlying network of key threads.
19:20
But probably won't be thinking about too much. Here's the thing about threads. They share memory. Which is cool. Until it isn't. That's the idea, right? You want 20 workers performing work. If you're on Resque, that means 20 processes and 20x the memory. It would be great if you could have 20 threads doing the work.
19:41
Like a thread is very similar to a process, but threads don't have their own isolated memory space. Which can lead to some really subtle, hard to reproduce, hard to track down bugs. Here's a rough sketch of one of those for anyone who hasn't had this particular bug before. A race condition sort of thing. This is a parse down example.
20:02
Pretend you're a bank and your clients have wallets and you're tracking money for them. And you decide all of your banking software is too slow so you're going to run everything else for that. And it'll be fine, right? Because those are nice atomic operations and nothing's going to go wrong. And it'll be fine. So the thing about threads is the OA scheduler is required to make sure that all of the threads running on the machine
20:24
like the 1D write and all of the system threads all have their fair share of access to processor time. So it is free to pause and resume threads more or less as it seems fit. Which, you know, it's fine for processes because they have their memory
20:40
but you get the mere close with this sort of thing. Really, because of the scheduler, any time you see multi-threaded code you should just randomly insert sleeps for random lengths of time in your code everywhere and ask yourself if there's any possible way that it can go wrong and hint probably as there is. So imagine something like this.
21:01
Depending on how long things get paused for and what runs in what order and who gets to the finish line first since the race can be made you might end up with one thread reading that there's $100 in the wallet going to sleep, the other thread waking up decrementing it to $90 and then the first thread waking back up and going okay $100 plus $10 is $110 good, you have $110, now you're a bank and you're not money
21:23
and that's not a good thing. You ain't doing this bank. So that's a very parse down kind of part of the whole example. And the worst thing, it's totally not informistic, right? It's entirely on the whims of the OA scheduler so it might happen very infrequently and it's just a nightmare to be bothered.
21:41
So we'd like to do something better. And the answer to that is, well one answer to that is something called the actor pattern that's available to us at Ruby using a library called Selulipin. Very much influenced by our writing and some of its ideas. I think the key idea here is, what makes a lot of this stuff really hard
22:02
is that our primitives for talking about objects and our primitives for talking about currencies are unplayed and so we have to manage them and always mess up doing that. The actor pattern is a way to bring that a little bit easier. So here's what this might look like with something like Selulipin.
22:22
If you notice, it looks alarmingly like just a plain old Ruby object. It has a wallet class, it's got an amount that it keeps up with, some functions for adjusting that, an adderator to the fancy and just includes Selulipin. And that gets us remarkably close to everything that's working. So we just new up a new wallet, we can asynchronously adjust.
22:42
This is how Selulipin handles asynchronous method calls. And we still get asynchronous method calls like calling wallet.amount. Under the covers, what that's doing is kind of a lot actually. When we new up a new wallet, we in fact do not get back a wallet.
23:02
We get back a celluloid wrapped wallet object. And really it turns this wallet into an actor. What that means is it's going to be an object that's running in its own thread and where all of the method calls, like all of the messages sent to that object are mediated through a celluloid mailbox.
23:22
So when we make a call, what that's actually going to do is it's going to throw that message onto a mailbox that Selulipin will handle the synchronization of. So when we asynchronously say to adjust by some amount, that's just sending out a piece of mail that the wallet will eventually get and process.
23:43
So I think that seems really love. There's a bit of a question then. Okay, you've got this wallet running in a separate thread. How do you then do a synchronous call? How do you get data back out of it? These async calls require forget and will immediately return nil and often you lose insight into if an error is raised somewhere in there.
24:02
So for Selulipin to make a synchronous call, it sets up kind of a return to sender mailbox. So you make the call, it goes to the wallet's mailbox with a respond to, and the wallet, when it's done, gets the amount and sends that back to the async proxy.
24:20
This is, again, just an object that Selulipin will take care of for us. And the caller will slip their block and wait for a response to come back in. So, that's kind of the underpinnage analogy. I think, hopefully, that makes it really comparatively nice to write multithreaded code because it really is very much just
24:43
sort of like writing good thing over and over again. So let's dig into Sidekiq a little bit and see exactly how it does it. So, same strategy here. I'll start up a Sidekiq instance. And I added a little bit of a test for consistency.
25:02
So, same idea. Keep that up. So here we go. At some point, you know, that got pushed onto a Redis queue. Sidekiq pulled that off, and we are now in that performable somehow. So, let's see how we got there.
25:27
Okay. Again, there's some Selulipin stuff in the background here that I probably don't care about immediately, but we'll want to get back to. So, I might start from the first mention of Sidekiq or maybe one up from that again. So, let's check out frame 14.
25:42
This would be the Seluloid, you know, dispatching that method. So, variables I have here. Object is a processor. Processor is a Sidekiq processor object. I'm sending the method process to it asynchronously. And here's what that does.
26:01
If I step down, that ends up here. Let's see, where am I? And look at that method. So, here's the processor's process method. It takes some work. Work here is a Sidekiq basic fetch unit of work. I think it pretty clearly represents, you know,
26:21
it's an object where I'm making the JSON data off of Redis. So, I'll get that work, I'll do some stuff, and end up here calling stats. Step into that. That's just updating some statistics.
26:42
So, continue down. Here, I'm now at the middleware chain, which is pretty cool to see how this is implemented. Yeah, we'll just step through. So, if you're familiar with rack middleware, it's essentially the same sort of idea where I can implement lightweight wrappers and compose them together. So, if I step down through this chain,
27:01
I end up at one piece of middleware, the Sidekiq logging middleware. And here's the entire file. It's one call method, and this is responsible for logging around jobs. And it's just like, you know, give the logger some context. Start a start time, log start, and then yield to the inner middleware.
27:21
And then print out when you're done or if we fail. So, continue down, continue down. Next piece of middleware. This is the retry jobs middleware.
27:40
And taking a look at the call function here, it's gonna yield again to the next layer down. But rescue from exceptions and attempt to retry the job is something that will retry the job after x seconds and then 2x seconds and then 4x seconds and so on up to some configurable limit. So, on down, on down.
28:01
And finally, we get to execute job. We're passing the worker, which is the literal class that we defined. And that's called worker job. So, that's interesting. You know, it's good to see the architecture.
28:20
I don't think we saw much of the multi-threaded stuff, though, because if we look at the stack, we started inside the actor for the processor. Presumably, this work got to the processor from something else. It's not quite as introspectable from the stack trace.
28:41
I'm just curious. Any thoughts about, if you wanted to trace down, how did that work get to the processor? Where would you all look next? Actually, not a short one. Here's what I would do.
29:03
Yeah, so we saw that whole thing started off with an async install to a processor object, sending in the message process. So, given what we know about the cellular API, presumably that was somebody called .async.process.
29:20
So, I'm going to open up a gem, bundle open, work with that string. Two places for that match, and one's processor done, and that's probably not it. So, I think we get pointed to this file.
29:43
This is the sidekick manager. Might be interesting to see how that boots up. When we initialize a manager, it's going to figure out how concurrent it should be. If it's 25 minute concurrent, it will 25 times boot up a new processor, a new link starts a new actor,
30:00
and then monitors that actor to see if it crashes. If it crashes, it can trap the exit of a linked actor and call processor die, which will restart the processor. So, we spin up those processors, and then the actual async process call comes in a sign,
30:25
which I think you can trace back through here. Oh, sorry. Let's do the trap. So, when a processor exits, we'll call processor die.
30:41
What that happens, we'll remove it from our internal list of processors that are available, and unless the manager itself has stopped, create a new one, add it to the list of stuff that's ready, and then dispatch. If you look at dispatch, that will immediately make an asynchronous call to tell the fetcher, hey, what's your fetch?
31:00
Fetcher's over in fetch. I know this is kind of quick, so. So, the fetcher's going to fetch, which is a nice strategy pattern here. When we call fetcher.fetch, it's going to look at the strategy and retrieve some work from here. That's the thing that got us our basic fetch human of work from Redis, and then asynchronously call back to the manager to assign that work,
31:21
and then that closes the loop, right? The assign work function was the thing that called processor.process, and got us to where we were. So, that's how jobs get run in Sidekiq. So, that's the how. Sorry, yeah.
31:46
Yeah, so, yeah. MRI has a global interpreter lock. So, you, depending on exactly what you're doing, you may not see much of a speed up from threads there. It's really quite like, if you are processor bound,
32:00
then MRI threads are not going to be able to utilize extra processors, because they don't get so much speed up, but I think a lot of jobs are normally IOM, so you can't see some extra throughput, even with near, but yeah, if you want true threads, you should be on a Ruby, Ruby Mini S or something, but it totally works with MRIs.
32:23
Correct, yes, I'm sorry, yes. Yeah, so some takeaways. Sorry to say it's early. If you are memory constrained, and don't want to use a lot of memory, you should use Sidekiq. If you're not thread safe, you should use Resque. If you're both, sorry. I don't mean to be limp. That just means,
32:41
you're gonna have to spend some time slash money somewhere, and it's up to you, whether it's scaling out extra boxes for Resque, or spending some part of your time making things thread safe. If you're worried about thread safety, though, settle it away. Good place to look. Other considerations, if you're weighing Resque versus Resque, how important is job isolation? Do you need to be able to signal them individually?
33:03
Well, what really is your bottleneck? And especially if you're on MRI, and processor-bound from threads. Also, how important is support? Mine's pretty awesome in this round. Resque, I'm not sure, and I hope it won't force me to work.
33:21
The code is always from the 1.3 branch. There's been a 2.0 branch around forever, and I don't know if it's actually going anywhere. Haven't seen a lot of activity on it lately. Someone correct me if I'm wrong. Other things, and again, I think maybe more important than specifics about processes or threads,
33:40
don't be afraid to jump into some code. We have lots of great tools for doing things that are better than just reading, like pry, lots of chilis, everybody know about WTF and pry? The more exclamation marks you get, it's the backtrace of the last error, and the more punctuation you have, the more backtrace you get.
34:02
But yeah, we have lots of great tools, so lean on them. And also, track up negative gems sometime. If you're interested in these sorts of things, Unicorn does some great Unix-y magic, stuff with forks and self-pipes and whatnot. Rescuepool actually pretty much the same thing as Unicorn, if you look at the code. Also, Adhesion is a big select-me thing built on top of celluloid.
34:24
I gather bed-playing is speaking tomorrow morning, so you might ask me about it, because I want to work on that. Also, if you want some reading-reading, these slides will be up on GitHub. Also, some links for blog posts. There's one, like, diving into Unicorn, this structure I think is quite interesting. And then a few more on specifics about, like,
34:40
thread safety in Rails, and a bunch of other lines, and all sorts of things. That's all I got.