Distributed locks with Python and Redis
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 103 | |
Number of Parts | 173 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/20200 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Bilbao, Euskadi, Spain |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
EuroPython 2015103 / 173
5
6
7
9
21
27
30
32
36
37
41
43
44
45
47
51
52
54
55
58
63
66
67
68
69
72
74
75
77
79
82
89
92
93
96
97
98
99
101
104
108
111
112
119
121
122
123
131
134
137
138
139
150
160
165
167
173
00:00
Red HatGoogolInformation systemsTabu searchBoom (sailing)Water vaporMultiplication signRoboticsProcess (computing)WordJSONLecture/Conference
00:17
Context awarenessMessage passingSoftware maintenanceProduct (business)Parameter (computer programming)Token ringSoftware developerTelecommunicationFocus (optics)Basis <Mathematik>QuicksortHypermediaSystem callJSONXML
01:14
GoogolBoom (sailing)Message passingData structureDatabaseLoginString (computer science)Cartesian coordinate systemCASE <Informatik>Key (cryptography)Video gameObservational studyData storage deviceImplementationAreaDuality (mathematics)Group actionLecture/Conference
02:17
Metropolitan area networkData storage deviceMessage passingElectronic mailing listMultiplication signBinary codeSemaphore lineMultilateration
02:34
GoogolBoom (sailing)CASE <Informatik>Thread (computing)Connected spaceDatabaseObservational studyDatabase transactionRight angleContext awarenessLecture/Conference
03:20
Task (computing)Context awarenessEmailTask (computing)1 (number)System callData conversionDependent and independent variablesMessage passingFamilyTap (transformer)Carry (arithmetic)Data miningComputer animationProgram flowchart
04:47
GoogolBoom (sailing)Task (computing)Port scannerMusical ensembleSimilarity (geometry)Water vaporTask (computing)Physical systemData structureForcing (mathematics)Client (computing)Lecture/ConferenceJSON
05:30
GoogolBoom (sailing)Client (computing)Queue (abstract data type)Task (computing)Different (Kate Ryan album)Lecture/Conference
05:52
Port scannerTask (computing)Musical ensembleData managementView (database)Point (geometry)Task (computing)Parallel port
06:13
GoogolBoom (sailing)Server (computing)Event horizonTask (computing)String (computer science)System callBinary fileSemaphore lineData conversionMultiplication signEmailThread (computing)Centralizer and normalizerMehrprozessorsystemStandard deviationTask (computing)Library (computing)Semaphore lineLoginRadiusState of matterDifferent (Kate Ryan album)Data managementServer (computing)Lecture/Conference
07:16
GoogolBoom (sailing)Maxima and minimaState of matterTask (computing)Set (mathematics)ImplementationString (computer science)Total S.A.LogarithmMereologyNewton's law of universal gravitationSystem callCASE <Informatik>State of matterDatabase transactionThread (computing)Group actionTask (computing)MereologyStaff (military)Phase transitionRight angleStreaming mediaInternational Date LineConnected spaceDatabaseSound effectData managementMultiplicationString (computer science)Control flowMultiplication signWeightCondition numberNoise (electronics)Message passingKey (cryptography)Event-driven programmingSemaphore lineGraphical user interfaceSingle-precision floating-point formatArithmetic meanLecture/ConferenceJSON
11:46
GoogolBoom (sailing)Block (periodic table)Curve fittingElectronic mailing listSemaphore lineSystem callTask (computing)Information systemsImplementationMereologyMathematicsMultiplicationScripting languageMultitier architectureComputer iconEuler anglesState of matterPort scannerSemaphore lineSoftwareOffice suiteElement (mathematics)Scripting languageInstance (computer science)CASE <Informatik>WeightControl flowState of matterMultiplication signMereologyVariable (mathematics)Right angleBlock (periodic table)BlogProcedural programmingElectronic mailing listNumberCartesian coordinate systemTask (computing)System callOperator (mathematics)Maxima and minimaExterior algebraBit rateBinary codeClient (computing)LengthWordLoginAtomic numberCondition numberObservational studyPointer (computer programming)Lecture/Conference
15:51
GoogolBoom (sailing)State of matterDatabaseMultiplication signWater vaporLecture/Conference
16:20
Set (mathematics)Right angleArithmetic meanJSONComputer animation
16:43
GoogolBoom (sailing)Condition numberLecture/Conference
17:05
GoogolBoom (sailing)ImplementationJSONComputer animationLecture/Conference
17:26
GoogolBoom (sailing)ImplementationReduction of orderClient (computing)LoginCodeCASE <Informatik>Social classServer (computing)Right angleMereologyJSONComputer animationLecture/Conference
17:58
Semaphore lineForcing (mathematics)Endliche ModelltheorieInstance (computer science)Uniform resource locatorJSONComputer animation
18:24
GoogolBoom (sailing)Limit (category theory)Windows RegistryReplication (computing)Instance (computer science)Different (Kate Ryan album)Point (geometry)Distribution (mathematics)Process (computing)Single-precision floating-point formatGene clusterLecture/Conference
19:14
Instance (computer science)JSONComputer animation
20:00
GoogolBoom (sailing)CASE <Informatik>Goodness of fitPhysical systemLecture/Conference
20:57
EmailMultiplication signRight angleJSONComputer animation
21:14
GoogolBoom (sailing)Observational studyMultiplication signTask (computing)Pairwise comparisonEndliche ModelltheorieArithmetic meanVideo gameLecture/Conference
22:23
Red HatRobotComputer animation
Transcript: English(auto-generated)
00:05
So, well, thank you for coming, hola, everyone. Did you notice that there are three talks at the same time with word distributed in name? The EuroPython folks have some sense of humor, for sure. Well, first of all, I would like to introduce myself,
00:23
as it was already did, so I'm a software developer and a maintenance guy at Focused Telecom Poland. My main toys are Geovent, Twisted, and Celery. What we are doing at the company is to providing a business-to-business
00:40
telecommunication solutions, right? We host teleconferences, we sell web access, but our main product is a contact center. So, nowadays you have multiple ways to communicate with each other. There are social media, there are SMS text messages, well, there are emails, and there are phone calls.
01:04
So, a contact center is a way to unify all this, to gain a much better contact with your customer. So, throughout this talk, I will show you the principles of implementing distributed logs
01:21
in Redis using Python, and show the usage in a real-life application. So, does anyone know what Redis is? Okay, there are some hands, so I will
01:43
quickly refer to that. The Redis is an OSQL database. It's basic key-value storage, but the key can be only a string, but the value can be a more complicated data structure. For the sake of this talk, most important are string and list.
02:01
They are pretty much like Python counterparts. And also, Redis has a simple support for a message passing. So, first of all, I will introduce some case study. Then I will show you how to implement correctly
02:20
distributed binary semaphore, so-called mutex, and later on, semaphore. And finally, I will give you some tips, and there will be some time for questions. So, first of all, this is a loose definition that I came up with a few days ago. The gido on his talk yesterday
02:41
has gave a great example of locking and keeping safe the reference counters. The other case would be that we have a shared database connection between threads in Python, and we would not like to other thread
03:02
sending some SQL comments while the other thread was trying to carry out a transaction. So this is basically what lock is. It's to protect a shared resource. So, the case study will concern a contact center.
03:21
Hope nobody expected that. So, in the contact center, work people, of course, and we refer to them as agents. So, an agent can answer on originate phone calls,
03:42
email messages, text messages, or carry out chat conversations, and this kind of stuff. Well, and my advantage of contact center is that it automates all of this kind of work. So, it keeps the agents busy. So, they are not looking at the floor or sailing.
04:06
Okay, but to be this concept more familiar, I have replaced agent with a giant W that stands for a worker, and the E are tasks.
04:22
So, you could say that this looks like a task QE. Who doesn't know any task QE solutions, like Celery, RQ, Germans? Are there anyone? Hasn't heard any one of them? No, there are, it's good.
04:41
So, basically, you know the concept, and this is the problem we are trying to solve. Yeah, it looks similar, you know the stuff. You can apply a working solution, as it will be great, right? Wrong. There are a couple of things that you have to worry about. First of all, a worker is an autonomous creature.
05:04
And it can choose whether to work or not. He can simply log out of the system, and he will not be available anymore. And this is a normal situation. This is not a failure of system or some kind. You cannot force somebody to stay logged in and work,
05:21
at least not in Poland. Well, the tasks must be prioritized, and this is a heavily desired feature from our clients. So, yeah, I don't know if any other distributed task queues like Celery support this.
05:42
As far as I know, in Celery, you can partially achieve this by routing to different queues, and that's all. So it's also not sufficient. And the last, mostly mind-blowing, is that one worker can handle multiple tasks simultaneously.
06:00
And from, I don't mean concurrently, but simultaneously, so in parallel, at least from a point of view of task manager. So this will be viewed that way. Of course, Q1 cannot carry out two phone conversations at the same time. But why not write an email and speak to somebody?
06:22
That's fine. So our solution depicted as a whole looks for some centralized task manager, and it's written using Twisted. And the Redis will serve us a logs server, and we'll hold them all.
06:41
So the first thing I would like to say is about binary semaphore. It has only two states. It can be locked or unlocked. To lock it, I will refer to this as acquiring, and to unlocking as releasing. This is the same as the, in standard Python library.
07:03
Who has background with multi-processing or multi-threading library? Yeah, that's quite a lot. And who was on the Monday talk on this? Yeah, that's quite a few things. So, and it will be stored as a Redis string,
07:23
and all calls to acquire it or release will be non-blocking from Python. So the first use case is about keeping state of a worker, whether he's busy or idle. So, meaning that the lock is released,
07:45
it's the same as idle, and so on. And the lock will be acquired when a task manager will sign a task, and it will be released afterwards, so it's over. And the second case is about preventing race conditions.
08:04
The one I mentioned that a worker would like to log out sometimes, and I don't know, go for a break or something, that he can do it using this mechanism, because it can acquire a lock before a task manager do. Or it will simply do not release it
08:22
after finishing its previous task. Okay, so, since this is storing this Redis string, we would use two Redis commands, a set to store a lock, to set the string, and a get to retrieve it.
08:41
What we need is atomicity. Every single command of Redis is an atomic one, so no one else can disrupt your data. So, the first draft, we have a Chrome disk, it was something like that. We're getting the log,
09:01
checking if we can lock it actually, and then do it. But there's a problem, because between get and set, there is some instruction running, and this as a whole is not atomic,
09:21
so someone else might have changed this in the meantime, and this is a problem. And it's not acceptable then. So we introduce some kind of Redis transactions. This is not a real transaction in a sense of SQL databases, but it allows you to execute a bunch of commands
09:40
in an atomic way. So, we issue the command watch, and giving the key name. So this will be carefully watched during the transaction. And finally, if we cannot do anything, then we unwatch it. So this connection will be free of side effects later.
10:05
The next step will be to introduce the transaction denotions, multi and exec. These actually will denote a transaction, or the things that should be done conditionally,
10:21
only if a watched key has not been changed. So I guess that's quite simple stuff. Okay, the important part that task manager sometimes would like to know whether the state changed. Well, when we log the worker, he will find it eventually.
10:44
But when we do not, when we are freeing it, it must know when the worker has freed, so we can actually perform some action. Like for example, look for the next task for it.
11:01
And to receive the notifications, we issue a subscribe command. But to be more precise, on the Python side, so this is a task manager. As I stated, it's written in Twisted. There is some kind of magic like ready subscriber
11:20
will handle the subscribe and actions for them for you. And this is the event driven. You see the very ugly method name message received. Which is written camel case. And by overloading this method, you actually define the behavior for this.
11:46
So the second part will be about distributed semaphore. So the semaphore is different from locking that way, that it maintains an internal counter. So it means that it can be released or acquired many times.
12:03
And if we define a semaphore with counter two, we can acquire it twice, and also release it twice. But the original counter cannot be exceeded by further calls. And this will be stored as ready's list, and the calls to it will be blocking.
12:24
So the use case is that we can handle multiple tasks at the same time. As I stated before, the usual lock is not sufficient, and the use of a bunch of binary semaphores would not be sufficient to.
12:40
This is similar to string operations, but now we are using rpush, which stands for write push, so it will append an element to a list. And the rpop, which stands for blocking write pop, so it will pop an element from a list.
13:02
And rpush will serve as a release operation, and the blocking write pop as the acquiring operation. So when there's nothing there, and it means that semaphore has been acquired a maximum number of times,
13:20
the command will block until there's something there. So in the Python part, we did not change anything, because it reacts only on events. Eventually you can implement your checking, whether the semaphore is free or not.
13:40
And there are cases when you might need a state of semaphore after changing it. You could use previously showed a multi-exec block, but that's not always the case. An alternative is to write a simple Lua script. You see that it's very much like Python in this case,
14:01
except this word local, so it means we are allocating a local variable, so it will be garbage collected after this block. And this script only releases our log. And return a length that we have left.
14:23
Okay, but there's a warning. Multi-exec blocks and Lua scripts all are atomic in Redis. The problem is that blocking invocations in atomic way has not sense this time.
14:42
So it would mean that the Redis instance has to be blocked forever to serve this one. So making barepop inside multi-exec will return you nil, so it's none in Python client. And invoking such a Lua script,
15:02
Redis will respond with an error. Alternatively, you can use an rpop, so it's non-blocking feature. It will return you either nil if log cannot be acquired, or a value if it has. Some final remarks before you rush
15:22
and implement these solutions in your software. Please care about starting conditions. Make sure that the logs are properly instantiated before using them. And study carefully control flow in your application to not introduce too many logs or too less.
15:42
And also, please work up a restoring state procedure. When something goes wrong, there might be cases when the state within Redis is not finally consistent. So you might want to fix that some way. For example, we persist a state in SQL database
16:01
as some kind of history of activity of given worker, so we can restore a state properly after some kind of failure. So, well, that's all for me. And now it's time for some questions, if you have some, of course. Thank you.
16:34
Hello, have you considered using setnx for the log, the setnx command?
16:42
You mean setnx, right? Yes, yes. For log? Yes. Yeah, we have read about this, and this is described also on Redis documentation as far as I know. But, well, this is an expiring command as far as I know, yeah?
17:01
You can use it as a non-expiring command. You don't need to set an expiring. Yeah, but somehow it didn't fit our needs. I can't remember now why, but no, it wasn't sufficient for this kind of locking. Yeah, but that's an alternative for sure.
17:23
Okay. Have you tried to look at other implementations besides the built-in lock implementation in a Redis client? You mean besides Redis, right? Yeah, besides the official Redis client. Well, ideally it would be to write
17:41
your own logs server in Python, but the case is that some code base we have, and the most is written in PHP, so it has to be something outside. Ah, I see, okay.
18:06
Hi. When you're talking about distributed semaphores and locks, are we talking about, so are the locks in one location, so in one Redis instance, and the workers can be distributed? Or are the locks in semaphores?
18:23
Actually distributed? Yeah, that's a good question, and I hope that someone asked that. Well, basically we use a single Redis instance for serving our locks, and due to different limitations, we do not need more resources, but distribution you can achieve
18:42
is by either sharding Redis instances, and this is done automatically by using Redis clusters starting from Redis 3.0, or replication, which is also in this kind, but when I prepared this, I meant distribution by,
19:02
yeah, the workers are distributed, and these are different kind of resources that might achieve a lock, sorry, acquire. Does this answer your question? Okay, thanks. Any more questions?
19:34
Have you tried, have you thought about using Zookeeper? Because I'm not entirely familiar with it,
19:42
but my colleagues at work use it, and it's supposedly that, which is synchronizing few distributed instances of anything to only get one resource at a time. Well, we didn't,
20:02
and I have read about something like Zookeeper just yesterday on O'Reilly stand, but I will surely consider that, so thanks for a suggestion, but no, I don't have any experience with that. Me too, but it seems like a good idea, so. Yeah, if it would be simpler or better in some case,
20:24
why not replace? Yeah, because it's used supposedly to, you know, select the name node in a Hadoop cluster, for example, so this is also a really distributed system, and only, you know, you have to think about it
20:41
and synchronize it, so Zookeeper is supposedly for that. Yeah, I will investigate this for sure, and if I came up with something interesting, I'm sure I will give a talk at the next occasion, not this year, I think. Okay. Thank you.
21:01
Any more questions? Well, I do have one though. Have you actually ever seen anyone talking on a phone and writing an email at the same time? Yeah, myself. Do you manage to actually make sense? Well, you know, there are some studies
21:21
that shows that a human that is carrying out a few simultaneous tasks is completely reduced by its brain capabilities, so there was a comparison in the book I read that he behaves as if he had a brain of a six-year-old child.
21:46
So, you know, I don't find personally that multiple tasks at the same time are a super killing feature, but that's what the business wants.
22:01
Yeah. Yeah, because they respond with some delay, so yeah. Maybe. So, any more questions? I guess we're good then. Thank you again, Sebastian.