Leveraging consistent hashing in your python applications
Video in TIB AVPortal:
Leveraging consistent hashing in your python applications
Formal Metadata
Title 
Leveraging consistent hashing in your python applications

Title of Series  
Author 

License 
CC Attribution  NonCommercial  ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. 
Identifiers 

Publisher 

Release Date 
2017

Language 
English

Content Metadata
Subject Area  
Abstract 
Leveraging consistent hashing in your python applications [EuroPython 2017  Talk  20170711  Anfiteatro 2] [Rimini, Italy] While consistent hashing is largely known and adopted in the NoSQL database clusters to solve data distribution and data access reliability, it is less known and used by the typical developers. This talk will introduce you to consistent hashing and the problems it solves while going through a practical use case in a python application. We will start from its standalone design and scale it out to an optimized clustered version thanks to consistent hashing

00:00
Chord (peertopeer)
Intel
Focus (optics)
Demon
Key (cryptography)
Software developer
Distribution (mathematics)
Software developer
Coma Berenices
Database
Bit
Computer programming
Arithmetic mean
Semiconductor memory
Gentoo Linux
Software
Personal digital assistant
01:02
Point (geometry)
Trail
Group action
Functional (mathematics)
Mapping
Information
State of matter
Texture mapping
Software developer
Memory management
Database
Bit
Database
Mereology
Cache (computing)
Cache (computing)
Software
Personal digital assistant
Core dump
Information
Physical system
02:37
Uniform resource locator
Process (computing)
Mapping
Equals sign
Operator (mathematics)
Logic
Data storage device
Selectivity (electronic)
Information
03:16
Implementation
Functional (mathematics)
Mapping
Table (information)
Mapping
Key (cryptography)
Software developer
Instance (computer science)
Uniform resource locator
Hash function
Operator (mathematics)
Hash function
Logic
Right angle
Table (information)
04:37
Point (geometry)
Functional (mathematics)
Distribution (mathematics)
Dependent and independent variables
Implementation
Scaling (geometry)
Key (cryptography)
Divisor
State of matter
Chemical equation
Data storage device
Number
Subject indexing
Process (computing)
Hash function
Semiconductor memory
Personal digital assistant
Operator (mathematics)
Faktorenanalyse
Right angle
Implementation
Reading (process)
06:27
Implementation
Table (information)
Information
Key (cryptography)
Scaling (geometry)
Subject indexing
Hash function
Readonly memory
Semiconductor memory
Hash function
Right angle
Table (information)
Embargo
07:36
Server (computing)
Implementation
Random number generation
Link (knot theory)
Distribution (mathematics)
Ferry Corsten
Virtual machine
Set (mathematics)
Number
Methodenbank
Different (Kate Ryan album)
Singleprecision floatingpoint format
Operator (mathematics)
Spacetime
Implementation
Computerassisted translation
Modulo (jargon)
Physical system
Scaling (geometry)
Key (cryptography)
Server (computing)
Cellular automaton
Moment (mathematics)
Fitness function
Operator (mathematics)
Flow separation
Inclusion map
Uniform resource locator
Hash function
Personal digital assistant
Network topology
Mixed reality
Right angle
Speicheradresse
12:05
Subject indexing
Uniform resource locator
Scaling (geometry)
Key (cryptography)
Hash function
Consistency
Network topology
Operator (mathematics)
Consistency
Electronic mailing list
Right angle
13:03
Continuum hypothesis
Service (economics)
Key (cryptography)
Server (computing)
Continuum hypothesis
Number
Hash function
Ring (mathematics)
Personal digital assistant
Network topology
Ring (mathematics)
Hash function
Right angle
Circle
Key (cryptography)
14:31
Point (geometry)
Distribution (mathematics)
Implementation
Server (computing)
Service (economics)
Key (cryptography)
Structural load
Neuroinformatik
Partition (number theory)
Hash function
Ring (mathematics)
Network topology
Energy level
Right angle
Partition (number theory)
16:22
Point (geometry)
Standard deviation
Slide rule
Asynchronous Transfer Mode
Functional (mathematics)
Implementation
Greatest element
Server (computing)
Weight
Multiplication sign
Range (statistics)
1 (number)
Function (mathematics)
Neuroinformatik
Number
Moore's law
Estimator
Cryptography
Different (Kate Ryan album)
Hash function
Ring (mathematics)
Cuboid
Circle
MiniDisc
Mathematical optimization
Partition (number theory)
Pairwise comparison
Algorithm
Weight
Chemical equation
Structural load
Keyboard shortcut
Expert system
Electronic mailing list
Variance
Cryptography
Hash function
Ring (mathematics)
Function (mathematics)
Website
Data conversion
Right angle
Library (computing)
Row (database)
20:02
Implementation
Code
Real number
1 (number)
Cartesian coordinate system
Library (computing)
20:54
Slide rule
Implementation
Functional (mathematics)
1 (number)
21:32
Scripting language
Area
Functional (mathematics)
Computer file
Key (cryptography)
Multiplication sign
Range (statistics)
Shape (magazine)
Instance (computer science)
Disk readandwrite head
Declarative programming
Ring (mathematics)
Hash function
Hessian matrix
Right angle
Object (grammar)
Physical system
24:21
Laptop
Mapping
Personal digital assistant
Military base
Network topology
Consistency
Database
Right angle
Maxima and minima
Instance (computer science)
Client (computing)
25:43
Point (geometry)
Server (computing)
Key (cryptography)
State of matter
Database
Cursor (computers)
Computer network
Client (computing)
Instance (computer science)
Connected space
Number
Category of being
Ring (mathematics)
Personal digital assistant
Right angle
Summierbarkeit
MiniDisc
Partition (number theory)
26:56
Workload
Key (cryptography)
Software
Distribution (mathematics)
Consistency
Code
File system
Computer network
Mass
MiniDisc
Mereology
27:51
Slide rule
Implementation
Server (computing)
Presentation of a group
Service (economics)
Code
Patch (Unix)
Workstation <Musikinstrument>
Source code
Virtual machine
Division (mathematics)
Client (computing)
Login
Dimensional analysis
Goodness of fit
Mathematics
Cache (computing)
Selectivity (electronic)
Information security
Cellular automaton
Cache (computing)
Process (computing)
Personal digital assistant
Order (biology)
Inverter (logic gate)
Library (computing)
30:48
Implementation
Randomization
Consistency
Multiplication sign
Demo (music)
Electronic mailing list
Electronic mailing list
Likelihood function
Methodenbank
Operator (mathematics)
Network topology
Octave
Right angle
Endliche Modelltheorie
Game theory
Modulo (jargon)
Installable File System
Computer architecture
33:13
Asynchronous Transfer Mode
Mathematics
Touchscreen
Consistency
Planning
Right angle
Vertex (graph theory)
Figurate number
Game theory
Affine space
Connected space
34:53
Implementation
Code
Multiplication sign
Consistency
Electronic mailing list
Bit
Student's ttest
Likelihood function
Mathematics
Methodenbank
Ring (mathematics)
Selectivity (electronic)
Game theory
Figurate number
Modulo (jargon)
36:46
Type theory
Cellular automaton
Library (computing)
37:25
Virtual memory
Process (computing)
Arm
Semiconductor memory
Physicalism
Data dictionary
Physical system
38:53
Slide rule
Distribution (mathematics)
Ring (mathematics)
Distance
Likelihood function
39:44
Point (geometry)
Server (computing)
Weight
Physicalism
Partition (number theory)
Cryptography
Graphical user interface
Ring (mathematics)
Logic
Function (mathematics)
Ring (mathematics)
Hash function
Program slicing
Data conversion
Right angle
MiniDisc
Resultant
Partition (number theory)
40:43
Implementation
Estimator
Chemical equation
Physical law
Energy level
Coma Berenices
Library (computing)
41:20
Point (geometry)
Structural load
Range (statistics)
Continuum hypothesis
Letterpress printing
Line (geometry)
Mereology
Frame problem
Hash function
Ring (mathematics)
Personal digital assistant
Video game
Right angle
Position operator
Partition (number theory)
42:35
Mathematics
Ring (mathematics)
Range (statistics)
Representation (politics)
Line (geometry)
Partition (number theory)
00:05
we spend the next hour I will do my best to explain to other consistent session he is and what can you do with it
00:13
and quick quick key that
00:17
about myself I you can shiny on
00:20
than on those which like a magenta Linux developers away our focus on meaning on clustering and no Sequeira disabilities stuff on and I'm so Szityu & memory for the the goal of this as sole key is to really introduce you to consistent hashing so I will have to go into the before we do we do what I would have to introduce you to the half so we start with the basics of what led to consistent hashing and what kind of programming souls and but 1st let's go into a bit of history so
01:05
this is the concept of consistent hashing is 20 years old action it came from a paper from the guys from can I when they had some problems with the distributed caching and then it's the got into the radar of P 2 p networks and like system makes sure that I use consistent hashing to keep track of where some part of the fight is and then he got into the database world and mainly after on his own and use it in said they knew would he be fall memory management so it's strange to me that it's still fairly unknown to the casual developer and maybe it would be a good chance for me to to to make it even a bit more known as well and so let me tell you a story 1st on the best so let me tell you a story on mapping and that is the
02:09
way to take something that 10 point 2 under the item and and return that certain items States pheromones the basic concepts and the idea behind it is that you have a referential that you will use to and to get some information this idea of looking at the is an the core of a new mapping functions so it's
02:39
like a phone book where you have a name and you want to find out the number of of the people that you're looking at the basic
02:51
of the map you that you have the restaurant shows selection step then you apply a logical operations on it and then that points to the location where the information that you're looking for the store the whole process in the Jews 1st steps we to have a great impact on your look efficiency so
03:19
mapping is just a map and you know as developers most of us know this as a key relating to the value when you think about this you usually think about the date right and widened it for instance is a map you have a key and you have the value associated to the key right so let's get to today's all QuickDEX sorry about how is it always the fight Dick to implemented the true from the bite is that it's what we call the hash table and hash to learn you apply the hash
04:07
function on the key and then you do and other logical operations that we point to the value the role of the hash function is very important because it has and the a direct relationship with the spread of your key on the location that is storing and all the values and the 2 steps represent the actual implementation so how is the implementation of the bite well it applies this function
04:41
well the hash of the key these as a binary operation on the size of the uh the memory array that stores all the value so if you do a national state you get a number that's the key point here the hash function since you we do a logical operation on it has to return the number sold you apply a banana operation on the size of in in this case is 11 and you get index 0 so that means that you will get on your in memory at the index 0 is the value for key that's his hashed have if I do the same for PC you see that it points to high index to if I do the same for can be points to index 3 OK pretty easy and pretty solid the such function is a reading of items
05:45
so there are some key factors to consider here because you expect this to be an efficient process right looking up key must return the value fairly easily and quickly so the distribution and balancing is the responsibility of the hash function and the operation will lead to some accuracy and performance on the location and on the whole implementation process when you add the storing of this data and then you start talking about scaling so does the bytes the actually scale good the answer is I guess
06:30
for some people who tried to but a lot of information in the fight and the null and the reason is that the hash function the basic hash function is meant for vary in the implementation itself is meant for very fast and consistently fast key lookups at the expense of memory or why because the hash function does not distribute the keys and the values of the keys even in the in memory array that means that if you if we take the exemplary here you see that index 1 you consume some memory was no information in it so long How do you scale out from the right and all 4 of the national that is not very easy spread this is where our distributed
07:27
hash tables coming so that its enemies DHT I guess most of you have heard of them but the idea is is the stupidest embargo
07:38
and use complete your keys into Beckett's so the hash of your key we applied to be onetoone operator or but instead of pointing to a direct location in the let's say in memory location it we just reference the bucket idea these Beckett itself is a hash so when you point to the but get you would have to get the cats get to the cats and then do another look up over there then you take those gets and you spread it on machines on several right so in this at this moment the disability dashed ever allows you to spread your keys and the data that comes with it around different machines which is how you will be able to scale if you wanted to have widened links insight inserted into buckets you could have a spike and it's on different machines and then scaleout what we wouldn't fit on a single machine right so the key question now is what's the best operator for the accuracy and the efficiency of this implementation what's the best operator you could have to point to the right server in the most efficient way so let's think with with this approach this problem with a naive DHT implementation the native 1 uses the modulo operator on the size of the number on the number of packets so the idea is that I do the hash on my key in this case I can't I would choose the hash function of items because it doesn't spread and doesn't makes the keys easily and even if that means that he asked of 8 is me a number that's is close or very close to the hash of the that doesn't mix well the idea is to have a random out that the random number coming out of the hash function so the best and easily known hash function that comes to mind is and if i which seeks to mix is well the keys atlas and it's pretty easy to be implemented been pretty standard as well so in this case I would do it and the 5 of the I would apply and it's just to get an exit the smell of it and I would transform it in base 16 to get an actual number on which I would apply a module on the size on the number of cells of kids that I have sold in doing so is the would give me and would you know on 3 and a set of trees of those would be OK it is on several keys on 7 0 and you could connect them and we get the value of what you're looking for so it seems pretty simple or pretty sorted and that's what we had actually uh in the nineties we can say that but before consistent hashing came into play and that's then the problem that we see is what's and got Akamai uh and think about consistent fashion
11:05
because it won't scale if your if the number of your it's change because the modulo operator or in the size of the the case we have in applied the modulo operator or the indexed by the same key was you referencing to we change all have a very high probability of changing in this case the D is on module 4 boys now to 7 1 and before it was pointing to 7 0 so that means that on distributed system or on systems where you can have and you need to tolerate the failure of the set up all you need to scale that and add new server if you were implementing the naive DHT implementation on the modulo you with some serious problems
12:06
basically the number of if we were using the hash function of a bite and the number of key remapping so that means the keys that would their index would change would follow this uh operation so let's say that for 2000 and cheese spread on 107 this you would get thousand keys that would point to a bad location should only changed or added or removed 1 several from your where topology it would be a disaster even more scale right so I guess we
12:51
need help on this we need consistency well at least the best consistency or better consistency list of the trough rights OK handy here I
13:05
introduce you to the hash uh the hash tree is the concept behind consistent hashing on the idea of this is pretty simple actually when you think about it all when you hear about it tonight yes and I get I do this Angela doing this and
13:27
the idea is that you place yourself on that's also called the continuum so you can picture it's quite easy you have an array of new bandits to form uh and the circle right and then you hash the name of the of of which give you a number and you place it on your simple enough for now so in this case have trees service those i.e. hash then name and I plays down number of on my ring looking then how do you look it up I would you look at the key and what how would you know that a key belongs to said Sarah well you do the hash of the key as well you place it on the ring and then you go clockwise until you hit a sour and that will be the of responsible for the key kind of take if you
14:33
don't use it on any 1 over N of your keys will be remapped or relocated or change if you either remove several wondering that's fairly and clearly better who is this is the more services you had on your topology the more abuse you gets OK but there is still a
15:03
problem the problem is that the implementation itself can be nice if the hash function is not nice use you get the problem of distribution even with the hash ring consistent hashing does not so late so the problem of distribution of your several so around the city of the hash function that's so you could end up with is because is pretty random at some point and if you think about it you could end up with this kind of distribution where several 0 as a very large a partition of the CA belonging to him because when we placed our tree points on the server and the hash function didn't distribute them even right so that means that do servers and several 0 would become what we call hot what it would get the higher load than the others here several 1 would do almost nothing right so the hash function is very important and the choosing of the hash function is very important on the performance level because of its request computation but most importantly on distribution and so they are not perfect so the question
16:24
you asked now is OK so now it's time you tell me which hash function to use that 2 kinds of hash functions and on the 1st example I went for the most common and maybe the 1st that comes to mind which is and 5 which is belongs to the cryptographic hash functions and you know other like 1 in injury experts that say those ones are pretty standard so that a wide adoption and but they need to like you said before right like so before they need a commission to integer so that we take some computation as well on the other hand you have noncryptographic and algorithm that exists and that are optimized for key those 1 don't including their and middle goes around a cryptographic 90 and onto breaking or whatever and does the need to do is actually and on the ducats so that I guess the most famous of them are Mumia and needs to be treated implementation that's the most recent 1 and CG hash from Google that fast since the optimize for this so that that means and their their direct output is number right and the only drawback I see or about them is that they need such CEC bindings C libraries to work so this don't work like out of the box so you have to something and on the bottom of the slide I gave you a row estimation and the speed the comparision about all of these actuations mere merely is pretty it on on on the top of the list where city hash for 2 2 is the fastest along so this
18:28
reduce and help you in the balancing and now you have to reduce the load variance on your site on your circle on your ring right for this there's this concept of the nodes and the nodes is just the fact that you would take your server and duplicated multiple times on the ring just to augment your or the number of points on the on your ring and to reduce the size of all the partitions on your ring data In the consistent hashing more do the best or the the most acknowledge number of of the nodes for our host on the rain is 160 and so we we give you the pretty so the number of a pretty exciting idea on the number of times that you have to duplicate your your your host on the on the brink the 2nd 1 and the 2nd aspect i want to to I like is this concept of hate weights you can have different servers and with different cavities all computing power on a range and you want your ring and 2 and you would want to adapt to your based on this as well so you can say OK this ever has a weight of 2 so it really be duplicated more on the each we get more Venus on the on the on the ring to have more old spread the fine sold know that
20:05
you will want to do consistent hashing how do you doing quite so let's see all the implementations all the most famous implementation of I don't know if it's famous but anyway those ones that
20:20
are existing and done so far
20:22
so when I went to hooking into this I need it at work and I said OK I'm the 1 consistent hashing understand the basics and I 1 I want to get my hands in it and on the really code and on on real applications right I was kind of of a disappointed I must say that by the utterance and and the libraries that present on on on by and so I detest I decided to go from home and do my own
20:54
which is you're shrink and which is the 1 that we have that we propose and on the rest of the
21:00
slides and the 1st ones are mostly akademije and implementation so that you only the hashing but if you would don't have easy functions to work with them I actual cope so but and my
21:23
websites like and again but here it is
21:43
instead would be a problem is really tough received
21:51
is go like this and OK so the idea on on on on on on you ushering is that you are able to create some nodes and apply some node to the each node that you will have to your range you can have an instance associated to it which is that any kind of object actually and and then here I demonstrate that you can getting the Hessian of the node for the hash key uh for the key coconuts but then if I wanted to use it and I would like to be able to use this in a straightforward manner right so I would like do they actually instances associated to the and NTDs for and node is an open on this on the system and I would and using the H R and coconut key . right actually points to a file descriptor so I can use it directly like this I don't have to get the node and and get some other object I can use my ring in a straightforward manner that's just what I wanted to to to to light here here you see the nose declaration was the open so it still doesn't opened fire this script all within this declaration and then I and that these on my ring so then I can use it directly with the key that I'm looking for OK so finally in the area to switch now no OK the notion of time to to see you no I will leave it once again so a chart of all at now here you see that it points to a file descriptor and then I can use it directly and use the file descriptor methods and functions like this and then I can open the size and shape that the Hague Agreement has been written in its head I was OK so I would switch was
24:24
some example use cases before is this issue of a laptop or a telephone of something bringing up I would have have a free working I don't know of uh a city like all and isolated rough using consistent hashing so it has question about it you can and I had some Razzberry 0 to win so you we get a chance to win something out of this style so uh if you have it prevails over we give you a neural tube connected and anyway so on Peru would
25:02
like talking fortunately anyway so on some examples you have the way a way to distribute has some data and on that these instances you could use consistent hashing to do it properly and to be able to either remove uh that the bases from your topology and don't so you have the minimum of the mapping of data relocation to do when you when you when you do that and so here I have some client has some data and I wanted to 2 . 2 random that that basically in a consistent way right and so here I would
25:45
import your actually I would trade my nodes on each node at the instance of the property would point to an actual uh my cyclic connections right and I would point to it every every several that I have and reference them on fold usage on my ring afterwards then I would trade my ring using those nodes and then I have some data and I will use to keep key of the state addict to select my note and to write these data into that of its and for this I would use and of for question Q which you would be client a plan B. Clancy kind the and then what's interesting is this 1 this H are partition he is points directly to the right server and to the right connections of the rights of the fall the key that we're evaluating and that would executing some my data and then do my committed right
26:49
the sum for use case number 2 the Bezier case to and here that we would be able so to to spread some disconnect
26:57
or Cairo let's say that you have a different mass and you will they don't do clustering stuff just suffers from some mass and you want to make the most out of them so you want to to to your workload to those mass on network mess you would mount them on a file system that's a little or anything else you could open the fires like this just like the example i gave earlier and then distribute your they taverns on your key on the right mass and and on the uh and the little code you could have the re the part of the code which is implemented as well here where based on key if we find the right mass whether that is written the
27:44
taking the military is also kind of interesting let's say that you have some piece of evidence of of
27:53
they test is coming in and you want to make sure that the user should session usually is a good example so you could use a station the use of genera some data based on the user ID you would like to make sure that a given worker or Selva um is responsible for processing all these of the data of the the the trenches and you could do it to uh have a greater performance so you can actually do some clever stuff with with just making sure that they do in the order comes in a consistent way on the data from the same user will be process like this and more care and twist it also works for adults so let's say that you have 10 machines every machine logs uh do use idea on their logs if you were to want to trace all what the user did and you will have to get the notes from all the machines using consistent fashion like this you would not have to do it and it would be quite the 0 and at least in a consistent way so it would do your is best for your logs not to be spread out so if you don't have a failure or you don't have added new node on to produce it can be the pretty good use case as well OK orders exemplars and the source code of this is is online so don't really bother Brittany like this on on on the slide and the How couldn't do this
29:29
presentation wasn't mentioning caching right and so I don't know how much about you maybe you can raise your hand used a bite and then catch and library OK pretty and pretty a lot of guys well actually the vitamin cash cell selection when you have you mind you create your own memcached client you can specify multiple nodes to distributed caching right well that Python cash implementation uses the for naïve DHT uh implementation so whenever 1 of your work cache server we fail or you we had another on almost all your cache would be invalidated and it will generate some node if you load these data from that of his or something in between did I guess that and so you ushering I did I did a inverted monkey patching on the dimension department as the library to change on the change in the service selection used to use consistent fashion is the so you don't have to modify the rest of your code so you just do this and you get this new security
30:49
OK so let's try to finish with the cities Rafael so the idea of the Raphael and usually when you do a all you wanna win right and so I did 2 implementations of this rough on the basic concept is OK I have a list of gives 1 of the gifts is a winner octave which you can see here and whenever you every time 1 of you will connect to the game we will simulate the adding of a node to the topology right and I have implemented this Raphael using modulo or consistent sold if we we're using the modulo operator so that would mean that every time someone would connect so a new player joins the Raphael the likelihood of the we never to change will be higher right the so in doing so we would favor all the people who don't know 1 utterance the winning if we were using consistent hashing every time a positive and would join the I am would have less chance of changing the winner so all we we play the game but you will get to choose if we use the model 1 so to favor or most of you know let's say all randomness all you want to use the consistent 7 and favor or where is gonna win so young lecture so it as to handle uh erase and follow the module who for the consistency well I I would for must have to ground but you just consistency so the architecture and I'm surprised and on OK so the concept of the game is this clear it's OK all right the so
33:03
this is the URL so you can connect our k p 17 . nb NY that co the can you have it
33:18
on top of the screen you see over that because the quite yeah I know and need to run either module consistent so it's gonna work now right but it was properly I had you had to choose from thanks we can I guess I lost connection that's why you've stuff a before anyway a lot of it OK inconsistencies here the nodes only 10 past already 10 participants 30 the idea of the winner is uh shown over there 59 really going
34:14
it so here for those who don't have the phone no on the plane right now you can see where the other people really are seeing sole basically here and that winning right I don't have the winner and a gift that displayed so most of you don't have affinities uh playing so even if you're losing your it's nothing that anyway and in 93 I wasn't expecting 93 uh people playing the game actually so it's a it's a nice figure 99 and I don't know what was the winner was the clear we yeah the the changes that did you win the
34:57
1st time and the didn't change not
35:02
OK so you 101 so now people are
35:07
trying to and figure you out a way to reduce this I guess I thought of it already so I had in the code there some 60 museo anyway OK so I wanted just to show a quickie on the 2 implementations and on on the top is the is the module once the module 1 s use is this is a simple list to do this and the other 1 using the dashing obviously and you when you add a note on a consistent on the consistent implementation it's pretty easy you see that it comes because it becomes a bit more complex when using the modulo 1 on but students but it's not the same as I have another cleaner for when people or are leaving the game QC that people are leaving the game so the likelihood is the changing so that we know may be changing and and the node that selection as well here you can see the winner selection and when it's on consistency it's pretty easy I passed the winning neuron to the ring so the renewed a as the you ahead of the we and I get the winning uh node it's a it's it's as simple as that the hand the so for me thanks for opening have things for for being that if you have any question I'd be happy to to to answer it so you're the winners did yep no it just
36:52
changed but that that it's OK if you're will
36:59
theory of OK this was good and of this the rest of the stuff is he is there it's
37:04
the libraries on pipe I
37:06
a I will in contribution of disease so uh I hope that you understand that there are the history of those uh even bite uh and distributed as and now where all what you could do is consistent hashing and the type of problem cells thank you it
37:34
of that thank you very much and any questions I think it's so it I just have a small comment on so this is all be equal and I do want to discourage you from doing this but actually Python and memory mapping is improved recently and they cut the 1st scuffle show that how Pitons addiction dictionaries internally stored in the memo is no longer for by the 6 an idea of year common for it in the 1 if you like this talk what's the talk by women heading there are from Pike on building something about dictionaries because python physics is really improve the situation and you addiction is going to be sliced by half fundamentally so that's a really good improvement if you don't want to have a distributed system and just want to think of your own arms performance memory performance of your own single process from this insightful comment yeah you're right
38:46
but like and that's what that's what you
38:55
mentioned replicating notes on the ring to kind of probability that you get an an even distribution having distribute notes on the ring because of seems to me that if you do this in predictable way with with the same distance everywhere you don't improve the changed distribution at all can you elaborate on this how this works and you know that the likelihood of a coordinated go back to the slide where you distribute a repeat the notes on the ring yeah this 1 yes the explained how this improves
39:37
the situation because it seems to me that you like just replicate the same thing 5 times yeah but if you compare this to this
39:48
this improve the situation doesn't it like the if you take the slice
39:57
of each node so it's light on now and then it was like this
40:03
so if here I took several 0 and several 1 and 7 2 and for each of them I would add to their name dash 0 bash 1 Dutch to the street and then I would hash this again and place the results that would physics that with logic any point to the same server but still it would be a different point on the ring I would reduce the role the violence of on my ring by reducing the partitions on the on the right right that thanks for letting me clarify this so
40:47
that it is how do you manage
40:52
they but if we given by its
40:57
mn even using around us
41:00
do you have an estimator of imbalance between the master law that no world and that least not that a lot of no using this level so you let that it's not dependent on the library implementation is dependent on the
41:22
hash function so if you it depends on the hash function that you will use and basically the hash function as and in the range rights and the range of your hash function would be the range of your range of the to the positions of ring so if you know all the data points that you have on your ring and your ushering provides a way to to have this with the print continuum that case you can calculate the parts of the the the partitions of each node and then you can know for sure based on your hash function how what what you will be the load of each frame of each 7 OK I actually 2 questions regarding the ring at 1st I mean why do I
42:11
need to use a hash function to distribute the mold kind of justice to them you really awful that I splits the thought of a chair for most even add new 1 so what the benefit of having a hash function that of just giving them evenly distributed points and the 2nd 1 is life the during more line or whatever or so I understood the 7 question than the first one
42:36
was vehicle I'm sorry I didn't understand it at all but a wider the ring and instead of a line I guess it's for a representation of the 4 s and let alone on the mathematics behind it it makes sense on on recon ring because it's it's a it's calculates an ongoing to represent partitions so some the paper on the Akamai paper uh so it's it would be better like it's like a trigonometric and uh range surfer sorry sold it makes much more sense all on this and on the 1st question answering maybe you can catch me late at the end because With the according of innocence the the entrance of but the Pentagon elections movements that of theft