Consensus Potpourri: Optimal Resilience in Systems that Mix Shared Memory and Message Passing
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 30 | |
Author | ||
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/52877 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Shared memorySystem programmingMixed realityPartition (number theory)Process modelingLinear mapMessage passingRead-only memoryTopologyKolmogorov complexityGroup actionMereologyCausalityInitial value problemComputer clusterCasting (performing arts)Single-precision floating-point formatComplex (psychology)Independence (probability theory)Symbol tableProcess modelingUniformer RaumWebsiteGastropod shellInterpreter (computing)Local ringContext awarenessRow (database)CASE <Informatik>Cellular automatonTask (computing)INTEGRALQueue (abstract data type)AdditionSheaf (mathematics)Field (computer science)System callGroup actionGraph coloringSpacetimeCoefficient of determinationTrailContent (media)Grass (card game)Connected spaceSet (mathematics)Incidence algebraPosition operatorBoss CorporationOvalCondition numberResultantMessage passingStandard deviationWordPrice indexLatent heatNumberMixed realityExecution unitLink (knot theory)Discrete element methodNumbering schemeMoment (mathematics)Term (mathematics)System programmingDigital photographyMaxima and minimaAngleBlogArithmetic meanProcess (computing)Office suiteDecision theoryAugmented realityMedical imagingDigital electronicsInstance (computer science)Semiconductor memoryPhysical systemSoftwareEndliche ModelltheorieAlgorithmSynchronizationStreaming mediaShared memoryMathematical optimizationTheory of relativityDifferent (Kate Ryan album)Gemischtes ModellF-VerteilungNetwork topologyLinearizationApproximationArrow of timeNamespaceReading (process)outputVector spacePrime idealSinc functionImage resolutionOrder (biology)SubsetAtomic numberCrash (computing)Binary fileSolvable groupComplete metric spaceImplementationCompact spaceBound stateProof theoryAdaptive behaviorObject (grammar)InterprozesskommunikationPolynomialForceConfiguration spaceFlow separationCategory of beingStudent's t-testRadical (chemistry)NeuroinformatikPredictabilityGene clusterDeterminismStapeldateiAuthorizationLimit of a functionGame controllerGraph (mathematics)Alpha (investment)View (database)IterationTupleVertex (graph theory)Randomization1 (number)Computer animation
Transcript: English(auto-generated)
00:01
Hi everyone, my name is Norshida and I'm a student setter-technion and I'm happy to present to you our paper Optimal Resilience in Systems that Mix Shared Memory and Message Passing This is a joint work with Raghita Deyar and Svetaka Two widely researched models in the distributed computing community are the shared memory model and the message passing model
00:24
In the shared memory model, other processors can access a shared memory and communicate by reading and writing to shared registers In the message passing model, processors can send messages to each other using a complete bi-directional links network We are interested in the asynchronous crash-failure setting for both models
00:42
While the processes and links are asynchronous, the processes may crash and stop taking steps The resilience of a problem is the maximum number of processes that may crash in any algorithm solving this problem We use F to denote this number Many problems can be solved in the shared memory model when all processes but one fail
01:04
On the other hand, in the message passing model, these problems can be solved only if a majority of the processes are correct This is due to partitioning where two disjoint sets of processes execute the algorithm separately We are interested in mixed models where all the processes can send messages over a complete network
01:25
and subsets of processes may share memory among them At one extreme, when there are no shared memory connections, we have the message passing model On the other extreme, when all the processes can access a single shared memory, there is no use in sending messages and we have the shared memory model
01:42
In such mixed systems, we can obtain better resilience than in the message passing model The resilience was in the intermediate between the resilience of these two models depending on the shared memory connection topology Two examples for such mixed models are the M&M model and the hybrid cluster-based model
02:04
We model mixed systems using a general scheme that can portray any such system We have N asynchronous processes and up to F crash prelates We assume a complete asynchronous message passing network with reliable links meaning any process can send messages to any other process
02:23
In addition, we have M shared memories Each memory has reading and writing access restrictions R mu is the set of processes that can read from memory mu and W mu is the set of processes that can write from mu We define the foreign relation between two processes
02:43
Processes P and Q are in the relation denoted P arrow Q if Q can write to some shared memory mu and P can read from it This extends to set of processes where two sets of processes P and Q are in the relation if it holds for some process P in capital P and some process Q in capital Q
03:05
If the relation holds for both ways, use a bi-directional arrow We say that the system is F partitionable if there are two sets of processes of size N minus F each that are not in the relation Intuitively, in the message passing model
03:21
we can partition the system when there is no common process in any two sets of N minus F processes In a mixed model, we also need to disconnect the shared memory connection between any two sets of N minus F processes in order to partition the system F opt is the largest F, for which for every two sets of processes
03:43
P and Q, both of size N minus F, the relation holds both ways So we can see that the system is F partitionable If and only if, F is strictly bigger than F opt Since every process can read from its local memory the relation is reflexive
04:01
and if two groups intersect, they are in the relation Therefore, F opt is at least this number The altered memory connection F opt is exactly this number which is exactly the resilience in the message passing model We show that F opt is the optimal number of failures
04:21
Of course, any algorithm implementing a register in a mixed model can tolerate The algorithm extends A-B-D register implementation in the message passing model and is an adaptation of a delicious U-N-2 register implementation in the M-N-M model Using the register implementation we can take known shared memory algorithms
04:41
and simulate them to obtain algorithms in a mixed system For example, we can obtain mixed model algorithms for renaming an approximate agreement with resilience F opt This algorithm has optimal resilience because of the lower bounds we present next
05:01
Now, I will present a proof for renaming bin and solvable value with F failures if the system is F partitionable Assume towards a contradiction that this is not the case Therefore, there are two sets of processes of size N minus F P and P prime, that are not in the relation Note that P and P prime are disjoint
05:21
and therefore we can partition the processes into three sets P, P prime and Q Processes in Q do not take steps in all the executions I'm about to present Also note that any process in P prime cannot read what processes in P write to the shared memory We define two executions
05:43
In the first one, only processes in P take steps and in the second execution, only processes in P prime can take steps Since in both executions, exactly F processes do not participate and the algorithm can tolerate F failures in both executions, all the processes will eventually decide on a new name and terminate
06:05
The size of the original namespace can be big enough to ensure there are two disjoint input vectors for processes in P and P prime such that there is a process P that decides the new name R in the execution alpha 1 and a process P prime that decides on the same name in the execution alpha 2
06:23
In the final execution, we combine the two previous executions by concatenating the execution prefixes in which processes decide In addition, we delay all the messages sent between the processes in P and P prime The first part of the execution is indistinguishable from alpha 1 to processes in P
06:42
since they are the only processes taking steps The second part of the execution is indistinguishable from alpha 2 to processes in P prime This is because processes in P prime do not receive messages from processes in P and they also cannot read what processes in P or to the shared memory
07:01
So, we got an execution where two distinct processes have decided on the same name but this cannot happen And this ends the proof renaming being unsolvable in an F partitionable system We can show that approximate agreement cannot be served in the presence of F failures if the system is F partitionable in a similar manner to renaming
07:25
We define two executions In the first one, only processes in P take steps and they all start with the initial value 0 In the second execution, only processes in P prime take steps and they all start with the initial value 2 epsilon By combining these two executions, we can obtain an execution where processes in P decide 0
07:46
and processes in P prime decide 2 epsilon But the difference between these two values is bigger than epsilon, in control prediction Due to FLP result, deterministic consensus cannot be solved in both the message passing model and the shared memory model
08:04
and this result also extends to mixed models Therefore, we settled on randomized consensus We define the following termination property, non-deterministic F termination, as follows For every configuration C and a set of at most F processes
08:21
there is some execution where only these F processes may fail in which some non-faulty processes terminate This property extends non-deterministic solar termination We show that non-deterministic F terminating consensus cannot be solved in an F partitionable system
08:40
We do this in a similar manner to a proximate agreement and this implies the same result for randomized consensus Unlike the previous deterministic algorithm we used, substituting an inarizable object in a randomized shared memory algorithm requires it to be strongly inarizable
09:01
Unfortunately, it was shown that AB-Dil-Gester implementation is not strongly inarizable Since the mixed model register implementation extends the message passing implementation this result also applies for it and we cannot likely use it in a randomized consensus shared memory algorithm in order to obtain an algorithm in a mixed model
09:25
Adilates Hu and Tued showed that the simple shared memory randomized consensus algorithm of Aspens and Hurley works even with using regular registers instead of atomic ones Therefore, we can substitute the register implementation in this algorithm
09:41
in order to obtain a randomized consensus algorithm in mixed model The simple A-H algorithm uses independent coinfics and has an exponential complexity By plugging a weak shared coin in the A-H algorithm we can obtain randomized consensus algorithm with polynomial complexity
10:02
The weak shared coin via Aspens and Hurley mimic a random org with plus one and minus one steps Log-free iteration of tuple collects ensure that processes' views of the accumulated encounter differ by at most n values We show this also holds when reads and writes are not atomic
10:21
and therefore we obtain a shared coin algorithm in a mixed system I will now discuss the complexity of the algorithms we presented We show a single-writer multi-reader register implementation with linear message passing complexities The shared memory complexities depend on the topology of the shared memory connections
10:42
and namely on the number of the shared memories in the system We use batching of reading requests, the simple optimization technique to obtain more efficient algorithm where we read registers of several processes at once instead of reading each one individually Simulating a polynomial shared memory algorithm using our register implementation
11:03
yields a polynomial mixed model algorithm when looking at the message passing complexities Using compact shared memory topologies where the number of shared memories is polynomial the simulated algorithm also has polynomial shared memory complexities I am about to present two such models, the uniform M-N-M model and the cluster-based model
11:25
In the M-N-M model subsets of processes can share any number of registers between them Without access restrictions our model is dual to the M-N-M model meaning they both capture the same mixed systems In our model we take a flipped view where we consider for each process the memories it can access
11:47
In the uniform M-N-M model the shared memory connections are described as a graph The sense of vertices is exactly the processes in the system There is a shared memory associated to each process
12:00
and processes P and Q can access each other's memory if there is an edge between them in the graph Here you can see a uniform M-N-M model induced by a graph and for each process the processes that can access its memory Here we say that the process represents itself and all its neighbors in the graph
12:21
The author introducing the M-N-M model has given a randomized consensus algorithm that is correct assuming that the majority of the processes are represented In the paper we show that this algorithm does not have optimal resilience In addition, they present a lower bound for the resilience of randomized consensus in terms of the shared memory graph topology
12:43
Unlike the upper bound, we show that the lower bound is indeed optimal Another work about the M-N-M model gives a single-writer multi-reader resistor implementation with optimal resilience in the M-N-M model In addition, they show a randomized consensus algorithm with optimal resilience
13:05
Another model is a hybrid model which we call cluster-based model In this model the processes are partitioned into M-disjoint clusters Each cluster has an associated shared memory which all the processes in this cluster can access
13:21
Here we show how we can represent the cluster-based model in our general model P and Q are in the relation if and only if they are in the same cluster It is easy to see that this relation is targeted We define F's cluster to be the largest F such that any two sets of processes P and Q of size n-f each
13:43
there is some cluster that contains a process in P and a process in Q From this definition, we immediately get that Fopt is equal to Fcluster and therefore this is the optimal resilience in the cluster-based model for the various problems we discussed
14:01
In this model, we say that the process represents all the processes in its cluster Note that if a process P represents a process Q, then they are in the relation We are looking at two processes that represent the same process As this relation is transitive, all the three processes are in the same cluster
14:22
This is to intuition why Fopt is equivalent to the maximal F such that every set of size n-f represents a majority of the processes Therefore, we can wait for a majority of the processes to be represented and not to at least n-f processes An interesting question to ask is what problem can we solve when the system is partitionable
14:45
In the k-set consensus problem, processes can decide on its most k different values This problem can be solved in both the message passing model and the shared memory model if and only if F is smaller than k A message passing algorithm is also a mixed model algorithm
15:03
In addition, we cannot have better resilience than in the shared memory model Therefore, k-set consensus can also be solved in a mixed model if and only if F is smaller than k For F of stream smaller than k-1, a system can be F partitionable but still offer an algorithm for k-set consensus tolerating F value
15:24
To summarize, we present a general model capturing mixed systems with access restrictions We give a general scheme for designing algorithm in mixed models and solve various problems efficiently with optimal resilience
15:40
In addition, we looked at specific mixed models and show how they are integrated in our model Future work may include incorporating synchrony and failure detectors through this model and looking at a more general model where the message passing network is not a click Thank you for your attention and I will be happy to take questions in the live session