We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Blockchain and BFT: Heterogeneous Paxos

00:00

Formal Metadata

Title
Blockchain and BFT: Heterogeneous Paxos
Title of Series
Number of Parts
30
Author
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
In distributed systems, a group of *learners* achieve *consensus* when, by observing the output of some *acceptors*, they all arrive at the same value. Consensus is crucial for ordering transactions in failure-tolerant systems. Traditional consensus algorithms are homogeneous in three ways: * all learners are treated equally, * all acceptors are treated equally, and * all failures are treated equally. These assumptions, however, are unsuitable for cross-domain applications, including blockchains, where not all acceptors are equally trustworthy, and not all learners have the same assumptions and priorities. We present the first consensus algorithm to be heterogeneous in all three respects. Learners set their own mixed failure tolerances over differently trusted sets of acceptors. We express these assumptions in a novel Learner Graph, and demonstrate sufficient conditions for consensus. We present Heterogeneous Paxos: an extension of Byzantine Paxos. Heterogeneous Paxos achieves consensus for any viable Learner Graph in best-case three message sends, which is optimal. We present a proof-of-concept implementation, and demonstrate how tailoring for heterogeneous scenarios can save resources and latency.
Charge carrierAutomatic differentiationFood energyProduct (business)Inheritance (object-oriented programming)HypermediaSelf-organizationVirtual machineComputer configuration1 (number)Wechselseitige InformationData managementPhysical systemCrash (computing)Database transactionProjective planeFunction (mathematics)Mixed realityNumberDifferent (Kate Ryan album)ResultantMereologyCommunications protocolAlgorithmEqualiser (mathematics)Independence (probability theory)Set (mathematics)
Service (economics)Point cloudCorrelation and dependencePhysical systemVerteiltes SystemCrash (computing)Constraint (mathematics)Different (Kate Ryan album)Radical (chemistry)Constraint (mathematics)Physical systemCrash (computing)Different (Kate Ryan album)Virtual machineData centerImplementationCASE <Informatik>AdditionCommunications protocolServer (computing)Equaliser (mathematics)Category of beingCondition numberAlgorithmKey (cryptography)Database transactionFunction (mathematics)Connectivity (graph theory)ResultantState observerMessage passingSet (mathematics)Mixed realityRadical (chemistry)MereologyElectric power transmissionLatent heatFood energyMedical imagingRule of inferenceAreaError messageArmState of matterPower (physics)Matching (graph theory)SummierbarkeitVapor barrierShared memoryDigitizingOrder (biology)Electric generatorWebsiteWhiteboardTowerComputer configuration1 (number)Natural numberRational numberInstance (computer science)MathematicsWeb page
Radical (chemistry)SynchronizationInformationIncidence algebraRadical (chemistry)DeterminantMultiplication signCategory of beingCommunications protocolSet (mathematics)Proper mapRandomizationSynchronizationCASE <Informatik>Computer animation
Mathematical optimizationCrash (computing)AlgorithmSurvival analysisGraph (mathematics)Condition numberVertex (graph theory)Radical (chemistry)Set (mathematics)ExpressionRevision controlAdditionSubsetSet (mathematics)Decision theoryType theorySet (mathematics)Crash (computing)Graph (mathematics)AlgorithmCategory of beingPhysical systemTerm (mathematics)Order (biology)Condition numberMessage passingImplementationCommunications protocolState observerEqualiser (mathematics)Element (mathematics)Food energyComputer configurationRadical (chemistry)Error messageTowerInsertion lossGreatest elementAreaProcess (computing)Vapor barrierLevel (video gaming)Endliche ModelltheorieForm (programming)Graph coloringDialectMetropolitan area networkMedical imaging1 (number)Green's functionForcing (mathematics)Rule of inferenceRaw image formatReading (process)ForestVideo game
AlgorithmSoftware frameworkMehrplatzsystemDifferent (Kate Ryan album)Condition numberFood energyCrash (computing)Basis <Mathematik>Computer configurationGroup actionSource codeOrder (biology)Process (computing)Shared memoryLevel (video gaming)MathematicsWater vaporMessage passingBlock (periodic table)Rule of inferenceConnected space1 (number)Bit rateError messageReading (process)Execution unitPhysical lawSelf-organizationForm (programming)Medical imagingMassData storage deviceDisk read-and-write headFlow separationProof theoryAlgorithmExpressionInstance (computer science)Concurrency (computer science)State observerChainSoftware frameworkLine (geometry)Graph (mathematics)Data structureRadical (chemistry)
Client (computing)Maxima and minimaCore dumpIntelMedianExpressionGraph (mathematics)AlgorithmConnected spaceAreaElectronic mailing listPhysical lawComputer-assisted translationSoftwareComputer configurationFood energyDemoscenePresentation of a groupMoment (mathematics)Multiplication signSlide ruleMessage passingEndliche ModelltheorieLink (knot theory)Shared memoryAdditionMassProcess (computing)Descriptive statisticsGroup actionDatabase transactionGraph (mathematics)MedianPhysical systemProof theoryArtificial neural networkOverhead (computing)CASE <Informatik>Block (periodic table)Traffic reportingVirtual machineDistribution (mathematics)
Transcript: English(auto-generated)
Hi, I'm Isaac. I'm a post-doc at the Max Planck Institute for Software Systems, and I'm here to talk about Heterogeneous Paxos, which is a project I worked on mostly at Cornell, with Xunmen Wang, Robert Berenassie, and Andrew Myers. Most of you know that consensus is a key part of maintaining a replicated database,
blockchain, anything else with consistent state. A consensus protocol is the part that decides which transaction goes next. Traditional consensus algorithms are built for traditional scenarios, usually one system owner running a few machines, trying to tolerate one or a small number of independent failures.
Crucially, most distributed algorithms are what we call homogenous, as opposed to heterogeneous. And by that I mean three things. First of all, that all participants are created equal. There might be different sets of participants with different roles, but within each role, any participant is just as good as the next. Usually, a system is characterized by having some number and participants.
Secondly, all failures are created equal. Usually, when we're analyzing a system, we talk about tolerating some number f independent failures, all of one type, so f crash failures or f Byzantine failures. Third, all learners are created equal.
Learners are the people who observe the output of the consensus. They make assumptions about which failures need to be tolerated, and which results need to be guaranteed. Traditionally, these assumptions are something like, there must be at most f failures out of the n participants.
Previous projects have managed to break each of these assumptions independently. Upright and XFT, for example, mix crash and Byzantine failures. Byzantine quorum systems and refined quorum systems use quorums to encode heterogeneous participants. Ripple was a consensus protocol designed with heterogeneous learners.
Even pairs of homogenous assumptions have been broken. Seller has heterogeneous learners and participants. Flexible BFT has heterogeneous failures and learners. And I'm here to argue that not only could Byzantine Paxos be phrased with heterogeneous failures and participants,
but we can adapt it to form a new protocol, heterogeneous Paxos, which is heterogeneous in all three ways. And I think that's really important because a lot of new systems, including most permissioned blockchains, are trying to connect lots of very different people with very different assumptions.
For example, in 2016, all of these companies were trying to run a blockchain together with R3. And of course, the actual machines running the consensus are different too. All these companies, for instance, can be used to run consensus participants in addition to servers owned by stakeholders in whatever system you have.
People are going to have different opinions about the trustworthiness of participants in each of these services. And not all failures are independent. Machines sharing and implementation are more likely to be hacked and go Byzantine together. Machines that are in the same data center or the same power grid
are more likely to crash together. Machines owned by the same people are more likely to fail together. Correlated failures happen all the time, so it's not always accurate to tolerate any half-independent failures. To build a system that works with many different parties, we should use a consensus that takes everyone's specific failure tolerances into account.
For our project, we're working with some fairly standard distributed systems assumptions. And we want our distributed system to keep working even when some participants fail. Crash failures are when a node simply stops functioning without warning, without detectability.
Byzantine failures can behave however they want. They can send messages that aren't part of the protocol, send messages out of order, they are arbitrarily equal. There's one other critical component of distributed systems. Someone observes the output of the distributed system.
Someone makes these demands about failure tolerances. And in consensus, we call that entity a learner. Usually, we think of learners as people interested in results of the system. In a consensus problem, we have some known participants who for historical reasons are called acceptors.
And there are also some learners. And each learner wants to decide or learn a value such as what transaction happens next. The key property of consensus is agreement. We want learners to decide on the same value. And we want this even in the presence of failures,
either acceptors crashing or becoming Byzantine. The most famous consensus algorithm is Paxos by Leslie Lamport. The original Paxos tolerates only crash failures, but there are several variants designed to deal with Byzantine failures as well. Most famous of these is arguably practical Byzantine fault tolerance, but our protocol builds on Lamport's Byzantineizing Paxos by refinement,
mostly because it's easier to prove stuff about. We usually think of Paxos as a homogenous algorithm. As a reminder, that means all acceptors are created equal, all failures are created equal, and all learners are created equal. But I'm here to tell you that we don't always want to operate
under homogenous assumptions. Sometimes we want heterogeneous systems. These are systems where many different, not necessarily symmetric, tolerated failure conditions, where there can be some crash failures and some Byzantine failures tolerated at the same time, and where different learners make different assumptions
about what may happen and what guarantees they want under which conditions. Heterogeneity means that we can tailor the system to the specific constraints of the learners and the potential failures of the acceptors. In the case where everybody is really the same, we want our heterogeneous system to reduce to the homogenous version,
but otherwise we should be able to tolerate more failures, be more efficient with our resources. So what does it take to have a heterogeneous consensus algorithm? Well, it has to be heterogeneous. That means many different failure scenarios, mixed Byzantine crash failures,
different learners make different assumptions. And it has to be consensus. Formally, it needs three properties, non-triviality, agreement, and termination. Let's take a closer look at what these consensus properties mean in a heterogeneous setting. Non-triviality requires that if a learner decides something,
that thing must have been proposed by someone who's allowed to propose it. You can't always decide three because three may not have been proposed. It doesn't matter whether you're in a homogenous or heterogeneous setting. What we're going to do is have our learners check that a value was signed by an authorized party before they decided.
Done. Agreement is trickier. If two learners decide, they decide the same thing as long as failure assumptions are met. In the homogenous case, this is clear. Either there were too many failures or there weren't. In the heterogeneous case, not so much.
Whose failure assumptions are we talking about? Who should agree under which assumptions? And termination is tricky too. Obviously, each observer can't decide if, say, all the participants fail. In the homogenous case, if there aren't too many failures, we want all the learners to decide,
but we haven't yet defined aren't too many failures in the heterogeneous case. Alert. It is technically impossible to guarantee non-triviality agreement and termination even in the homogenous setting if you want to tolerate failures and make all these assumptions we've made so far. Protocols get around this by either using randomness cleverly, in which case they may show proper bliss determination, or by guaranteeing termination only when there exists some unknown time.
Afterwards, there exists some unknown messes like to be bound. Paxos uses this weakened partially synchronous termination property and it's what we use as well. Anyway, we're left with two questions. How do we express heterogeneous failure assumptions as opposed to homogenous ones? And can we make a consensus algorithm that works under those assumptions?
But before we can answer those, we have to go back and learn a little more about Lamport's Paxos. Everyone remembers Lamport's Paxos as the optimal homogenous consensus protocol. That means, among other things, that it can tolerate strictly less than half of the acceptors crashing,
or a different version can tolerate strictly less than a third of the acceptors being Byzantine. But it's usually set up exactly like that. Tolerate f out of n Byzantine failures, or tolerate f out of n crash failures. Pretty much every implementation I've come across does it this way.
But Paxos can easily be defined in a much more general way, in terms of quorums, which are subsets of participants. When your quorums are just any n-f acceptors are a quorum, it has homogenous properties. But the actual requirements are more general.
Intuitively, a quorum is a subset of acceptors which is sufficient to make a learner decide. For a system to have termination, at least one quorum must survive. For a system to have agreement, you can't have two quorums with an intersection that is entirely Byzantine. So, if there are some crash failures, but a quorum survives, composed of non-crashed participants,
they can allow a learner to decide. But if there are two quorums whose intersection is wholly Byzantine, then each Byzantine acceptor can bi-simulate. Meaning they act one way for the purposes of one learner, and act another way for the purposes of another learner.
And that means that each quorum can make a learner decide, and they don't necessarily decide the same thing. We don't have agreement. If there are safe acceptors in the quorum intersection, that can't happen. The safe acceptor won't bi-simulate. Now, it so happens that with six acceptors, you can make a Paxos where any four are a quorum.
And this Paxos will tolerate one Byzantine and an additional crash failure, but not two Byzantine failures. So, this quorum-based definition is already a little bit heterogeneous. So, returning to our questions, how do we express heterogeneous failure assumptions?
Well, with a quorum-based Paxos, they're embedded somewhere in the definition of those quorums. And can we make a consensus algorithm that works? Well, Paxos already kind of is, for two out of three kinds of heterogeneity. Specifically, not all acceptors are created equal, and not all failures are created equal.
But what we really want is to be able to express our learner's assumptions in advance, and then see if we can get a consensus protocol from them. So, we're going to do it using a thing we call a learner graph. On the learner graph, the nodes are learners.
We'll label each learner with the conditions under which it demands termination. And each edge is labeled with the conditions under which that pair of learners wish to agree. So, suppose we have a pair of learners, I'll call them red, and five acceptors. Red learners want to terminate even if one acceptor fails.
But they also want to agree even when one acceptor is Byzantine. In practice, a traditional consensus tolerating one Byzantine failure will work for them. But suppose that there are two more learners, who I'll call yellow. And yellow learners want to terminate even when there are two crash failures.
But yellow learners accept that they may disagree if there are any Byzantine failures. I'm labeling the edge between them blank to represent no Byzantine failures. This is simple enough if red and yellow learners don't want to agree. Red learners run a one Byzantine tolerant consensus,
and yellow learners run an entirely separate two crash tolerant consensus. But what if the bottom yellow and red learners also want to agree? At least when there are no failures. In fact, agreement is transitive. When there are no failures, the bottom two learners agree.
And the red learners agree, so the top red learner agrees with the bottom yellow learner. And for this graph, the red yellow pairs will in fact have to agree when there are no failures. Now, there is a consensus that can satisfy this observer graph, but first I want to explain exactly how we implement these labels.
Edge labels express when two learners want to agree. If we're worried specifically about Byzantine and crash failures, then it turns out that what we really have to express here are which acceptors are safe. That is to say, which acceptors won't send wrong or out of order messages.
Even a crashed acceptor is safe, but a Byzantine acceptor is not. So, if we want to tolerate any one Byzantine failure out of five acceptors, then any four acceptors are a safe set. The learners demand agreement whenever the elements of one safe set on the edge label are non-Byzantine.
I've only drawn one safe set here, but this label actually should have five safe sets, one for each possible tolerated failure. When we're only considering Byzantine and crash failures, it turns out that the conditions that matter to a learner,
or whether it can terminate, concern who can crash. Usually we assume that Byzantine nodes can crash, but you can represent alive but corrupt failure types. So, if yellow learners want to terminate when any two out of five acceptors crashed,
then they must terminate with any three live acceptors. By tradition, these live sets are called quorums. They are sets of acceptors necessary to make a learner decide. So, on our learner graph, red learners have a quorum of any four acceptors, and all edges of the yellow learner have one safe set consisting of all the acceptors.
So, now we're ready to build a consensus for these learners. But, first, we have to take one more look at Lineport's Paxos. Paxos involves a step called 1b, in which acceptors respond to the question,
is there any reason we can't consent on this value? And one possible reason is, I've previously accepted something else. Now, I should note that accept is a thing that acceptors do in Paxos, it is not the same as when learners decide. However, this is the step that ensures that intersecting quorums
can't make learners decide different things. The acceptors in the intersection will have a reason to put in their 1bs. So, we're going to make two changes to Byzantine Paxos. First, we're going to run a different instance of Paxos for each learner, concurrently.
So, if we're talking about these two learners, we run two different Paxoses, one red, where any four acceptors are a quorum, and one yellow, where any three acceptors are a quorum. And second, in these Paxoses, 1bs can include, I've previously accepted something else in another Paxos.
That prevents the two learners from deciding different things, under some conditions. Now, I know what you're thinking, Isaac, that sounds terribly inefficient. You're running a different Paxos for every single observer. Well, in reality, the vast majority of conceptual messages for each conceptual Paxos
share an instantiation on the YA with messages from other Paxoses. Most notably, if learners' connections all have the same label, we're exactly the same as regular Paxos. For this graph, the Paxoses for the red learners have quorums of any four acceptors, and for the yellow learners, any three acceptors.
So long as there is at most one crash failure, there will be one live red and one live yellow quorum, and they all intersect. If there's a Byzantine failure, red learners will agree, but yellow learners may not. And if there are two crashes, the yellow learners will agree and terminate,
but the red learners may never terminate. So at last, we have a consensus with that third kind of heterogeneity. Not all learners are created equal. So we're turning to our questions. How do we express heterogeneous failure assumptions with a learner graph? Each learner expresses their termination conditions,
and each pair expresses their agreement conditions. Can we make a consensus algorithm that works for them? Well, clearly sometimes yes, because we just did one. And sometimes no. If two learners want to agree even when all the acceptors fail, that's not possible. We need to generalize Paxos's quorum requirements.
Recall that Paxos requires that one quorum survives, and no two quorums intersection is entirely Byzantine. And this is what our generalization looks like. Summarize, under the conditions of some edge of the learner graph, any quorum of one learner must have a non-Byzantine intersection with any quorum of another learner.
In order to guarantee termination, just like in the homogenous case, a learner must have a live quorum. So yes, we know exactly when we can make a consensus algorithm that works with a given graph. We run our transitivity process to condense the graph,
and then we check if heterogeneous consensus is possible. We've implemented heterogeneous Paxos, mostly to show that we can, and also to show that there are performance benefits from heterogeneity. We used the Charlotte framework for authenticated distributed data structures to build blockchains,
where each block requires a proof of consensus referencing another, to prove that the reference block belongs in the chain. This was implemented in about 1700 lines of Java with fairly standard building blocks. So for one experiment, we have an organization we call Red.
Red has two learners and three acceptors, and red learners want to terminate even if an acceptor has crashed. But they accept that they may not agree if an acceptor is Byzantine. There is also another organization we call Blue. Blue also has two learners and three acceptors, and red learners do not care about blue acceptors.
Red learners want to terminate even with all the blue acceptors crashed, and red learners want to agree even with all the blue acceptors Byzantine. And blue learners, likewise, don't care about red acceptors. They want to terminate even when all the red acceptors have crashed, and at most one blue acceptor has crashed. And they want to agree as long as none of the blue acceptors are Byzantine, regardless
of the behavior of the red acceptors. Now I can tell you right now that no consensus is going to use only these participants and get the red and blue learners to agree. They're going to need some at least slightly trustworthy third parties. So suppose that all the learners want to terminate as long as no more than one third party has crashed.
And red learners want to agree among themselves when a third party is Byzantine. Blue learners want to agree with each other when a third party is Byzantine. So when can red learners agree with blue learners?
Well suppose that they want to agree when a red acceptor is Byzantine and a blue acceptor is Byzantine, but none of the third party acceptors are Byzantine. Well how many third parties do we actually need? If you use a regular old homogenous Byzantine Paxos, then we still have some red learners
who want to agree in the presence of five failures. Five failures means you'll need 16 homogenous acceptors, which means 10 third parties at least. And your consensus would look like this, that have quorums of any 11 acceptors. But heterogeneous Paxos can save us some resources here.
It turns out that we only need three third parties. Any two red acceptors and any two third party acceptors form a quorum for red learners. And likewise any two blue acceptors and any two third party acceptors form a quorum for blue learners. You can see that if there were a third party Byzantine failure, red learners could
disagree with blue learners. But that's okay, according to the learned graph. So we've implemented this scenario. For our experiments, we use a light client, which wrote transactions and passed them to some proposer machines with Paxos Process. Paxos has the best case latency of three messages, or five once we include the messages
to and from the light client. We put 100 milliseconds of artificial latency on the network connections, and therefore the theoretical best time would be 500 milliseconds. We ran homogenous Byzantine Paxos, which is pretty easy given that our heterogeneous Paxos reduces to the homogenous case if we give it homogenous assumptions.
So we used 16 acceptors and a median latency turned out to be about 555 milliseconds, or 55 milliseconds of overhead processing time. In contrast, the heterogeneous setup using only nine acceptors had only 37 milliseconds median overhead processing time.
Here we've plotted the distribution of latency for appending a thousand sequential blocks to our little blockchain. The bars are the top 5%, bottom 5% in the median latencies. The reduced time comes chiefly from having to process fewer messages, which means fewer signatures, fewer hashes, and so on. So demanding that this example system use a homogenous consensus costs an unnecessary
extra seven third-party acceptors, which adds an unnecessary 51% median latency. So that's the kind of consensus you need when you have heterogeneous acceptors, heterogeneous failures, and heterogeneous learners.
We built a heterogeneous consensus. We formalized a definition for what it means to have a heterogeneous consensus. We expressed heterogeneous failure assumptions with a learner graph. And we have a quorum requirement that tells us when consensus is possible. Our consensus, put very briefly, runs one Paxos per learner, where one of the messages
can include, I've accepted in another Paxos. I'd like to thank the OPDIS committee, as well as my co-authors, Ximun Wang, Robert Veronese, and Andrew Myers. I am Isaac Sheff, and at this URL, you can find links to our paper, this presentation,
these slides, and our technical report. Our paper includes a detailed step-by-step description of heterogeneous Paxos, and our technical report includes additional examples of heterogeneous consensus scenarios and additional experiments, as well as proofs of correctness. Thank you.