We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

New Fault Tolerant Subset Preservers

00:00

Formal Metadata

Title
New Fault Tolerant Subset Preservers
Subtitle
Q/A Session E - Paper E1.F
Title of Series
Number of Parts
123
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
SicNeuroinformatikGraph (mathematics)File Transfer ProtocolState of matterSupremumTheoremWeight functionVertex (graph theory)Helmholtz decompositionNetwork topologyLinear mapSource codeDifferent (Kate Ryan album)Vulnerability (computing)Power (physics)Graph (mathematics)Error messageInteractive televisionLimit (category theory)Set (mathematics)MereologyCASE <Informatik>Group actionFocus (optics)Wave packetResultantMathematicsOrdinary differential equationCategory of beingData conversionNetwork topologyMachine visionSocial classMassDeterminantShared memoryGoodness of fitMultiplication signMetropolitan area networkRoundness (object)FreewareField (computer science)Condition numberDecision theorySingle-precision floating-point formatRevision controlRootFerry CorstenShortest path problemPrisoner's dilemmaInsertion lossComputer configurationBuildingTaylor seriesDivision (mathematics)Grass (card game)Workstation <Musikinstrument>Line (geometry)MikroblogStandard deviationExtreme programmingShooting methodAdditionSineTerm (mathematics)Real numberClosed setVideo gameAreaStrategy gameVideoconferencingTouchscreenNumberInheritance (object-oriented programming)DatabaseText editorExtension (kinesiology)Food energyTelecommunicationDistanceSystem callEstimatorArithmetic meanLaceSign (mathematics)GradientStaff (military)Software testingQueue (abstract data type)Entire functionOrder (biology)Physical systemMoment (mathematics)State of matterInjektivitätNeuroinformatikLatent heatVideo GenieTwitterRing (mathematics)LengthData miningHelmholtz decompositionVertex (graph theory)outputSubgraphSubsetDivisorParameter (computer programming)HypothesisFault-tolerant systemFlow separationCartesian coordinate systemSoftware developerFunction (mathematics)Graph (mathematics)SummierbarkeitPoint (geometry)Bound stateReduction of orderLoginSoftwareWeightSigma-algebraAlgorithmBitPrime idealMathematical singularityMaxima and minimaPartition (number theory)RandomizationProof theorySampling (statistics)Slide ruleOpen setWaveMathematical analysisProcess (computing)Lemma (mathematics)Ocean currentSparse matrixUniform resource locatorComputer animation
Transcript: English(auto-generated)
This is a joint work with Greg Brodwin, Curtis Chothari, and my other partner. Today we will talk about new fault-tolerant subset distance preservers.
This work is about fault-tolerant distance preservers. Let me start with a general definition of a distance preserver. The input for this problem is a graph G and a set of node pairs P, where the goal is to calculate a sparse subgraph that preserves exact distances between the pair nodes of P. Here in the illustration, you can see that the distance between the yellow pair node is exactly the same in H and in G,
and the same goes for the green pair. But in H, we try to make it as sparse as possible. Formally, for every pair node in P, we require that the distance between them in H is the same as the distance between them in G.
Clearly, we can take H equal to G, just take all the edges in the graph. It is always a distance preserver, but our goal is to make it as sparse as possible. Our work focuses on the special case where P equals to S times X. Now we say that all nodes are yellow, they are all part of the same set,
and we had to add some edges to H to make it a preserver of S, meaning that for every pair of nodes from S, the distance between them remains the same in H and in G. This is called a subset distance preserver, and it is quite common. It is actually one of the primary reasons distance preservers were first developed.
One of its main applications is a communication network in real-life graphs, where parts of the graph can spontaneously fail. So what happens if there is a failure in this network? As H is very sparse, the failure can completely disconnect it,
while G, not even that G is still connected, the distance between the yellow nodes is not even harmed. This motivates an additional requirement, which is called fault tolerance. So a subset fault tolerance preserver, we are given a graph G and a set of nodes S,
and our goal is to have a sparse subgraph that preserves exact distances between the pair of S upon a failure. So what do I mean? If there is a failure, then H minus X is still a preserver of G,
meaning that in the presence of a failure, the shortest path between every yellow node is the same. And this holds for every possible failure, it may be also this one, and formally we say that for every pair of nodes from S, the distance between them in G minus X is the same as in H minus X,
and this holds for every two nodes from S, and for every fault X. And I will also mention that the definition for fault tolerance for edges is the same, except X is an edge. But this work is about vertex fault tolerance,
so when I say fault, I usually mean a vertex fault. So now we added enough edges to H to say that H is resilient to a singular vertex fault. Our work is about fault tolerant subset distance preserver. There are various settings in which one can learn fault tolerant preservers,
which includes directed and undirected graphs, and weighted and unweighted graphs. So let's begin with undirected graphs. So the state of the art so far, first we begin with the single pair edge fault tolerant case. So this is the case where we have S and T, and we have the path between them,
and any edge on this path can fail. And what we want in our fault tolerant preserver is that for any such edge, we want the shortest path between S and T without this edge. So Baldwin et al. showed a linear bound on this problem.
So next, a result by Billow et al. is O tilde of NS for subset edge fault tolerant preserver. And this means that we have a set S, just like we showed before, and we want to preserve the distances between this set following a failure of an edge.
So current literature reflects a world in which vertex faults are harder to analyze than edge faults. And many other questions in this area are closed for edge fault tolerance, but open for a vertex fault. Another interesting result is the one of Kupferschmidt and Elkin, and the upper bound of O of NS for subset distance preserver in the fault-free setting.
Note that this is very close to the state-of-the-art result for edge fault tolerance, and we'll talk about it in the next slide. So our result in undirected graph, so given an N node undirected weighted graph G,
and a set of nodes S, and a subset, we say that a subset vertex fault tolerant preserver has at most O tilde of NS edges. And when S is equal to 2, meaning that we only have one single pair, this bound improves to be strictly linear.
We also add a matching lower bound, which means that for every size S of source nodes, and for every N, there is a weighted graph with at least NS edges for its subset vertex fault tolerant preservers.
So not only that our result is as sparse as the subset fault tolerant preservers for edge fault, it is also almost as sparse as the known upper bound in the fault-free settings proved by Kupferschmidt and Elkin. And I believe that this reflects the weakness of our current understanding of non-faulty subset distance preservers,
rather than the thought that fault tolerance is actually for free or for log N factor. Still, non-faulty subset distance preservers bound have resisted improvement for the last 15 years, so this unintuitive hypothesis of free fault tolerance may be hard to refute. Our hypothesis is that in the bound of current knowledge, the presented lower bound holds in the unweighted case as well.
So this completes the picture with regards to undirected graph. Let us consider directed graphs now. So in directed graph, the only non-trivial result is the one of Bodvin et al.
showing that for a single pair fault tolerant preserver, which again, which means that we only consider one s and one t, and we want to preserve the distance between them no matter what edge fails. So such a preserver, they show that the correct size bound is exactly that of a non-faulty distance preserver
in the fault-free setting of N demand pairs, which bounds were given by Kupferschmidt and Elkin. As you can see, it is at least N to the 3 over 4, and it must N to the 3 over 2. The most interesting case that was left open by the Bodvin et al. work is the one of directed and unweighted graphs.
For this case, we provide the following result. So for directed unweighted graph G, a set of sources S and a fixed root R, we define P to be R over S, meaning that we always have R and some vertex for S. This is our pair set.
So there is a fault tolerant preserver with all tilde of N to the 4 over 3 and S to the 5 over 6 edges. And when S is equal to 1, which means that we only have one single pair to consider, this bound improves to be O of N to the 4 over 3. So like before, we also show conditional tightness of this bound,
meaning that until there is improvement in the fault-free setting, we believe that this is optimal. Okay, so lastly, we turn to weighted directed graphs, and we show that for directed weighted graph G, and with the same setting and definition of P,
there is a fault tolerant preserver with all tilde of N to the 3 over 2, S to the 3 over 4 edges, and this extends the state of the art by Bodvin et al. So in today's talk, we focus on our first result in undirected graph. So just to remind you of the problem, we are given an undirected graph G and a set of sources S.
And let's go carefully through the output. So our H is defined to be all the edges in the replacement path P, S, T, X, when S and T are the sources and X is the failure. And when a replacement path P, S, T, X is defined to be the shortest path from S to T in G minus X.
So you have a yellow illustration of this replacement path. And also, let's remember that it's parameterized by three vertices, which is the source, the target, and the failure vertex X.
So a bit of notation, we use TS to denote the shortest path tree of S. You can think of the BFS tree in an unweighted graph. And you have big S, it's just the union of all these paths. So our algorithm is as follows. First, we add all of this shortest path tree to H.
And this cost us all of S and edges, and edges for each tree. And next, we carefully choose n log n additional edges for every source node, ending with n S log n for the entire preserver.
So our technique is based on two simplifications, which we can do without loss of generality. The first one is that we allow ourselves to analyze short replacement paths only. And this is due to a hidden set argument. And the second one is we use a reduction from general vertex fault in the tree of TS
to a specific case where the fault occurs on a path in TS. And this reduction is based on the useful tool of heavy path decomposition. And we will talk about it shortly. So first one is, we observe that it is enough to consider only short replacement path.
So we define PSTX to be short if the number of edges in PSTX is no more than n over S edges. And we define P short to be all the short replacement paths that we want to preserve. So our claim is that it is enough to bound all the edges in P short only.
And this is by a hidden set argument. We can sample O of S random vertices. And then in every longer replacement path, we have that each edge on a longer replacement path is actually in between two new sources. And the path between them is at most n over S.
So it's actually each edge on a longer replacement path is actually on a shorter replacement path. And this is why it is enough to consider only edges in short replacement path. So the second simplification, which allow us to consider faults only on a path.
For that, we use the heavy path decomposition. And this is a decomposition of a tree TS into vertex disjoint paths that was introduced by Cello and Tarjan. So the input for this decomposition is a tree TS, and the output is a partition of the tree to vertex disjoint path,
which have some useful properties. So just to be on the same page, let me briefly explain how this decomposition can be computed. So a basic definition is that of a heavy child, which is a child with a maximal subtree size. And the decomposition is made by just following each time from the source to the heavy child, like this.
The desired decomposition is obtained by recursively applying this process over the remaining subtrees. So now we're done. So a useful lemma is that for any vertex V, its path to S and TS intersects a small slogan path of the decomposition.
So let us consider V and the yellow path. We'll just give the intuition for that. So any black edge means that we did not have an heavy child, and this means that the size of the subtree is cut at least by half.
So this is why we can have at most log n such edges. And this will be very useful in our analysis. So we now turn our focus to edges of replacement path with failure on a specific queue, when queue is denoted by X0 to Xk.
We define H as queue to include all the replacement paths with X the failure on queue, and T from S, which are all out of our interest replacement path anyway. So our goal is to bound the number of edges in H as queue to be the same as N queue,
where N queue is the size of the subtree rooted at X0. So again, since we want to use the heavy path decomposition, we want to analyze queue locally, which means that we want to be depending not on the entire number of vertices in the system,
which is N, but in a smaller N, which is only the number of vertices in the subtree rooted by X0, which is the subtree including queue. So our strategy is as follows. For every replacement path, how do we handle it?
We partition to several classes. We have many kinds of replacement paths, each with a failing vertex on queue, and what we are doing is to partition them into classes depending on the location of the target T with respect to the failing vertex.
Then the edges in each class will be bounded separately. So let us fix queue and S, and actually we also fix the classification with respect to the failure, so let us fix a failure, and we divide T minus the failure into three parts,
and this partition was introduced by Boswana and Kahana. So first part is Ui, which includes all the shortest path tree minus the shortest path tree rooted at Xi, like we see in the painting. Second part is Di, which is the subtree of the vertex in queue that follows our fault,
like we can see in the illustration, and the remaining part is just the part in between, just like we see in the illustration. So we have three options for tree, where T can be the target. I remind you that S is fixed now, X is fixed now,
so the only thing that defines the replacement path now is T. So T can be in Ui, T can be in Oi, and T can be in Di, and those are actually our classes. So when T is in Ui, then actually this path is already in the shortest path tree of S,
because it is not on the subtree of the failure, meaning that the failure was never on the original part from S to T. So this was easy. And for the case that T is in Oi, we use a counting argument which relies heavily on this fact, that Xi is the LCA of T and Xk.
So let me just convince you by that. So if T was here, for example, we expect the LCA to be somewhere here, and if T was here, we expect the LCA to be somewhere here. So indeed, we have that T is in Oi if and only if Xi is the LCA of T and Xk.
So actually the counting argument is based using the happy path decomposition. We know that we have at most log n queues such that Xi is really on the path to queue, and I remind you that we have only short paths,
so this means that we have enough edges to add all the replacement paths with this property. So the counting argument says that it's not too many edges, so just add all the short replacement paths with this property that T is in Oi.
So next we have the most involved case, where T is in Di, where this is Di and the yellow nodes are the options for T. So fix Xi and we have T and Di, and the replacement path looks somewhat like that,
and I will not go over it, I will go over it shortly just to give you the intuition. So in this class we ask which edges of the replacement path are already in H, and for that we define some special vertices, for example, this vertex,
which we know that by our previous argument we know that this path is actually already in TS. We also define this vertex and we show that this path that you see right now is actually in TT, since the failure Xi cannot be on its shortest path to T.
And this is how all that's left for us is this small segment in the painting, which might be long in reality, to handle. So what are we going to do? Our strategy for any Xi, for any failure, we embed this shortest path into a smaller graph, G hat i, of Oi nodes.
So there may be several replacement paths in here, and we define a new graph which contains also edges which are not in G. Then we compute a shortest path tree in G hat i,
and we use it in order to decide which edges from G we can add to H. And this shortest path in G hat i covers all of this replacement path. In addition, we add the following edges in the endpoint for every replacement path,
and this gives us a bound. We also use the fact that Oi's are mutually disjoint, so you can see it in the picture quite easily. So the summation over all Oi's, over all failures that possibly can happen on Q, is O of NQ, which is the size of the subtree rooted at the beginning of Q.
So this is what we wanted. To sum up, if HsQ is all the edges in the replacement path where the failure is in Q, so the size of it, we can show that it's O of NQ,
and the number of such replacement paths. As you recall, we added two edges for each replacement path. So actually this is a good point to mention that for S which is only a single pair, for the case of single pair, we get a linear bound here since NQ is equal just to N.
If we take Q to be the shortest path between S and T and apply this analysis, then Q is the path between S and T, so NQ is just N. And the number of replacement paths, so T is fixed, and X in Q is bounded by N since this is the bound over a length of shortest path.
So we get a linear bound here for a single demand pair. So in the general case, we have many Qs from the heavy path decomposition, and we would like to check how many edges do we add overall.
So let's check this out. So first we bound the sigma of NQ, so a reminder is that each vertex appears at most in log N subtrees of Q, so we can participate in at most log N subtrees rooted by the first vertex of Q,
so V is actually counted here in the left hand side at most log N times. So this is why the summation is O of N log N. And to bound the number of replacement paths, so actually, PSTX with X on Q, since the Qs are a partition of TS,
then this is a partition of the replacement paths that we have, and they are mutually disjoint. So all we need to bound is really the number of short replacement paths from S, which is PS. So how do we do it?
We defined PST, which includes all the replacement paths from S to T. And then we showed that it is actually sufficient to show that for every T, the replacement paths from S to T are no more than N over S. So again, why is this enough? Since we have S such Ts,
then it shows an immediate bound on PS that we have O of N such paths. So now all that's left for us is this way, and I don't know if I will have time to show you the weighted case, but let us begin by the unweighted case.
So if the replacement path is short, it means that the path between S and T is short, and it has at most N over S edges. And like we said, a replacement path PSTX, so S is fixed and T is fixed, and for X, we have only N over S possibilities. So this gives us a bound on the number of replacement paths from S to T.
And for the weighted case, so I will just give a proof sketch for any replacement path with failure in here, where in here I mean between S prime and T prime, which are in distance from S and from T. And we know that it must avoid all of these segments.
Otherwise, it wouldn't be short. So again, we have a fixed S, a fixed T, and for X, we have only N over S possibilities between S and S prime, and N over S between T prime and T, and one more for the one in the middle, and we finish with at most N over S possibilities for failure as well.
So putting it all together, we show that the union over all paths in the heavy path decomposition is no more than N log N edges, and summing this over all S yields the desired bound of O tilde of NS.
So this summarizes the main idea of our approach, and I'll be happy to answer any questions in the Q&A session. And thank you for your attention, and I hope you enjoy this talk.