Graph algorithms and population protocols: Fast Hybrid Network Algorithms for Shortest Paths in Sparse Graphs
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 30 | |
Autor | ||
Lizenz | CC-Namensnennung 4.0 International: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/52873 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
00:00
Zellularer AutomatRechnernetzLASER <Mikrocomputer>FunknetzDatennetzStellenringGraphMessage-PassingKnotenmengeBildschirmmaskeDatenmodellMobiles InternetKürzester-Weg-ProblemApproximationDurchmesserSchwach besetzte MatrixGraphentheorieDatennetzEndliche ModelltheorieHybridrechnerRechnernetzZusammenhängender GraphInterprozesskommunikationATMDurchmesserTypentheorieStichprobenumfangSprachsyntheseProzess <Informatik>GraphDifferenteDreiecksfreier GraphLaufzeitfehlerAlgebraisches ModellSchnelltasteBildschirmmaskeMereologieApproximationBitMinimalgradStellenringGRASS <Programm>Message-PassingInformationGüte der AnpassungRechenzentrumDatenstrukturGrenzschichtablösungLokales NetzDatensatzNetzwerktopologieParametersystemResultanteOffice-PaketWald <Graphentheorie>Hecke-OperatorKategorie <Mathematik>Lokales MinimumZahlenbereichPhysikalischer EffektCAN-BusKanalkapazitätTotal <Mathematik>MultiplikationsoperatorNeuroinformatikLASER <Mikrocomputer>BildverstehenCharakteristisches PolynomKonditionszahlMultifunktionVerschlingungSchwach besetzte MatrixStandardabweichungLoginZeiger <Informatik>GeradeAuflösung <Mathematik>AlgorithmusAbstandVollständigkeitKnotenmengeEbener GraphKürzester-Weg-ProblemTeilbarkeitFokalpunktZellularer AutomatSpannweite <Stochastik>CoprozessorDrahtloses lokales NetzIdentifizierbarkeitGebundener ZustandSynchronisierungCachingMobiles InternetOverlay-NetzComputeranimation
07:40
Kürzester-Weg-ProblemDurchmesserGeradeZeiger <Informatik>GraphGraphentheorieAbstandKnotenmengeDreiecksfreier GraphGerichteter GraphVirtuelle RealitätLokales MinimumGraphDurchmesserNetzwerktopologieDreiecksfreier GraphKnotenmengeKonstruktor <Informatik>Quick-SortPolygonzugAbstandLokales MinimumNeuroinformatikBroadcastingverfahrenGeradeKürzester-Weg-ProblemResultanteReverse EngineeringExzentrizitätDickeBitRechter WinkelRichtungVektorpotenzialTotal <Mathematik>MinimalgradSchlussregelOrientierung <Mathematik>Wald <Graphentheorie>RechnernetzGewicht <Ausgleichsrechnung>DifferenteFramework <Informatik>MAPPartitionsfunktionCASE <Informatik>Helmholtz-ZerlegungVirtualisierungGerichteter GraphGraphentheorieInterprozesskommunikationComputerspielProdukt <Mathematik>Klasse <Mathematik>SkalarproduktVersionsverwaltungUniformer RaumElektronischer FingerabdruckStapeldateiSoftwareschwachstelleEinsGRASS <Programm>Kartesische KoordinatenMechatronikLokales NetzElektronische PublikationPartikelsystemGruppenoperation
15:19
DurchmesserGraphKürzester-Weg-ProblemZusammenhängender GraphKnotenmengeWurzel <Mathematik>Netzwerktopologiep-BlockMultiplikationsoperatorAbstandGraphExzentrizitätDurchmesserPartitionsfunktionDickeMessage-PassingGerichteter GraphBitAlgorithmusNetzwerktopologieDifferenteKnotenmengep-BlockDreiecksfreier GraphLokales MinimumGewicht <Ausgleichsrechnung>VirtualisierungEinsRoutingExploitLie-GruppeMultifunktionMusterspracheVollständigkeitParallele SchnittstelleSummierbarkeitPolygonzugPASS <Programm>Zusammenhängender GraphRichtungRechter WinkelArithmetischer AusdruckReverse EngineeringMereologieGeradeProzess <Informatik>EINKAUF <Programm>SichtenkonzeptKlasse <Mathematik>Trennschärfe <Statistik>Elektronische PublikationReelle ZahlStandardabweichungARM <Computerarchitektur>Endliche ModelltheorieEinfügungsdämpfungSchlussregelOffice-Pakett-TestDiagramm
21:18
SchnelltasteKnotenmengeGraphLokales MinimumApproximationGewicht <Ausgleichsrechnung>MultiplikationZufallszahlenMatrix <Mathematik>AbstandDurchmesserRundungHelmholtz-ZerlegungNetzwerktopologieHelmholtz-ZerlegungGraphKnotenmengeApproximationSchnelltasteAbstandNetzwerktopologieRechnernetzLokales MinimumDickeTeilbarkeitTermGraphfärbungMereologieDurchmesserMultiplikationsoperatorInformationAbgeschlossene MengeMAPSelbstrepräsentationGraphentheorieKategorie <Mathematik>ZahlenbereichGewicht <Ausgleichsrechnung>DifferenteParallelrechnerMetrisches SystemDreiecksfreier GraphMultiplikationLoginMatrix <Mathematik>EinsDreiecksungleichungPaarvergleichEndliche ModelltheorieGRASS <Programm>Bildgebendes VerfahrenKomplexe EbeneProzess <Informatik>UngleichungChord <Kommunikationsprotokoll>BeobachtungsstudieSoftwaretestMittelwertSystemaufrufResultanteExogene VariableKanalkapazitätQuick-SortComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:00
Hi, everybody. My name is Kristian Hintoy. Welcome to my talk. I'm going to talk about fast hybrid network algorithms for shortest paths in sparse graphs with just joint work with Michael Feldmann and Christian Scheideler. So first of all, what's a hybrid network? We all communicate in a hybrid fashion or in hybrid networks, so to say.
00:22
For example, these two guys on the left, they use communication via smoke signals on the one hand and on the other hand, they also speak to each other and use verbal communication as a different mode of communication. All right. So in the right picture, this is depicted by two different edges, two different types
00:44
of edges, one, yeah, giving long range communication, smoke signals in this example, and also a short range communication mode, this blue edge via speech. So we do this all the time.
01:01
In computer networks, this idea has also been, yeah, incorporated into practical use in recent years. For example, in data centers, in many data centers, not only can the processors or computers communicate via fixed cable networks, but they can also use, for example, lasers
01:21
or wireless connections to establish dedicated links between different computers, which gives them the opportunity to communicate large amounts of data to some dedicated other computer. For cars, this has also been envisioned in car-to-car communication by using both ad-hoc
01:43
and cellular networks and exploiting the different characteristics of these two types. So our work is based on the model from Augustin et al., which distinguishes two types of communication. So it distinguishes a local network and a global network, whereas the global network
02:03
forms some fixed communication network indicated by the blue edges in this example, the global network forms a clique. So in the global network, every node can, in principle, reach any other node. We assume that nodes have identifiers of size of log n, and the two different modes
02:24
of communication, so this is the clique global network. They have two different communication capacities, communication bounds. So we have a synchronous network or synchronous model in which each node can send and receive our log n messages over each local edge, so over each edge of the local network,
02:42
which corresponds to the congest model. On the other hand, we also have the opportunity to, the capability to send O of log n messages in total over global edges at each node. So each node can communicate O of log n messages over global edges, which corresponds to the so-called node-capacitated clique, which is a fairly recent model.
03:05
So the difference here, of course, is that in the local network, we have a communication cache capacity at each edge of the network, whereas in the global network, we have a capacity bound on each node. This is justified or motivated by the idea that the local network corresponds rather
03:23
to a fixed physical infrastructure such as a cable network, where we really have a dedicated link between, some real link between any two nodes, and the global network is rather seen as an overlay network that's formed on top of some shared infrastructure.
03:42
So for example, the data center network that I discussed briefly, the cable network, which corresponds rather to the local network, it's a fixed network where you really have the capacity of a cable, whereas the laser connections are connections established in
04:01
some global network, where you can, in principle, talk to any other guy, but you can only send a few messages in total. You can't speak to all the other guys in parallel. For car-to-car, our talk connections would be local network and cellular infrastructure would be the global network, and if you think of mobile phones, for example, that
04:23
combine device-to-device communication with cellular infrastructure, this would also correspond to local and global networks. So we look at shortest path problems in this model, and we already know some results for shortest paths for APSP diameter and SSSP in particular, those are the problems that
04:41
have been studied before in this model, and we have some bounds, all of them are polynomial as you can see in runtime, by the way, the tilde hides polynomial factors in the runtime. And it's interesting that this lower bound for APSP even holds for trees by some just
05:01
simple communication argument. You need to receive information from all the other nodes, so that's even in a hybrid network that even allows the nodes to communicate even more over the local network, this is still not doable in faster-than-scale-with-n.
05:25
So our focus in this work is to look at certain sparse graphs and see whether we can come up with efficient algorithms for diameter and single-source shortest paths, and we particularly look at so-called cactus trees, which are trees that look like cacti, so they have
05:43
the property that each edge is contained only in one cycle, and then we also look at what we just call sparse graphs here, which are graphs that have at most n plus o of n to the one-third many edges, and arboricity o of log n, where the arboricity is some graph property that is defined as the minimum number of forests into which
06:03
the edges of a graph can be partitioned, and for example, for planar graphs it's a constant. And we come up with exact algorithms for diameter and SSP that are as efficient as it gets, so log n is an obvious lower bound for these problems, and for sparse graphs we come
06:22
up with log n squared algorithms, so very efficient, but these are only constant approximations, so not exact solutions for sparse graphs. All right, let's start with something very simple, and I will then make this a bit more complicated until we finally argue about cactus trees.
06:43
So, for line graphs, you can solve SSP and diameter by simply using pointer jumping, which is the standard technique from PRAM literature, where the idea is to kind of, yeah, let's, in our distributed setting, let nodes introduce each, yeah, its neighbors to each other until
07:02
you have formed shortcuts that span the complete graph, basically, so when you let the nodes introduce the neighbors to each other and continue this process, you establish shortcuts in exponentially increasing length, so after log n rounds, you have distance
07:20
preserving shortcuts that you can use to very easily solve SSP in the global network, and by the way, this still, so the degree of this will be of log n, so we can really run algorithms then in the global network, and we don't really need any local communication at this point, actually. So for SSP, we can just perform a broadcast for diameter.
07:43
We can simply perform two broadcasts, two SSPs, so that's very simple. Cycle graphs, it's also, you can also reduce that to lines, basically, so the idea is if we want to solve SSSP, we divide the cycle at S and obtain two line graphs,
08:02
L1 and L2, and then we just perform SSP on both line graphs because, well, the shortest path to any node must either go over the left or over the right side of the cycle, so that will do the trick if we just take the minimum of the two computer distances. So diameter, I don't want to talk too much about how we do this.
08:21
This is a bit more complicated, actually. What we compute is for each node, first of all, we compute the farthest edge of the node, for example, for some source S, by first agreeing on some notion of left and right of the cycle by performing SSSP, and then we want to find the farthest
08:41
node, V, which is, if you go along the left traversal of the cycle, the farthest node until the distance exceeds W over two, which would mean that the shorter path would go over the other direction, so that's kind of the farthest node
09:03
in one direction, and the incident edge of that node is the farthest edge. So, but a different, a removal of this edge would not destroy shortest paths to S. So these are the nodes that we want to compute for all of the nodes of the cycle,
09:24
and as a byproduct, we also compute the eccentricity, which is, yeah, the length of the longest shortest path of that, that begins at that specific node. So it's kind of how far you can go from that node. All right. And I, yeah, maybe, maybe just very briefly, the idea is basically to assign
09:44
certain potentials to the nodes and then sort the nodes from, from where we can compute these farthest nodes of each node, basically sorting. All right. How do we do this in trees? The idea is also kind of known from literature, which is called the so-called
10:02
Euler 2 technique, where the idea is to reduce basically the, to look at the depth first reversal of the tree and look at the line of nodes that results from traversing the tree in that manner. So yeah, this is a depth first reversal and the nodes can construct a line that
10:24
corresponds to this traversal by introducing, or by, by basically employing the following rule, when I'm reached from my i-th largest neighbor, I forward, or I proceed the traversal at my i-th largest neighbor. Now this is some, some local, some simple local rule. And then for each neighbor, every node introduces one virtual node and
10:44
these virtual nodes are then connected according to this local rule. And this obviously gives a line of virtual nodes in which every node kind of is contained with all of its virtual nodes. The problem is nodes have a high degree. So whereas we can, in fact, yeah, we can connect the nodes, the virtual
11:05
nodes accordingly using the local network, we can't use it in the global network, but we really want to use it in the global network to make efficient use of it. Yeah. So what do we do? We reassign the virtual nodes. To do so, we compute an orientation, which is an assignment of directions
11:20
to the edges, such that every node has only at most three outgoing edges. Um, we can compute this orientation using the so-called Nash-Williams forest decomposition technique, which gives us such an orientation because a tree has arboricity one. Uh, and, and then we have the following rule. We look at, um, the first clockwise was the first edge from each
11:43
virtual node reached in a clockwise order around the, uh, the corresponding node. And, um, we ask ourselves, is the node, is the edge directed towards me? So if it is, then the virtual node is simply passed to the corresponding
12:00
neighbor, which leads to the following redistribution and this can easily be seen. Um, the outcome is that each node simulates at most six virtual nodes. No. So now we have an assignment basically of this virtual construction to actual nodes that, uh, allow them to communicate the necessary communication in the global network.
12:22
So for SSB, this is basically also known from PRAM literature. Um, if we reroute the tree, which we can do using our, this sort of two technique, uh, and assign weights to the edges that are minus or plus the original way, depending on whether the, whether the traversal goes up or down in
12:43
the tree, um, we can achieve that every virtual node, the distance of every virtual node equals the node distance, the nodes distance in G. So thereby we can compute SSSP for diameter. Um, we use a different well-known trick, uh, just performing SSP once
13:01
from some arbitrary node, then take the node that has maximum distance to this one and performing SSSP from that node. And then, um, the interesting thing is that the eccentricity of this node must be the diameter. This is only true in trees, but well, that's, uh, that's fine with us. Okay. So sort of trees, that's kind of the next level.
13:22
Um, this is a tree with an additional edge that contains one cycle. And so we have one cycle, uh, attached some trees to this cycle. That's a sort of tree. Um, if we just throw the Euler to the solid to a technique on top of this, um, we will yield, we will get two cycles in this case.
13:42
Um, blue one and an orange one, and every cycle node, every node that's contained in the cycle will also be contained in both cycles in both of these, uh, these virtual cycles. So for SSSP, we can, um, use this partition of the graph to just solve it like in three steps of first in the tree in which S is contained, then
14:05
in the cycle, and then finally in the other attached trees. So that's very easy. Um, and I'm neglecting lots of small technicalities here. So of course we again need to use this redistribution framework and so on that I just introduced to you.
14:22
Diameter is a bit harder. Um, for diameter, every node needs to contribute two candidates to, uh, the diameter. The first candidate is the diameter of its tree TV, which is a tree that is attached to the cycle node here is for example, um, and the second candidate
14:40
is the eccentricity of the node V, uh, which is the length of the shortest path in some direction here, plus the height of its tree, which is here, this value, um, but this, um, value can only be contributed if the eccentricity of V is larger than its height because, because otherwise,
15:01
uh, it may be that, um, both values correspond to the same shortest path. So this candidate would actually heals. Well, the, the, the, the length of a path that's not actually a shortest path in total. Okay. And what we can show is that, uh, the maximum of all these candidates will give
15:23
us the, um, the actual diameter of the pseudo tree. So the only thing that remains is how to compute these values. So we know how to compute this because we've just argued how this can be done. Uh, we know to compute this, but this part, part, um, how to compute it.
15:40
So first of all, we define the value MLV and analogously a value MRV, uh, which is just where the height of, um, some node, some cycle, not you minus the distance along a left traversal of the cycle from V to you. So for example, um, if this is the note that maximizes the expression,
16:02
the, the, yeah, it would be the height of this notch minus the distance from V to you along the left traverse where left is this example, just this direction. Yeah. All right. And MRV again, and obviously, so how do these values help us now? When we know, when we compute the farthest edge from the node V, uh,
16:24
defined as I introduced to you using the algorithm that I introduced to you. Then, um, that L DL be the distance along the left traversal to X and DR be the distance along the right traversal to Y then the eccentricity of V is given as the maximum of DL plus MRX and DR plus MLY so intuitively, why is this true?
16:46
Um, for example, that you'll be the one that maximizes the right expression here, um, then the value, this, this right value here will give us exactly this distance, this distance minus this distance plus this distance.
17:02
So actually this one, so it corresponds to an actual length of, of the shortest path, um, on the right side and on the left side, we do the same. And the maximum message turns out is actually the eccentricity. Nice. So how can this be generalized to cactus trees? Now, so the nice thing that we had for pseudo trees, we were able to easily
17:23
kind of, um, compute the, or partition this graph into the cycle notes and tree notes. So we want to do roughly the same here, but it's more complex. So, um, actually want to find the biconator components that are also called blocks. Um, that's a bit hard, uh, to do that efficiently.
17:42
We perform the algorithm of Gotthardt et al, which is a very new approach. Um, and we have to adapt it a little bit and exploit the, uh, low arboricity of a cactus tree to do this and lock in time in our model, which is not trivial. Um, I want to talk about this at this point, let's assume that we have this.
18:01
So green ones are some biconator components and all the, the brown ones are, yeah, tree, tree edges or, uh, uh, cut edges actually. So now we want to find the anchor notes of each cycle. Um, and I can notice just the note that's closest to the source note S.
18:23
And when we, we compute these by, um, yeah, replacing each cycle by a binary G in a global network, basically, and then use our trees, uh, algorithms on, on this, uh, and this gives us the anchor notes and having these anchor notes, we can also compute the farthest edge within each cycle and remove it.
18:44
Which gives the shortest path three. It actually, this one is the shortest path three from S. Um, so now we can just throw our standard, our tree algorithms from before on top of this. So diameter again is, uh, it's harder. Um, and we want to compute different candidates again to do so.
19:04
We first route the shortest path three that we just constructed towards S and then we compute the height of each note as a first candidate, clearly the height of a note, um, corresponds to the length of some shortest path. So for Y, this is maybe this one for X, this is this one.
19:21
And, um, yeah, this is our first candidate of each note. So the second candidate of each note is, um, when we look at the heights of the adjacent subtrees of every note that lie in different blocks. So for example, let V have two neighbors, these two neighbors Y and X. And obviously at this example, they lie in different blocks.
19:40
Then V takes the heights of these two, um, and the, the weights of the edges towards these two takes the sum of this. And, um, well then for, for all neighbors that lie different blocks, it takes the maximum of all these pairs, uh, and takes this as a candidate. Now, so this will, um, be the length of a shortest path that looks like this.
20:01
And the reason for why the two adjacent subtrees must be in different blocks is because then there can't be a different path that shortcuts the shortest path. So actually these, these lengths together must be, must correspond to length of an actual shortest path again. So that's a nice candidate, but still there might be paths that are not covered by such candidates.
20:21
This one, for example, is not covered by any candidates so far. So to cover these paths as well, we, um, look at each cycle and we transform, or we construct for each cycle, a virtual pseudo tree where we attach virtual edges to each note that corresponds to the height
20:40
of the note in the corresponding or the adjacent subtrees, roughly speaking. So that the length of this path here will be exactly the length of the, of the patterns that before exactly this, this complete length here, and all of this can be done in parallel in each, um, virtual pseudo tree, so, um, we
21:06
just compute the diameter of each pseudo tree using our pseudo tree algorithm from before and contribute this as an additional candidate. And as it turns out, um, the sum of all these candidates then actually,
21:22
uh, or not the maximum of all these candidates gives you the diameter of the cactus tree. All right. So, um, lastly, let's briefly talk about graphs with N plus O of N to the one third edges. Um, so for example, this could be such a, such a graph and now such a graph may also have nested cycles, uh, different cactus graphs where
21:43
nested cycles would not be allowed. Um, here we can have lots of nested cycles. So first of all, we compute an MSTM, a spanning tree would actually also suffice, uh, this can be done in log N square time deterministically in our model. So, um, we define these orange edges, which are the ones that are not
22:03
contained in BMST as non-tree edges and nodes that are adjacent to such edges, shortcut nodes. Okay. So now we want to compute two different values. First of all, different, um, information for each node. First of all, every node U wants to compute its distance U VU to
22:21
its closest shortcut node VU. This one, for example, wants to find the distance to its closest shortcut node, which might be this one, or it might also be this one. Uh, and secondly, for every pair of shortcut nodes, UV, we want to compute the distance between the two nodes, uh, in G so for example, this is a shortcut
22:41
node, this is a shortcut node. We want to compute the pairwise distance in the original graph, not in the MST. All right. Um, and then a three approximation for this, uh, for, for the distance between any two nodes, S and T is given as the minimum of the distance between S and T and the MSTM and the distance from S to its lowest shortcut node.
23:04
VS distance from VS to VT, where VT is the closest shortcut node of T and then D VT and T. So it's not too hard to see that this gives us three approximation. Um, it's a pretty simple argument of this, the actual shortest path.
23:21
And this is our shortest path. Um, according to this, to the second term here in the approximation, then, um, first of all, the, this is upper bounded by, by the length of, of this, uh, because triangle inequality. And if you then study, uh, divide this red path and color it appropriately, you can see that each color, the, um, the, the part of each color is at most
23:47
of the length of the orange, um, the orange path because VS and VT are closest to S and T respectively. So, um, yeah, very high level now how we can compute these, uh, this information.
24:03
For one, we construct a so-called decomposition tree, which is a graph again, that is built in the global network and that preserves the distances. Uh, in the spanning tree M and that has additionally hop diameter of lock-in, which allows us to perform massive parallel computation.
24:22
Um, first of all, we use it to perform, um, simply Bellman-Ford and the global network from every shortcut node to let each other node learn it's close shortcut node, and then for a property number two, we also heavily use it. Um, first of all, we assign random representatives to each pair of shortcut nodes, um, who are tasked with computing the distance between
24:43
these nodes or actually computing first the distance in M and then also the weight of the corresponding edge, a non-tree edge between these two nodes. If it exists, uh, and to compute these things, we make use of techniques from the node-capacitated pick, uh, to distribute all the requests and all
25:03
the responses, uh, in the global network to gain or to, to retrieve all these values and once we've retrieved them, we kind of have filled a matrix, uh, of, of size N to the two-third. And, um, we perform metrics multiplication in this metrics of
25:21
log N times, which, um, lets us essentially cover all shortest paths. So I, unfortunately, I don't have any time to go into detail here. Um, if there are any questions, then I'm happy to answer them somehow. So with that, I thank you for your attention.
25:41
I hope to see you virtually in person at the conference. And, um, yeah, stay safe. See you.