Temporal walk based centrality metrics for graph streams - TIB AV-Portal

Temporal walk based centrality metrics for graph streams

00:00

1

Related Material

Institut des Hautes Études Scientifiques (IHÉS)

Palovics, Robert

Formal Metadata

Title

Temporal walk based centrality metrics for graph streams

Title of Series

Google matrix: fundamentals, applications and beyond

Number of Parts

17

Author

Palovics, Robert

Contributors

Benczur, Andras

Shepelyansky, Dima

Ullmo, Emmanuel

Kontsevich, Maxim

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/46365 (DOI)

Publisher

Institut des Hautes Études Scientifiques (IHÉS)

Release Date

Language

Content Metadata

Subject Area

Computer Science Mathematics

Genre

Conference/Talk

Abstract

The workshop will address fundamental features that determine the efficiency and control of information flow on directed networks,information retrieval, including also such fundamental properties of Google matrix as the fractal Weyl law, Anderson localization transitionfor Google matrix eigenstates. The highlights and future developments of this research fieldwill be analyzed 20 years after the seminal article of Brin and Page (1998).

Keywords

complex networks

financial flows

cancer networks

recommender systems

Google matrix: fundamentals, applications and beyond4 / 17

1

1:10:10

Graph Embedding, Content Understanding, and Self-Supervised Learning

2

59:40

From Fractal Weyl Laws to spectral questions on sparse directed graphs

3

56:32

Complex network approach to evolving manifolds and simplicial complexes

4

56:56

Temporal walk based centrality metrics for graph streams

5

53:22

Random Walks and Graph Properties

6

55:54

The simplex geometry of graphs

7

57:23

Axioms for centrality: rank monotonicity for PageRank

8

51:44

Typical properties of efficient representations of high dimensional data

9

56:33

Recommender Systems: Twenty years of research

10

56:45

Google matrix analysis of the World Trade Network

11

54:13

Spectral properties of Google matrix and beyond

12

54:33

Reduced Google Matrix analysis of Wikipedia networks

13

55:49

World influence from Wikipedia directed networks

14

1:01:38

Eigenvector Overlaps (an RMT Talk)

15

1:01:08

Collective computation in nonlinear networks and the grammar of evolvability

16

59:03

Reduced Google Matrix approach for exploring biological networks

17

50:13

Inverted steady states and LAMP models

Automatic playback

Speech

Text

Image

00:00

Time seriesMass flow rateEvent horizonGoodness of fitGraph (mathematics)SummierbarkeitObject (grammar)Weight functionMultiplication signCentralizer and normalizerFood energyMusical ensembleFilm editingMatrix (mathematics)Series (mathematics)Different (Kate Ryan album)Arithmetic meanSlide ruleLink (knot theory)Ocean currentRight angleDirected graphMathematicsSet theoryMetric systemTheory of relativityGraph (mathematics)Figurate numberMaxima and minimaDimensional analysisMultilaterationComputer animationLecture/Conference

06:29

WeightExponential functionCountingFunction (mathematics)Multiplication signGraph (mathematics)Film editingSummierbarkeitLengthConcentricNichtlineares GleichungssystemWeight functionWeightExponential functionBeta functionTerm (mathematics)Product (business)Numerical analysisPower (physics)Goodness of fitGreatest elementFigurate numberLatent heatWell-formed formulaCentralizer and normalizer2 (number)Logical constantMass flow rateConfiguration spaceFormal power seriesSlide ruleCategory of beingNetwork topologyMortality ratePotenz <Mathematik>Positional notationPoint (geometry)Computer animation

12:59

Relation <Mathematik>Graph (mathematics)Graph (mathematics)Sample (statistics)Uniform convergenceRandom numberExponential functionTotal S.A.WeightLengthMultiplication signTime seriesSet theoryFilm editingSummierbarkeitCentralizer and normalizerInfinitySampling (statistics)Nichtlineares GleichungssystemFluid staticsConcentricWeightWeight functionGraph (mathematics)Term (mathematics)Positional notationCalculationNumerical analysisPoint (geometry)Different (Kate Ryan album)Derivation (linguistics)Well-formed formulaUniformer RaumConfiguration spacePower (physics)Slide ruleMetric systemPotenz <Mathematik>Selectivity (electronic)ComputabilityFunctional (mathematics)Latent heatLimit (category theory)Beat (acoustics)Graph (mathematics)Exponential functionTheory of relativityCausalityRight angleBeta functionComputer animation

22:06

Graph (mathematics)Centralizer and normalizerTheoryMultiplication signFluid staticsTerm (mathematics)Numerical analysisWell-formed formulaMereologyMass flow rateTotal S.A.Graph (mathematics)Maxima and minimaGraph (mathematics)Keilförmige AnordnungComputabilityLengthFilm editingCorrespondence (mathematics)Functional (mathematics)MathematicsLecture/Conference

24:50

Graph (mathematics)Slide ruleExponential functionTime seriesWeightMultiplication sign2 (number)Graph (mathematics)Centralizer and normalizerExponential functionCategory of beingComputabilityWell-formed formulaOcean currentSlide ruleWeight functionFilm editingRecursionMass flow rateRule of inferenceComputer animation

28:30

Graph (mathematics)Potenz <Mathematik>SummierbarkeitFunctional (mathematics)Centralizer and normalizerGraph (mathematics)Similarity (geometry)Multiplication signTerm (mathematics)Well-formed formulaDirection (geometry)Computer animation

29:56

Different (Kate Ryan album)Function (mathematics)Metric systemElectric currentTournament (medieval)Hill differential equationMaß <Mathematik>Graph (mathematics)Series (mathematics)Fluid staticsCentralizer and normalizerWeightMetric systemFilm editingTournament (medieval)Multiplication signOcean currentCausalityComputabilityResultantNumerical analysisArithmetic progressionMass flow rateGraph (mathematics)PredictabilityTerm (mathematics)Different (Kate Ryan album)Well-formed formulaLogical constantRankingTime seriesAveragePoint (geometry)Sampling (statistics)SummierbarkeitSet theoryWeight functionLength2 (number)Mortality rateLatent heatMereologyGraph (mathematics)Event horizonFluid staticsLine (geometry)Slide ruleThermal conductivityTable

37:17

WeightGraph (mathematics)Random graphSlide ruleCentralizer and normalizerMultiplication signNumerical analysisRandom walkMetric systemAlgebraic structureSampling (statistics)Well-formed formulaFilm editingDirection (geometry)WeightGraph (mathematics)SummierbarkeitComputabilityPropagatorSelectivity (electronic)Graph (mathematics)Process (computing)Graph (mathematics)Right angleCausalityEvent horizonDerivation (linguistics)Computer animation

44:37

Graph (mathematics)Function (mathematics)Random numberModel theoryParameter (computer programming)Mathematical optimizationInsertion lossMereologyGraph (mathematics)Graph theoryFunctional (mathematics)DivisorModel theoryEinbettung <Mathematik>Square numberFilm editingCentralizer and normalizerSampling (statistics)Multiplication signSimilarity (geometry)Arithmetic meanWell-formed formulaPresentation of a groupObject (grammar)Term (mathematics)Algebraic structureRandom walkSummierbarkeitFluid staticsTime seriesStochasticRandomizationVector spaceGraph (mathematics)Dot productPhysical systemMatrix (mathematics)Direction (geometry)Mathematical optimizationGroup representationComputabilityGraph (mathematics)Ocean currentBipartite graphStrategy gameMetric systemGradient descentStandard errorMaxima and minimaParameter (computer programming)Right angleProcess (computing)Series (mathematics)Computer animation

50:41

Strategy gameBeta functionSampling (statistics)Centralizer and normalizerWeightMultiplication signRandomizationMetric systemNeighbourhood (graph theory)Graph (mathematics)Degree (graph theory)Numerical analysisDifferent (Kate Ryan album)EvoluteParameter (computer programming)Einbettung <Mathematik>Musical ensembleSet theoryGraph (mathematics)Slide ruleFilm editingRandom walkLecture/Conference

56:44

Musical ensemble

Transcript: English(auto-generated)

00:17

Thank you for inviting me here.

00:21

I will talk about temporal centrality metrics, which is a joint work with Andrzej Bensour from the Hungarian Academy of Sciences. So the topic of this talk, I guess, is strongly connected to the previous one.

00:41

First, I'll explain you the data set and the setting that we are interested in. We are interested in temporal networks where the network changes over time. Now in data sets, this can be captured in different ways.

01:01

What we are looking at are going to be a time series of edges where each link has a time step and basically in your data you have a long time series of these events when an edge appears can be, for example, a Twitter, a mention inside a Twitter,

01:23

in the Twitter mention network. So a general example for these kind of data sets how people mention each other on Twitter, for example, around a certain topic or an event.

01:44

Our objective is to define over this a network centrality metric which is temporal, can reflect the changes in the data set and besides that, it is also online updatable. I will explain this online updatable a little later,

02:03

but the basic idea is that we want a centrality metric which can be always updated on the fly as we read the time series of edges that is coming in as a stream. The concept of our future temporal centrality metric

02:24

will strongly rely on the concept of time-respecting paths which are currently getting more and more attention in current publications and the concept of a time-respective path which you, I guess,

02:42

already figured out is super simple. It's nothing but adjacent edges right after each other in a net stream where the edges are not just adjacent but also ordered in time. Here you can see three edges with three different timestamps

03:00

and the timestamps are, of course, increasing. A time-respective path may have a actual meaning in some data sets. For example, a time-respective path on Twitter can mean like an information flow inside of the social network.

03:23

Another example for this can be a flow of funds or goods in the economy where you, for example, when you investigate transaction graphs, adjacent transactions that appear after each other may mean the flow of funds in the economy.

03:43

So our objective is to develop a online centrality metric which relies on these time-respective paths. And we define the centrality metric as temporal cots centrality. Here in the name, cots may be surprising.

04:04

I will explain the relation of this centrality metric to cots centrality later. So this is just a name but the definition you can see actually here is pretty simple. We'll formulate this in the upcoming slides but the idea is nothing but just to take

04:22

the weighted sum of all time-respecting walks that end in a single node u at time t. This will be the definition of temporal cots centrality. Let me give you an example of this sum in case of a very toy example.

04:42

You have seen this figure before. Say that you have an edge stream that consists of these five edges and you have a, b, b, c, u, d, e and e, u. And at the end, you want to compute

05:01

a weighted sum of time-respecting walks that end at node u, right? And for these edges, you have these five different timestamps. One is the first and the fifth is the latest. In this very small example, the possible temporal paths in this network

05:21

will be the following. First, from a, you can get to b and then from b, you can get to c and then from c, you can get to u, right? These are the first three edges that are actually adjacent. And besides these three, there are also two other smaller paths

05:44

along this long path. The first one starts from b and ends in u and the third one starts in c and then ends in u. And for the final two edges, you can create again two paths. The first starts from d and the final one starts from e.

06:01

So temporal cut centrality will be nothing but just the sum of these time-respecting paths at any time t ending in a specific node, okay? But when I'm talking about the sum, I'm talking about a weighted sum and this is gonna be what I will explain next.

06:25

So formally, what we have done on the previous slide with those colorful nodes is just nothing but the top-like formalized Skollet equation where I state that the temporal cut centrality

06:42

of a node u at time t will be the sum over all the other nodes and for each node, the sum over all the temporal paths that start from that v and then end at node u

07:03

and for all paths z will have this weight. Now the definition of the weight can be arbitrary. The way how we will define the weight is that it will be a product

07:20

of some other weights over the path and you can see for that definition and formalization like a figure here at the bottom of the slide. So say I have a temporal path that starts from uu and we have these timestamps here, then the final weight of this specific path

07:45

in temporal cut centrality will be product of these three terms and each of these depend on the delay between the adjacent edges. So first here we have the delay

08:02

between the second and the first edge and then we have the delay between the third and the second edge and so on. And based on these delays, we assign a weight for every edge pair and then we take the product of these weights which will define at the end

08:22

like the weight of the final path configuration. And in temporal cut centrality, for a given timestamp t and for a given u, we summarize these weights over all the time respective walks that end in a single node u.

08:43

Now about the weighting functions here in this slide, I give you two examples that we've actually particularly used in our experiments and I think both of them are pretty reasonable like everyone would have these first two I guess.

09:00

The first is a simple constant weighting and the second is exponential decay. Now check out first the constant one. For the constant one, we don't care about the delay itself, we just penalize the number of edges that the prime respective walk is built from.

09:21

So at the end, the final weight for a single Z walk that has length, say Z, will be beta to the power Z. So just like in cut centrality,

09:42

we penalize the weight of the walks. The longer the walk, the less likely it will happen. You can think about these weights as something like a probability of the walk. Like the longer walk you see in your data set,

10:01

the less likely that it was an actual flow of information there are actual flow of goods there in the previous examples. The second weighting that we will use is exponential decay where besides penalizing the length with beta

10:23

that we had done before, we're also gonna penalize the delays over the edges and so here the delay will be just inside of an exponential function with some predefined constant.

10:47

Now as we know that for the exponential function, we have this nice property, if we take the product of weights along a specific walk,

11:01

then at the end we have a very nice formula. So think about again this walk here at the bottom of the slide and say that for every weight defined for the pairs of edges, we have a exponential function defined above

11:22

which is based on the delay between the two edges and if we take the product of these, then at the end from this product we'll just have a simple term here which is nothing but a penalization as before

11:42

for the constant weight which will penalize the length of the walk and there will be another term which will simply penalize the length of the walk in terms of the time it took the information to flow from the beginning to the end

12:02

of this time respective walk. So as a summary and as a summary of my notations, temporal concentrality is nothing but the sum of all temporal paths that flow to a single specific node u

12:21

up to some point in time. This sum is always weighted. We can do this weighting by defining a product of weights over the adjacent edge pairs in this time respective walk. Specifically what we will take care about the most

12:43

is when we apply exponential weighting over this walk. In that case specifically we will have a very nice formula for the weight of a specific walk in our sum.

13:00

Okay, so I defined temporal cut centrality before but it still remains unclear why I call this centrality metric temporal cut centrality. In the upcoming slides I will show the relation of our centrality metric

13:21

to the original static cut centrality. As you are familiar with right now, cut centrality is nothing but a weighted sum of all paths of length of k is a weighted sum of all paths ending at a single node u

13:44

and here the penalization is again based on the length of the walk. And what I will show you is that for a specific setting our temporal cut centrality will converge

14:03

to a very similar formula to the static cut centrality. We'll have two assumptions during these calculations. In the upcoming calculations I will assume that we have a fixed underlying graph

14:24

and in this fixed underlying graph we'll have an edge set of size e and what we will do is just we'll uniformly sample over time from this edge set which is defined as a underlying static graph.

14:40

So we're not gonna have any concept in the data or whatever our time series from which we get the edges will be fully uniform. This will just, it's just gonna be a uniform sample with size t from the edges.

15:00

Our goal now is that given the setting at time t the question what will be the value of temporal cut centrality? And we will compute this for both decaying functions. Remember I had two decaying functions. One was the constant beta and the other was the exponential decay.

15:20

First we're gonna see what happens if we apply a constant weighting along the time respective paths and then second I will show you what happens in this setting with temporal cut centrality if we have an exponential decay over the time respective paths.

15:41

Now if we have a constant then what is easy to see is that what will be the expected number of times that in this uniform sample a specific path will happen

16:04

and this will be t over k times e to the minus k for a specific path with length k cause you have a time series which has length t.

16:21

From this you want to find in this a path which has length k and you want to select k examples from these time series and a specific configuration for a specific configuration of a time respective path the probability will be e to the minus k.

16:45

So the expected number for a specific time respective walk in a uniform edge stream will be t over k times e to the minus k. If we know this and if so based on this

17:06

then we can compute temporal cut centrality in a limit when we have a time series of length t because temporal cut centrality for a specific node

17:26

will be nothing but the sum of all possible walks ending in that node u and for a single walk the expected number that we see that walk will be t over k times e to the minus k

17:42

and because of constant waiting we'll also penalize a walk with length k with b to the power k. So at the end the expected value of temporal cut centrality over this edge stream if t is large will be the term on the right side

18:02

which is pretty similar to static cuts centrality. Now for exponential decay the derivation is a bit harder and here what I want

18:21

I will highlight for you from the derivation a few tricks first. Let's say that s t k will be now in expected total weight of a given path of length k.

18:41

After this point we don't know what is the value of this expected total weight for a specific walk but if we can somehow compute this expected total weight in an s stream for a specific walk then temporal cut centrality will be nothing but the sum of walks with different lengths

19:10

and then those multiplied by these weights. And to compute this weight over time there's just one single like trick

19:21

that we should realize which stands specifically for exponential decay and it is if you have a walk if you have a walk with length k starting specifically here at time t minus j

19:46

then no matter and ends at time t no matter how the internal edges in the path will appear between t minus j but t the overall weight of a time respective path

20:02

will be always the same and it will be beta to the k with a exponential decay. Now if by assuming this the derivation of the expected value of temporal cut centrality

20:20

with an exponential decay will be simple and I'm not getting into the details but finally what we will have for the expected weights for a specific path in an s stream where the length of the s stream tends to infinity will be this very nice formula here at the end.

20:43

And in this formula c is nothing but the decaying term in the exponential function. Now assume that this c is much smaller than the number of edges

21:02

from which we sample our time series and say that c is equal to c prime divided by e. So this is just like a notation by introducing a new constant instead of our original one and by this we can easily

21:23

like recompute our previous formula and at the end for a exponential decay what we get is that the temporal cut centrality of the nodes in case of a stationary s stream which tends to infinity will be nothing

21:40

but actually the original static cut centrality and here in this equation the constant with which we penalize the length of the works will be slightly different. So here we have beta divided by c prime where remember c prime was the penalization

22:03

in the exponential function. Okay, so this was like the theoretical part of temporal cut centrality and how temporal cuts is related to static cut centrality

22:21

and what I want to talk about now is that how we can compute actually temporal cut centrality from a graph stream and show you that the computation is very easy and the computation over a s stream is really based on a simple online update formula.

22:44

To understand this, you just have to understand a very simple fact about time respective works which is the following. Say that a new wedge appears now in our stream which will point from node v to node u.

23:04

In that case, the question how the number of time respecting works ending at node u will change, okay? And the change is very simple because what happens first, a new wedge will appear, so of course a new work with length one

23:22

will appear that will point to you. This will be the edge itself and besides that, all the time respective works that were ending in v will continue to node u and we will propagate all these time respective works

23:41

to node u that originally ended in v. Hence, if a new wedge appears in our s stream, then the total increase in centrality for the single node u will be the following formula

24:04

where the constant one term corresponds to the newly appeared edge and r corresponds to the temporal centrality of node v at that specific time

24:22

when this edge appeared. So like all the time respective works that were ending in v at the time when this edge appeared will continue to node u. And then if time flows and if the current time will be larger than t, v, u,

24:42

then this formula will be just decayed, will be also decayed with our decaying function. Based on that, we can set up a recursive-like formula for temporal centrality over this temporal network

25:06

and it is the following. For a single node u at time t, we will do nothing but just summarize over all edges that appeared before time t and for every edge, we will check out the centrality

25:24

of the node that u is connected to at the time when that edge appeared and we will compute a weighted sum over these time respective walks. Now, here what you should observe

25:42

and what is interesting is that this centrality here the centrality of node v at the time when the edge was created does not depend on the actual time. So this is a fixed weight value

26:02

and this is always the centrality when that edge appeared in the past. So it's kind of like fixed and as time flows by, this will not change in the definition. And we'll heavily rely on this in the actual computation of temporal centrality.

26:21

So say that you have a net stream over time and now here I explain to you how finally we can compute temporal centrality based on that time series. We will initialize for every node u

26:41

the centrality equal to zero and then we will maintain three quantities. The first will be the actual centrality of the node. The second will be weights defined over the edges and the third will be the timestamps when the edges appeared over time.

27:03

And what we are doing is whenever a new edge uv appears, first we will compute the current centrality score of this node at that given time and then we will propagate this centrality

27:23

like I explained before to node u and what we will do is that after computing the current centrality score of node v, we will save this weight that I was explaining before and which is not changing over time

27:41

than over this edge. So we will define a weight over all the appearing edges and this weight will be nothing but just one plus the centrality score of this node when that specific edge appeared in time.

28:07

When you have an exponential decay for weighting, then the formulas will further simplify because of the property of the exponential function.

28:24

The updates will be just way much simpler. The takeaway message of this slide is that in the previous case, we had to compute here in general the centrality score of this node as a summation over its in edges

28:44

and this step is not necessary and can be done immediately without any summarization if you have an exponential decay function. So here the update will be nothing but just the following.

29:06

For every node, we will also store when it was the last time updated and the updated centrality score of node v at time t will be nothing but the centrality score the last time we observed

29:22

multiplied with an exponential decay. And if you accept this, then the update formula for the centrality score of u when a specific node appears will be the following. The left term is the update of all the old works

29:41

that were ending in u in a similar fashion as for v and then for this formula, we will add the new incoming edges from the direction of node v to node u. Before explaining our experiments,

30:01

I just want to talk about the singular result which is I think the only centrality metric defined up to this point over time respective works in the current literature and it's called temporal page rank and this is from Polina Rosenstern and Arizionis. Actually, their work is pretty close

30:24

to the formulas and the definitions I explained you before but here in their case for temporal page rank, the weighting function is somewhat different for them as for us. Remember for us, the weighting function

30:41

was either a constant term or an exponential decay over time and here in case of temporal page rank, the penalization of the work will be the following. Say that these two edges appear after each other over time.

31:00

Now for an edge pair like this, for temporal page rank, the penalization will be based on the number of other edges that appeared and started from b after edge ab appeared but before edge bc appeared.

31:21

Thinking about the flow of information or flow of anything, it's like after this edge appeared, then the flow may continue over the edges or already continued over the edges that appeared before edge bc.

31:45

Besides defining temporal concentrality, of course we try to evaluate the performance of temporal centrality, whether we can use this and how we can use that over S time series and in general, centrality metrics are super hard to evaluate overall

32:04

cause you rarely can conduct supervised experiments to evaluate these metrics. For temporal networks, the evaluation is even more challenging because if you have a temporal network and over that you have a definition

32:22

of a temporal centrality which evolves over time, then the best would be if you could have data sets where you can conduct a supervised evaluation with labels that change and evolve over time. Now these kind of data sets are very super rare and hard to find.

32:43

We crawled our own, which is a Twitter mention. We crawl over various tennis tournaments so what we did is that we crawled Twitter and how people mention each other over a sports event.

33:00

This defined for us a temporal network and besides that, we defined labels, temporally changing labels over this network in a way that we checked out for every day of the tournament which people that have actual Twitter accounts

33:21

in Twitter are playing for that given day. These were defined as relevant samples in our experiments. I just want to show you a single result then and well it's pretty confusing at first

33:41

but let me explain what's going on here. Average for every day what we try to understand is how the performance in terms of predicting the players that are actually playing right now on that specific day of the tournament,

34:01

like how well can these different centrality metrics predict the players over the hours of a given day. Of course, in early hours, the Twitter mention network is inactive. Because of that, methods hardly can predict

34:21

what's gonna happen today and which players are gonna actually play today but then over time, people start to mention the players that are actually playing on that specific day and the different metrics and of course, cut centrality is the best in this

34:41

can capture this and better and better over time can understand and predict which players are actually participating in the tournament. On the y-axis, you can see a bit confusing metric and the CG at 50. I'm not gonna explain this now for you

35:00

but say this is a ranking based metric, the higher is better. If you can see their high values, it means that centrality metric can predict the players that are actually playing on that specific day and if the values are low, then it is not capable. For that.

35:22

As a summary for temple centrality, this is a centrality metric that we define over edge time series. The basic definition of that, that this is the sum of time respective walks ending in a specific node. The weighting can be arbitrary.

35:41

We define two different weightings. One was a simple constant penalization which is based on the length of the walk and the second was an exponential decay where the exponential decay penalized the delay in terms of time that lasts from the beginning of the walk to the end of the walk.

36:01

After that, theoretically, we also prove that by using these kind of weighting functions, if we assume that our time series is a simple uniform sample from a specific graph, then the temporal constant centrality definition will actually converge and result into something which is very similar

36:21

to the original static cots centrality. Besides that, I also try to highlight for you that if you want to compute over time the values of temporal cut centrality of the nodes, then this computation is super easy and we can define an online updating algorithm

36:43

for temporal cuts centrality. So in my final slides, what I want to talk about are heavily like work in progress results and questions. So we are very happy if you could give feedback

37:04

for these issues and problems that we are thinking about. The first is very, I think, reasonable and trivial. So for temporal cut centrality, I said that the centrality metric is the sum of time respective works

37:23

ending in a single node, which is a very reasonable definition, but think about the following case. What happens in a Twitter mentioned network if first a user mentions CNN,

37:42

because she has seen something in the news portal that she was interested in, and then someone else in her social network, in the social network of U1, sees this mention and mentions then U1, okay?

38:01

So then the order of the events, right, in the graph will be that the first mention happened in the past, and then the current mention between the two users happened in the present. And in that case, our temporal cut centrality definition

38:21

will give the highest centrality metric for the node that appeared in the present, because centrality, the propagation of centrality will always follow the edges that are ordered. Over time. But in this case, it would be more reasonable to give a higher centrality metric for CNN.

38:46

And the question, how can we do that? Now, a reasonable definition for this kind of centrality metric would be to say that the centrality of a given node is nothing but all the time respecting paths

39:02

starting from that node and end in the present. The problem with this definition is that you cannot define an updating formula for this definition, and you're not gonna be able to compute that in an efficient way as temporal cut centrality.

39:26

Here, we have a single idea how we could overcome this, but before that, let me explain again, like why it's hard to compute

39:42

all the time respective paths that start from a single node. Assume that in the present now, an edge appears in your S stream, which points from B to U, okay? Then if you would like to count the number of time respective paths that start up to time T at every node,

40:02

then, of course, the number of walks that starts from this node, the value of the number of walks that starts from this node should be updated when this edge appears. Similarly, for all other nodes that are a long time respective paths that end in this edge,

40:23

the number of walks starting from this node and this node and this node and so on should be updated whenever this kind of edge appears. Which would be really, really costly, which would cost a lot if we would like to do a exact computation for that.

40:45

And here, I'm just gonna sketch like an idea how it would be possible to overcome this challenge. And remember that temporal cut centrality, generally, what we were doing

41:02

is that to compute the current time respective walks that end in a single node, we updated weights over all of these edges, and these weights were fixed over time, right?

41:20

So the online computation was like, whenever a new edge here appears, we sum these weights, and then we write here an edge value, and then this is not gonna be touched, and it's not gonna change over time. These weights were actually the number of time respective paths that were flowing from direction B to node U, right?

41:46

Now, this is a structure that we can easily online update, okay? And I'm just gonna note you here a single statement. It's the following, and I leave you like the derivation of this.

42:02

If you can online maintain this weighted graph, and then over that, you define a random walk procedure, which is the following. You start from a single node. You compute, you start from a single node at time T.

42:22

First, you flip a coin, and with probability like this, you will stop, and then if continuing, you will continue the random walk over these edges. At the end, this random,

42:40

the random walk which will be resulted of this process will be a time-respective walk sampled from all the time-respective walks proportional to its probability or to its weight. Again, the random walk that we define over these weights

43:04

is the following. First here, we stop with a probability which is related to the centrality of node U, and then, node V, and then V may take a step.

43:21

Then, a step, and then if you continue, the selection will be proportional to these weights over these edges. The random walk is like, first I have a probability to stop, and then selecting from the neighbors will be proportional to the weights over these edges.

43:42

If you continue this, at the end, you will stop somewhere, and then you will have a walk over this weighted graph, and this walk will be a time-respective walk, and this walk will be sampled based with this random walk approach proportional to its actual weight

44:01

as a time-respective walk. Anyway, if we can do this, and we can sample time-respective walks on a way that the sampling probability is proportional to their weights,

44:23

based on that, we would be able to approximate the number of, or the sum of the walks that start at some node. I'm not sure if I, maybe I have time for this.

44:44

Yeah, okay. Yeah, and finally, I wanna talk about something related to these kind of computations in terms of the fact that we utilize time-respective walks but this final part of my presentation has nothing to do with centrality metrics.

45:01

It's about network node embeddings, which is a very popular topic currently in computer science. Now, network node embeddings are popular but actually existed long before 2017 and 18.

45:21

Here, the objective is to find the dimensional representation of every node in the network on a way that these vectors somehow capture the similarity of the nodes inside of the graph structure.

45:42

Now, these kind of embeddings existed before. For example, for bipartite graphs in the research of recommender systems, all of the matrix factorization methods actually compute these kind of embeddings and try to capture the similarity of the nodes in a bipartite graph and somewhat similar to that

46:03

for networks, there are graph factorization methods which basically do the same or generally over graphs. In the current literature, people are trying to somehow improve these methods and a large portion of currently proposed methods

46:25

that improve the quality of node embeddings over static graphs are called random walk-based methods like node2vec. Let me just highlight here for you

46:41

like the core idea of random walk-based methods. For highlighting, first I just want to explain you a simple graph factorization-based method. If we want to compute embeddings like in case of recommender systems and matrix factorization, generally we have a loss function

47:03

that we intend to optimize and during this optimization process, we will get the embeddings that we actually want. For a simple graph factorization loss function, the loss function is nothing but a square loss defined over all the edges that appear in the network

47:22

and what we want here is that we have embeddings defined by vector p and pv for every edges, every edge in the network and for all the edges, we want these vectors to be close each other.

47:41

If these vectors are close each other, it means that they point to a similar direction which means that their scalar product is high. This is a loss function that we generally minimize, for example, the stochastic gradient descent method if we want to learn the model parameters, namely pu and pv over a static graph.

48:04

And then there are random walk-based methods which appeared recently in the last one or two years. Here, the loss function that people generally use is a log-softmax objective which is somewhat different

48:23

than mean square derivative, but don't take care of it. So this is not the important part of these methods. The interesting part of these methods is that the loss in the previous case for graph factorization was defined over adjacent nodes.

48:45

But in this case, in current papers, there are random walk-based methods which define some kind of random walk strategies over the static network.

49:00

And these will generate node pairs over the static network and the loss function will be defined based on these pairs that are the beginnings and the ends of random walks which were sampled from the original network. It's like a second or third order similarity,

49:22

two nodes are not only similar if are connected in the network, but two nodes are similar if usually random walks connect these two nodes. And here I just want to sketch for you our future research direction

49:42

and we are really open to any kind of comments and critiques because based on our experiences with time respective, random walks, our ongoing work is nothing but to define an S stream-based online updatable model which is based on time respective walks and can learn temporarily evolving embeddings

50:02

over S streams for temporarily evolving networks. And this is all I wanted to talk about. So first, I was explaining for you temporal cut centrality which we developed and it is defined generally over S time series.

50:23

It's always the sum of time respecting works ending in a specific nodes. It is related to static cut centrality in case the S stream is nothing but a uniform sample from a static graph. Besides that, for an arbitrary S stream,

50:40

I gave you online updatable formulas for temporal cut centrality. And besides that, we have two ongoing works which are highly related to this. One is how to compute and how to generalize temporal centrality metrics and not only take care of works that end in a single node

51:00

but also start from a single node. And finally, we intend to develop temporally changing network embeddings which are also based on time respective walks. Thank you so much for your attention. Thank you. Thank you very much, questions?

51:21

So, it was an interesting, I don't know if I understood well, but when we were discussing this ongoing work from reversing the order of this centrality measure, essentially we were showing that highly

51:45

central nodes are those which will essentially absorb walks. Whereas low centrality nodes are those that are going to pass it to other nodes, right?

52:01

You mean this example? Go ahead, go ahead. Yes, stop with probably the one over one plus centrality means that if you are very central then the walk is going to be absorbed there.

52:24

If your centrality is very low, you are going to pass. Yes, yeah, because here the concept of this sampling is to select a walk proportional to its probability. And if here I have a node with a high centrality,

52:43

it means that a lot of paths are flowing through this node. And then if a lot of paths are coming from here to this node, then this random walk sampling should select say uniformly over these walks. And then this will continue and end

53:02

at like one of the endpoints of those walks. So like this sampling strategy here that I explained is for selecting actually time-respective walks from over this network structure

53:22

and the stopping criteria is for that. You said that node can appear in time but cannot disappear?

53:41

Slowly it will disappear. So like here all the weights, say like weights that are proportional to probabilities of these walks gonna slowly disappear with an exponential decay. So if a node disappears, say a node disappears in a net stream

54:02

will mean that it no longer gets edges in its neighborhood. If it no longer gets edges in its neighborhood, right, then everything that happened in the past around that node will be exponential decay and then we will forget it with a very low number, yeah.

54:23

Yes, would it be interesting to play with acceleration of the central temporal cuts? What do you mean on acceleration? Because I see that there are evolutions with the temporal cuts on the node. Maybe it could be interesting to see how those evolutions

54:42

are and measure them. Yes, yeah, definitely. For example, we also like defined in our paper other say tricks to overcome some kind of an animal's evolutions.

55:03

One kind of evolution can be if this like there's a spammer core which is highly or too active and we are trying to take care of those by like further tweaking these definitions. But yeah, certainly, yes.

55:21

And of course it will follow like basically it will follow at the beginning like the in-degree of every node and then based on the parameters of the decay and the parallelization beta, it may lead to something different. Yeah, and something else comes into my mind.

55:42

We had another experiment which was not highlighted in the slides which was the following. We created two synthetic networks over the same edges, over the same nodes, okay. And like for say that for one week,

56:00

we were generating S streams and samples from a single network and then we changed the underlying network and then we started the sampling from a different network. And then what we could observe is that this centrality over time could adopt. So it means that first it learned what's going on in the first network

56:22

but then when the network was changed, it started to adopt to like the new setting and the centrality metrics were really changing. Thank you. Questions?