Axioms for centrality: rank monotonicity for PageRank
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 17 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/46362 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
14
00:00
Price indexMereologyFamilyScheduling (computing)Physical systemMany-sorted logicCentralizer and normalizerProof theorySet theoryProduct (business)Object (grammar)MeasurementAlgebraic structureRankingObservational studyLatent heatMetric systemMusical ensembleSelectivity (electronic)Right angleAxiomGraph (mathematics)Numerical taxonomyMultiplication signNumerical analysisAmsterdam Ordnance DatumGroup actionMultiplicationFunctional (mathematics)2 (number)Content (media)Lecture/Conference
08:30
Price indexRankingGraph (mathematics)GeometryMatrix (mathematics)FamilyCentralizer and normalizerClosed setDifferent (Kate Ryan album)Group actionVertex (graph theory)Multiplication signNumerical analysisLinear algebraMeasurementMetric systemMany-sorted logicAlgebraic structureDistanceSummierbarkeitEigenvalues and eigenvectorsMathematicianAdjacency matrixPosterior probabilityRight angleLecture/Conference
14:55
Local GroupSpectrum (functional analysis)Price indexVector spaceEigenvalues and eigenvectorsRecursionCentralizer and normalizerParameter (computer programming)Group actionMeasurementGraph (mathematics)Well-formed formulaDefinite quadratic formSummierbarkeitGraph (mathematics)Right angleParametrische ErregungDivisorDifferent (Kate Ryan album)19 (number)Similarity (geometry)Degree (graph theory)Normal (geometry)Adjacency matrixNumerical analysisLengthTournament (medieval)Alpha (investment)RankingFilm editingMultiplication signPower (physics)Nichtlineares GleichungssystemArrow of timeMatrix (mathematics)ComputabilityLecture/Conference
21:19
Uniform convergenceMatrix (mathematics)Correspondence (mathematics)Neighbourhood (graph theory)GeometryFunction (mathematics)Vector spaceSummierbarkeitFunctional (mathematics)CountingDistanceObservational studyNeighbourhood (graph theory)MeasurementNumerical analysisRankingGeometryParameter (computer programming)Range (statistics)Centralizer and normalizerMathematicsDifferent (Kate Ryan album)Many-sorted logicRecursionSet theoryVector spaceGraph (mathematics)Connectivity (graph theory)Group actionWell-formed formulaSummierbarkeitAxiomCartesian coordinate systemPrice indexFamilyClosed setCategory of beingGraph (mathematics)Projective planeDirected graphNetwork topologyDivisorTopologischer RaumBijectionConnected spaceHarmonic analysisUniformer RaumDegree (graph theory)Algebraic structureNatural numberRight angleLatent heatComputer animation
27:41
Metric systemGraph (mathematics)Invariant (mathematics)DistanceCentralizer and normalizerFamilySocial classArc (geometry)IsomorphieklasseCategory of beingMeasurementAbelian categoryRankingLink (knot theory)AxiomGeometryDifferent (Kate Ryan album)ResultantMaß <Mathematik>Constraint (mathematics)Numerical analysisGraph (mathematics)CountingPrime idealDevolution (biology)Alpha (investment)Many-sorted logicCounterexampleGraph isomorphismInterface (chemistry)Vector spaceGroup actionCondition numberTwin primeSet theoryChemical equationRight angleAlgebraic structureInverse elementSimilarity (geometry)Computer animation
34:02
Graph (mathematics)Graph (mathematics)Category of beingGraph (mathematics)Invariant (mathematics)AdditionCentralizer and normalizerArc (geometry)GeometryGraph (mathematics)MathematicsPrice indexFamilyMeasurementDifferent (Kate Ryan album)SubsetRight angleConstraint (mathematics)ResultantPoint (geometry)Spectrum (functional analysis)Observational studyPrime idealLatent heatGroup actionComputer animation
37:02
Price indexResultantMeasurementArc (geometry)Spectrum (functional analysis)Graph (mathematics)Closed setComputer animation
37:47
Graph (mathematics)SummierbarkeitDistanceClosed setSummierbarkeitDistancePosition operatorCentralizer and normalizerRight angleAxiomDirected graphConnectivity (graph theory)Arc (geometry)Range (statistics)Graph (mathematics)Proof theoryRankingConnected spaceSet theoryComputer animation
39:50
Graph (mathematics)RankingCategory of beingArc (geometry)AdditionPosition operatorGraph (mathematics)Table (information)Graph coloringComputer animation
41:34
Maß <Mathematik>Harmonic analysisRankingCategory of beingTable (information)MeasurementPairwise comparisonComputer animation
42:19
RankingRankingDifferent (Kate Ryan album)Modal logicVector spaceGraph (mathematics)HypothesisCounterexampleCondition numberMatching (graph theory)Computer animation
43:33
Proof theoryGoogolMatrix (mathematics)ChainMarkov chainRegular graphLemma (mathematics)RankingRegular graphSpectral radiusAdjacency matrixDivisorSpectrum (functional analysis)Category of beingMatrix (mathematics)Film editingInverse elementLemma (mathematics)TheoremSet theoryCentralizer and normalizerCartesian coordinate systemDiagonalMultiplication signVector spaceMetric systemElement (mathematics)ChainPosition operatorMarkov chainGoogolModel theoryMoment (mathematics)MeasurementLecture/Conference
47:01
RankingProof theoryWell-formed formulaVertex (graph theory)Vector spaceCondition numberTerm (mathematics)RankingLemma (mathematics)Well-formed formulaElement (mathematics)Process (computing)WeightArc (geometry)AdditionClosed setProof theoryObservational studyMultiplication signCohen's kappaMeasurementBounded variationPrice indexTheoremCentralizer and normalizerCategory of beingPairwise comparisonDifferent (Kate Ryan album)Limit of a functionAxiomAbsolute valueFamilyAxiomatic systemAnalytic continuationLink (knot theory)Matrix (mathematics)Graph (mathematics)ResultantMathematicsNumerical taxonomyRegular graphInequality (mathematics)Group actionInverse elementLogical constantGraph (mathematics)Stability theoryChainModulformLie groupDegree (graph theory)Film editingMetric systemMusical ensembleInvertierbare MatrixCountingPosition operatorRight angleLecture/Conference
Transcript: English(auto-generated)
00:16
Thank you, Andres. First of all, thank you for inviting me.
00:21
It's a great pleasure for me to be here. And today I will be talking about a topic that I've been working on lately in the last few years, mainly with Sebastiano Vigna and sometimes with other people. So if you already know about centrality and the history of centrality,
00:41
you may take a nap for the first part of my talk. But if you are not so familiar with the topic, then I think I will provide a sort of a bird-eye view on the subject and above all on the history of this topic. And then in the very last part, I will focus on a specific measure of centrality, page rank.
01:04
I have chosen page rank because this conference is about the Google metrics. So it's natural to take this as an example. And I will focus on a specific axiom that is rank monotonicity. But this will happen only in the last part of my talk. And if I will have enough time, I also want to show you a sketch of the proof.
01:26
More or less, the content of my talk will be as follows. First of all, I will start with a sort of a plea of centrality. I will try to convince you that centrality is very ubiquitous, very interesting, very simple, but very interesting.
01:44
And it pops up in many different situations. And doing this, I will mention the fact that by now we have literally hundreds of centrality indices. And of course, I don't have time to present them all. But I will try to provide you a sort of a guide to try to classify.
02:04
I will have a sort of a taxonomic approach in this talk. So I will try to allow you for a classification of centrality indices. And of course, I will single out some of them. In this talk, I will concentrate on six of them, two for each of the three families that I will define.
02:23
Then I will also try to convince you that a good way to study centrality measures is using axioms. This is not the only way, but it's a very nice one. And many people have tried or are trying to use this approach to study centrality. And by now, we also have a jungle of axioms.
02:43
So once more, I will try to give you a guide and a taxonomy of the possible axioms. This is quite new. This is something we're working on these days. And I will focus on some important examples of axioms. And then in the very last part of the talk,
03:01
I will consider rank monotonicity for page rank. And for this only case, I will provide a sketch of a proof. Let me start with a very, very general scenario that I have in my mind. And that explains why we as computer scientists
03:20
come to like the idea of centrality. I describe this as a, I said, information retrieval system, in fact, it's a much more general thing. What I have in my mind is that I have a document repertoire, whatever it is. So it may be the set of pages retrieved by Google,
03:41
or it may be the set of items that are on sale on Amazon, or it may be the members of a social network like Facebook, can be many, many different things. I will call, in each of these examples, I will call the items or the objects there documents, and I write D for the set of documents.
04:00
You should think D is finite, but very, very large. And then there is a set of queries. Now, the set of queries is typically infinite. You, in most cases, have a query language to write the queries, but don't think that the queries are always explicit.
04:22
Is it, what should I do? So, in some cases, the queries are really explicit, right? When you type a query on Google search engine, that thing, okay, that thing is explicit.
04:48
So it's an actual user typing an actual query using an actual query language. But in fact, part of the query is implicit, because the query that arrives at Google is probably enriched with information
05:00
that you don't explicitly type, like your location, whether you are typing on a mobile device or not, etc. So this is an example where the query is partly implicit and partly explicit. But in some cases, the query is totally implicit. For example, in product recommendation on Amazon, say,
05:21
then you are going to be recommended a product based on your action. There is no explicit query there. There is your last purchase, or the fact that you have been looking for some specific thing on Amazon and so on. All of these I would call query. And then there is the system.
05:40
So the system takes a query in, and what it should do is selecting the appropriate documents, whatever they are. So for the search engine, it should select the appropriate pages for your query, but for product recommendation, it should select the product
06:02
that you should be recommended, etc. And typically, this is two things. One is a set of documents that are appropriate, and besides, for each of them, a score. That is a value that measures how much that specific document is appropriate for your query,
06:21
or relevant for your query. And, well, more often than not, D is endowed with a graph structure. The simplest example probably is pages retrieved by a search engine, because in that case, pages are part of a graph
06:42
that is called a web graph. But in all the other cases, you can think of the graph that is underlying D. Now, the selection part of this can be, in fact, simplified. Many people prefer not to think of the selection, but rather to think that you score all the documents.
07:01
Simply assign a very, very low number for the documents that you wouldn't select. With this simplification, which is not a big one, the whole system can be described simply as a function that for every query and document pair assigns a score.
07:23
And the larger the score, the more appropriate is the document for the query. Now, the second simplification that I will be doing is apparently more crazy, because now I am assuming that there are no queries. This seems absurd,
07:40
but in fact, there are two good justifications for this simplification. The first one is that in some cases, you are not interested in how relevant a document is for a query, but rather how important a document is overall. So you are interested in importance and not in relevance.
08:01
The second justification for this simplification that I'm going to apply is that in some cases, you can take the query and modify the document repertoire so to reflect the query you have in your hands. And then after this, you can throw away the query. In both cases, the query is not there anymore and the system is now simplified
08:23
as something that assigns a score to every document. The query is not there anymore. And the third simplification is that this scoring happens based only on the linkage structure. So we forget about the rest of information,
08:40
only look at the graph structure and assign a score to every document. And this is what people call centrality. Centrality is a way to assign a measure of importance to every vertex in a graph. This is called centrality measure.
09:01
Always think that the nodes of our graph are the documents now. Keep in mind that this is called centrality index or centrality measure or centrality score. I prefer not to use the rank, the word rank if I can, because rank is something else. You use scores to assign a rank to the documents, right?
09:24
You can sort the documents by decreasing score. This is what people do. But the rank is something different from the score. At the end of my talk, this will be super clear. It's unfortunate that page rank is called page rank.
09:40
The correct name should be page score because it assigns a score to every page, not a rank. But nonetheless, I will try to always refrain to using the word rank in this context. Now, where centrality was born, where the word centrality comes from. In fact, the first usage of the word centrality
10:02
comes from social sciences. And it was used for the first time by a group of sociologists working at MIT at the end of the 40s. This group was led by Alex Bavellas. And from this very seminal work,
10:21
there was a stream of works that started back then in the 50s and is still going on. Centrality is used a lot by social scientists. Psychometrists, sociologists, economists, they all use different kinds of indices for centrality.
10:43
And here I wrote that it was brought to computer science through information retrieval. This happened sometimes with the advent of search engines. But this is sort of an a posterior reconstruction
11:01
because as computer scientists, we prefer to ignore the decades of history of centrality that preceded us. In fact, if you read, for example, the PageRank paper, they make no mention of the fact that they are developing a centrality measure, in fact.
11:23
It is only recently that we are looking back to the past and see that many things that we, as computer scientists, are doing have been done much earlier, either by social scientists or by mathematicians. I will try to give you some pointers later.
11:41
So, as I said, it has a key role, and by now there are so many centrality indices that it is sometimes difficult to have a guidance. Now, the way I prefer to look at indices is by clustering them into three families. Remember that the overall purpose is to assign a score
12:03
to every node in a graph that measures how important that node is. The first family is what I call path-based indices. Path-based indices are based on the number of paths or sometimes shortest paths passing through or arriving at the vertex you are interested in.
12:24
Two examples in this family are very famous, they are called betweenness and Katz's index. I will give you the exact definitions in a second. The second family that we like much more than the first one as computer scientists is the spectral family. It's a very general thing,
12:41
but we can say that it's based on some linear algebra construction on some eigenvector that is defined on a matrix that is derived from the adjacency matrix of the graph. PageRank is a good example, you all know about it. Another example is the Ciliz index, but there are many more.
13:03
And finally, there is another family that I call geometric indices. Geometric indices are based on distances, distances from a vertex to the other vertices. Probably the best example is closeness centrality, harmonic is another example. The six names of centrality indices that you see here
13:23
will be defined in a while in my talk. I should say that these three families are not completely distinct. And in fact, the first two families have a large overlap. Many times the same index can be defined using paths or using some metrics with a spectral construction.
13:46
Let me give you the examples that I was mentioning. So I started with path-based centralities. And as I said, the path-based centrality depends on the paths entering or passing through a node. And without further ado, let me define first betweenness.
14:06
Betweenness is very popular among social scientists. If you talk with people from social science, they say that many of them think that it's the only measure of centrality. It was defined by Anthony in the early 70s.
14:22
And it goes like this. The betweenness of a node X is defined with this summation. So you take every possible pair of vertices YZ and you look at all the shortest paths connecting them and count how many of those shortest paths pass through X.
14:43
You get that ratio there and you sum over all possible YZ. If you want, it is proportional to the probability that if you select the shortest path, that path passes through X. In my view, this is not a measure of importance,
15:03
but it's rather a measure of robustness. But still people use it a lot, many, many times with the intention of measuring importance of nodes. Sometimes robustness is a measure of importance. It depends on your problem. Cat centrality was defined much earlier.
15:23
It can be defined in many different ways, but probably the simplest one is this one. First of all, cat centrality is a parametric centrality measure. So you have to decide a parameter that is called alpha in this formula. What you do is you count all the paths
15:42
ending at X of length T for every possible T. Here I mean all the paths, not elementary paths. And this number is discounted exponentially by this factor here. So essentially a node is important
16:00
if there are many short paths arriving at the node or many, many long paths, because long paths are counted less, right? Because there is this decaying factor. Starting from where? Wherever. They should end in X, but they start whenever. You count all the paths.
16:20
This is cat central. This is called cat centrality. By the way, many of these measures were defined originally on undirected graphs. I am in fact defining them for directed graphs. And in these cases, there is always the problem of whether considering outgoing paths or incoming paths.
16:42
And I always use what is usually called the negative definition, which means I consider incoming paths, because in most situations you are more interested in incoming paths than outgoing paths, because importance is endogenous. It's given by people that is pointing to you.
17:02
So these are the two definitions of the path-based tribe. And let me move to spectral centralities. Spectral centralities in a way have a longer history. And well, the first probably simplest definition of spectral centrality is just take the left or right
17:21
dominant eigenvector of the adjacency matrix. This is called eigenvector centrality. It's not very used for many reasons, but it's simple in one. Sometimes it's considered. In fact, we could trace back its usage from the works by Landau at the end of the 19th century.
17:43
Landau was interested in evaluating the importance of chess players based on chess tournaments. He was manipulating matrices that contains 0, 1, and 1.5. 0 means a defeat, 1 means that x1 over y,
18:04
and 1.5 was used for a draw. Landau observes that if you take m times the all one vector, you get a reasonable measure, right? Because you are simply summing 1 for every win,
18:22
1.5 for every draw, and 0 for every defeat. But it is even better to consider the powers of m. And at the end of a long discussion, he says that what you actually need is solving this eigenvector equation. I think this was the first time, as far as we know,
18:43
that somebody is trying to measure importance based on an eigenvector. And this idea was brought to social sciences by Burch at the end of the 50s. I don't think he was aware of the work by Landau.
19:02
I don't think he quoted Landau. And now Seeley, at the end of the 40s, proposed a similar idea to measure children popularity. He had this group of children, and he wanted to measure who was the most popular. And he came out with this recursive formula.
19:24
The popularity of x, of a guy x, is a summation of the popularity of the guys y that like x. Here the arrow of the graph means y likes x. But there is a small variant
19:41
with respect to eigenvector centrality. That is, the popularity of a guy is divided by the number of people he likes. This is the out degree of y in the graph. This is the difference between eigenvector centrality and Seeley's index.
20:02
Eigenvector centrality is the left dominant eigenvector of the adjacency matrix, whereas Seeley's index is the left dominant eigenvector of G after row normalization. I write GR for this. And they are two different indices. Now let me come to page rank,
20:21
which is something you all know. It was defined in 1998. And the idea is that you can define it in many different ways. It's the left dominant eigenvector of this guy here. And you recognize two things, right? You recognize that there is the row normalized version
20:41
of the matrix, like in Seeley, but there is also this decaying factor. And if you write this as a summation, you will see easily its similarity with cuts. In fact, cuts can be defined much, almost like in the same way,
21:01
except that in cuts, you don't have the row normalization. So you can think that page rank is the son of two parents. One is Seeley and one is cuts. Like in Seeley, you got row normalization, and like in cuts, you got the damping factor.
21:21
This is the recursive definition of page rank, but you all know about it. Or there should be a Y here. Okay, thanks. I will use the definition later when I will talk about page rank. Remember that I'm using the slightly generalized definition
21:41
with a preference vector v. In the original one in the paper by Breen and Page, v is the uniform vector. It doesn't change much. And let me pass to this last family, geometric centralities. Geometric centralities depend on distances.
22:00
You can precisely define what I mean by using what I call the distance count function. The distance count function for the node x at distance t is the number of nodes z that are at distance t to x. And if you let t range over all the possible positive integer,
22:25
you get what I call the distance count vector. Of course, it's ultimately zero. After n, it will be certainly zero. This distance count vector is in one-to-one correspondence with something that is probably better known. It's called the neighborhood function.
22:42
The neighborhood function is defined exactly in the same way, but with a less than or equal to here. Geometric centralities are the functions, the centralities, that are a function of the distance count vector. What I mean is that if two nodes have the same distance count vector,
23:00
they will have the same centrality. For example, trivially, degree, indegree is a geometric centrality because it's just the projection of this vector on its first component. The first component of this vector is the indegree, which is by itself a measure of centrality and it's the geometric centrality.
23:24
Two more famous families of geometric centralities are closeness. Closeness is usually defined as the average distance from a node to all the other nodes in the network. But in fact, you want to take the reciprocal
23:41
because you want important nodes to have large centrality as usual. So you take just the reciprocal of this summation here. You must be careful because closeness was defined by Bavella's. It was actually the first centrality measure that was defined in social sciences. But Bavella's had in his mind undirected and connected graphs.
24:02
If the graph is directed and possibly non-strongly connected, you must be careful in this summation. And what you do is you take this summation only in the same connected component as x, or if you prefer, you take it only for the y's that are at finite distance to x.
24:21
This causes a number of problems. A phenomenon that is called Big in Japan effect. That is that people that are living in a small community tend to be considered more important than they should. There are some ways to adjust this.
24:40
One possible way is considering instead what is called harmonic centrality. The difference between the two formulas is very small, but at least here you can consider the summation over all possible y's with the proviso that when this guy here is infinite, you just sum zero.
25:03
The idea of harmonic centrality was inspired by works by Mercuri and Lattora of the early 2000s, but can be dated back to the 50s. So, with all these definitions in mind, remember that the six definitions that I gave
25:23
are just the tip of the iceberg. There are so many possible centrality measures, but one problem that one often tries to solve is which centrality measure is better, or how does this centrality measure compare to this other one.
25:41
The study of centrality measures can be done either individually, you can focus on a specific measure and study it at depth like page rank and study how page rank depends on the graph structure, on the parameters, on the parameters in the case of page rank are the damping factor and the preference vector, blah, blah, blah.
26:03
So you can study a measure in depth individually, or you can try to compare different measures and you can do both things in two different ways. One is by using some external source of ground truth, if you have one, or you can assume an axiomatic approach.
26:22
You state a property that is desirable or undesirable for centrality, and then you check if which of the measures do satisfy that property. I think that the axiomatic approach is nice because at the end of the day, it allows you not to decide which measure is more important,
26:40
but rather how measures are related to each other, much like the T axioms in topology that allow you to classify topological spaces. So axioms for centrality are not a new thing. You can trace the study of these axioms back to Sabidusi in the 60s.
27:04
Many people are trying to do this, right? They are trying to define axioms that are suitable and then trying to see which measures satisfy them. Sometimes in a comparative way, but sometimes aimed at specific indices. For example, here I'm putting two works
27:22
where they only study axioms for page rank. Sometimes they even have, like in this work, they even have a sort of a hardcore approach. They want to define a set of axioms that is necessary and sufficient for a given index.
27:41
By the way, they succeed in this case, but only for a sort of a degenerate version of page rank without alpha. Now, once more, I think that the axioms are so many that it is wise to try to classify them. And I am trying to do that using three classes that are interesting.
28:04
One I will call invariance properties. Then I will talk about score dominance properties. Then I will talk about rank dominance properties. And I should say that there are also many other axioms that don't fall in these categories, but the first three categories are interesting by themselves.
28:22
Let me start precisely with invariance properties. And first I want to give you a flavor of how those properties are usually stated. Usually they are stated like this. You have two graphs, g and g prime. And you have two nodes, one living in g and the other one living in g prime. And these two things satisfy a number of constraints.
28:44
And under these constraints, you want that the score of x in g is equal to the score of x prime in g prime. Now, depending on the constraints, the actual constraints, you get different invariance properties. Let me start with a very simple one. There is what I call invariance by isomorphism.
29:04
This one is the following. g and g prime are two graphs that are isomorphic. And f is the graph isomorphism. You take a node in g and its image in g prime, and you want the two nodes to have the same score. Now, this property is so obvious
29:22
that in fact it is given for granted. When we say centrality, we mean a measure that satisfies this action, right? Because we want centrality to depend only on the graph structure, not on the name of the nodes. So when we define centrality, we should always state it using this property,
29:41
which is fundamental, even if usually people don't do that. So this is a very nice invariance property and all the centrality measures that I define and that are defined in the literature do satisfy this. You can do it one step further and consider another kind of invariance,
30:01
which I call invariance by neighbors. Invariance by neighbors, in this case, you have just one graph, g, and two nodes living in the same graph. And these two nodes have the same in-neighbors and the same out-neighbors. Under this condition, you want x and x prime to have the same score.
30:24
Otherwise said, if you take two twin nodes, nodes that have exactly the same interface to the graph, by interface, I mean incoming arcs and outgoing arcs, then you want them to have the same score.
30:41
Is it satisfied by the measures I defined? Well, yes, the answer is yes. In fact, it is easy to verify that all geometric centralities do satisfy this property and the same happens for spectral centralities. And you can take this one step further, right? Because all the measures that I mentioned so far
31:01
only depend on incoming links. So you may think that this property, this invariant property can be strengthened by what I call invariance by in-neighbors. Invariance by in-neighbors simply means that if you have two nodes in a graph and they have the same in-neighbors,
31:20
they should have the same score. Is it true for the measures I defined or isn't it? What do you guess? Well, at the beginning, when I saw this, when I thought about these axioms, I thought that all the measures that I defined
31:40
in fact satisfied it, but it's wrong, it's not true. And I give you an example of why it's not true. I will show that geometric centralities in general cannot satisfy this kind of invariance. Take these two guys. So x and x' are two guys living in the same graph.
32:01
They have exactly the same set of in-neighbors, but they may have different out-neighbors. So the in-neighbors are the same, but the out-neighbors can be different. Now take any other guy, z, and consider a shortest path, let's say from z to x. Now, however, this shortest path is done,
32:22
at the very last step, it will contain an arc from one of the in-neighbors of x to x. And you can slightly modify this path to obtain a path which is exactly the same, but going to x', right? So what I said is that a shortest path from z to x
32:42
can be turned into a shortest path from z to x' and the other way around. This means that the distance of the guys from x is the same as the distance of every guy to x'. But there is this important difference.
33:00
I am considering a guy which is not x or x'. Because x and x' cannot be considered here. The distance from x to x' and the distance from x' to x can be completely different. And the result is that, if you consider their distance count vectors,
33:23
they are almost the same, but not exactly the same. There will be one single unit that is moving. For example, here, the single unit is moving from here to here. Because in this example, the distance from x' to x is 4,
33:42
whereas the distance from x to x' is 8. And since these two vectors are different, in general, you cannot assume that they have the same centrality. And in fact, we have counterexamples for all the families of geometric centralities. This is a small difference, but it's enough to make the axiom fail.
34:02
On the other hand, if you consider symmetric graphs, this does not happen. So they do satisfy invariance by neighbors. And by the way, spectral centralities all satisfy invariance by neighbors.
34:21
So far, I've considered invariance properties. Now I move to score dominance properties. Score dominance properties go like this. You have two graphs, you have two nodes, some constraints, and based on these constraints, the centrality score of x in G is at least as large as the centrality score of x' in G'.
34:42
Sometimes you require a strict larger. In this case, I talk about strict dominance. One example of dominance that was considered recently in this wonderful work by Schock and Brandes of 2016.
35:01
This, by the way, this is a very nice piece of mathematics. They try to define a large family of centrality measures based on some semi-group construction. But at some point, they consider this dominance that I call dominance by in-neighbors.
35:20
It goes like this. You have two nodes in a graph and the in-neighbors of x are a subset of the in-neighbors of x'. Then you want the score of x to be not larger than the score of x'. Why should it be like that? Because, of course, if x has a subset of the in-neighbors of x',
35:43
it means that x' is at least as important. But we already know that in directed graphs for geometrical centrality measures, this property cannot be satisfied because if it would be satisfied, also invariance by in-neighbors would be satisfied.
36:03
In fact, Schock and Brandes studied this property only on undirected networks. Another very famous example of score dominance is score monotonicity. Score monotonicity goes like this. Pay attention because this is the property
36:22
that I will be focusing on in the rest of my talk. So you have a graph that does not contain a specific arc x to y. You add this arc and what you want is that this addition will increase the score of y. This is, in fact, the property that was considered by Katja
36:44
in her talk two days ago. She used to call it sensitivity. So this property was studied a lot for many different indices
37:02
and this is the result for the indices we are looking at. You see that there is no apparent pattern. There are some spectral measures like CILI that don't satisfy it. PageRank does satisfy it and in a sense Katja gave it for granted during her talk.
37:24
If you add an arc to a node, the score of that recipient node will always increase. Betweenness does not satisfy it. Cats does, et cetera. The situation is different on strongly connected graphs sometimes. Let me just show you why on closedness
37:45
in PageRank you have this situation. First, let me start with PageRank. The score monotonicity for PageRank was proved back by Chen and others in this work of 2003. They, in fact, had to assume that all nodes have non-zero PageRank
38:06
for their proof to work. We generalized it under a much weaker assumption. We only assumed that the source node has a positive score.
38:21
For closedness, the fact that closedness in general does not satisfy score monotonicity can be shown in a very small example. This is the example. Take this graph here. Now, the centrality of Y is one because there is only one node that can reach Y and it is at distance one.
38:40
But if you add the arc from X to Y, now you have two guys at distance one and the overall centrality of Y becomes one half. So in this example, you add one arc to Y and you decrease the centrality of Y instead of increasing it. On a strongly connected graph, though,
39:02
closedness centrality does satisfy the axiom and the reason is very simple. This is the definition of closedness, right? And in a strongly connected graph, this summation ranges over all the possible Ys. So after you add the arc, you have exactly the same set of Ys
39:22
over which the summation is done. But adding an arc will in general decrease or not increase all the distances and there is at least one distance that gets decreased, that is the distance X to Y.
39:40
So the summation strictly decreases, which means that the closedness strictly increases. Now, score dominance properties are interesting, but in fact, they are not so interesting because as we saw in the talk of Katya two days ago,
40:02
it may be the case that you add this arc from X to Y and we know that Y increases its score, but this doesn't tell us much because maybe there are other guys that increase their score as well and maybe they become more important than they used to be with respect to Y. That's why people usually prefer
40:21
to consider rank dominance properties. Let me jump the general approach and go directly to the rank monotonicity definition that is the one that I want to focus on. So this is the rank version of the score monotonicity property
40:42
I discussed a second ago. You have a graph, you add one arc X to Y and you want this to happen. So if Y used to be better than Z, after the addition of the arc, the same must be true. And if Y was at a tie with Z,
41:02
then after the addition, either Y is still at a tie with Z or it is even better. This is called rank dominance. So you don't want only Y to increase its score, but you want to keep its position in the rank. And this is the weak version of rank monotonicity.
41:23
The strict version requires that all ties are broken in favor of Y. Not only you have the same position, but you also break all the ties in favor of Y. This table tells you what is the situation for rank monotonicity.
41:42
And for comparison, here I am putting in parentheses the score monotonicity properties. You see that in all cases when score monotonicity was satisfied, also rank monotonicity is satisfied. And in many cases, it's satisfied in a strict way. The star here means strict.
42:02
So at least for these six measures, there is nothing surprising. If the score increases, also the rank is preserved. And I will focus in the few minutes that I have, I will focus on rank monotonicity for page rank. The story goes like this.
42:22
In the work by Chen and others, where they proved that page rank is a score monotone, they also proved the weak rank monotonicity under the assumption that every score is positive. Every page rank is positive. We did the same more recently,
42:42
but we were able to prove the strict version of rank monotonicity with a slightly stronger hypothesis. That is, we had to assume that the preference is everywhere positive. This is stronger, of course, because if the preference vector is everywhere positive,
43:03
then all the scores are positive. But we can prove that this condition is in fact necessary. We have counterexample of graphs that are strongly connected and some preference is zero and that fail to satisfy the strict version of rank monotonicity.
43:22
Now, the difference between what Chen and others did in 2004 and what we did is also in the technique that we had to use. Because in their work, if you look back at their work, which was, by the way, very interesting, they use the fact that the Google metrics
43:41
is a regular Markov chain, so they use Markov chain properties, whereas we had to use properties of M-metrices, which we think are interesting because they are very general. In fact, what we did can be applied in many different ways, to many different measures,
44:01
and we are able to apply our theorem at the same time to page rank and cuts, for example. So let me give you an idea of what we did. We worked in a more general setting of which page rank is a special case and also cuts is a special case, that is called dump spectral ranking.
44:23
You start with a non-negative matrix and with a factor that is smaller than the reciprocal of the spectral radius of M and with a strictly positive vector v. And you define the rank to be like this. Now, if you are confused by the generality of this definition,
44:43
think that you get page rank when M is the renormalized version of G, of the adjacency matrix, and you get cuts when M is exactly the adjacency matrix. But this is, as you can see, more general. So this is the kind of centrality we are working on.
45:01
And the basic lemma that we use is the following. It's an interesting lemma and I want to discuss it in detail because I think it can find applications even to other problems. It's a lemma about inverses of M matrices. So you have this M matrix here
45:20
and you take its inverse and you define the rank by multiplying C by a preference vector v. Now, consider in this matrix C two columns, y and z, and compute all the ratios between elements in the yth column and elements in the zth column.
45:41
Of course, to do this, you must, for the moment, assume that z does not contain any zero. Now, among all these ratios, there is one that involves the diagonal in the column y, this ratio here. Now, it turns out that this ratio is the largest,
46:03
assuming that cyz is non-zero, this ratio is the largest of all ratios. You don't need to write it as a ratio. In fact, you can write it this way. So the only thing that you are assuming to be non-zero on this column is cyz.
46:20
This is the first property, which is interesting and which we use. And the second property, as a consequence of this, if you take the rank of y, which is this guy here, well, this guy can be multiplied by q,
46:40
and you get that ry is less than or equal to rz. And as a consequence, if rz is instead less than or equal to ry, then q must be larger than one. This is the key property that we will use in the theorem.
47:02
So the theorem goes like this. Remember that we are looking at this guy here, and four-page rank m is the row-normalized version. Now, let me observe that if you add an arc and look at the difference between the new matrix and the old matrix, what you get is something like this. All the rows that are not the x-row are zeros,
47:23
and the only difference is in the x-row, and the x-row contains some negative entries that are the old out-neighbors of x, and exactly one positive entry that is in position y. I call this vector delta,
47:43
and what we do is observe that the difference between the rank after the addition of the arc and the rank before the addition of the arc can be written in this way using Sherman Morrison's formula. Kappa is a positive constant. I will get rid of it in the next.
48:00
Delta is the vector that I just showed you. Remember, it is all zero or negative and it has exactly one positive entry, and then there is this inverse of an m-matrix, and this is where we use the lemma. Now, what we need to prove is that if ry used to be better than rz now
48:23
after the addition of the arc, it is strictly better, and in fact, what we prove is that the increase in z, the increase of score in z, is less than the increase of score in y, and the increase of score by this formula
48:41
can be written in this way. So what we need to prove is that the z-th element of this guy here is less than the z-th element of this guy here under this condition. But the condition... Well, you have to discuss many cases, but the only non-trivial one is when the yz element of this matrix is positive.
49:04
The other cases are easy. And in this case, using the lemma by the fact that rz is less than or equal to ry, we know that q is larger than or equal to one. So if you look at this, the y-th element of this guy here is delta times...
49:24
Delta y times cyyc is this inverse matrix, minus these elements. These elements are all negative, so I can write them as a minus and take in the absolute value. And using the fact that q is larger than or equal to one,
49:41
we just collect q from here and get the inequality that we want. In fact, we proved the weak inequality. Now, to go to the strict version of the inequality, we need some more work. But I wanted to give you the flavor of the proof
50:01
because the proof is very different from the one that Chen and others used in their work. They used the properties of regular market chains. So what is the take-home message of my talk, if there is one? First of all, I wanted to convince you, and I hope that I did, that centrality is important and it is very ubiquitous.
50:25
Unfortunately, we have so many centrality indices and I think we should make an effort to classify them in families. Our classification into three families is still very rough, I think. I think we can do a better job.
50:41
Axiomatization is one of the ways in which you can study centrality. I'm not claiming that it's the only one or that it's the best one, but it's a good one. And by now, there are so many axioms that also axioms should be, in a way, taxonomized. This is the very beginning of an attempt
51:03
to look for a taxonomy of axioms. And be careful because sometimes you see a property that appears to be so trivial, that is either true or false, but sometimes proving that it is actually true
51:21
or actually false requires a lot of work. So in case you are interested in this kind of proofs, you should be careful because it's much wider than you think at the first sight. Thanks a lot for your attention.
51:40
Thank you very much. Questions? You have examples? Examples of what? Examples with concrete data with Google or... Yes. So I didn't want to present this in this talk,
52:01
but the axiomatic approach is not the only one that we use. In fact, it's not the first one. At the beginning, we looked at actual data and the web graph... Sorry, we used the main example coming from the web.
52:21
And the comparison between all the indices... I don't have here this slide, but you can see that some indices are pretty much alike. Like, for example, page rank is very, very similar in practice to in-degree.
52:42
Page rank is also quite similar to cuts, but it's, for example, very different from closeness. And by the way, it is based on these observations that we observe that betweenness is an index by itself. It doesn't have any relation with any other index.
53:03
So, I mean, in a nutshell, we did and other people did a lot of work on practical graphs and on these indices. The problem is that what you come out with is you get these two ranks and then what? You have closeness centrality of all the UK pages
53:26
of a large web graph, and then you have page rank centrality. You can compare them with Kendall's Tao, let's say, to see if they are similar or not. But if you decide that they are not similar, then deciding which one is better is a tough job.
53:42
You must have some ground truth to do that. You discussed the change of page rank when you add the link. So if you have weighted graph and you change the weight, everything can be generalized?
54:00
Probably so. Yes, most probably so. So it's a matter of sensitivity. Every result we used is in fact continuous. So I think it's easy to generalize to weighted page rank. But in that case, it is not really clear
54:22
of what you want for rank monotonicity. That if you increase by epsilon, the weight... I discussed this variation by epsilon of weight. Yeah, I am pretty sure that for every positive epsilon, if you increase the weight by epsilon,
54:42
you get strict dominance. But I'm not 100% sure, but probably it's like that. More questions? You spoke about centrality in terms of vertices.
55:04
Measures of centrality for edges. Yeah, betweenness was originally defined for edges, yes. Yes, of course. Many people study centrality on arcs or edges, depending on whether you are considering directed or undirected networks,
55:20
which is also quite interesting. You can do it directly or you can do it indirectly looking at how sensitive the network is by the deletion of an edge. An edge is important if when you delete it, you change a lot, you disrupt a lot centrality of the nodes.
55:47
So we have two questions. One is, you mentioned you have counter examples. If I understand when the damping vector has zeros and then that monotonicity doesn't hold, does it have an effect on the usability of page rank
56:02
or is it some degenerate case only? It's only the strict version that doesn't hold, right? I don't think it has any practical impact because the weak version will in any case hold as long as all pages have positive rank. The other question is, are there axioms for stability,
56:23
like adding something will not significantly change the scores? Well, you can think of rank monotonicity as a form of stability. What you are saying is that the addition of that arc
56:41
does not change the relation between the target node and the other nodes. I'm not sure about the rest of the network. I'm not sure whether if you add an arc here, you cannot change the relation between two guys that are not involved in the addition. Probably so.
57:00
This would be another very interesting axiom to study. Thank you.