Optimization using Flow Networks in NetworkX.
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 160 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/33791 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 201734 / 160
10
14
17
19
21
32
37
39
40
41
43
46
54
57
70
73
85
89
92
95
98
99
102
103
108
113
114
115
119
121
122
130
135
136
141
142
143
146
149
153
157
158
00:00
Mathematical optimizationComputer networkMenu (computing)Vertex (graph theory)Convex hullFunction (mathematics)Repeating decimalControl flowGraph (mathematics)Parallel portOpen sourceChannel capacityGraph (mathematics)Maxima and minimaComputer clusterSpherical capPartition (number theory)TheoremConnectivity (graph theory)Bipartite graphOpen setData miningScheduling (computing)BildsegmentierungGaussian eliminationReduction of orderLevel (video gaming)Polygon meshTotal S.A.Web pageGradientComputer wormModemMaß <Mathematik>Open sourceTerm (mathematics)SoftwareGraph (mathematics)SummierbarkeitFilm editingAlgorithmFunctional (mathematics)Set (mathematics)Cartesian coordinate systemMaxima and minimaProjective planeControl flowChannel capacityTotal S.A.Execution unitLogical constantPlanningProof theoryApproximationGraph theoryMultiplication signVertex (graph theory)Conservation lawArithmetic meanGraph (mathematics)Directed graphConnected spaceWeightCASE <Informatik>MereologyStudent's t-testForcing (mathematics)NumberLetterpress printingIntegerLevel (video gaming)Sign (mathematics)InformationScripting languageGraph coloringString (computer science)DialectSatelliteFunction (mathematics)Optimization problemMathematical optimizationError messageDistanceTheoryDifferent (Kate Ryan album)Object (grammar)Electronic mailing listLine (geometry)Type theoryAttribute grammarShortest path problemSubsetPiPosition operatorNegative numberMobile appParallel port2 (number)Library (computing)BitSelectivity (electronic)MathematicsClient (computing)Video gameWell-formed formulaPartition (number theory)Medical imagingPlane (geometry)Default (computer science)Theory of relativitySlide ruleCondition numberVector potentialGaussian eliminationBootingLaserGraph (mathematics)Rule of inferenceCellular automatonData storage deviceStapeldateiElectronic visual displayDemosceneWordAcoustic shadowInteractive televisionObservational studyBit rateOcean currentPairwise comparisonRight anglePeripheralPhase transitionStandard deviationRevision controlFeasibility studyGodPower (physics)View (database)Form (programming)Traffic reportingWebsiteMetropolitan area networkSpherical capGoodness of fitSound effectGrand Unified TheoryProcess (computing)Conjugacy classDimensional analysisArmGroup actionJunction (traffic)Connectivity (graph theory)Text editorElement (mathematics)Computer animationDiagram
Transcript: English(auto-generated)
00:05
Hello everyone, my name is Rishabh Dal and I am a student at IIT, Varanasi. The topic of today's talk is optimization using flow networks. The talk can be seen as in three parts. In the first part, we will have pretty basic introduction to NetworkX library.
00:22
It's a library we use for dealing with graphs. In the second part, we will discuss a thing about what are the flow networks, what is the maximum flow, what is the minimum cut. These things are mathematical concepts but we will see how we can use these concepts for dealing with some optimization problems.
00:43
Flow networks are generally used for getting optimized solution for something and these days they are also used for approximate solution like emergency evacuation planning and all that. In the third part, we will discuss a problem called project selection problem.
01:02
And we will discuss how we can model the problem into a flow network and prove that the minimum cut of this problem gives us the solution and within five to six lines of code, I will solve the problem in NetworkX. So, we'll start. So, in NetworkX, we have four types of graphs.
01:22
First is the simple graph, undirected graph which do not have any parallel edges. Another is directed graph which have parallel edges and then undirected graph with parallel edges and similarly, directed graph with parallel edges. So, first of all, it's a pretty basic functions like add node,
01:41
node can be an integer and then we can add some attributes to the node too like color, weight, these are used for various algorithms. And one good thing is the node do not need to be integer. It can be any hashable object like it can be string, it can be a mathematical function or even it can be a graph itself
02:02
like I have added the graph itself as in vertex into the graph and some pretty basic function like g.node gives us the nodes, we can iterate all the nodes, all the nodes are here and for adding edges, there is add edge, a to b, a to b were not originally added as vertex in the graph
02:21
but it will automatically add them as vertex in the graph and weight, some colors and all that. For accessing the edge, you can access them just like you access a multi-dimensional array. So, g25 and g52 are the same thing because we are dealing with undirected graphs and we can add the edges from a list as well,
02:43
similarly we can add nodes from a list as well and some pretty basic function like number of edges, number of nodes, number of self loops, network access, a lot of this function which comes very handy while writing small script when you are dealing with graphs and this is the thing I like the most,
03:01
like you can draw the graph pretty easily. It comes very handy when you are dealing with very large graphs like you are trying to find some relationship between the population and all that and it can be customized very easily. You can change the edge length, edge color and everything like that.
03:21
So till now, the nodes were not integers like the one node was a mathematical function sign, one node was the graph itself, one node was string somewhat integers, so you can just convert them to integers if you want and keep the old levels too. So like here, the old node that was a mathematical function sign
03:42
is now number 5. So the graph remains the same, just the labeling is changed. It's pretty easy to iterate all the edges, print the information and something like that and these are some algorithms that, these are just one, two, three algorithms that I'm discussing. There are a lot of, almost all the graph algorithms are available with NetworkX
04:04
like the connected components is the set of vertices that are connected to each other, going over the edges. So here, 0, 1, 2 and 7 are in different connected components. So if I connect 0, 7, they will be in the same connected component.
04:24
You can relabel the node to create a new graph and you can take union, intersection of the graphs. The shortest path gives the shortest distance using the Digistra algorithm. Shortest path here between 10 and 12 gives an error because the 10 and 12 do not have any path between them.
04:41
Then shortest path between 0 and 8 is 0, 7, 8 and then is dominating set. If you know about the dominating set, I'm just showing you the simple algorithms. So this was just a simple introduction like how to create a graph, add a node, add an edge in the NetworkX.
05:02
Now we will see something about the flows and the cuts in the graph. So first of all, let's just see about the history of the network flows. So around 1955, some US Air Force researcher published a classified research
05:23
in which they studied the 40 regions of Soviet Union. The Soviet Union was connected to satellite Eastern European countries with railway networks. So they studied 40 of the regions, 105 edges. So edges are just the railway network between two regions.
05:45
So they were studying it and they gave a weight to each edge and the weight was how much material you can pass from one region to another. What their intention was, just like these cars,
06:00
they wanted to bomb the railway networks. They wanted to disconnect the Soviet Union with its European countries and there was a course related with bombing East railway network. So they found a set of edges such that bombing them was cheapest and it was cheapest to disconnect the Soviet Union from its European countries
06:27
and they were successful. So let's see what did they use. So what's a flow network? Flow network is a directed graph which generally do not have parallel edges and we have two very distinct nodes in it.
06:43
One is the source and another is the sink. Source is from where the flow starts and sink is where the flow ends. And for each edge in the flow network, each has a capacity attached to it. The capacity is the maximum amount of material that you can flow through that edge.
07:05
So this is an example of flow network. S is the source, T is the sink and given the capacities of all the edges, the capacity of the edge S to 4 is 15. It means you can pass 15 units of material through edge S to 4.
07:28
So the flow from source S to sink T is a function which satisfies two assumptions like the flow on any S is greater than zero obviously
07:40
and it should be less than the capacity of that edge. And for every vertex that is not the source and the sink, we have a conservation of flow. Like for every vertex other than source and sink, the flow that is coming is equal to the flow that is going out of the vertex.
08:02
And the total flow is how much we are getting at the sink. Like this is an example of flow. So here we are pushing 4 units of material from source S to 2, then from 2 to 3, 3 to 6 and 6 to 4. This is a valid flow because we are not breaking any of the assumption.
08:24
The flow is less than the capacity in every time and all the vertex have the conservation of the flow. So this is the maximum flow that we can get from this flow network. So we are pushing a flow value of 28 from the source,
08:43
getting it at the sink and all the edges are satisfying the assumption like the flow in them is less than the capacity of the flow and for each vertex the conservation of flow is working. We cannot pass more than 28 units of flow through this network.
09:00
Just remember the number if you can. So this is the maximum flow and now we jump to the cuts of graph. So till now we discussed what is the maximum flow in a graph. Now we will discuss what is the cut of a graph. So in cut of a graph we try to partition the graph into two subsets, A and B.
09:23
Generally we say that source belongs to A and sink belongs to B. The idea here is to cut some edges such that the maximum flow from source S to sink T becomes 0. Just like the air force researchers wanted, they do not want any material to flow from the Soviet Union to the Eastern European countries.
09:45
So they just wanted to cut some railway networks in between. The capacity of the partition is calculated as the sum of the capacities of the edges that are going from set A to the set B. We are not considering the edges that are going from set B to set A
10:04
because they will not participate in the flow. So the capacity of the cut is the capacity of the edges that we cut that are going from set A to set B. The pretty simple cut of the graph will be to disconnect the source from the graph.
10:22
Like here we are just disconnecting the source from the graph. So we are cutting all the three edges that are connected to the source which have capacity 10, 5 and 15. So the capacity of this cut is 30 but this is not the optimal case.
10:40
We want to find the minimum cut. This is the trivial case. We can always separate the source from the network. So we want to find the minimum cut. This is the minimum cut. If we destroy the edge from S to 2, 3 to 6 and 7 to T, the sink,
11:01
then the capacity of the cut will be 10 plus 8 plus 10 which is 28. The max flow was also 28. So this is another theory that some guys from the Soviet Union also studied the same network at different time. They wanted the good thing. They were calculating how much material they can actually pass
11:21
from Soviet Union to European countries and the air force searcher wanted to cut the network. So they actually found the same number if you give them the same weight. So in every network flow, the max flow is always equal to the minimum cut. So there's a guy called Folt-Fukerson who gave this theorem
11:41
that in any network, the value of the max flow is always equal to the minimum cut. Like here, all the flows and the cut of the graph are shown on the same flow network. So you can see some of the capacities of the cut and the flows are same and is equal to 28.
12:05
So what are the applications of network flow? Network flows are used in network connectivity, bipart matching, airline use them to find the minimum number of planes they need to serve all their flights. It's used in image segmentation to separate the background or foreground of an image.
12:23
It's used in project selection which we will discuss. It is used in baseball elimination like they eliminate the teams which do not have any chance of going further. Now we go to the project selection problem. There will be a little bit of mathematics involved.
12:43
So you have a company which need to carry out some projects and let's say you have a set of projects P and with every project is associated a value, the revenue of the project. So if the PI is greater than zero, then the project is a profitable project like you are gaining some new clients
13:01
and if PI is smaller than zero, then it is a losing project like you need to hire some new guys and just like in real life scenario, the projects are interdependent. You need to do some project before doing another so some projects are prerequisite of another project. So we are given with a graph G
13:21
where edge from A to B shows that you need to complete the project B before doing project A. So now we need to find a subset of project which gives us the maximum profit like a subset of the project which is complete in itself
13:40
like we are doing all the prerequisite of the project that we are involved and the profit is maximum. So just like this cat, we are looking for the maximum profit and projects we need to do. So how we will reduce the problem to network flow?
14:01
So first of all, the problem we discussed right now that there is a set of projects we want to do, there is no potential source and sink in this problem. So first of all, we will make a source and a sink for the problem and now we need to add more edges, more conditions between the projects.
14:25
So for every project that have a positive revenue, we will add an edge between source and the project with the capacity of the revenue of the project. And for every project that have a negative revenue, we will add an edge from the project to the sink with negative revenue.
14:47
We are doing it for the projects with negative revenue so it means the capacity of the edge remains positive. And the edges that were given us in the graph,
15:02
like the edges that represented the relations between the project, the projects are prerequisite of one another, we will set the capacity of these edges to a really high number because while we are finding the cut set of the network, we do not want to cut those edges because if we cut those edges, we will not get a feasible project.
15:24
I mean feasible set of projects. If we are cutting a prerequisite edge, it means we are missing some prerequisite for some project that we are taking. So we will make the capacity of these edges to be a really high number so that they are not included in the cut set.
15:41
So if you make it to the equal to the sum of positive revenues, it will be fine. So now we have represented the graph into a network flow. We have three types of edges. One is going from the source to the projects which have positive revenue. The other type of edges are the interrelated edges,
16:01
like some projects are prerequisite of another project. And the third type of edges are edges going from the project with negative revenue to the same. So now let's see what will be the cut value of this graph,
16:21
so this bit of mathematics. So we have three types of edges in our graph and absolutely we will not be cutting the prerequisite edges because they have really high capacity. So we will be cutting two types of edges, one that is going from the source to the project and second which is going from the project to the sink.
16:43
So when we are cutting an edge which is going from source to the project, the value it contributes to the cut is the second term in the first line. And then if we are including a project with negative revenue
17:01
due to the prerequisite need, then we are cutting it from the sink side so it will be contributing the first term of the cut. So we can expand the second term, like summation of Pi when i does not belong to set A,
17:21
we can expand it to summation of Pi when i belong to the whole project, subtracted summation of Pi when i belonging to A, the subset. So it will reduce to this, the summation of Pi when i belonging to complete set P minus profit of A, because this is the profit of A.
17:45
If I take the summation of all the Pi's in our subset A, that is the profit we are getting from A. So the cut value is reduced to summation of Pi for which Pi is greater than zero and i belonging to our set P minus profit of A.
18:03
Now since the first term is a constant term for a given set of projects, we see the cut of graph is a constant minus profit of A. So if we minimize the cut, we will be maximizing our profit. So now we have proved that the minimum cut of the graph
18:21
gives us the set of projects that have the maximum revenue. Now we will just go for an example, like there are some projects which some have positive revenue, some have negative revenue, and there is interdependence between these projects. Now we need to select projects which have the highest revenue
18:42
and that is a feasible set. So I guess it's not possible to do it by hand or by just saying it. So we will just represent the whole graph in network X and use the minimum cut function on network X to find a solution.
19:03
So for the network flow, we will be creating a directed graph, line 44. I cannot point it. So line 44, we will be creating a directed graph and like the image I've shown here, I'm adding the edges manually.
19:25
I mean, if you will be using network with some script, you will not be adding the edges manually. So I added the edges, the pairwise, all the edges, and we needed to set the capacity of these edges to be a really high number
19:40
and network X do it by default. Like I have not mentioned the capacity of these edges, so it will consider that the capacity of these edges is infinite. So in line 46, I have calculated the sum of all the PI that are greater than zero and that are present in our set.
20:01
Like P. So that is the constant that we found in the formula we derived. And so if we found the minimum cut, and next to minimum cut from source to sink, we see the value of the cut is 21 and we get two sets, 4 at 10, 11, 12, 14, minus 13, minus 3 with the source.
20:20
So the set with the source is the set we should choose. It is the feasible set which have the maximum profit. And if we subtract the minimum cut from the constant C, we will get our revenue like 43 is the total revenue you can get while choosing projects from this graph.
20:45
Like if you choose all the projects on the left side of minus 1, we are good to go. That's the end of slides. Any questions?
21:02
Yeah. At the end of the talk, any questions? Yeah, so we have time for a few questions if anyone has any.
21:23
Well, thanks again. We can continue in the coffee break and do remember to vote, sorry, break talks in the app. Thank you.