Modelling pollution from traffic, using Smartphone data and Python
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 160 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/33674 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 2017148 / 160
10
14
17
19
21
32
37
39
40
41
43
46
54
57
70
73
85
89
92
95
98
99
102
103
108
113
114
115
119
121
122
130
135
136
141
142
143
146
149
153
157
158
00:00
SmartphoneExecution unitSoftwareIntelDemonMathematical modelComputer networkÜberlastkontrolleThermodynamic equilibriumoutputData modelPredictionPlanningLink (knot theory)Point (geometry)ProgrammschleifeAxiom of choiceReflection (mathematics)Diskrete EntscheidungTransportation theory (mathematics)Asynchronous Transfer ModeCountingNumberDistribution (mathematics)ÖkonometrieUtility softwareMaxima and minimaVariable (mathematics)Ultraviolet photoelectron spectroscopyIntegral domainQueue (abstract data type)Copenhagen interpretationElectric currentDeterminismTerm (mathematics)Negative numberStochasticCalculationEstimationPersonal digital assistantRoutingInformationOpen setMatrix (mathematics)DatabaseImplementationUniverse (mathematics)Computer fileRoutingVotingNeuroinformatikMultiplication signNetwork topologyState of matterChannel capacityInternetworkingMathematical modelLink (knot theory)SoftwareSource codeRoundness (object)Personal digital assistantDataflowFreewareTowerSummierbarkeitCombinational logicFigurate numberRow (database)AreaFrequencyIncidence algebraNoise (electronics)Social classMatching (graph theory)VibrationSlide ruleDifferent (Kate Ryan album)Logic synthesisField (computer science)Heat transferDiscrete groupNumberGraph (mathematics)StatisticsSet (mathematics)Physical systemGroup actionRight angleInformationBitAlgorithmConcentricSmartphoneProcess (computing)Module (mathematics)Sound effectVector spaceEndliche ModelltheorieTwitterRandomizationCondition numberUtility softwareLine (geometry)ÜberlastkontrolleShortest path problemNeighbourhood (graph theory)Point (geometry)DampingWell-formed formulaMeasurementMappingArchaeological field surveyMereologyProper mapForm (programming)IterationThermodynamic equilibriumDeterminismCausalityTerm (mathematics)Chaos (cosmogony)Sparse matrixCommunications protocolPlanningDistanceRankingLevel (video gaming)Medical imagingExterior algebraTraffic reportingWave packetCartesian coordinate systemLoginException handlingAxiom of choiceCanonical ensembleSurfaceDatabaseoutputCommutatorLengthUniform resource locatorVariety (linguistics)Order (biology)Bridging (networking)Bit rateProgrammschleifeFunctional (mathematics)Stress (mechanics)Speech synthesisDecision tree learningArithmetic meanTotal S.A.Connected spaceMatrix (mathematics)Core dumpImplementationMaxima and minimaPredictabilityRotationInternational Date LineÖkonometrieCellular automatonTube (container)BuildingGoodness of fitBus (computing)Engineering physicsNichtlineares GleichungssystemQuicksortSoftware testingScripting languageAlpha (investment)Beta functionElectronic mailing listLecture/ConferenceComputer animation
Transcript: English(auto-generated)
00:04
So, my name is Anders Lehmann and I'm from Denmark. I work for the University of Aarhus in, yeah, well, Aarhus. More specifically, I'm employed in the Aarhus School of Engineering.
00:21
So, just a quick agenda. So, first I'll talk a little bit about why we think it's important to do this work and then go more into the traffic modeling part, which is the bulk of... It turns out that it's actually the hardest part of the pollution modeling stuff.
00:46
So, why is it a good... Why do we think it's important to try to model traffic pollution? Well, there's actually two reasons for it. The first one is, of course, the pollution.
01:03
And in the bottom corner of the slide, you can see a special situation in a street where we have pollution in the street and we have a wind coming across the street.
01:22
And because of the high buildings in the street, we get what we call a cannon effect. So, you can imagine that if you want to have a precise model of the street level pollution, this is quite a challenging problem to actually model how much of the pollution
01:41
is going to be concentrated on the lee side of the road. So, one part of this is to be able to model precisely how much pollution, how much the concentrations of the pollutants are at the street level,
02:02
because, of course, it's not healthy to be in a very polluted street. So, the other reason why we think it's a good idea to model pollution from traffic
02:21
is because we have an obligation to report our climate gas emissions in Denmark. Well, through the Kyoto Protocol, we all have these obligations to be able to measure or model our climate gas emissions.
02:44
In Denmark, it turns out that 24% of all climate gas emissions actually comes from the transportation system. So, it's important to know how much it is, of course, but it's also important to know where we have this pollution
03:03
and how we can mitigate or how we can adapt to these pollutant emissions. So, one idea was to use smartphones as a way of registering how people are moving in the transport system,
03:23
because before we tried to use smartphones, we really didn't have very precise information on people's movement in the traffic system, because we had point measurements with tubes on the streets,
03:44
we have coils in the streets, also point measurements, and then we have surveys where we ask people how they move around and how they use the transportation system, so which modes of transportation they are using and when they move and when they travel,
04:01
which also has some problems with coverage and with people not remembering correctly which routes they take and stuff like that. Also, people have tried to use data from cell phone towers
04:22
where we can measure how many cell phones are in the neighborhood of a given tower, which also can, if we can track the cell phone IDs, we can see, not very precisely, but we can see how these cell phones move from tower to cell tower.
04:44
So, that's also been a source of information about how people travel. But now we have smartphones and they are with people all the time and they have several sensors and maybe we could use these sensors
05:03
to get more precise idea on how people are moving around in the transport system. In the upper right hand of the slide there is a spectrogram of the accelerometer in a cell phone,
05:23
in a smartphone, which is inside a car in idle. And what you can see here is actually the vibrations from the engine. So, in the start there is, I don't know if you all know what a spectrogram is,
05:40
but on the y-axis we have frequencies and we have a time on the x-axis. So, this is a 10 minutes measurement on a car in idle. So, in the beginning there are some red bars, which is the handling of the phone
06:03
and then it's green, which means there is no vibrations. And then we get these red, almost horizontal lines, which is the frequency of the idling car. So, it was on a cold day, so in the start the motor is cold
06:24
and then the rotation speed of the motor is high and then when it gets warmer it slows the idle speed of the motor down. So, that's what you can see from these data from a smartphone. So, we wanted to see if we can use these smartphones for getting more data.
06:48
So, more precise data. Of course, it turns out that not all people use smartphones and not all people using smartphones in traffic systems want to give us the data.
07:05
So, we end up with a not complete coverage of our transport system. So, how can we fill in the blanks?
07:20
And that's what we can use traffic modeling for. So, traffic modeling is actually a science that goes back like 60 years or something like that. And it has been used to...
07:42
So, the basic use of it is to try to model how much traffic there is on each road in a transport network. And we can use this knowledge of the traffic flows to predict where and when there will be congestions.
08:03
And it has been used for planning the effects of new roads. If we want to make a large road work, how could we do that with a minimal inconvenience and something like that. It has also been used before for pollution predictions.
08:26
And, well, with quite some success, I must say. And we can also use it if we are planning or want to see the effects of new residential areas or business areas.
08:43
So, if we build new parts of the city, what will the effect be on the transportation network? But the core of it is to try to find out how big are the traffic flows on each road in the network.
09:02
So, let's go to some of the basics in traffic assignment. So, we have this idea of a transportation network. And the smallest part of the transportation network we call a link.
09:20
So, a link is the connection between two points. And the traffic that flows into a link has to flow out of the link. It can't disappear in between two nodes. So, that's the definition of a link. And we then look at routes for travelers.
09:42
And a route is then a combination of links which the travelers use to navigate through the network. So, a transport network is actually a graph. And we can use the individual streets, cut them up in links.
10:04
And then we have the edges of our graph. So, a route starts somewhere in origin and it ends in a destination. And the transportation modeling tries to find out what are the used links to serve the travel demand for that origin to that destination.
10:32
So, of course we don't expect people to go the shortest path through our network.
10:46
So, we don't want to have any loops in our network. So, every route is a simple route. So, a part of the travel model is to look at...
11:12
We try to model how people are choosing the route through the network to serve their wishes to come to a specific place.
11:23
So, when we talk about route choice, we also talk about how to assign traffic flows to a specific route. So, you can imagine that if you have people coming from different places, going to different destinations,
11:46
they might share links at some point in their journey. And that means that the flow on an individual link is the sum of all the routes that use that link.
12:04
So, we need to model all the routes and then we have to take care of all the links that are used more than one route. And we try to model how our travelers are using their free choice.
12:24
And we will try to find an equilibrium. We will state some ideas on how we think that people are choosing routes. And in their way of choice, we will try to find the equilibrium of all travelers in the network.
12:43
So, one of the challenging things about route choice is that there is a very large number of routes which can serve a demand from one origin to one destination.
13:04
There is another science which is called discrete choice. But that field is really more about only a few choices. So, if you only have like a handful of different choices, then there are ways to model that quite efficiently.
13:24
So, one example is what are you going to use for the transport. Are you going to bike or going by car or going by bus or train or whatever. So, there are only a few different choices when you consider which transportation mode you want to do. But here in the route choice, there is a very large number of routes to choose from.
13:48
So, how many are there actually? I am going to go through a small artificial example just to show that this number is going to be very large.
14:00
So, how many routes are there in the first figure? We have four points, A and B. So, how many ways are there to come from A to B? Exactly, two. So, let's make it more difficult with nine nodes. Anyone can count how many routes there are, simple routes without loops.
14:29
How many? Yes, actually there is 12. So, they are all 12. So, just for going from four nodes to nine nodes, you get from two routes to 12 routes.
14:49
And actually if you go to the next one where we have 16 nodes, we are going to get 184 routes. And you can see, of course it is an artificial example, but you can see that even for n equals 7,
15:13
you get this very large number, 575 million different routes.
15:22
And that is for a very small network. So, of course we don't have to consider all the routes maybe, because we have Dijkstra and A*, methods to find shortest path algorithms, but it is still not that easy.
15:45
So, we could just ask Dijkstra to give us the shortest path and then we are done. Because, well, you know of course that if we have a large number of travelers on the road,
16:01
then the speed of the road, the travel speed goes down, what is called congestion. So, we experience that as more and more people use the same road, our travel time increases.
16:22
So, of course if you notice that in the traffic you will ask yourself if there is an alternative route with lesser travel time. So, our model here should be able to model this so that we get the demand distributed or more routes,
16:47
so that links are not completely congested. So, we will come back to that later. So, of course it would also be nice if we had a way to calculate the travel times for congested links.
17:05
So, we need to have a function of the travel time as a function of the amount of congestion. And then we should figure out a way to formulate an equilibrium so that we could distribute the flows in order to obtain this equilibrium.
17:26
So, the classic way of doing that is through looking at econometry. So, in econometry you consider all the access in a market.
17:43
You assign a utility to... So, we are considered to be selfish actors in this market place and we want to maximize our utility by doing the best thing we can.
18:03
So, another way of looking at the utility is that we could maximize our utility but we could also try to minimize our costs. And we consider the time that we use in travel as a cost so we can put a special number on how much we consider the cost of our travel.
18:31
Of course there are fixed costs and variable costs. But the interesting thing here is the variable cost, the travel time mostly.
18:44
Of course the travel length is also an issue because we have to account for the extra fuel that we have to spend to travel that extra length. But this is the basic model of every traveler in the network.
19:00
We try to minimize our costs, that is, try to minimize our travel time. So, I'm not going to go into this too much. There is this idea of a value of time and we can measure that actually.
19:20
Let's talk about congestion instead. So, what happens when more and more travelers are using the same link? Well, the travel time goes up and this is what the figure here in the right-hand side shows. So, to the left we have no cars and we have what we call a free-flow travel time.
19:50
And then as more and more cars come on the link, the travel time increases. And we use this formula here, which is actually a very old formula
20:03
defined by the Bureau of Public Roads in America. But it turns out that it actually works pretty well in a lot of cases. So, we have the zero which is the free-flow time and then we have the x is the amount of cars on the link
20:22
and then c is the capacity of the link. So, we see that when we reach a capacity, so x equals c, we add an extra alpha to our travel time. But the beta is normally a very large number or very large number.
20:43
It's four most of the time. So, the increase is going to be quite dramatic when we cross over the capacity of the road. And the figure shows how we can find an equilibrium in this case
21:02
because we have two links connecting the two points. And we have a fixed number of cars wanting to go from A to B and we see that if we, at the crossing point,
21:23
we see that the travel time on each of the link is the same and that means that it doesn't make sense to change your route because everyone has the same travel time. So, that's the equilibrium that we are going to look for.
21:41
We want to find the assignment of traffic to each link in such a way that every traveler has the same travel time. This is what we call the deterministic user equilibrium. So, if everyone has the same travel time,
22:01
it doesn't make sense to try to change your route. Simple enough, except that you have to go through all the links and make sure that it's true for everyone. So, but is it really a realistic condition for this equilibrium?
22:29
Because in order to make sure that we all have the same travel time, you expect everyone to know everything about the travel system.
22:40
So, you know where the shortest path is and you know where everyone else is going to drive at this point in time. So, it means that the travelers need to have perfect knowledge about the transport system. So, maybe it's not completely realistic.
23:01
So, maybe we could do something else or we could try to randomize it a bit. So, instead of demanding full or perfect knowledge about the transport systems, we can act like our model, as all travelers think that they have the lowest travel time.
23:24
So, this think introduces a randomness into the equation and we can use this to statistically model the travel assignment.
23:45
So, instead of only considering the value of time and the travel time, we add a random variable, the EPS here, which is going to model this randomness or this uncertainty
24:05
on how we are going to do. So, now we have two different equilibrium conditions. So, there are of course some problems with that.
24:21
So, in the statistical user equilibrium, we need to consider every route in the transportation system. And we have seen that there can be a very large number of that. So, it takes a long time actually when you consider statistical user equilibrium.
24:44
But on the other hand, the deterministic user equilibrium is a bit unrealistic. So, what can we do? Well, recently there have been research into combining the deterministic user equilibrium and the statistic user equilibrium.
25:05
So, we can reduce the choice set that we have, that we consider in the statistic user equilibrium in such a way that we don't need to consider all routes in every iteration of the model,
25:21
and in that way we can save a lot of computations. I'm not going into that too much there. So, but I implemented that model and tried it for a large city, a very large city actually.
25:41
It's number 35 on the list of large cities in the world. So, there are 300,000 links which I convert into one-way links so that it's easier when you do the modeling
26:07
to only have one-way links instead of considering which way we are going through that link. And then I made some synthetic data
26:21
to see if I could make it work on this large example. And what you see in the figure is that all the green stuff is all the links in Istanbul. And I don't know if you know Istanbul, but the white part in the middle is the strait of Baspurs,
26:46
and you can see that there are two bridges. They are working on a third one, but I think it's not completed yet. And the red part in the figure is all the used roads,
27:01
and you can clearly see all the highways which span the city. I could talk a lot about the implementation, but the next step in this work is to use the smartphone data,
27:23
the location data that we collect from the smartphones to create these origin-distination matrices. This is how we spell out the traffic demand in the system. And that is really one of the points where we don't have very good data,
27:43
so if we could get traffic demand from the location data from the smartphones, it would be great. And the figure shows the origin and destinations for a local experiment we did in Denmark.
28:04
So it seems promising that you can use these data, even though we only have a small percentage of the total number of travelers, it seems that we can get this origin-distination matrix going.
28:24
So, thank you. Questions? Alright, so we have time for a couple of questions.
28:54
Thanks for the talk. Can you tell us which kind of data processing module you have used for finding equilibrium?
29:03
So, the traffic model is implemented in Postgres. So, I got the data from the OpenStreetMap project,
29:22
and so it lends itself to a database implementation. And then instead of pulling out data from the road network, I just keep all the data in the Postgres database and do all the processing driven by Python scripts.
29:44
So, the heavy lifting is Postgres. Alright, one more question over there.
30:04
Thanks for your talk. If I understood rightly, you didn't consider any time-faring aspects. So, like dynamically adapting to congestion, lanes being closed, accidents, things like that. Was it possible to investigate that? And I was also curious if you had any access to other data sets,
30:22
like say Uber, to sort of test algorithms on that. So, as far as I can understand, the question is about time aspects. And if I had access to Uber data or more…
30:41
Yeah, but just to the first part, I think you're just assuming that a fixed number of people are trying to travel on the network. But if people know that congestion is really bad, they might either delay their trip or take the very indirect route or switch to an alternative mode of transport like train, or time shifted or just cancel their trip.
31:00
So, it depends on how bad the congestion is and what the cause is. If the road is completely closed, was it possible to algorithmically investigate any of those? Well, I think that you, as I've understood your question, I think you are right that people know more about the road situation now than they did before,
31:24
because they have access to Google Maps and Waze and stuff like that. And intelligence transfer systems are also more available, at least in larger cities. So, maybe our travelers are becoming more likely in the deterministic user equilibrium
31:46
that they actually know exactly what's going on in the transportation system. But, at the same time it seems that even though we have access to these systems, in our daily commute we tend to only use the roads that we are used to.
32:06
So, maybe when we are doing things that we always do, we are only using the habits that we have been used to.
32:27
But, with respect to timing information, we consider the most interesting things, of course, when there are congestions. So, we consider only the times of day where we have the most traffic.
32:44
In fact, that makes the most sense. And the data that we collect has this timing information, so we can see how people are moving through when the congestion is there.
33:07
Okay? Okay, well, let's thank the speaker again. Thank you. Thank you.