Modelling pollution from traffic, using Smartphone data and Python
Video in TIB AVPortal:
Modelling pollution from traffic, using Smartphone data and Python
Formal Metadata
Title 
Modelling pollution from traffic, using Smartphone data and Python

Title of Series  
Author 

License 
CC Attribution  NonCommercial  ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. 
Identifiers 

Publisher 

Release Date 
2017

Language 
English

Content Metadata
Subject Area  
Abstract 
Modelling pollution from traffic, using Smartphone data and Python [EuroPython 2017  Talk  20170714  Anfiteatro 1] [Rimini, Italy] The talk presents results from my PhD project on models for transportation related pollution. Pollution from personal transport in Cities is a big and growing problem. By monitoring the flow, and congestion in the transport system two goals can be achieved. First, the adherence to agreed limit values (or breaking said limits) can be followed and used to decrease health effects of local pollution hotspots. Secondly, monitoring of the total emission of climate forcing gases from transportation, is important for designing climate mitigation actions. Python is used in combination with other tools to convert sensor data from smartphones, into pollution concentrations in urban settings. To mitigate the lack of complete data coverage, the missing data is simulated by a traffic model, to locate congestion and model the traffic related pollution concentration

00:00
Execution unit
Intel
Demon
Computer file
Computer network
Überlastkontrolle
Bit
Thermodynamic equilibrium
Mereology
Mathematical model
Mathematical model
Software
Universe (mathematics)
Smartphone
00:45
Multiplication sign
Source code
Archaeological field survey
Chaos (cosmogony)
Mereology
Proper map
Logic synthesis
Mathematical model
Vibration
Data model
Medical imaging
Bit rate
Core dump
Smartphone
Endliche Modelltheorie
Physical system
Decision tree learning
Area
Predictability
Concentric
Stress (mechanics)
Sound effect
Maxima and minima
Überlastkontrolle
Measurement
Mathematical model
Arithmetic mean
Sparse matrix
Prediction
Tower
Smartphone
Slide rule
Trail
Überlastkontrolle
Canonical ensemble
Planning
Frequency
Energy level
output
Traffic reporting
Noise (electronics)
Information
Neighbourhood (graph theory)
Planning
Computer network
Line (geometry)
Cartesian coordinate system
Software
Speech synthesis
Communications protocol
08:59
Discrete group
Axiom of choice
Group action
Randomization
Distribution (mathematics)
State of matter
Length
Multiplication sign
Combinational logic
Determinism
Set (mathematics)
Thermodynamic equilibrium
Counting
Mereology
Mathematical model
Variable (mathematics)
Neuroinformatik
Data model
Roundness (object)
Different (Kate Ryan album)
Endliche Modelltheorie
Physical system
Exception handling
Social class
Algorithm
Link (knot theory)
Channel capacity
Point (geometry)
Bit
Determinism
Überlastkontrolle
Term (mathematics)
Thermodynamic equilibrium
Connected space
Mathematical model
Copenhagen interpretation
Sparse matrix
Exterior algebra
Order (biology)
Summierbarkeit
Right angle
Queue (abstract data type)
Figurate number
Freeware
Programmschleife
Electric current
Row (database)
Point (geometry)
Shortest path problem
Dataflow
Asynchronous Transfer Mode
Functional (mathematics)
Statistics
Link (knot theory)
Ökonometrie
Variety (linguistics)
Reflection (mathematics)
Maxima and minima
Überlastkontrolle
Heat transfer
Field (computer science)
Number
Diskrete Entscheidung
Programmschleife
Internetworking
Wellformed formula
Term (mathematics)
Utility software
Condition number
Form (programming)
Axiom of choice
Matching (graph theory)
Graph (mathematics)
Computer network
Ultraviolet photoelectron spectroscopy
Incidence algebra
Cartesian coordinate system
Number
Voting
Software
Integral domain
Personal digital assistant
Network topology
Utility software
Iteration
Negative number
Routing
25:27
Link (knot theory)
Link (knot theory)
Surface
Calculation
Thermodynamic equilibrium
Mereology
Mathematical model
Number
Stochastic
Bit rate
Estimation
Bridging (networking)
Personal digital assistant
Endliche Modelltheorie
Figurate number
Electric current
27:09
Point (geometry)
Implementation
Multiplication sign
Database
Distance
Login
Number
Twitter
Wave packet
Matrix (mathematics)
Smartphone
Damping
Information
Endliche Modelltheorie
output
Implementation
Physical system
Routing
Module (mathematics)
Algorithm
Mapping
Information
Commutator
Total S.A.
Database
Incidence algebra
Überlastkontrolle
Open set
Uniform resource locator
Process (computing)
Exterior algebra
Software
Vector space
output
Smartphone
Right angle
Figurate number
Ranking
Matrix (mathematics)
00:05
so my money was when and am from Denmark work for the university of our goes in here well files
00:15
Moses is civically I'm employed in the whole school of engineering so just as quick agenda so 1st I'll talk a little bit about why we think it's important to do this work and then go more into uh the traffic modeling parts which is also the bulk of of of it turns out that if it's actually the hardest part of the pollution modelling stuff so why
00:47
is it a good why we used to we think it's important to try to model of the traffic pollution all there's actually true reasons for at the 1st the 1st 1 is of course a pollution and in the in the bottom corner of all of the other side you can see and especially the especially the situation in uh in the streets where we have delusions the shade and we have a winter coming of across the street and because of the high buildings industry we get what we call a canon effects so that you can imagine that if you want to have a precise model of the street level the notion this this quite a chasing problem to check the model of how much of the pollution is going to becomes traded under the side of the road so some 1 part of this is true in the image of model precisely how much pollution how much could that the concentrations of the pollutants that are at the street level of so but because of course it's not healthy to to be in a very Aprilette's street so the and the other reason why do we think it's a good idea to model there really pollution from the traffic is because we have an application to the 2 reports that our climate chaos emissions that vary in Denmark what we want through the Kyoto Protocol we all have at these obligations to to be able to make sure all models the our climate gas emissions and in Denmark it turns out that 24 % of all climate gas emissions actually comes from the transportation system so so it's a it's important enough to know who said that it's how much is of course but it's also important to know who go where we have this pollution and how we can mitigate or how we can that to to these pollutant emissions to so 1 idea was to use smartphones as a way of register how people alive moving in the transport system because before we over the we've tried to use sparse Austria with didn't have very there's where precise information on on on people's movement in in the traffic systems there because we have plants measurements with the troops on the streets we have course in the streets also point measurements and then we have surveys where we asked people how they move around in and hold the user traffic the system to which modes of transportation that using and when they have move from when the travel uh which also has some proper problems with coverage and with people not remembering correctly which will grants to take and stuff like that so so that's Barisan does also um think of to have tried to use the data from cell phone towers where we can measure how minutes cell phones in the in the neighborhood of I given tower which also can if we can track the the cell phone the ideas that we can see that not not serious very precise the but we can see how these the cellphones moved from Taoist to so tell the so there's a that's also been a source of information about how people who travel the but now we have smartphones and they are will be rewarded time and they have a uh has several synthesis and maybe we could use the senses uh to get more precise idea on and how beautiful moving around in the transport system and there are indeed of a righthand turn of the slide there's a spectrogram of the yeah uh of of this the experiments in a cell phones mean as much the which is inside of a a kind entitled and what you can see is actually the vibrations from the engine so in the start areas that's and other noise you all know what a spectrogram is but the a on the Y axis we have the frequencies and we have a lot of time on the X axis so that this is a is the 10 minutes uh measurement on on the uh the cart in title so in the beginning there are some rich us which is the handling on the floor of phone and then it's the green means there's no vibrations and then the we get these the rates almost horizontal lines which should which is the frequency of that I think so it it was a knock out a cold days In the start motor is cold and then the the irritation speech of of the model is a high and then when it gets smaller it slows the idle speed of the motor time so that's what you can see from from these data from a smart from so we wanted to see if we can use these the smartphones for for getting more data so it no more precise data the of course we do we turned it turns out
06:55
that not all that people use the smartphones and but all people using smartphones and traffic systems you want to give us the data so we end up with is an age they a is used not complete coverage of our of the transport system so our can we fill in the missing the blanks and that's what we can use traffic modelling for and so um this traffic Molly is actually my sense that goes have the back like 6 60 years or something like that and you now it has been used to solve that these the basic use of it is to try to a tool model of how much stress that is and each road in the transport network the and we can use is this uh this knowledge of the traffic flows through to predict where and when there will be congestions and it has been used for planning defects of a new roads and if we want to make the natural work how could we do that was minimal means inconvenience and something like that uh it as lost also been used before for pollution predictions uh good 1 well with quite some success with what must and then in the we can also use it we if we're planning all want so sees the effects of new research in residential areas of business area so if we build new new parts of city how all of that what was effect beyond the transportation network the but in the in the core it all is to try to find out how how much the a hub peak of the traffic flows on each wrote in the
09:01
network so let's go over to something the so some of the basics and traffic assignment so we have this idea of all the transportation network and what the basic that the smallest part of the tragedy is that what we call a link so a link is the connection between 2 points and that can as the traffic that flows into a link has to fill out of that link it can disappear in in between the the 2 notes so that's the definition of a link and we then look at france for for travellers and shows our around so then that a combination of links which the the trouser used to to navigate through the network so a traffic the traffic network is actually a graph and we can use the individual's trees called cut them up in in links and then we have the the ages of graph so uh I wrote this with start somewhere in origin and it ends in this nation and we the the transfer or to the that trench petition modeling tries to find out what what are the the they used to links to serve that Tycho demand for for that origin to that destination so of course we we we don't expect people to to go over on the extend to go there to shortest path to true our network so we're going to want to have anything to signal networks so every vote is a simple simple rout so the looking so uh no part of of of travel model is is to look at you know only tries a model of how people are choosing the truth and the network to serve this that in wishes to come to us specific place so there and when we talk world trust absorb about uh how to assign traffic flows to the specific grounds so uh and you can imagine that if you have people come from 2 different places going to different destinations and they might share links at some point in and the journey and that means that the 1 hour uh that the flow on an individual link is the sum of the sum of of the uh all the rats that use that link so we we need to model all the rounds and then we have to take care of and all that the the the links values and that more than 1 rout so uh and and we try to model how the how our travels are using their the free choice uh and we we will try to find a tree promote was is state some ideas on how we think that people are choosing the grass and in that they have their own way of choice you try to find the equilibrium of all of travels in the Internet but so and 1 of the challenging thing about route choices that there is a very large number of which can serve as some demand from 1 origin to 1 this nation uh there is a kind of uh the science of which is called the discrete choice but that field is really more about not only a few choices so if you only have like a handful of different charges and to die ways to model that quite efficiently so 1 example is what are you going to use for the transfer of AI winter by people who owned by cargo and a personal trainer or whatever so that there there's only a few different choices when you you consider which traffic what you want to do but here in in the road shows there's of very large number of of approach to choose from so so how many are there actually aren't going to go through a small artificial example just to show that this number is going to be very match so so how many rounds out of the 4 in the 1st figure we have 4 points a and B so how many of ways are other to come from a to B exactly so let's make it more difficult for with 9 notes that anyone can count how many rounds Simba rods without loops government E. S. actually there's 12 so they about to melt and so just far for going from 2 to 4 now notes to 9 nodes it gets from to ratchet rods and actually if it goes to the next 1 where we have 16 of notes uh we're going to get 100 and 80 84 rows and you could see the if you if should of course is an artificial example but you can see that even for foreign uh and the incidence equals 7 you get this should have much very large number 575 million different routes that's the a foreigner is small and works so of course we don't have to consider all the rats maybe because we have Dykstra and then they stop the methods to to find shortest path algorithms but there still is is still not this not that easy so we could just asked ISAF to to forgive us it's sparse and then we're gonna have done because world when the ball you know of course that if we have yeah uh when we have a large number of travel some I wrote stem and the speed because of of of the id of the real the transcript goes down what is what is called congestion so so what we do we experience it as many as more and more people use the same brought a our travel time increases so and of course if you will notice that in the tractor you would ask yourself is if there's any alternative rock quiz and this of travel time so that our model here should be able to a to model this uh so that so that the forward that we get a decent amount the distributed or more out there so that and the links are not completely congestion so we're go come back to that later so of course we want what it would also be nice if we had a way to coregulated travel times for a congested links so we need to have a function of the of the so as a function of the amount of condition and then we we should and
17:16
figure out a way out of so the phone it a curriculum so that we could to distribute the flows in order to to obtain this equilibrium so the the classic way of doing that as of through looking at Beacon economic tree and so in a country you consider all the axis in a in a market that the you assign and utility to virtual world so and you would be considered to be selfish and actors in this marketplace and we want to maximize value to by doing the best thing we can and so on and and have no no way of looking at it should you to say is that uh we're if we could maximize our you choose but we could also try to minimize our costs and we consider the time that we use uh in travel as a course so we can exercise we can uh put a a special number on how much we we consider the cost of of all travel of course the fixed cost and the course and the but so we are but but the interesting thing here is that the variety of course the child time mostly and of course the the travel length is also an issue because we have to account for the extra fuel that we have to choose to bend to travel that fixed length but this is the basic model of every travel in the network we tried to minimize our costs that is try to minimize our travel time the so there I'm not going to go into this too much that is this idea of a value of time and and so and we can measure that actually so let's talk about congestion states of the what happens when there are more and more trials are using the same link well the travel time goes up and this is what the figure here on the right hand side shows so we have uh in to the left and we have figure we do have note that no cast as and we have that what we call a free free flow travel time there and then as more and more cars comes on the link the child's and uh right increases and we use this formula here the which is actually a variable at form defined by the Bureau of topic wrote in America but it turns out that actually works pretty well in in a lot of cases so the and uh so these we have the team 0 which is a free vote time then we have the X is the amount of cars on the link and then C is the capacity of the link so we see that when we reach a capacity so X equals C we uh we add an extra effort to our travel time but uh the basin is normally a very large number of where it's full most of time to and so it is that the increase is going to be quite dramatic when we cover 1 across all of the capacity of the world so and the that is the figure shows how we can find an equilibrium in this case because we have to the links connecting the 2 points and as so that we have a fixed number of classes and wanting to go from a to B and we see that the uh if if we that the um crossing point we we see that the travel time on each of the link is the same and that means that it's the that doesn't make sense to change your out of your new because every 1 has a central so that's the equilibrium that we are going to look for we want to is that we want to find the the assignment of traffic to each link in such a way that at every traveller has the same travel time this is what we call the deterministic user equilibrium so if everyone has the same term time it does make sense to try to change your your router OK simple enough except that you have to you have to have do have to go through all the links and make sure that it's true for everyone so so but is it it really be a realistic the the condition for this equilibrium because in order to to make sure that we all have the same travel you kind of uh you you expect everyone to know everything about the travel system so did you have to you you know where the the shortest path is you know where every 1 else is going to drive at this point in time so a z it's made me so that the child's needs to have perfect knowledge about the transport system so maybe it's not completely realistic so maybe you could do something else what we could try to randomize a bit so this we instead of demanding full of perfect knowledge about the traffic systems we can we can act supply of model if as at all uh travels thinks that they have a little was trying times a week this think it introduces layer aid uh the random nests into uh the the question we can use this to to statistically model the that directed to travel of assignment the so we instead of all you have to consider the value of time and that the term time we add a small uh and uh at random variable the ETS which in which he's going to model this the randomness of this uncertainty on and how do we have gone to do so now we have 2 different uh a critical uh conditions so that doesn't of course in the summer some problems with that so as in the statistical use intuitively but we we need to consider every who got in the India to the transportation system and we have seen that there can be a very large number that absolutely it's it takes a long time I visit to when you will consider since this is key you would use a credit problems so but on the other hand the data the dis 7 user group is a bit unrealistic so so what can we do well recently we uh there have been research and so combining here that the sum is taking a user could around the statistic use a career comes out we can we can reduce the choice of sets that we have that we consider in this to to use it really prompt in such a way that we don't need to consider all riots in in every iteration of the model uh and in that way we can save a lot of computations on not going into that
25:28
too much there so but I
25:33
intimated that model uh and tried it for the large very large city actually it's number 35 under the list of of that cities in the world um so there are is free at 100 thousand links which are converted into a oneway links to it so that it can uh I don't it's easier when you when you can uh do that over the years the modeling 2 0 a oneway links instead of considering which you by the way we're not going through that so yeah and other than not then I had these I made some synthetic data through to look to see if I can could make it work uh on this large examples and what to see in the and to figure is that is all the green surface of all the links in the summer and I don't know if you know the simple but the the yeah the white part in the middle is the Bosporus Strait of Bosporus and you can see that there are 2 bridges uh that they're working on earth 3rd 1 but the other thing is not completed yet and that the rate part in the in the figure is all the use roads and you can clearly see although highways uh which spanned the city so the that approach
27:11
the like the bell could talk a lot about the implementation that but the next step in this work is through the use of the smartphone data of the patient data that we collect from smartphones uh to create these origin destination dismission matrixes that this is how we we we spell out that traffic demand in in in the system and that is really 1 of the points where we are not as well we don't have very very good data so so if we could get traffic demand from the uh the location data from from the smartphones and it would be great and the figure shows the origin and destinations for a local for the experiment we did in in in in Denmark uh so it seems promising that you can you can you you can use these data even tho even though there we only have a small percentage of the the the total number of travellers it seems that we can uh we think it this origin destination matrix going so the thank you
28:30
if questions arise so at we have time for a couple of questions people the yeah and and talk there can you tell us which kind of data processing module you have used was for finding and EC the vector beer so the the the model that the trends that the traffic model is implemented by and make enclose Chris so um I got that got the data from the OpenStreetMap my project and so it lends itself to to that database implanted implementations of and then the instead of putting out data from the from the know road network edges TP all the data in the input posters database and and do all the processing is driven by Parsons so the heavy lifting is supposed this and yeah right 1 more question over here so yeah it and thank thank talk and if I understood likely and didn't consider any time during aspects it's like the dynamically adapting to congestion lanes being close to accidents things like that it was possible to investigate that and I was also curious if you have any access to other datasets like say Hooper the 2 to just algorithms and it so as far I can understand the question is about time aspects and if if I had access to rule but data or log of more than the distance to the 1st part like and I think you you just assuming that fixed number of people are trying to travel on the network but if people know the congestion is really bad there might either delayed the trip or take the very direct rank or switching alternative mode of transport like train or timeshifted August cancel the trip and it depends on how about the congestion is and what the causes if the road is completely closed in might it was possible to algorithmic investigate and well as I think that you as a float understood your question I think you're right that the people know more about the the rotor duration now than they did before because they have they have access to golden maps and no 2 ways and and stuff like that so uh and incidents travel systems are also more available in at least in in larger cities so maybe it is maybe I'll travellers are common becoming more like in the dishonesty user creating that they actually know exactly what's going on and transportation system but on the same time it seems that even though we have access to to these uh systems we in our daily commute we we tend to only use of wrote uh that get used to so so you may be able to marry me when we have that doing the things that we always do we acquire or using and the habits of that give then used to show this but of with respect to to timing information that we with with we consider the most is this interesting things out of of course when the congestion so we we consider only the times of day where we have the most traffic that's that makes the most sense and we have the data that we we we collect it has sustained timing information and so we can we can see how people are a moving through the when they put up the congestion this there to OK the OK well let's thank the speaker again thank you thank you that uh