An environmental modelling and information service for health analytics
Formal Metadata
Title 
An environmental modelling and information service for health analytics

Title of Series  
Part Number 
32

Number of Parts 
193

Author 

License 
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
Identifiers 

Publisher 

Release Date 
2016

Language 
English

Production Place 
Bonn

Content Metadata
Subject Area  
Abstract 
Enriching patient information with environmental information such as individual exposure to air pollution or noise is a relevant procedure in health care and research. Generating this exposure information, however, is computationally intensive as the processing of large datasets with fine spatial and temporal discretisation is necessary; thereby usually exceeding hardware resources e.g. of family doctors. In an interdisciplinary research project (Healthy Urban Living) we develop an environmental modelling and information service (EMIS), consisting of these parts: a set of environmental models calculating exposure (e.g. NO2, PM10) on Dutch scale a set of algorithms to calculate exposure of individuals along their spacetime paths a set a of microservices to maintain a flexible workflow in generating and executing queries The microservices architecture enables us to perform the computational intensive modelling tasks on institutional or national computing facilities, and allows lightweight client applications such as web portals to query the EMIS and thereby give health researchers straightforward access to exposure data. The presentation gives a general overview of the research project, the EMIS system architecture and outlines how opensource software tools (e.g. GDAL, PCRaster, flask, docopt, sqlalchemy and more) are used to process the spatiotemporal data sets. We additionally demonstrate use cases from the health researcher's perspective.

Keywords  Utrecht University The Netherlands 
Related Material
00:00
Data model
Execution unit
Pointer (computer programming)
Computer animation
Term (mathematics)
Universe (mathematics)
Sheaf (mathematics)
3 (number)
Integrated development environment
Average
World Wide Web Consortium
00:23
Service (economics)
Context awareness
Service (economics)
Information
Software developer
Projective plane
Analytic set
Online help
Faculty (division)
Data model
Data model
Computer animation
Integrated development environment
Meeting/Interview
Personal digital assistant
Integrated development environment
World Wide Web Consortium
01:01
Domain name
Presentation of a group
Multiplication
Theory of relativity
Mapping
Cellular automaton
Weight
Projective plane
Commutator
Maxima and minima
Sound effect
Estimator
Sic
Process (computing)
Computer animation
Integrated development environment
Endliche Modelltheorie
Spacetime
02:43
Statistics
Service (economics)
Divisor
Euler angles
Calculation
Combinational logic
Set (mathematics)
Database
Mathematical analysis
Power (physics)
Spreadsheet
Integrated development environment
Medizinische Informatik
Quantum computer
Task (computing)
Information
Cellular automaton
Sampling (statistics)
Computer simulation
Sound effect
Attribute grammar
Database
Evolute
Statistics
Process (computing)
Computer animation
Dew point
Personal area network
Spacetime
05:16
Service (economics)
Execution unit
Enterprise architecture
Information
Structural load
Multiplication sign
3 (number)
Coma Berenices
Web browser
Data model
Uniform resource locator
Data model
Process (computing)
Computer animation
Integrated development environment
Integrated development environment
Data structure
5 (number)
Summierbarkeit
Physical system
Physical system
World Wide Web Consortium
06:19
Threedimensional space
Group action
Enterprise architecture
State of matter
Multiplication sign
Set (mathematics)
Data storage device
Theory
Data model
Operator (mathematics)
Integrated development environment
Spacetime
Information
Process (computing)
Endliche Modelltheorie
Summierbarkeit
Distribution (mathematics)
Power (physics)
Information privacy
Mathematics
Data model
Process (computing)
Stochastic
Computer animation
Multiagent system
Physical system
Resultant
07:33
Building
Group action
Source code
Combinational logic
Mereology
Mathematical model
Data model
Geometry
Homography
Software framework
Endliche Modelltheorie
Predictability
Source code
Metropolitan area network
Simulation
Software engineering
Mapping
Temporal logic
Open source
Parameter (computer programming)
Oscillation
Data model
Raster graphics
Software framework
Endliche Modelltheorie
Metric system
Freeware
Asynchronous Transfer Mode
Point (geometry)
Programming paradigm
Functional (mathematics)
Implementation
Module (mathematics)
Graphics tablet
Maxima and minima
Mathematical analysis
Student's ttest
Field (computer science)
Power (physics)
Finite element method
Operator (mathematics)
Software
Energy level
Task (computing)
Domain name
Module (mathematics)
Dependent and independent variables
Information
Projective plane
Interactive television
Evolute
Algebra
Computer animation
Integrated development environment
Universe (mathematics)
Computing platform
Natural language
10:26
Implementation
Mapping
Variety (linguistics)
Computeraided design
Set (mathematics)
Data model
Geometry
Population density
Videoconferencing
Descriptive statistics
Domain name
Module (mathematics)
Addition
Multiplication
Length
Multiple Regression
Number
Uniform resource locator
Data model
Process (computing)
Computer animation
Raster graphics
Buffer solution
Free variables and bound variables
Spectrum (functional analysis)
Resultant
11:41
Metre
Dynamical system
Image resolution
Linear regression
Texture mapping
Image resolution
Multiplication sign
Mathematical model
Measurement
Geometry
Data model
Computer animation
Different (Kate Ryan album)
Energy level
Endliche Modelltheorie
Summierbarkeit
Resultant
Condition number
13:04
Point (geometry)
Group action
Cellular automaton
Weight
Sampling (statistics)
Field (computer science)
Computer network
Rule of inference
Field (computer science)
Data model
Geometry
Data model
Word
Computer animation
Software
Operator (mathematics)
output
Modul <Datentyp>
Routing
14:27
Enterprise architecture
Service (economics)
Structural load
Dependent and independent variables
Multiplication sign
Mobile Web
Temporal logic
Online help
Data storage device
Data model
Response time (technology)
Type theory
Dedekind cut
Spacetime
Information
Serviceoriented architecture
Endliche Modelltheorie
Physical system
Task (computing)
World Wide Web Consortium
Enterprise architecture
Simulation
Moment (mathematics)
Bit
Numbering scheme
Variable (mathematics)
Power (physics)
Information privacy
Degree (graph theory)
Type theory
Uniform resource locator
Arithmetic mean
Data model
Computer animation
Query language
Calculation
Serviceoriented architecture
Physical system
Task (computing)
16:24
Execution unit
Web portal
Information
MIDI
Coma Berenices
Water vapor
Axiom
Numbering scheme
Data model
Data management
Web service
Prototype
Computer animation
Query language
Selectivity (electronic)
Serviceoriented architecture
Queue (abstract data type)
Conditionalaccess module
Form (programming)
World Wide Web Consortium
17:03
Discrete group
Presentation of a group
State of matter
View (database)
Multiplication sign
Source code
Set (mathematics)
Client (computing)
Special unitary group
Computer programming
Variance
Data model
Estimator
Web service
Office suite
Metropolitan area network
Mapping
File format
Temporal logic
Software developer
Open source
Data model
Arithmetic mean
Process (computing)
Internet service provider
Order (biology)
Different (Kate Ryan album)
Resultant
Spacetime
Interpolation
Functional (mathematics)
Implementation
Time series
Mathematical analysis
Entire function
Rule of inference
Term (mathematics)
Operator (mathematics)
Software
Serviceoriented architecture
Implementation
Computing platform
Address space
World Wide Web Consortium
Operations research
Scale (map)
Cellular automaton
Projective plane
Generic programming
Field (computer science)
Coma Berenices
Database
Volume (thermodynamics)
Total S.A.
Supercomputer
Computer animation
Software
Integrated development environment
Basis <Mathematik>
Personal digital assistant
Computing platform
Object (grammar)
20:57
Point (geometry)
State observer
Multiplication sign
Execution unit
Workstation <Musikinstrument>
Process modeling
Set (mathematics)
Mereology
Order of magnitude
Dimensional analysis
Homography
Energy level
Conservation law
Representation (politics)
Endliche Modelltheorie
Addition
Noise (electronics)
Scaling (geometry)
Projective plane
Moment (mathematics)
Sampling (statistics)
Planning
Multiple Regression
Measurement
Flow separation
Data model
Process (computing)
Computer animation
Integrated development environment
Personal digital assistant
Video game
Escape character
Resultant
00:00
check the OK continuing with
00:18
our next section and I would like to introduce of assessments from the terms university is
00:25
considered something of roads and
00:27
a mental modeling and information service for help analytics and it's fairly his edits so far you think you are much better yes thanks to the development of an environment
00:41
emollient information service and project and interest and it's in the context of moral coupling and project to help the only thing which is a research project at University which in this case all kinds of research topic related to all of health and the nowadays especially we're looking at some
01:02
of the quality and it's still a nation to help and what what you should be aware that the quantity of pollution and some of relations on health effects on human people also we are some land scientists and held the scientists and and that actually looking the morning in the cell on personal exposes estimation and essentially means we would like to know all it persons what's their personal exposure and then to continue relating it's true health effects so if you look at the other 2 domains of the working in the project Geosciences and health research and geosciences usually have some mildly detailed knowledge about the specific processes of the toward the risk mapping and then they have some kind of and knowledge about space possible people come in and on the other hand the health sciences they're very detailed models from multiple things that a person's legible like age or weight or the the the sports for example and we tried to combine these both knowledge domains and to get a more precise estimation of personal exposure and present disposal now you have people search everything at all for
02:29
from environment would have some influence on on the and the comedian was for example have pollution becoming audible commute climate effects think of the dominance of refinanced their of infections stress these sort of
02:46
basically factors that affect human health the health researchers narrowing the interest of demand well how was the relationship between person exposure and the health effects and labeling of by investigating the of individual calls for for several thousand people perhaps for example and by using detailed knowledge this they trying to get some kind of process understanding of personal exposure can be related to health effects and if this process of understanding is that there are other problems connected to the population from sample and the thing look at the combination of the pollution exposure well there's some some challenges associated with it if you have quite some high spatial 1 of the cell and the world data that the service required but if you think moving people that such as information to represent the spacetime parcel for people for example and usually this simulated in computer models so this quantum computing power and start to quiet and cilia also geoscientists sampling and there's is that many of the problems a bit of about such as well I think lot medical doctors layer other 1 patient and they will have information about that so how can we in which gap between what we'll have a very special set worlds and health information if if the amount of the of the of the work task and that need to do well the
04:27
left side you have invented the health research signed here a database information or spreadsheet information there with the attitude information about a person's it can have what you have little information and to be flexible the movement information and you have more world environmental introduced for example evolution or other kinds of resources there is someone to combine these but engineers based and temporally aggregators and these are all use finally need to be combined together and so by the health is such as to the update of the individual people so that 2 years health is such a kind of statistical analyzes for so
05:19
that the the workflow of our
05:22
environment modeling and so was that of the structure basic abalone some kind of called information which you can see has all location information with the person actually isn't a particular time that I can have all this kind of data throughout the whole system and how well after some time the will retrieve and which information and this information is coming from mile from 0 and 1 of these kind of all of our environment loads of that kind of thing these values and while of the the medical doctors abilities and they that's more the likely to process I'm going to talk more about the American bonds calculating the spatialtemporal molds that's what I'm going to talk about a little more so you talk about
06:20
the exposure of the world the distributing for this data tell scientists which theory was invited by a work of the and have to consider environmental and animals that can had pollution but we still have a spatial processes we presented special data that represents an out we in the helium of time
06:46
and this actually what's happening in faltered modeling we have a special data only presenting a state of the system at about that time that we have a set of operations which can change the system and bring it to the following time step and state to basically you can will will end up in Japan before 2 dimensional threedimensional space time steak the question is who is among results and there wasn't always search group we are especially interested not just going 1 airpollution what we want to construct several of models we want to construct a different 1 entitled hydrological models for example so we are interested in
07:35
making very generic functions so that the environment models can really build of them all and then we have the resources to research group what we'll do with spectrotemporal and Bob and modeling that we have represented by the model which basically implements about 200 operations that and some of the points within culture but also we have a modeling framework for dynamic modelling and uncertainty analyzes and became interact with them the and also there's lots of free of freedom of the people can do well with the with the model names and that's what we already went on for for a while since the 19 nineties uh people are using it quite intense fear for example the global level to the moment of of and that we have all the sources himself we used but if we have a look at what we're going up to us basically it's it's more or less of the main tasks of the software engineering task and the student modeling in uh where we deal with the data from other for special data we are implementing a home 1 of them is the of the current implementations are also there's many a software engineers working on its because performance metrics and and simulation models not so well as a domain scientists they usually not multiple programming suppose past we exposing all of functions and operations as Python modules and then we come to the remotest 2nd part that the environment modeling domain so we use a lot of most of the police and other people around the world building oscillation modes with with power and with also and they bond that we underlying other universities research of course so we are currently missing and developing new data model that we can we examined fields and they develop the agents phenomena and 1 data model and to assess language what would be the
09:44
last thing that we developed the it happened and what we here we implemented the approach models and that we used existing evolution moles from a European projects the obvious care project and nonproject where possible improvements of the models are implemented the model basically calculates the average well useful for you and well short it's it's combination of predictive variables and these prediction maps of our models of a based on that basically is a land use information population informational traffic related to data sets the given example we
10:27
have found that the PMT mammals well they offer free variables so freedom of datasets to see them on you to these are spatial and spectral data sets and so for example population densities means for the trust so we also can affect limited buffer around how many people who live there so that's all these so regression models was set up there you already see the improvements and that of the of the of the problem side so all those little location in the rest of my but there is a mention of it
11:07
with the by the module iteratively straightforward we just only 3 sense so roleplaying video actually a special variety and then you see that the model implementation and this is basically has a domain scientists just writing process descriptions and multiplication and addition for example of it's calculated for the rest of so what that's about it's about a process like a description of the model and so will finally you can just we want to know this results in this and that's what you
11:44
can see here we calculated the 6 different models of National College for the Netherlands that the 5 meter resolution so that's about 1 . 100 million resources for the Netherlands uh if you need to complement levels 1 that's the that's quite some technical conditions have from that they're on the right side of the of the of the of the the of the level of the details of our findings with him we are not only interested in the
12:15
possible steady later we also interested in the dynamics so according of minus a busy with trying to get from measurements later shift the dynamic model so what he did he arrived at the regression coefficients that a representing the 1 all time steps and this is a very very preliminary that result of this is that the end of it can already see that in the morning and evening of an increase due to traffic but that will will we would likely have some kind of dynamics that we've presented in this model and this can imagine there 5 meter resolution with an old stuff for a year or more this will result in a lot of the we need to think about how to solve but we are
13:06
not only interested in that also busy with other kinds of questions that we have the field of for example so that the political as well as cells that will be be actually interesting how can we fully faster so all of the world that we can assign the external to the network roadmap for example there was a lot of work and figuring out what it is but but basically it was that we assign them proven
13:38
values as edge weights what do you do this because it was only a strong interest in activitybased modeling and so we would like to point sample if people what home world that we would like to history as well what's their exposure during a trip to the other side you have the more healthy rules that uh possible and that's what we see for example the words of the city of Rotterdam 1 the shortest route and 1 of the healthiest in blue so you can see here that we basically in in the input giving rise to the people should sort of some some of the groups of that is something that is lazy what but we are interested in the field of network operations the that's
14:28
pretty via the new competition the bottom and that we are now mainly in busy with not we also have the problem that world this in the end users to help researchers world of interests of entrenched means that they need to access to the data and the way we have quite some some challenges there because in the simulation models suffer use calculation times y of doctoral degree has smaller datasets investment want all the data but also we need to have a bit more faster response time that we might have different locations where we compute animals uh where we store models and we don't know exactly what the researcher wants have to Tomainia datasets he's he's asking for a different type of 1st Pacific temporality gations so there's quite a quite of variability and uncertainty regarding the requirements that means that we have to deal with the moment and the so actually were what we tried to avoid at all costs some kind of monolithic system and that's why we have some kind of Microsoft's architectural layout that where every task cancer dedicated service and in that way we hope that we can then be quite a quite flexible adaptable to changing requirements and this is just a
15:54
just a very brief overview of the of Microsoft was how to of the moment that I want to volunteer here for that kind of query about that can be used while the other the health research on the news something quest to the to the model service which more massive and maintaining and bookkeeping service of all kinds of other services of the portal services and arranging basically the task
16:25
management which started school to the water and the all
16:29
this kind of uh we processing as mentioned here
16:38
was always a surgical infrastructures what web services you can't use so we we are building a prototype of a web portal that whether the sort of and need to conceal the topic select few pollutants that you can select from aggregation methods that I can upload your own look coordinate information but if you want to have information for form
17:04
and the world and you can see in the model the processing status of your of your client request and I'll finally you will get you will get your database
17:18
so as to whether the next thing we are and building a really high and I was a little spatial data sets of the usefulness of plants and there we use a total of exposure estimation on on the interpolation we're looking at the field and that what operations for all kinds of food generic operations there are such that people can use them in animals as well and we make our results actually all accessible by by a web services and that will listen to this of whole set of challenges that we are facing in this project was computational and so basically you really in terms of the future of the space environment discretization for our suspicion performances is a challenge because they are not developing 1 model we in developing a generic and modeling platform so we know we need to provide flexibility for model development and this and that makes it the quite implementation challenges and also I will fall for health research as well we can generate a lot of the special program of data but the question of the analyzing and then validating in the state of the this there's also the remaining so that they we are not acting in the 2nd year of the project if you are interested in what happens well so it's a sensible source itself or you can just for free there some people fall costs were anything like that you want to know more about about human living project to the world like and relative to the the object address so it kind of look that up there and with evidence so basically we have a lot presentation what still want like to do is to to Due to give you an idea of what our software in there and you know see the top 20 for all time steps of model results therefore this question that of office farming so what you basically no that's the 1 over the means this and this is rule is a userfriendly format to stick you see the view that the cell was not about 20 or OK that's nitrogen genocide the and then there is the ancestors were also through maps of the interactive is the datasets always use more here the time series development somebody's a unified we're of systolic and just to get to work in these in these volumes and then as we have some kind of time series but we can also made some kind of animation with time series and you will see that and that you can have assume that the temporal distributional fall for that the pollutant in this case so this is just another and as an example for for the functionality and together they exploratory working off for 1st order the piece to us to tools with this preliminary data also don't we still need to the work on on the model and motivation so I would like to mention that but yet with that I think you want to have more than 4 questions to this is you
21:06
that know we don't know believe that our approach is when you we have a very well for relatively a simple model which is really based on on land use population these kind of datasets conservative measures regression model and used to model basically that uh that's not linear incorporating processes that affect winter climate that we might say and want to look at that not our approach is to have a more straightforward model but we calculated from national level because we're interested in really large extent and that basically will likely to estimate the the the exposure for Dutch people and that's why a will process models with the detail process knowledge they even more computational intensive and some of them are working on a much closer criticize so of that would not ready for for the only that so units Quantikine experimental projects so we hear the we have a way of conferring 1st indication that that going into the election not to I don't him I want to use that knowledge so that would be interesting to do it's not a escape sample from noise model the most money is not a part of the cascade models that really at the moment that I let would be would be a lot of and interest to include more social stresses related to noise and there were plans to can life from traffic data sets but also more more modeling is escorted according to the business again if if there are straightforward approach as well totally environment you're going to give lots and units so next month yes you someone will someone smallscale really very very detailed it would be very interesting if you can just upscale those total national level some of them assume that serve a separate research project the the you were the so you can read about or and this is ending and the magnitude of the computational time it's not a new variant but certainly especially phonology have more dynamic ability that we need to have a more detailed look on on 117 finally escaped models they are developed the new measurement data out there do I have 2 dimensions regression models and that the process so they also had to look at that and that's the aspect of and in our case we use starting with with computational have friends that we see if we can win we presented with finally the approach of the model results and again extending that uncertainty would be would be interesting what's what we already or not contagious the future so we need more time there your situations the that means we're going to do it when we started with project that on scale projects that they will be validated to invent a model to measurements and observations at some point as far as I know about 8 the station solutions model our supporting problem that I'm not to the underlying model addition step also taking care in this case we took small interesting case but we have built up of me to model and we compare our results of the peace was similar to the escape model and so we get to the article the agreement so the fact we dressed as someone who is in his book representation for from this kind of problem the things I just the who is based on rare and and FIL