Semantic assessment and monitoring of crowdsourced geographic information

Video in TIB AV-Portal: Semantic assessment and monitoring of crowdsourced geographic information


Purchase DVD

Formal Metadata

Semantic assessment and monitoring of crowdsourced geographic information
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date
Production Year
Production Place
Seoul, South Korea

Content Metadata

Subject Area
Whilst opensource software allows for the transparent collection of crowdsourced geographic information, in order for this material to be of value it is crucial that it be trusted. A semantic assessment of a feature’s attributes against ontologies representative of features likely to reside in this location provides an indication of how likely it is that the information submitted actually represents what is on the ground. This trust rating can then be incorporated into provenance information to provide users of the dataset an indication of each feature’s likely accuracy. Further to this, querying of provenance information can identify the features with the highest/lowest trust rating at a point in time. This presentation uses crowdsourced data detailing the location of fruit trees as a case study to demonstrate these concepts. Submissions of such crowdsourced information – by way of, say, an OpenLayers frontend – allow for the collection of both coordinate and attribute data. The location data indicates the relevant ontologies – able to be developed in Protégé – that describe the fruit trees likely to be encountered. If the fruit name associate with a submitted feature is not found in this area (e.g. a coconut tree in Alaska) then, by way of this model, the feature is determined to be inaccurate and given a low trust rating. Note that the model does not deem the information wrong or erase it, simply unlikely to be correct and deemed to be of questionable trust. The process continues by comparing submitted attribute data with the information describing the type of fruit tree – such as height – that is contained in the relevant ontologies. After this assessment of how well the submitted feature “fits” with its location the assigned trust rating is added to the feature’s provenance information via a semantic provenance model (akin to the W3C’s OPM). Use of such semantic web technologies then allows for querying to identify lower quality (less trustworthy) features and the reasons for their uncertainty (whether it be an issue with collection – such as not enough attribute data being recorded; time since collection – given degradation of data quality over time, i.e. older features are likely less accurate than newer ones; or because of a major event that could physically alter/remove the actual element, like a storm or earthquake). The tendency for crowdsourced datasets to be continually updated and amended means they are effectively dynamic when compared to more traditional datasets that are generally fixed to a set period/point in time. This requires them to be easily updated; however, it is important that efforts are directed at identifying and strengthening the features which represent the weakest links in the dataset. This is achievable through the use of opensource software and methods detailed in this presentation.
Open source Presentation of a group State of matter Direction (geometry) Multiplication sign Linked data Source code Motion capture Student's t-test Computer font Semantics (computer science) Chain Ontology Representation (politics) Information Computer font Information Projective plane Open source Mathematical analysis Bit Set (mathematics) Cartesian coordinate system Computer animation Network topology Software Ontology Personal digital assistant Software framework
Area Computer programming Email Building Information Transformation (genetics) Projective plane Shared memory Sound effect Mereology Semantics (computer science) Order of magnitude Category of being Medical imaging Computer animation Network topology Personal digital assistant Term (mathematics) Reverse engineering
State observer Database transaction User interface Euler angles Multiplication sign Scientific modelling Direction (geometry) Source code Sheaf (mathematics) Function (mathematics) Parameter (computer programming) Mereology Semantics (computer science) Mathematics Matrix (mathematics) Video game Bit rate Stress (mechanics) Ontology Software framework Pairwise comparison Scalable Coherent Interface Interface (computing) 3 (number) Bit Functional (mathematics) Maxima and minima Network topology Database Data storage device Hausdorff dimension Freeware Mathematical optimization Resultant Row (database) Point (geometry) Service (economics) Presentation of a group Open source Transformation (genetics) Connectivity (graph theory) Linked data Process modeling Computer Rule of inference Revision control Computer worm Data structure output Posterior probability Form (programming) Rule of inference Information Inheritance (object-oriented programming) Process modeling Server (computing) Projective plane Physical law Local Group Table (information) Diameter Peer-to-peer Calculation Uniform resource locator Computer animation Software Ontology Personal digital assistant Logic Function (mathematics) Local ring
Area Pairwise comparison Connectivity (graph theory) Weight Limit (category theory) Computer Uniform resource locator Computer animation Network topology Bit rate Ontology Network topology Energy level Right angle Units of measurement
Curvature Open source Server (computing) Linked data Formal language Computer animation Software Ontology Linker (computing) Function (mathematics) Ontology output Implementation Partition (number theory) Social class
Mashup <Internet> Query language Code Scientific modelling Source code Water vapor Usability Web 2.0 Uniform resource locator Maxima and minima Bit rate Object (grammar) Ontology Network topology Social class God Mapping Wrapper (data mining) File format Software developer Attribute grammar Bit Price index Measurement Category of being Wave Process (computing) Network topology Lattice (order) Condition number Pattern language Quicksort Conformal map Data structure Resultant Metre Ocean current Web page Digital filter Mapping Patch (Unix) Linked data Process modeling Ultraviolet photoelectron spectroscopy Web browser Rule of inference Event horizon Attribute grammar Workstation Revision control Goodness of fit Database Energy level Data structure output Implementation Associative property Form (programming) Condition number Graph (mathematics) Information Process modeling Server (computing) Query language Mathematical analysis Usability Line (geometry) Predicate (grammar) Vector potential Uniform resource locator Error message Computer animation Ontology Personal digital assistant Predicate (grammar) Query language Function (mathematics) Mixed reality Statement (computer science) Object (grammar) Library (computing)
Point (geometry) Building Process modeling Basis (linear algebra) Attribute grammar Usability Computer animation Network topology Personal digital assistant Linker (computing) Ontology Subtraction Data type Near-ring Task (computing)
I however on 1
and 4 and the times really in the states that I We PhD students from the University of Canterbury in question using on and I would like to present you an application of our research using free and open source software totalled semantic assessment and monitoring of crowdsourced geographic information I would also like to acknowledge the support of the CRC sigh and how undertaking of this research so is a bit of an overview of representation also of for introducing a research outline out for and the freedom so soft used throughout they know outlined some of the finer details of our produce such as crowdsourcing moral and tuning trust of the information and high-emission takeover describe ontologies consuming information is like dialog in the finish for the now 1 of our future directions in research so try your research review scanning applications of crowdsourcing to spatial information and its use so the crowd can be more than just the tonsils so through my PhD on obviously going ways to improve the trust of crowdsourced information provides assistance of informations quality and the reliability of the source of information so trust in the context of crowdsourced you record information as knowing the quality of the information and also the reliability of the source of the information and Hamish's looking at the implications of this trust his beyond simply the capture of the information and the consumption of the information for example an analysis and presentation and also complementing other existing geospatial information so our project is an application of our research and is based around the fir trees in the residual redesigned and cross on to crush it choose a relatively small city on the east coast of the set of all and of and has a population of roughly 340 thousand people and September
2013 approaches have struck by quake magnitude 2 . 1 which was followed by a series of of each aftershocks and through that earthquakes including the deeply 2011 the earthquake in the following aftershocks caused major damage to infrastructure buildings and land in the city 2 major land image some areas of the city mailing reverse taint uneconomical to rebuild upon because the land was simply be too costly effects to ensure buildings on it would not be damaged and future quite this land was identified by the government and is known as the residential reads on includes some 7 thousand 860 properties and 680 he gives all of the landowners with the the Sun were paid out for the houses in what program shares and the government and all of the houses have either been demolished or to be demolished so today the
residential radon has been cleared of many of the houses that wants to the but fortunately many of the tree still remain including fir trees there once part of people's gardens including Apple's fears and women trees to our project is based around these trees and lots of ways to crowdsource for transformation and term the trust of that information through Semantic Assistants and they consume information and its trust is like the is
the framework for a project as you can see use free and open source software throughout the main components of the book this is the finding that location which is set up to collect information from the crowd trust rating sections we we determine the trust of information ontology which is used for the Semantic Assistants of information to help to determine the frustrating linked our which is used to aid in the consumption of the information and output now with the information is consumed so crowdsourcing employment of the framework is built using using GeoServer pathways and gender this component that acts as the interface for the user to the users and presents existing for tree information and also allows users to create and submit new for you for transformation and the stage of their uh crowdsourcing part is just a simple we've met both time to become part of a larger groups
so use Jagger we've framework is provides us with functionality for creating a larger saw there we've met can become a part of we use a localized construct we've met which allows us to display and OpenStreetMap placement with the Mathlet shown crowdsourced for points 9 optimal I also applies also support and provides us with the tools for creating editing deleting and saving features the now used just of the already existing for tree features and to see changes to these features and the creation of new features this is done through a transaction with pictures service reduces the stakes his argument and to store the information we use posters with posteriors and we suggest postprocessor directions produced due to the interoperability with other free and open-source software and its ability to expose the information to processes that we consider through direct queries and so it model the 3rd tree information first-person way that makes it easy for us to store information and run processes on the information and also still results in those processes a tree model was a relational model that contains tables so information and information about the trees themselves and observations of with the trees of foraging or not and observations of the trust of tree features the information that we the trees for human not and the trust the features so warrants super tables to main for tree information to allow us to report observations of this information this means we can look back through history of the tree features see how the truck that stresses change over time as the main features change In this also applies to observations of the trees version what the information improve such dependent on the data the features squared so unsolicited received crowdsourced for tree information we have to determine the trust of that information so frustrating part of the framework is we we do this through a trust model the stress was largely built from posters and now in use is Python to trust calculations the any conceptually all that trust will determine the intrinsic semantic trust of the crowdsourced information this conceptual-semantic law was a component of a larger conceptual crowdsourcing will focus on the spatio-temporal social and semantic trust of crowdsourced information for assessment of the intrinsic and extrinsic dimensions of these components the spatio-temporal trust what when and where the information social trust looks a bit of information by the reliability of the source and the semantic components of the WoT of the information for each of these components in the logic in situ moral contain intrinsic and extrinsic mentions with intrinsic assisting the individual piece of information all the individual source of the information and the extrinsic consisting of a piece of information that's what the surroundings for how the individual source localization was reviewed by the peers so that trust model for this project we're looking at the intrinsic semantic component of the crowdsourced for a tree features I but the understanding of the trust of this information could be strengthened through additional assessments of the spatial temporal or structural components of the information so optical since trust was rebuilt using posters Python l the structural forms assessment of the crowdsourced information and russet frustrating taken into life so the trust posters as a store of the information that provides the rules of the information should comply with and passes a catalyst that brings it all together we use Python scripting to query the post we start saying he records of no had the trust rating attitude and yet this information is in computer rooms place literals and L false through start the so what the for transformation we assist quality of the matrix of the tree being the height and diameter to etc. the forging observation and also the locations tree so our project
elementary features to install tall on 1 May the environment and as foraging now be considered trustworthy features as a trees that units within the city limits here and the ontology but on the other hand a coconut feature would be considered least trustworthy solely because coconut trees cannot foreign which is we know the stages with trustworthy through the comparison of the trees location in the area in which coconut trees can grow as outlined in the ontology say to the for trade that you use a computer the ontology and given an entry level trust writing these trust ratings of the an aggregated into an are or weighting of trust for the feature and a written back into the feature recording the divers so in this example the aggregation of eTutor trust ratings are evenly weighted by the weightings may be changed to exaggerate important components of the trust right so now posture behind to discuss ontologies and linked up some
ontologies as a programmatic way of
defining a concept based on human reasoning by defining classes and the links and relationships that exist between these classes and end up defining this of knowledge the curvature using crowdsourcing because the accessibility when publishing we've ontology at that language and they can be accessed via a single your ROI grow so highly adjustable and can count up multiple ontologies and join them together so think so that covers the whole subject topic for and non-developers the marxist there are juries out there such as partition which is open source software from Stanford the last uh at the excuse to focus its and knowledge on putting together the ontologies as opposed to having to cope an example of this simple
class and rule structure that we use in our ontology is in this case the apple tree which becomes the subject the property or predicate which goes to which the fondness would be the maximum height as redefining the idea repertory the maximum and the object they completes a statement and contributes to with the definition of the apple tree as a value in this case the meters and Protégé can
simply entered into the class of all the properties that would define this feature and all the objects that make these um statements true the resulting 17 classes themselves and so the graph grows because this is published as linked data has enabled Bx is there a SPARQL query which is a query language for RDF is and Python this can be achieved using a lot research as an idea for that which reads in the ontology and the other thing we or forms the graph and in uses of the patterns within this graph to match the sparql query so in this case we want to find out the maximum height of an apple tree they classes is here that 1st you URI the property of the maximum height and then the object that can place it statement based on the uh ontology graph so defined by this 1 ontology and then everything else follows from this this is in and put into the trust model in each feature the submitted by a user can be compared against all of these other properties that define the worst case scenario in this case the maximum heart to see how trustworthy that feature has been that has just been submitted the resulting trust can then be added that through the uh as an attribute to this particular feature and becomes an attribute itself it's immediately week and in a way that we can then put Linked Data Bank yeah and this follows the same subject predicate object triples and in this case the subject as the feature has been submitted in this case tree 44 Wall property of that feature 1 of the attributes as defined by the height has hired and in this case a literal 2 . 5 meters as is appropriate crowdsourced information because the goal is to not just harvest all this data from everybody's out there but in the other to give it back to the using structures such as this means that the crowd does not have to be familiar with complex data structures such as spatial formats would databases and they can act as dada yeah this single you are always because crowdsourcing is large we base it's assumed that was the when I had interrupted a URL ever see a simple human-readable web page this way they can this inter similarly similarly structured URI in received and see but also makes it very easy 1 3 more like out on the web to create mashups so they would bring in data from almost anywhere in this really the power of crowdsourcing we unleash that the real potential of the human intuition and goodness knows what people was that bring you that what what might be the talented relate and really brings another meeting to see how well it works in this particular example conformers simple measure using a Python library could phoneme which is a Python-based leaflet wrapper so in this case I wanna look at most trustworthy fruit trees that we have some this case there would be if just writing about 70 out of 100 and so for the purpose of demonstration will put the wind speed that tree conflict is a lot of when there that we mn offered on the ground that fucking go and pick up so simple SPARQL queries and basically this is shows you can filter these queries based on the trust rating charge writing guidance even returns the attributes of all the features that make those guidelines including the latitude and love on have the tree I days and the let long for each of those which are they can estimate with underground define the nearest cluster with a station and the conditions of that of that with the station for each tree now I have all the odd days prior of the just writing burdensome the wind speed all the other attributes uh great thereof Folium this Zafira python develop but it's very easy 1 line of code in a snob a line of code for each tree and 2 but also palatial MIT justice here everything is pop ups to tell you what sure you looking at and of course the once so when we go from here missing the why and the how about the here and now with current goddess it's truly credibility and trust from well the largely result from legacy of Florida's datasets from national mapping agencies or large corporations have traditionally performed their roles relatively well and so people assume that when they pick 1 of these datasets this is going to do the job misplaced prominence that to go because basically used for tracing years and that that the artist it's have that built and trust in the waters that look back at what's been done what something goes wrong with me a holdout is considered reliable and Dallas it's just considered as pretty much 1 this following the w 3 c guidelines for provenance give the dotted so was generated by some form of conviction and come back to this struggles was continues by some agency and their association that in collection form that they use patches this fishery this produces the graph of the provenance of the data has been used retrieve from so something goes wrong there might be a bit of a trial but you can use you get back to find out what the wrong there is the wrong analysis for that particular does it usually applying this approach to crowdsource data as it is largely considered that feature will if you think about a medium-sized out about that that that is the about 100 thousand features or even just a browser shot of OpenStreetMap and how many features come up in the news that the considered each of those events emitted by a user and because of the anon anonymity of the wave it's very hard to know who that is exactly what they've done you that tracing problems with something goes wrong and things quickly become the nylon impossible to trace really years come from so what this trust ratings provide us with his inability to again how reliable an attribute is we can aggregate that that the feature level and into the dataset little to provide an overall indication of just how trustworthy that God Cedars alternatively we can just look at the features themselves and take the most trustworthy features depending on what sort of analysis we want around hopefully they paid the most trustworthy phages then analysis is going to be even more reliable what this provides us with is a proactive for more prominence away to stop these major years from happening before they appear before this process that's and what allows us to do this use data and the most reliable way possible and increase the usability of this really valuable source thanks very much the Christians if it I was as you you guys defined
some ontology models for trees in this case against worked with the uh and many other ontology models doing this kind of thank you make easier the crowdsourced data and specifically on interested in I personally using a similar method for a bunch of voice and data in the feature in the near future I'm wondering if you get a kind of a building added on story using larger datasets lot a different type of ontology models in OWL we haven't
thought of that because you made about this and exchange the delta so absolutely that such as you would be interested as well as just delicacy that's gonna at that point we need more more daughter on with a note on the link of a song that includes ontologies more that we have the more reliable restructuring basis yet different talk of yet task yet is the never the questions it's a much


  665 ms - page object


AV-Portal 3.16.0 (9cfa3864b8acb689056f9c67aa39bc8ec4c75d58)