Turning Data into Information with Geo-Ontologies
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 183 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/32072 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Producer | ||
Production Year | 2015 | |
Production Place | Seoul, South Korea |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
SoftwareCartesian coordinate systemOpen sourceSoftware developerEvent horizonSpacetimePhase transitionTerm (mathematics)Computer animationUML
00:39
Procedural programmingProcess (computing)State of matterSpacetimeOpen sourceMultiplicationGoodness of fitThread (computing)Staff (military)Computer animation
02:00
Term (mathematics)OntologyMultiplication signData modelComputer programmingAreaExistenceGroup actionComputer animation
02:44
OntologyPhysical systemBitDifferent (Kate Ryan album)Crash (computing)
03:17
Universe (mathematics)InferenceSoftwareDemo (music)Point (geometry)VideoconferencingHierarchyNetwork topologyPhysical systemHookingState of matterNeuroinformatikComputing platformReal numberVisualization (computer graphics)OntologyContext awarenessGUI widgetQuicksortOrder (biology)Software development kitObject (grammar)Computer programmingDiagramoutputData managementProcess (computing)Multiplication signFlow separationGeometryRight angleSpeech synthesisMetropolitan area networkDependent and independent variablesPresentation of a groupOrder of magnitudeProgram flowchart
06:48
Branch (computer science)Different (Kate Ryan album)HierarchyLevel (video gaming)Natural numberBit
07:08
Branch (computer science)Network topologySlide rulePower (physics)Term (mathematics)ExistenceRow (database)Computer fileSemantics (computer science)Mechanism designOntologyGeometryField (computer science)MappingPhysical systemTexture mappingProcess (computing)Domain nameStandard deviationData structureUniform resource locatorDifferent (Kate Ryan album)Duality (mathematics)MereologyInternetworkingMedical imagingType theoryCASE <Informatik>Program flowchart
10:00
Field (computer science)GeometryUniform resource locatorMobile appBuildingGeometryFile formatOntologyMultiplication signTexture mappingOnline helpType theoryInternet service providerTerm (mathematics)Physical systemWeb 2.0Mathematical analysisVisualization (computer graphics)Computer animation
10:52
Web 2.0Cartesian coordinate systemType theoryRow (database)HierarchyDiagramIterationGUI widgetContext awarenessMappingGeometryState of matterDigitizingUniverse (mathematics)Inheritance (object-oriented programming)UML
11:55
MappingGeometryOntologyAudiovisualisierungSoftware bugPoint (geometry)HierarchyRow (database)Physical systemGeometryComputer fileTexture mappingLevel (video gaming)Thermal expansionDuality (mathematics)Network topologyWebsiteHazard (2005 film)Food energyStudent's t-testUniverse (mathematics)Computer animation
13:32
NeuroinformatikHooking
14:08
TouchscreenMultiplication signComputer animation
14:32
Form (programming)CASE <Informatik>MappingRight angleElectronic mailing listNeuroinformatikDynamical systemPhysical systemAttribute grammarComputer animation
15:55
OntologyGUI widgetGeneric programmingNetwork topologyNumberFront and back endsPhysical systemLevel (video gaming)Type theoryDemosceneQuery languageData managementData structureWeb 2.0Universe (mathematics)InformationPoint (geometry)SmoothingGeometryUniform resource locatorFrequencyMultiplication signField (computer science)Right angleInformation managementSequenceWave packetKnowledge representation and reasoningHierarchyMappingComputer animation
18:09
Computer animation
Transcript: English(auto-generated)
00:03
application intelligence software, for lack of a better term, on dealing with things like consolidating disparate data sources and giving entities the ability to map and analyze that data.
00:21
Specifically, we worked with primarily international non-profits, but we've also done work in the oil and gas space, as well as advertising, but primarily, most of the work we've done is in international development space, specifically trying to prevent scary creatures like this from spreading malaria across large swaths of the world.
00:42
So this is a difficult space to work in. As you can imagine, a lot of the customer needs are quite difficult to achieve, and we started working on this almost nine years ago when the state of technology was different than it is today. Needless to say, this was not easy
01:03
to achieve the goals we needed to achieve to do things like try to prevent malaria across multiple countries in the world. But some common problems we found across these customers over the years have been pretty common themes
01:21
that I'm sure a lot of you have experienced. Things like needing to consolidate disparate data sources, entities having limited resources, specifically in the GIS staff and GIS technology realm, a lack of good data, especially GIS data, and especially when coming from developing countries,
01:42
and a need to consolidate and map and analyze and visualize your data. So pretty common things that I'm sure a lot of you have experienced over the years. The common thread to all this is data, and specifically the need to turn messy and incomplete data into something that's useful.
02:01
So how do we do this at Terraframe? First, gotta give a shout out to all the FOS4G and FOS tools in general. Without the awesome work by these groups, there's no way a small company like ours could do what we've done so far. So thank you.
02:20
But additionally, we needed another way to approach data and visualization, something that was practically non-existent almost nine years ago. And we chose the technique of modeling data as ontologies. So every time I bring up this term ontologies and programming,
02:40
people tend to think I'm talking about some voodoo magic. So I'm here to try to dispel some of those concerns by giving you guys a quick ontology crash course of this. I'm gonna have to ask you to bear with me a little bit because this talk is gonna be kind of complicated, but I swear you'll see no code, so you all should be able to grasp it for me.
03:01
But before I get too deep into it, I need to explain that there's two primary avenues of data in this system. There's ontology data, and there's user data. User data can have a relationship to ontologies, but in general, these are two different pipelines of data in this system we're talking about. So to start off with, ontologies.
03:21
What are ontologies? Ontologies are a style of programming that allows us to make human-like inferences about data nodes. So you can imagine a basic ontology, something like Justin is a person, where is a is an ontological relationship between two data nodes. Justin has a brain, again,
03:41
is a has a relationship between two data nodes. A geo-ontology is really similar to this, but we're talking about spatial. So Colorado is a state, or Colorado is located within the United States of America. You can see these are very human-like references between data objects.
04:01
In order for us to do this in the stack that I'm talking about, I'll talk about that a little later, we have to come up with two central concepts of the system. One is the universal, the other is the geo-entity. So this shouldn't be too foreign to you. A universal is essentially a collection of features.
04:21
You can imagine this on the political hierarchy where there's countries, states, provinces, districts, and so on and so forth. Example of countries is a university. Geo-entity, on the other hand, is an individual feature within that collection, within that universal. So South Korea is a country within a country's universal.
04:44
To try to beat this home a little bit, in this diagram you can see geo-entities on the left representing individual features within a universal set. So Colorado is a state, and it can be found within the state universal,
05:02
and the universals have a relationship between each other. So countries, or states are within countries, and counties are within states. Another way to look at this is in this simple tree widget you might see on the web.
05:21
So Colorado, again, is a state within the state universal, and you can travel up and down the universal hierarchy to navigate these ontological relationships between spatial entities. So what's the purpose of all this? We can already do this, right? With spatial processing.
05:41
The purpose is to provide a central geographic context for the system itself. Specifically, it provides well-defined spatial and non-spatial relationships between data nodes. And there's no dependency on GIS,
06:00
or on geometries, excuse me. So this means that we can input data into our system that doesn't have geometric data, so it can come from an Excel file, or CSV, and still work with it. This is huge, and we'll try to beat this point home. So the software I'm talking about here is Runway SDK and GeoDashboard.
06:21
Runway SDK is a data management platform. You can also consider it an ontology engine of sorts. More recently, we've been developing GeoDashboard, which is a visualization user platform that sits on top of Runway SDK. And unfortunately, I can't get my computer to hook up to this presentation, but I have a really quick screenshot video
06:42
I saved real quick, and I'll give you a really quick demo of GeoDashboard in a minute. So what about user data? What I just described is the first branch of data, and it provides a reference level of spatial data inside of Runway SDK and GeoDashboard.
07:02
So these are the political hierarchies in most situations, right? User data's a little different. User data's the data coming in from the user that pertains to their domain of knowledge. So if you're working with malaria, your user data's gonna be like CSV's Excel files
07:22
related to malaria, right? Not necessarily mapped to some spatial feature. But, user data can have relationships to ontologies, which gives us a lot of power. So this user data may come in as JSON,
07:42
it may come in as GeoJSON, or any other GIS format, but more often than not, comes in as Excel files. And as we all know, Excel data, and all data for that matter, is often incomplete, messy, and non-existent in terms of geometries, especially when working in developing countries.
08:01
So another complicated slide. This is a slide I put together to try to explain how user data can map against this ontology structure. So here on the left, you have user data coming in, whether it's from a JSON API, an Excel file, or whatever. It gets pumped into the system,
08:21
and up at the top, records 46 and 47 are just standard records of data. They get mapped against a geo-entity using a location field. So again, notice that you don't need a geometry. In this case, we're working with semantics. Like, every customer we've ever worked with can work with semantics, with labels,
08:41
with names of countries. They can't always deal with having geometries. So you map those user records against a geo-entity, which automatically gives those records as reference to spatial processing in the system itself. And notice, this lower tier, or this lower branch of the system maps a synonym.
09:01
So we have this mechanism to map typos and semantic differences between locations to a single geo-entity. So we're piping data into the system every minute, and there's a known typo in the system. You register that typo, and every single record coming in in the future with that typo will map the same geo-entity.
09:22
So it also buys us a lot of data-cleansing power. And finally, when you have this user data mapped against the ontology structure, in other words, the geo-entities, you gain all the power of working with the universal tree.
09:45
Thank you. So, why is this valuable? I mentioned some of the reasons why, but a big reason why is it allows us to map data in a generic way.
10:01
What do I mean by generic? I mean, my data, your data, everyone's data. It doesn't matter what data you have, it doesn't matter what format it's in. We can pipe this into our system and map it against ontologies as long as you have something that indicates location. It could be a geometry, it could be a field indicating some text location name.
10:21
It has to have something, but the doors are pretty wide open in that regards. So it opens, it gives a lot of flexibility in terms of the types of data and users we can interact with, or we can help. So issues like no geometries, like I said multiple times, not a problem at all. Of course, if we have geometries,
10:41
it only adds to our ability to build apps and help deliver better solutions, but it isn't a requirement to get the system running and to provide visualization and analysis. So how does this work in a web application? This is kind of a big leap, but this diagram demonstrates something
11:02
you should be familiar with by now. Here you have user data and geo-entities. We've mapped this user data against some known geo-entities, and because the universal hierarchy has some awareness, we can aggregate these geo-entities up to the parent universal geo-entity.
11:23
So we'll know that by summing all the records that join with these two geo-entities, that Colorado has sold five widgets, and we can aggregate up the universal stack even further to the United States to see that five widgets were sold in the United States.
11:41
And the beauty of this is it works with all types of data. This doesn't have to be sales. This could be counting bunnies in the desert is a reiteration of what I just said. So what about geometries? I keep saying how we don't need them.
12:00
However, they're incredibly useful. Geometries are still used to visualize geo-entities. So everything you see in a map will be a geometry stored on the geo-entity. If the geo-entity does not have geometric data, it can't be visualized, at least if it's an ontology record. If it's user data, it can still be mapped
12:21
against a geo-entity that has to do with political hierarchy. Sorry, that's gonna get confusing. Come bug me afterwards if that's confusing. It also allows us to visualize user data at the lowest level. So if we get an Excel file with some lat-long coordinates, we still allow users to visualize that. They don't have to aggregate their data
12:41
up the universal hierarchy to see it on a map. We can also use geometries to algorithmically enhance data or do some QAQC on data coming into the system. Of course, just like always, geometries are incredibly useful. It's just we've had to find a solution for mapping and visualizing data that doesn't require them.
13:03
And again, an important point is that these geometries are optional. So we've currently deployed this technology to, actually I think it's eight countries. We're expected to be in 14 within the next six months. And we have a lot of other expansion opportunities coming up soon.
13:20
So we fully expect to be possibly quadrupling the amount of countries that we've deployed this technology to in the next couple years. Which is really great, because we're a very small company of seven people. So like I said, I don't have the ability to hook my computer up
13:40
to this computer, but luckily, I did a little desktop screencast of the most basic use of GeoDashboard.
14:04
So, if I can get this in here. How am I doing on time, really? Five minutes, great.
14:24
Okay, all right, so. Sorry, I really wish I could click through my computer. I'd love to show anyone who is interested in this later.
14:42
I'd be happy to sit down with you and walk you through this. GeoDashboard is much more feature-rich than you're about to see here. So this is a basic dashboard, as you can see. It's very familiar to all of us. We have layers on the left, and some data, a list of data on the right.
15:10
So, this right here is a simple data set. You can imagine this as user data. Again, we can pipe any user data into the system. You can have more than one. Here we only have one.
15:21
Cage delivery summary actually stands, is a Cambodian terminology for sales, salesman. And we have a bunch of attributes on the data, on the right. So if I hit play, real quick, here I'm simply opening up a form to create a layer.
15:41
So this is, we enable a bunch of dynamic mapping of various flavors, much like a lot of the other hosted mapping solutions offer. And voila, we've mapped some data. But then if you want to work with the data
16:01
and analyze it a little further, you can change your aggregation methods. So before, we aggregated by province, and I just changed the aggregation level to district. So here's an example of us navigating up the universal tree to dynamically map data based off of an ontology structure, and voila.
16:22
There we have a bunch of sales data represented as points, sized by the number of sales, and mapped against a generic ontology structure. So one other interesting piece to this is the ability to filter data.
16:43
So I think I just did it real quick on that example, but on the right, you can manipulate the data through simple little widgets. Here we're working with a number field, so it's a simple, are these data values greater or less than or equal to whatever? But you can also query on ontology types
17:01
and date, time, all that stuff to work with the data. It's really nice. You don't have to write any SQL or anything to do with it. And one final thing, we have this bunch of data management tools to work with the ontology data in the backend. So here I'm navigating the universal tree.
17:22
Users can move these universals around the system to redefine the universal hierarchy, and they can also visualize all the geoentities in the system. Here we have a bunch of geoentities for Cambodia and Zambia and a bunch of identified problems that are identified through the ontology structure.
17:41
So again, there's no spatial querying or post-gis behind the scene to figure out what these problems are. This is all derived from ontology information. And here we're going to confirm a new location or fix a problem in the geoentities through a simple web widget.
18:02
And that's kind of it. I believe this is the end of this movie. Does anyone have any questions?