Bestand wählen
Merken

Fixing GIS Data Discovery

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
OK so I got the thumbs up to go ahead and stay here and my name is Jack Reed I'm a software engineer at Stanford University Libraries in my talk is called fixing GIS data discovery by introducing due Blacklight project have been working on and so must start off by asking a question what is the hardest part of GIS work right so where you guys when guys say finding the data right in and I I hope you don't say projections right because you know that it's Hawkins's like of projections and got honoring about projections so yes to not the hardest part about doing GIS work is finding the data on that last talk was also about slide of OOS and tool in in in I work at the University of Iowa worked another university before I worked at Stanford and in working with students researchers anybody that the hardest part about all the work is is getting at the data from I heads nodding yeah every agreement here again in yesterday I was that of the diamonds talk and she she she said this quote 1 of the biggest problem for Code for America projects is getting access to the data so not just for the university people but it seems like Code for America it seems like it seems like a lot of organizations anybody who's doing GIS work would regardless of what it is it's really difficult and problem n I would argue here that phosphor G that getting the data should be simple I we we have data it's hopefully in a machine format and and we should be it and in a simple way I believe it requires a lot of domain knowledge from there's probably a tremendous what a tremendous wealth of GIS expertise in this room right now and I know that phosphor G we all have we all have kind of the ways that we've learned through doing geospatial work of where to go get data what what data is good data what did is bad data 0 from doing this kind of project get a natural Erther from doing this and I go look for that in in we have all this domain expertise but but for people who don't have lot of people who are geospatial experts it's a big learning curve the the In it's a really poor user experience from I really liked some mikes keynote yesterday in of Vlad talked about simplicity will say GIS and and I thought those were both great talking about the the tools we use and the user experience behind the tools because honestly finding the GIS data that that user experiences is is not really good right we we we we worked really hard on it and there's a lot of really great software all platforms applications out there but for the new GIS user it's it's still problem and our the and now I I started I started at Stanford in January and no 1 of the reasons when the reason came out there is they they sold me on this idea and and really never heard it talked about in the geospatial community was on preservation of GIS data and access to GIS data in accessibility and discoverability of GIS data in the and if you think about all those things go together and when I heard preservation of GIS data I I you know that that kind of concept was foreign to me but you know when you think about it a lot of organizations I don't really have a preservation strategy for the geospatial geospatial data I mean you guys organizations in you guys have a preservation strategy for geospatial data to Hans so it in in that's great that you do but but most organizations don't and I work in a library and I have a lot of library colleagues in here from other institutions in in libraries a collecting geospatial data in in doing this is not even this not a lotta times preservation strategies from for for that yet so in some of our desks look like this right we have a stack of DVDs or CDs maybe with like to oppose the date data and and maybe we have it
on my car drives and and that's great to in it's accessible we can share maybe we have it on a network disk that's available at our institution organization or you know wherever maybe we share it you know it's it's online and like you know in a file system available on a server somewhere i at new folder for right how many carriers projects can look like this
so the point is like we're all guilty of this right hey I am guilty of the iron and never in thought of this earth but thought that maybe preservation as you know I always thought accessibility in was it was really good and GIS data and some really passionate about open geo-data and Open Data but this concept of this concept of preservation and concept of accessibility was kind of foreign and never thought about it in my workflow and the data I created the date I found the data used for projects and it was just never delicious never really from thought about
the so and you know what of come to believe as we treat GIS data and ephemeral manner know right we don't think about it for a long term and preservation but we really do need a tree um these data like durable assets because they are we we all why why do you
care right so you're sitting here and work but you know what why should you care about some guy from some library talking about preservation and access and discoverability of Jester money right it's really expensive divided up it's really expensive to create data and for all arenas it's all the work we do it's it's a really expensive cost for us a 2nd reason you know put you're altruistic Open Data reason in here right graph phosphor g you know we're all somehow contributors to open source software just by being here and so we I have many reasons why I believe in open data and how it can help the world help you know others and what not but put your reason in there and the 3rd reason is money again like data is really really expensive in how how much of your time your time is also really really expensive and how much time do you spend finding data so I'd argue it could be a large portion of a project whether you're creating a map with you trying to do an analysis for a city government were have so that's what I came into my started from at Stanford is the sole on this idea let's let's do this right let's let's work on preservation of this content in in really really kind of my and my job is to look at discovery of it how do we discover in find this geospatial data n I'm a developer I just wanna start writing code right right i it lets you know who's with me like let's just go and grab a room and spend the next 3 days and not this cell right because we can do this right we know there's really smart people in this room right now I'm playing like you know like I was you know I have I have me images and I have you know somebody signed my paycheck end where we all have these great ideas about how you want to do this but it from the you know there's this thing called like design right and and there's a lot of great there's a lot of great GIS offer geospatial solutions from out there and we really wanted to take a step back and really think about how we designed this project not just not just from the user interface the user experience of but also the stakeholders hearing the deploying the software Burgundy maintaining can potentially contributing to so my talk it was in the description but is a little bit about the design process of how a design this open source software project I'm not just not just from a user experience user interface but also the project as a whole in sustainability in some of so we embark upon hours of user interviews right and that's kind of 1 of the the the starting points of of this project was interviewing students researchers and maybe you were 1 of the people I interviewed talk to some experts librarians government agencies GIS analysts people who work in the field of experience in the field and how do we solve this problem of the GIS data discovery what are the sticking points we recorded a lot of the interviews and had to be set list of questions in that a lot of great responses a lot of feedback and where would you understand I think a lot of users and stakeholders holders from that we also went on this process you trying to do an exhaustive environmental scheme we we don't want to recreate the wheel here right we don't have we don't have the resources to recreate the wheel but we we want understand what's out there what's available when you from were from a user interviews and in some institutional from needs will work is it just platform out there we can adopt in contribute to and it already works for everything we need to do In some you know we we went through open source solutions we wanted to look at what the open source GIS data discovery from platforms are out there what letters some potentially close source applications what they do well what are some other geospatial type search applications you know where some cool things in Foursquare you know what some cool things and know what is Craigslist to fur for their stuff and so we really really have all this stuff feature matrices you know looking at wet weather all features a person you know doing a geospatial searches can be looking for the need to what's useful to them and we really came to the conclusion that for for us that these were great hashed out all this but it there wasn't anything out there that matter institutions needs in the same process we found
that there wasn't really a metadata metadata schema available that that we could that we could find that met this discovery use cases for us so the discovery use case around around you know focused on on the users and so my colleagues so during Hardy and kinda anti and wrote a paper with in in this whole design process wrote a paper on um called a metadata schema for geospatial resource discovery use cases and it was published in the code for livejournal this this year so I my slider link so if you wanna go read this paper about on the metadata schemas it's a great paper NO all direct any questions about the metadata schema to care more daring to they're going have great interest Fourier can hear no it's by but we didn't want read to we create the real when we're doing it in the sky we don't just go create this well we we want standard but we wanted to adopt and adapt to the things that are out there and are currently being used so we we used the controlled vocabulary from the phosphor G. cat interop project anybody here from the world again it also works with the highest so FGC in archaea IS metadata so we have conversions from all those to this discovery in in really the geo black-white schema is a solar scheme and when it comes to to and also convergence from the open geo-portal project at MIT so we can went to this next phase of 2 through all this of the design process of rapid prototyping so we we're looking at what's out there were talking users trying understand the needs of were building the schema in conjunction were looking at was on what is the institution where would we would we have experience and already in what we leverage and and what can we let's do some rapid prototyping see what we can get out there and out the door during the design phase and what might work and put it in front of users so we did that and we have kind of lot of super alpha version of it at GEO Blacklight . stanford you in before you go check all but it's you notice cormorants very very of of to and so and with all this coming in all this information coming in we developed developed a group of scientists around the geospatial discovery of foreign institution for a higher education institution and what is what is look like in any and how many people created personas persons in their workflow so yes it several people in here and and this is this is really useful for us something that our group is not doing for all our projects to as anybody been involved in a project where there was scope creep right yeah OK my good but I think is that we do have been involved in a project of and so 1 helpful being that we're trying to do is implement this this design process it also personal creation so we can we can all agree in half consensus on what we're moving towards and we can iterate offered in an agile way but we can also win uh stakeholder comes to us and says no hey we wanna go in this different direction now right where we what we really think this now say OK that's a great idea but you know we agreed on all these personas right we talked about this we spent a lot of time developing these you know use cases around these personas so let's go back to that and see where this fits into this is helpful for us in in that scope creep in kind of project management piece of the sustainability of it and I go through these quickly how much time you have been doing OK and yet at the time so Brian Diaz the and so this whole this whole design document that I've pennconverter into a presentation is available online on make available but Brian Diaz is a professor of history is an experience scholar he's not a GIS user but he wants to integrate GIS into his is workflow in in into his research any once the other point students to be able to go find GIS data and that's really hard you know that's really hard for him to do to say here go here and you can find this geospatial data that's going to help you further insert your research topic in there there's it's it's it's harder but it's hard for him to be confident in and in doing that to his students we also have Andrea paint is a PhD candidate she's an experienced GIS user she knows she knows how to find her data she she's not going to use some the 2nd use of portal that you know we we build for her she's just wants to use Google could she already knows how to find the data right but we have seen
he's new GIS research is a soft war and this is you know this is your new GIS user but also he's grown up with technology you know he's got it's got a smartphone he's got a tablet he's you know he's used to using Pinterest and Foursquare and he expects the applications used to it I have a really great user experience like these commercial applications and these are used to the Beverly Arnold and Earth Sciences Librarian she just wants to create more access to geodata for her customers and patrons at the library she some it's really hard for them to make their own decisions on what date of purchase when they don't actually have any numbers on who downloads what data are how much data is down the brain then is the GIS instructor at the University in lab manager and he wears many different hats and he wants to once have this poem must be 0 . students to a great resource for geo-spatial data to trainings on at some point somebody to a persistent URL and that URL be there in that dataset the available for them from the students own whenever and so we have our own web engineer whose can be a code contribute a project and open-source advocated and wants to be 0 the project so these are the personas and where were designing the software around any and through this process the user interviews the scheme development that rapid prototyping environment scans and persona creation we've developed like a featureless we developed a pretty good from pretty good set of features and functionalities and removed a lot of features and functionalities and that we think will create a successful project the so what we learned here through this whole process so for us users are frustrated with finding geo data we didn't interview 1 person who said you know what I've never had an issue I don't have issues finding data it's it's really easy for me to do but I I did I did interview 1 person said that we have things like when I do find it I can download it why can I do this we heard a lot of different like him and use this you know why would I ever use that software is it supposed to work like this you know have you ever heard that from like is it really supposed to work like this I to so so we we also found out that the current suffer offerings just don't matter institutional objectives and all talk a little bit more about that is like I work in digital library systems and services group at stanford and we have a lot of expertise in certain software areas in just the current software offerings that were out there from the Open Source Geospatial community just didn't just a line up with the objectives in our in our expertise that and in also we found the spatial search is important for us spatial data we found a map imagine that break from we found a lot of users who you know text based search for categories and keywords didn't always work to find the things that they need to work on we have a lot of researches you're doing you know doing random research on this village in Brazil right that the anode they would be overseen map and find where where if there's data available for the 1 also big of things we found out the area that we we from you got from our stakeholders in this this is of focusing on discovery and leaving out analysis a lot of there's a ton of amazing geospatial software out there GIS software Web Tools applications mapmaking from cartography software that that do great jobs with analysis and things like that end we won and completely leave off analysis use cases in the software and only focus on discovery so we came up with the breaking user feature list on that you know that we were pointing the offer to focus on these features and that you know we you also came up with stakeholder features so things that were incorporating based of stakeholder needs and you know of course I The of phosphor G presentation wouldn't be good without a map split I can mention before like our institutional priorities this so Stanford some degree by working at Stanford is part of a of digital repository of a group called Hydra project hydro and we have partners from all over the world probably some of the institutions you work at over by some you know attended you know our partner institutions of higher just we we have in core contributors to these difference offer projects and Fedora Repository project hydro Blacklight when we really wanted to leverage the expertise and knowledge that we have in house so we can sustain this project further we can add the functionality that supports our needs the so
Due blacklight is the project on talking about it's a discovery application for geospatial data is spilled off Blacklight which is an open-source Ruby on Rails genome that provides discovery interface for solar it's also build off the open geo-portal community so open geo-portal has done an amazing job of building a discovery application for geospatial data in building up community around metadata sharing and so we wanna leverage both those some of projects rather than just starting from new but to build up this project and 1 of the things we know some you are doing is also leveraging of different types of services and our organization so this concept of preservation of geospatial data it is built around the Fedora repository software with geospatial connectors using the GEO hide project so we we can so actually know when me assessing content and bring in digital content and our library we do a whole bunch of things the metadata and we actually put in your preservation system in Fedora so so now all the geo data that were bringing into the Stanford University Libraries is also being preserved is backed up to tape in 3 different places and were working on a project right now world the St. between 3 other universities so this geodata if you know California falls off the end of of the face of the earth will be available on you for long-term term I We also serve allergy a web services from all this data that's in a repository through STI which we use a clustered OpenGeo Suite from approach and GeoServer GeoWebCache PostgreSQL PostGIS and this project you monitor was kind of spun out of the from our user interviews as is 1 1 thing we found out was a lot of times users want to download data they couldn't actually find it so we created you monitor which is a WMS monitoring service and so you guys ever like try to good O W master owners down in India didn't know so this is a rails out that monitors DBMS in points and we check all Stanford's DBMS services every hour and which had other open geo-portal consortia members stuff every 2 days to begin to make sure that the In points that we're giving to our users are actually available and users can get at the data so I posted all links to these different projects and all the and get of repositories on these some were really excited about the work that we're doing in this space and we wanna just kind of openness opportunity we're working with a lot of other institutions and libraries but there's any other partners of people and you know other organizations government and like to work on this together we would love to hear from you I internal think I have really time for demos but just quickly current status on the project is we've completed this design phase a couple more UI UX stuff things that we wanna work on the right our scheduling are codes friends were working with a couple partner institutions trying to figure out when we're going to be able to all work together on a so as the current status of the project from think simple requests questions welcome yet this but yet we Lugosi metadata about geospatial resources so you have a lot of it can be catalog or people can come to define the looking for like gather is the scope wide open there some other if the some government you or whoever's got data today potentially other metadata as well or is it strictly limited to use word mother's universities stuff yeah we we we don't wanna recreate all this metadata from all this organizations right we we don't need 50 organizations hosting the tiger wine data may be right you know we just want the leverage that Harvard already created the metadata for inserting in a great way so we just want our users access that so yeah I think it's for an institution deploying geo block light to be able to decide what they want and the instance of it so yes it's open anything besides higher education institutions OK but points to and that Harvard instrument black light to expose and which are not point it's unlike unlike state Google which points to you or else to help to find stuff on the internet will impose the Internet and how you know uses him saying this analogies and trying to figure out is how do you point to resources that don't even know what you black
is work that does it only know about what's in it in I happening so silly point to resources that don't even know it you black-eyed is and through the geo black-white scheme and solar and and we also point to our own stuff so essentially
take w mass points from is a triple IF in point that is a use case for some of organizations or even static on a static download points were great thanks so as to what's the with the underlying search engine and get yeah so we're using solar and as underlining the underlying search for duplex black life yes OK so so what is the use of semantic based chance the I would it what are the characteristics that over or say something like a I will engine at a Google engine so well so I guess I can really I'm sorry what people of also the features of the way we all and that there's some of the site sites surveys have want to hold up 1st of all you have to find out the and kind the so can you tie that in with some sort of faceted navigation are the answer with some some sort of semantic based search so that it solves the good example that that this common when we talk about geospatial searches went some researchers to the Big Apple that association with New York or New York City so knowing that those 2 things at the same thing or knowing that for example that that there are multiple Manhattan's in Manhattan Kansas ban had Montana Manhattan you work yeah OK yeah I mean or that were is 1 that's kind of down the pipeline were looking at that as a library as you know in a larger context because we want integrate with like a GeoNames in like a Linked Data wait we wouldn't have those types of searches available but currently our infrastructure we don't know with that thank you thank you
Expertensystem
Domain <Netzwerk>
Thumbnail
Selbst organisierendes System
t-Test
Kartesische Koordinaten
Smiley
Systemplattform
Code
Computeranimation
Rechenschieber
Virtuelle Maschine
Rhombus <Mathematik>
Domain-Name
Software
Rechter Winkel
Mereologie
Programmbibliothek
Strategisches Spiel
Dateiformat
Projektive Ebene
Kurvenanpassung
Grundraum
Software Engineering
Punkt
Freier Ladungsträger
Offene Menge
Software
Selbst organisierendes System
Mini-Disc
Server
Dateiverwaltung
Projektive Ebene
Gammafunktion
Rückkopplung
Bit
Prozess <Physik>
Punkt
t-Test
Abgeschlossene Menge
Zellularer Automat
Kartesische Koordinaten
Term
Systemplattform
Code
Netzwerktopologie
Deskriptive Statistik
Software
Prozess <Informatik>
Code
Datentyp
Endogene Variable
Programmbibliothek
Softwareentwickler
Bildgebendes Verfahren
Hilfesystem
Analysis
Expertensystem
Matrizenring
Benutzeroberfläche
Graph
Open Source
Mailing-Liste
Nummerung
Programmierumgebung
Mapping <Computergraphik>
Datenfeld
Einheit <Mathematik>
Offene Menge
Rechter Winkel
Projektive Ebene
Umsetzung <Informatik>
Bit
Punkt
Prozess <Physik>
Gruppenkeim
t-Test
Versionsverwaltung
Kartesische Koordinaten
Analysis
Richtung
Metadaten
Prozess <Informatik>
Kontrollstruktur
Phasenumwandlung
Gerade
Lineares Funktional
Dokumentenserver
Kategorie <Mathematik>
Güte der Anpassung
Nummerung
Strömungsrichtung
Rapid Prototyping
Entscheidungstheorie
Fourier-Entwicklung
Software
Dienst <Informatik>
Menge
Rechter Winkel
Heegaard-Zerlegung
Projektive Ebene
URL
Information
Computerunterstützte Übersetzung
Programmierumgebung
Smartphone
Portscanner
Subtraktion
Elektronische Bibliothek
Digital Rights Management
Zahlenbereich
Kombinatorische Gruppentheorie
Räumliche Anordnung
Code
Benutzerbeteiligung
Software
Fokalpunkt
Programmbibliothek
Vererbungshierarchie
Geometrische Frustration
Jensen-Maß
Strom <Mathematik>
Softwareentwickler
Analysis
Open Source
Mailing-Liste
Physikalisches System
Binder <Informatik>
EINKAUF <Programm>
Mapping <Computergraphik>
Objekt <Kategorie>
Minimalgrad
Flächeninhalt
Offene Menge
Mereologie
Tablet PC
Speicherabzug
Rapid Prototyping
Subtraktion
Demo <Programm>
Punkt
Gemeinsamer Speicher
Selbst organisierendes System
Kartesische Koordinaten
Online-Katalog
Term
Räumliche Anordnung
Raum-Zeit
Internetworking
Metadaten
Web Services
Software
Prozess <Informatik>
Datentyp
Programmbibliothek
Inhalt <Mathematik>
Grundraum
Analogieschluss
Phasenumwandlung
Schnittstelle
Suite <Programmpaket>
Ruby on Rails
Dokumentenserver
Open Source
Güte der Anpassung
Magnetbandlaufwerk
Physikalisches System
p-Block
Binder <Informatik>
Scheduling
Dienst <Informatik>
Offene Menge
Codierung
Wort <Informatik>
Projektive Ebene
Instantiierung
Aggregatzustand
Assoziativgesetz
Web Site
Punkt
Simplex
Selbst organisierendes System
Ruhmasse
Nummerung
Sondierung
Kontextbezogenes System
Räumliche Anordnung
Quick-Sort
Computeranimation
Gewöhnliche Differentialgleichung
Hydrostatik
Multiplikation
Computerspiel
Suchmaschine
Datentyp
Programmbibliothek
Vorlesung/Konferenz
Charakteristisches Polynom

Metadaten

Formale Metadaten

Titel Fixing GIS Data Discovery
Serientitel FOSS4G 2014 Portland
Autor Reed, Jack
Lizenz CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/31700
Herausgeber FOSS4G, Open Source Geospatial Foundation (OSGeo)
Erscheinungsjahr 2014
Sprache Englisch
Produzent Foss4G
Open Source Geospatial Foundation (OSGeo)
Produktionsjahr 2014
Produktionsort Portland, Oregon, United States of America

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Discovery of GIS data is broken. It is overly complicated and incomplete. Organizations spend time and money on creation and acquisition of data, yet it sits on hard drives, dvd's, and shelves without a straightforward way for others to discover it. Discovery tools that do exist have usability issues which alienate users and prevent wide adoption. Too often, data discovery is an afterthought, grafted onto tools that have been designed for analysis, or treated as one feature among many in a map portal. Thus these tools attempt to serve every possible user need and in the process become unusable. Simply put, we need an application which enables discovery of GIS data with an emphasis on user experience, integrates seamlessly with other tools, and streamlines the use and organization of geospatial data.We present GeoBlacklight, a collaboratively designed and developed open source software focused on discovery use cases. The project builds on existing, widely adopted open source projects. GeoBlacklight fills the gap in discovery tools for geospatial data by providing a simple, yet powerful intuitive interface. To reach this goal, Stanford University embarked upon a comprehensive design process. Our process includes an environmental scan, stakeholder interviews, user interviews, inter-institutional collaboration, and rapid prototyping. We will present the user personas that have been distilled from our interviews, the user stories and feature prioritization process these inform, and prototypes of the software we have developed so far, as well as plans for future development.
Schlagwörter discovery
data
open-data
user experience
accessibility

Ähnliche Filme

Loading...
Feedback