Adding value to Open Data using Open Source GIS.
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 188 | |
Author | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/31647 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Producer | ||
Production Year | 2014 | |
Production Place | Portland, Oregon, United States of America |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Open sourceSpacetimeDirection (geometry)Software frameworkPolygonElectronic GovernmentWeb pageElectronic mailing list1 (number)TextsystemSelf-organizationNumberCharacteristic polynomialMereologyRange (statistics)Time seriesFitness functionOpen setCore dumpService (economics)Product (business)Level (video gaming)Desktop publishingSpreadsheetOperator (mathematics)CodeFocus (optics)Integrated development environmentDomain nameMainframe computerWeb 2.0QuicksortPresentation of a groupType theoryStandard deviationSoftware bugLibrary catalogCharge carrierTotal S.A.BitComputer programmingMoment (mathematics)3 (number)Water vaporOntologyComputer fileEinbettung <Mathematik>Personal computerNichtlineares GleichungssystemExpert systemWebsiteForestAsynchronous Transfer ModeProcess (computing)Expected valueInternetworkingUbiquitous computingVideoconferencingSystem callRight angleWaveMultilateration
07:23
Archaeological field surveyService (economics)Real numberOpen sourceBuildingElectronic mailing listMassNumberLocal ringRow (database)Degree (graph theory)Bookmark (World Wide Web)Library catalogNormal (geometry)Image resolutionProcess (computing)Task (computing)Endliche ModelltheorieMusical ensembleQuicksortMathematicsSlide ruleData managementPerspective (visual)Element (mathematics)Price indexHypermediaWave packetFile formatComputer programmingMachine visionConnectivity (graph theory)Computer configurationCore dumpCASE <Informatik>Set (mathematics)Strategy gameGreatest elementComputer virusVariety (linguistics)Operator (mathematics)ConsistencyVideo gameConservation lawStandard deviationFood energyPoint (geometry)Web serviceOpen setFocus (optics)MetadataUniform resource locatorEvent horizonSoftware developerAreaTime seriesMereologySampling (statistics)MIDIOrder (biology)FamilyInstance (computer science)SoftwareNegative numberLaptopMappingLevel (video gaming)TesselationFeedbackWebsitePlug-in (computing)Server (computing)Cache (computing)Web 2.0Internet service providerSelf-organizationRevision control
14:46
Generic programmingHand fanService (economics)Server (computing)AdditionSign (mathematics)BitWeb servicePlug-in (computing)Product (business)Exception handlingMetadataMereologyUniform resource locatorInformationQuicksortProcess (computing)Open setWeb 2.0MappingClient (computing)TouchscreenGoodness of fitWebsitePresentation of a groupWorkstation <Musikinstrument>Software developerCore dumpExtension (kinesiology)Internetworking1 (number)Pattern recognitionNumberAnalogyPlanningProof theoryRepository (publishing)Row (database)Physical systemCollaborationismError messageSet (mathematics)Regular graphWeb pageDependent and independent variablesLetterpress printingFile Transfer ProtocolFreewareMultiplication signPoint (geometry)Time zoneTime seriesMusical ensembleImplementationProjective planeLocal ringLatent heatComplete metric spaceInformation privacyMathematical analysisRule of inferenceRight angleFood energySound effectOpen sourceSelf-organizationWater vaporDisk read-and-write headPower (physics)Statement (computer science)Group actionWeb browserStandard deviationFeedbackMultiplicationMoment (mathematics)State of matterVideo gameAreaTerm (mathematics)OnlinecommunityAssociative propertyFigurate numberIncidence algebraContext awarenessSocial classStatistical dispersionLine (geometry)Different (Kate Ryan album)Internet forumNeuroinformatikOptical disc driveAuthorizationComputer animation
Transcript: English(auto-generated)
00:00
New Zealand is a dateline country. If you saw the presentation this morning on D3, there are tools now that work across it seamlessly. For many years GIS hasn't, and for 14 years ESRI has failed to address that for us. As an example of if you release open data and your data includes spatial data,
00:20
there are issues that you need to address that users have to be aware of and tools they use have to be aware of to be able to work with these data. And spatial data, as everyone here should know, has some fairly special characteristics. Quick and due to the other example there, inside, outside of polygon is fairly straightforward,
00:41
isn't it? Inside or outside of the equator? And it's a legitimate question and users are confused. So we need to address those sorts of issues. What is NIWA? So we're gonna cover first a very brief overview,
01:00
a look at open data, a look at open source, enabling open data, and then what we've done working with the open source community to add value to our open data initiatives. So if anyone's interested in a birds of a feather type topic, discussing this sort of thing, let me know, we'll try and get something organized. The birds of a feather opportunities are here
01:21
and we should make use of them if we can. So NIWA, New Zealand National Institute of Water and Atmospheric Research undertakes a wide range of atmosphere and water-related environmental research in New Zealand. It's government-owned, it operates as a commercial business, operating under
01:41
some legislation, the CRI Act. Open data itself, we wouldn't have a fair idea what that is, it's a global phenomenon, Inspire is driving it in a big way in Europe, but most countries now are having some sort of data.gov. whatever website.
02:00
The focus is on deriving economic value from taxpayer-funded data, which is leading to the democratization. The data is no longer available to a specialized group, it's available to anyone to play with. For your data to be useful, you need it to be standards compliant, it needs to be able to be used with other people's data,
02:21
other countries' data, other institutions' data. So a federation at a wide range of levels is pretty much a requirement. It also allows citizen science not just to reuse your data but to re-provide added value products, crowdsourcing not of raw data but of derived products.
02:43
And the concept in this space of vendor lock-in is anathema, it has to be completely open and available to everyone to reuse. The open data philosophy is quite new compared to open source. There are a number of issues in common. One of the ones that I run into most frequently is,
03:02
if I let you see my data, you'll see all the mistakes. The open source community learned about sort of fixing bugs in code by releasing it many years ago, and the open data community can learn from our experience. So the goal is that data and data reuse become pervasive, not an occasional exercise
03:23
but a standard operation. It goes way back in the 70s, those of you who remember that far back, you used to write your papers longhand and give them to someone else to type in on a word processing machine. And democratization, pervasiveness, personal computers, people now do their own word processing.
03:40
They do their own spreadsheets and you no longer have a mainframe running Minitab, SAS, et cetera. Desktop publishing is no longer the domain of a few experts, you do it on your desktop, and GIS is heading down the same path. QGIS, I think, is the tool of all that is leading the way down that path.
04:01
Open data itself, it needs three legs as a tripod, it's sitting on three legs, any one of them isn't there, it will fall over. People have to be able to find and determine the fitness for purpose of data and its fitness for their purpose. They need to be able to get the data and the data has to be available under a license
04:21
which allows them to reuse it. And once they've got the data, they have to be able to do something with it. And I found that as part of our open data program, that's what I'm trying to address at this stage. The New Zealand background, in the mid 2000s,
04:42
the government produced a several hundred page guideline that was too big and cumbersome for anyone to use, but the concepts were really good. And that was the e-government interoperability framework. In 2009, the open data license was published. In 2011, the legislation guiding government agencies
05:02
to release data was also enacted. So at the moment, there were a few guidelines on how to implement those directives and the agencies involved are discussing and working through those issues themselves. We are focusing very much on an OGC based capability,
05:25
catalog service for the web for discovery, WSWS for spatial data delivery and SOS for time series. The one thing we found is missing in the framework is the idea of a vocabulary service within the OGC.
05:42
And we are working on embedding the SCOS vocabulary service through some work done in Australia on the SysFOC capability for a vocabulary service to work with the OGC services. We're new efforts in the scheme. It's a research institute.
06:01
Lots of scientists have been playing with their own tools, developing their own tools for many years. So culturally, it's not difficult to get open source working there, to get it as part of a core philosophy and strategic direction for the organization has been a different story, but we're getting there. For the past couple of FOSS4G's,
06:22
I've been able to present on the use of open source tools to provide open data, to support an open data program. And that list describes the tools that we're using and I've presented on those previously. This year, it's a slightly different focus.
06:43
As well as the tools we're using to provide and support our open data initiative, this year I'm talking about how we're adding some value to this, which was the third leg in the tripod, which is if you're giving your data away
07:00
and users don't have access to tools that allow them to use it to best advantage, you aren't doing your own open data program. You're doing it a disservice. And if users don't have the tools, the data's largely irrelevant. So this is essentially to support as part of our open data program,
07:22
we have an open source GIS program, delivering the tool, modified version of QGIS and running training workshops, et cetera, to encourage people not just to use our data, but have a tool that shows them how to do it and train them on how to do that. So we have an enhanced QGIS tool.
07:44
The next slide I'll raise through. Essentially we have a catalog of web services in New Zealand. We have a plugin for QGIS that talks to that catalog and users use that tool to find data from NIWA plus a number of other agencies in New Zealand
08:02
using the APC services to provide their data. It was developed by Sourcepol in Switzerland. It now harvests a list of services from a CSW catalog via CSW. It allows anyone to enter such a catalog URL
08:20
into the tool so you can use it with your own catalog, it's not limited to our one. The tool allows users to connect to it to identify a service, to connect to a service, to identify the layers, to add any layers to a local favorites list and also to cache layers locally for offline use.
08:41
And for WFS that's fairly straightforward, for WMS the tiles are resampled and downloaded and cached locally. So people can actually use WMS backgrounds from a WMS service on a local laptop. The catalog's a G network instance
09:01
and for those of you who don't know, G network is able to harvest metadata from other catalogs and that includes WMS WFS services that can harvest the service metadata and the layer metadata from each layer from those services. So standard metadata catalog,
09:24
the records can be harvested or entered manually and that describes the layer or the service that's available. Individual layer described there. What the tool allows you to do is you can view the various services available,
09:46
connect to one service to identify the layers, select a layer and a variety of options. You can add it to your favorites list,
10:01
you can add it to the map, you can turn it online or offline and at the bottom as you can see there's a toggle there which allows you to just view favorites or not. So once you have a set of layers set up from any servers anywhere in the world, you just have them available on the list only.
10:21
The WMS download is obviously problematic. You don't want to sort of try and download seven terabytes of WMS tiles to your local laptop. So the dialogue allows you to pick the resolution you want to work with to get your job done offline. As far as I'm aware, this is the only offline caching of a WMS service
10:40
I've come across. We have a number of other plugins we're working on. Again, allowing users to do things with the data that we provide. So it's all part of our open data initiative. For people who want to design surveys, it allows them to do that.
11:01
The SCP tool was based on a tool that we developed in the Ross Sea with CamelR for Antarctic for designing rain-protected areas in the Ross Sea. And we're also working now more on sensor-based data for time series work and funding some developments
11:23
in the Map Composer for maritime maps. As a general principle, because of the role we now have with open source tools, both for providing open data and for encouraging users to use open data,
11:40
our upper management has now actually recognized the value of open source and is committing funds out of its normal IT budget to support and enhance those tools. And that's a fairly significant change for an organization that's been one of New Zealand's main ESRI customers for 20 years.
12:00
As I mentioned initially, 180 degrees has been an issue for us. We also, as a maritime organization, work quite a lot with GMT. That's not an AGC tool, sorry, OSGEO tool, but it is a very capable open source GIS tool.
12:22
The aspect that became obvious once we started providing QGIS to people, they could get our data, they had a tool, they weren't totally sure how to use it. And so we've now, in conjunction with other QGIS-supported groups around New Zealand, are running training workshops for people.
12:43
There's one being run next week, and next week should see the incorporation of the New Zealand QGIS User Group. The focus we're taking has been that the ability of you to use data empowers you,
13:03
whether you are a Maori indigenous group, whether you are a local conservation group, the ability to have these tools and the data, not just in the US, but everyone else's as well, is quite empowering. So we're expected to provide data.
13:21
The government has said so. They've also said everyone else is supposed to do it as well, and there's no point in doing it unless we do it in a consistent way. Reuse requires that consistency. QGIS is a core component of our strategy, and that's been extremely successful.
13:43
I've got two very good indicators of how successful it's been. We initially had a version that didn't use a catalog as a backend, it had a website that it scraped the URLs from for the web services, and a website upgrade, NIWA's website upgrade broke that.
14:01
And the good news from my perspective was a number of complaints I got from people I had no idea had downloaded and used this tool that said, hey, it's not working. Generally, that's a negative. In this case, it actually gave me some feedback to let me know what was actually happening. The other thing that was very positive was when I kicked this thing off, I was frantically chasing up all the councils
14:20
around the country saying, can I add your data to our list? And as of the last six months, I'm now getting agencies coming to me saying, can you add our services to your list? So that to me, they're both from a user perspective who are using the tool, and from a provider's perspective,
14:41
they're seeing this as valuable in the open source through the open data arena. So it was on time. You have the mic here?
15:00
Any questions? Are you the guys who made the plugin for SOS service to QGIS, or this is basically a fork of QGIS, if I understand it correctly? Are there any plans to make a regular plugin out of it, and have it in the plugin repository of QGIS?
15:22
We have lots of plans. The initial intent was for, who's familiar with SOS? Is anyone here not familiar with SOS? Essentially, it does for time series data what WBS does for spatial data. And if you're dealing with hydrometric or climate stations or anything like that, you obviously have long time series of data.
15:42
It's at a point. At FOS4G in Barcelona, there was a presentation where, from 52 North, who develop a SOS server, on the use of WFS as a discovery tool for SOS.
16:03
A number of SOS vendors have recognised the lack of a good discovery capability within the SOS specification, and have created their own extension to the standard, which is a get data availability call. And that, to some extent, does the same thing as WFS.
16:24
And that was discussed, again, in another presentation at FOS4G in Nottingham. The problem is now that when we try and build a generic client in QGIS that will talk to all these different services, they all operate in slightly different ways. Some use the WFS service as a discovery tool,
16:40
some use the get data availability, and we have not been able to develop a generic tool that's able to talk through all the idiosyncrasies of the SOS implementation. So we'd love to, but we can't yet. The intent is that the get data availability will become part of the OGC spec. Once it's part of the spec, we can then support it,
17:02
and there will be a standard way. At the moment, there is a working group in New Zealand which is looking at this issue, and we are likely to use WFS in the short term, simply because for OGC compliance, we can't just get data availability because there's not an OGC part of the specification.
17:21
Okay, thank you. So is anyone else here doing anything similar? Are you part of an open data release, or are you more of the user community? I could probably...
17:49
Yep, yep. One of the big reasons we went for the OGC approach, I've left plenty of time for questions and comments.
18:01
If you go for a FTP download, then you have the whole issue of metadata and everything else as well. If you go for an OGC approach, then you have information delivery and metadata capability, and we've gone so far as to make, and as a research organisation,
18:21
scientists cannot sign off on a project as complete until the metadata and data are all into the corporate system. So there's the stick. If you don't behave, you cannot close off your project. The carrot is that scientists traditionally get credit for publications. This is the first year where performance assessment
18:42
includes credit for data publishing. So I jumped up and down and said, if they're gonna give me this hat and make me do this stuff, then they have to sort of make it part of the institutional process, and it actually wasn't a battle at all, I agreed, which was cool. The other aspect is the point of making the data
19:04
available is to have it reused. If you make it available via a web service, the clients, your clients don't have to manage the data, download the data, update it when you do it, et cetera. Their client accesses it live. If you update your data, their client shows your new data on the map.
19:22
If you have an error and you fix your error, the error isn't still on their system being shown on their map. The other thing, the first one we said it was some years ago, and it took two weeks for some, technically they're competitors as much as collaborators,
19:42
but they were using the data from our web services in their own websites. So we had our own websites, all the nice branding, et cetera, so we had that as a showcase, but the data was being reused by others almost straight away. And that, it's essentially just a client.
20:01
The services there, the analogy I use to people who don't know what a web service is has been, if you open a website in your browser, the website doesn't care if you're running a Mac or Linux or Firefox or Internet Explorer or whatever, it takes some data off a URL and puts it on the screen in a nice way that you can understand.
20:20
And these web servers do the same thing with spatial data. So you have QGIS, you have ArcGIS, you have Udig, you have whatever, and it does exactly the same thing as your browser, but it's map data. And it seems to be, I'm not always a great fan of analogies, but that one seems to work.
20:42
Thanks for the talk, Brent. I was wondering, you talked about it a little bit, but just judging by some of the plugins you talked about that are sort of in development with QGIS and also some of the web mapping services and other delivery products of the open data,
21:01
I sort of started making some guesses around users and you addressed this a bit when you talked about the feedback, but I was wondering what your experiences have been on the other side of sort of local council and local government when they've come to you and said, can you take our data? What do they usually expect back? Are they the ones who want QGIS plugins
21:23
for their GIS shops? Are they the ones that want web mapping services that really provide no analysis? Most of them are struggling to cope internally, let alone externally. What they need, New Zealand is a country of 4 million people,
21:42
so resources are limited financially and people-wise. Councils who do not have a good history of collaborating. I mentioned there that one of the issues we have is a lack of a good vocabulary service. If you have 27 councils and every one of them
22:01
calls their rainfall record something different, like 24-hour rainfall, rainfall 24-hour, precipitation, whatever, and you have a nice common web service that delivers these things, how do you actually work out the 24-hour rain from this one is the same as precipitation from this one, so you can treat it as the same value. And in the sensor arena, it gets worse
22:22
because your 24-hour rainfall is quite often a cloudburst dump that might take an hour and a half. Now an automatic station, I was talking to someone from this area, automatic station, their new automatic stations are running at a UTC midnight. Their old automatic stations are running on local time.
22:42
Manual stations are a bucket that someone goes and reads and they may get there at nine in the morning, 10 in the morning, or whatever. So your 24-hour rainfall, trying to associate with radar data, it's interesting. And the minute you put that data out there for people, you don't want to foster misuse
23:01
by allowing them to use the data inappropriately. You have a responsibility to try and make sure that sort of the information is there. So if they're going to use it, they do so usefully and robustly. And that's as much for your reputation as anyone else's. We got this from NIWA and it says this. That's not something we really want to go down.
23:24
Does that help? Yeah. Yeah, what we have found is the councils are now looking at this as leadership and proof of concept and are recognising that it's an extremely cost-effective way of doing this sort of thing.
23:43
Councils are not required to make their data available under legislation, they're encouraged to, which was a recognition of their limited resources more than the government goal. It was also a recognition that the only ones that were required to were the core government agencies because that was all the monitoring agency was able to sort of monitor for the first part as well.
24:03
So the whole process was starting at top down. But yeah, what it has kicked off is a interest in across New Zealand from agencies of all sorts to come up with a national best practise. And the fact we've been, the bigger agencies have been doing it longer
24:21
and we have an international best practise with OGC. We're tending to model our New Zealand best practise on that. The exception is going to be the addition of a vocabulary service that we will have to have before it becomes part of an OGC spec. Yeah, I had a question about cascading your services.
24:44
You said earlier that people were taking your service and kind of rebranding it. Are you doing anything to prevent that? And also, you're not? You think you ever would? If we give the data away as free data, the only thing we have on it is a CCYBY licence.
25:04
And therefore, somewhere in their website on the back page in small print beneath the screen, where no one will ever see it, will say data supplied by Neuqua. And that's all they're required to do. Where we have commercial data, we get paid for it. Where we're giving it away for free
25:20
as part of the government initiative, that's, it's there. Take it, do what you want with it. That's not a popular with all parties in Neuqua. The branding issue is still seen as an important one. But equally, if we can say we are doing it
25:40
and people are doing this with it, and document that, then we are meeting our obligations under the Act without generating money or anything for it, that's where the monitoring of that sort of use kicks in. And as long as people are doing it, we know they're doing it, that's good. It's when they're doing it and we don't know that we can't sort of get the kudos for the public good that it's providing.
26:01
We need to see some value, whether it's public good or financial or whatever out of it, to some extent, because we have to operate as a commercial business. But essentially, if it's free public data, it's free public data. There are data sets that we actually do sell commercially, and that's dealt with quite differently.
26:21
So. Thank you.