We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The Past, Present, and Future of Governement Data Publishing

00:00

Formal Metadata

Title
The Past, Present, and Future of Governement Data Publishing
Alternative Title
The future of open data
Title of Series
Number of Parts
50
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The future of open data
Vector potentialIntegrated development environmentCoordinate systemSelf-organizationData managementObservational studySet (mathematics)QuicksortUniverse (mathematics)Product (business)PhysicalismComputing platformConservation lawRange (statistics)Data conversionProjective planeVector potentialEndliche ModelltheoriePresentation of a groupFood energyDegree (graph theory)XML
Point cloudControl flowSoftware frameworkNetwork operating systemDesign by contractMultiplication signCapillary actionLattice (order)SpacetimeProjective planePressureConservation lawEvoluteNatural numberIntegrated development environmentPoint cloudFile formatControl flowOpen setSoftware frameworkFrustrationHookingQuicksort1 (number)Local ringSingle-precision floating-point formatLevel (video gaming)Set (mathematics)YouTubeTexture mappingSurfaceMappingVirtual machineSystem administratorComplete metric spaceForestComputer animation
SoftwareFile formatService (economics)WebsiteGoogolStatement (computer science)Multiplication signWebsitePhysical systemSet (mathematics)Standard deviationInternetworkingMereologyComputer fileFile viewerProcess (computing)SpacetimeElectronic mailing listLibrary catalogSoftwareView (database)Web serviceVector potentialStatement (computer science)Open sourceProgram slicingDataflow
UsabilityCivil engineeringComputer networkAnalytic setProduct (business)Civil engineeringGroup actionExpected valueProcess (computing)MereologyStandard deviationSoftwareExpert systemComputer fileComputer animation
Point cloudData storage deviceWeb browserImplementationOpen setEuler anglesData storage deviceMathematicsMereologyPoint cloudObject (grammar)Integrated development environmentFitness functionWeb browserShared memoryComputer animationLecture/Conference
Goodness of fitType theoryProjective planeCycle (graph theory)File formatVideo gamePoint cloudWeb browserAttribute grammarData storage deviceSet (mathematics)Service (economics)WordPhysical systemHomography
UsabilityService (economics)Process (computing)Uniform resource locatorUsabilityFile formatWordMultiplication signUniverse (mathematics)Analytic setStructural load
View (database)View (database)Memory managementSet (mathematics)QuicksortInformationPrisoner's dilemmaContext awarenessComputer animation
Streamlines, streaklines, and pathlinesChainView (database)Computer networkComplex (psychology)SynchronizationDifferent (Kate Ryan album)Shared memoryStructural loadBitFeedbackConnected spaceDataflowProcess (computing)Projective planeService (economics)Level (video gaming)Product (business)Set (mathematics)Filter <Stochastik>Attribute grammarMobile appState of matterEmailSoftwareMassAddress spaceQuicksortChainRange (statistics)Musical ensembleMultiplication signDrop (liquid)AreaSynchronizationComputer animation
XML
Transcript: English(auto-generated)
Thank you all for coming. It is the last talk before lunchtime, so yeah, we'll keep the energy high. So yeah, I'm here today to talk about the past, the present and the future of government data publishing. So I'm the Customer Experience Manager at Coordinates. I've spent 10 years or so in geospatial.
About five years I was doing, I was a geospatial analyst working in the conservation, yeah, Department of Conservation doing all conservation work. And in the next five years I spent working in mostly geomarketing.
So my degree, and when I went to university I studied environmental studies and economics because those two things were my interest. And economics had this thing called externalities and it just seemed like a little side thing that they talked about. You work out the economics and do the thing you want to and externalities are a thing that you don't worry about.
And I was like, oh, that's the environment. Let's bring that back in. And why I got so into geospatial was because I felt that with geospatial data and modelling we could actually start to have conversations about what those externalities are. And make sure that when we look after and create new products and businesses that we take into account communities and the environment.
So that's how I got into geospatial and why I'm still here. So at Coordinates our goal is to realise the potential of geospatial data. And why I'm so passionate about this and I've stayed in the industry is because I feel that all these geospatial data assets are out there but they tend to be locked away.
And as an analyst working for ten years doing all sorts of modelling across a huge range of businesses and projects and environmental things. I've found it so challenging to create a full picture of the scenarios that people want to know about because of lack of access to data.
And lack of value of creating data sets to be able to communicate all facets of an environmental, whether it's a physical environment or business environment. So our company is a geospatial data publishing platform.
And we have users who are professionals or people that come to take data sets to put into their projects. And we have publishers who are folks who publish the data out who might have huge amounts in their organisations. So to give you a story about an example of frustration throughout all my ten years, this is the most recent example I have.
So about three years ago I decided because I was working in business but my passion is the environment to work four days a week instead of five.
And Friday is a nature day. So on nature day I volunteer on conservation projects. And that's why one reason I'm enjoying this conference so much because I see everybody here and all the people I'm meeting have their side hustles, they've got their jam. You all have things that you're passionate about and really into and fitting them around paid employment.
Whether that's working for a large agency and finding that within your work and giving your free time to open source. Or for those people who are out there doing, you know, have gone freelance and doing contracting and figuring out how, yeah, you want to spend your time basically.
So Cloud Break. I had some friends who work for a charity called Sustainable Coastlines which is all about taking single use plastic or reducing single use plastic but also picking up trash on the beach. And they were, they're big really passionate surfers, like they love surf.
So who's here heard of Cloud Break? Oh awesome, awesome. So Cloud Break is one of the top surf reefs, surf breaks in the world and it's in Fiji. It was managed by American Resorts for a long time and then it was opened up. So all these businesses, suddenly local businesses flourished of taking people out to this break so that they could surf it.
But that meant quite big pressures on the break as well because you've got people dropping anchors and all these boats out there. And also they're starting to be really aware of what climate change could do to a reef that is, yeah, coral. So my friend said, oh do you want to help us out and do some mapping for us?
And I was like, yes I do, trip to Fiji. And I thought, well I'll use, yeah, I'll just do some research before I leave and figure out what data sets I can get to be able to kick this off because I don't know if they really need me. Someone over there I'm sure could do it. So I went online and I managed to find an incorrect atlas of all
these marine reserves that, yeah, it said there was this massive marine reserve already there. So I printed out a map and took it to the locals and they were like, no, no one's ever spoken to us about this. There is no way this marine reserve already exists. So it was like, oh okay, so the one thing I've said, found that was authoritative is completely wrong.
And I couldn't find any topographic data. So by chance I rung up some friends in New Zealand and said, oh hey you guys, work in Fiji. I'm going over on these dates, can you help me out with some topographic data? And they happened to be there at exactly the same time for a week working with the Ministry of Lands and Mineral Resources.
So I'm just using this as an example but this is, I have so many examples in New Zealand leading up to this one of exactly the same kind of story. So my friends were in Suva, I was around at Cloud Break and it took them sort of a good four days of working with the Ministry to say,
look, can you please find a licence and can you hook us up with some data to take to our friend so that she can do this work? And it took them a few days to like get it downloaded in a format that I could use, get a licence associated with it, took it on a USB, drove all the way around the island and then I took them out to the break and we went to North Glen and it was amazing.
And I created this map. So it was a complete mission. And yeah, that happened many, many times in New Zealand. All my career it was intended to be based around who I could phone up to say, oh first of all do you have this data and do I need to call somebody else like a policy analyst to get you permission to share it with me?
So all that roundabout missioning is where we were and we're mostly out of that now. And how we've got from that evolution of that space was first of all Creative Commons licensing has been awesome and people I feel are a lot more socialised about how to use them and what these licences mean.
There's some awesome YouTube videos to explain what they are. I remember when they came out I was like, wicked licence, Google it, legal text, not getting it. But now, yeah, they're a quite common uptake. And we also have open access licensing framework and the policy has kind of filtered down right through government agencies now.
So that's kind of settled in. So it has left us with a space where we have quite a lot of data available but it's not that usable. And when it is out it might be out only in a viewer.
So for starters you're like, wicked this data exists, I can see it, I can see it. But you can't necessarily take it and put it into your system and then realise the value of it by using it for one. Or the second part is to embed it in your own businesses and your own processes because you don't know really about how frequently it might be updated or the quality of it.
And there may not be so many people using it that there is enough demand for the agency to prioritise caring for this data. And this is where we come to the concept of dormant data. So this is data that exists so it might be in a little zip file somewhere on the internet but it's not realising the potential of what it could.
So I just want to go quickly through these ideas where we have data that requires technical expertise to access. So someone asked the question in the last talk about are journalists using data sets.
And it's people like that. Is it discoverable for those people? How would they even know it exists or who might hold that data to even go and find it? Again the data sets are view only so a lot of things I've seen online all talk to people and they're like yeah data is available you can find it.
And then I find an online catalogue or a list and it's like yeah now I have to email someone and ask them and go through that whole engagement process. And that's a lot of time and there's no updates that are going to be associated with that. That data requires proprietary software to access. We all know about that hence being an open source conference.
That data is not available API or web service and a key part of the API or web service is the reliability. So raising the standard and saying it's not good enough to just go yeah we've got like a WMS available.
But then if you have an emergency and you have half the country smashing that WMS it's not designed for that purpose. And your team or a team may not be in the business of providing that. That data can't be previewed or appraised before export so you'll start downloading a huge massive data set when you only need certain slices of it.
That it's not easily discoverable within a site or within Google. I'm sure we all know about navigating through websites and then figuring out how did I even get to that last time I was here for the update. And also confusing or contradictory licensing statements and we can say thank you for Creative Commons for helping us along with that.
So I love this quote because it's not just us feeling this pain. This is a quote from the Civil Analytics Network saying that getting data out there is arguably the single most important obligation but it's not enough anymore.
We can raise our standards of expectations that yeah like a zip file we can do better than that. We need to think about data as something as a product that can be published and served like other products. And it needs to be easily accessible and usable for everyone.
And it's an easy trap to fall into to think because I can use it and my mates can use it easily. Great job done. We've got to think and I feel I'm talking to an audience that's very friendly to this concept. But it's really for a much much wider group of people and how we can get it into the hands of everyone not just technical experts.
So the parts of the open data puzzle that we see, we've got cultural change and we're feeling really good about the changes I've seen in the last 10 years on that. Coming from someone who had to once sneak data onto a DVD and send it to an island
because we had gatekeepers who were fearful that these people who needed the data weren't able to use it. And there was no licensing and it wasn't someone else's data but there were attitudes of very much yeah we can't share it.
We have the legal tools now. Government policy is in a pretty good place. So the last part and this is where coordinates fits in is the technology and that's changing dramatically with cheaper cloud storage. We've got better browsers than we had and ultimately easier ways to distribute data. And I mean that's a major thing like 10 years ago it was a very different technological environment so even if
we had all the other things in place we didn't quite have the technology to be able to meet our objectives. So we've designed this data life cycle and this is what we believe all the steps are that need to happen for data to be in a really good place of people sharing and getting what they need.
So I'll just talk about the green side is about data users. So we believe a data user wants to find the data really easily. They don't need to know who's publishing it. They just usually want to go onto Google or a browser and type in some words and have data sets returned to them.
They then want to appraise the data. They want to actually see it, pull it apart, check what attributes are in it, licensing and just have all that human readable and accessible. They then want to access it in whatever format they want to consume it in. So that means like projection transformations, APIs, PDFs, whatever you need.
You want to be able to get it already prepared that way. You then want to use it in your own systems. And this is when we jump over into the publisher side. So if you're holding data, you want to have clear insights as to who's using it, what formats they're using it, how they're using it and what value they're getting from it.
So you then know how to give great customer service as far around your data sets. And that allows you then to prepare the data, store it in the cloud or wherever suitable and then publish it again. And that's how we believe with all these things in place, you're then able to realise the value of it.
And three key concepts that we also want to bring into the idea of data services are the universality, so that word we mean everybody can use it in whatever format. The usability to make sure again that it's already pre-prepared, so you're not spending loads of time preparing data before you're using it.
And also the reliability of it, so that's talking about uptimes, APIs that you can rely, they're going to be there, URLs not changing, regular updates. And just that you can trust it basically to build it into your business processes
and not worry about how it's going to affect or change any of your analytics. So one of our customers is Land Information New Zealand and we've seen a radical increase of the use of their data sets through publishing this way.
Ten years ago they were issuing 30 DVDs a month that went out to various agencies and now they're getting 226,000 views and 30,000 downloads in a year. And just to give you some context, we've got 4 million people in New Zealand, so yeah, it's heaps.
So that's the present, which is yeah, really proud of where we've got to so far. Now I want to talk a little bit about the future. So I've talked a lot about data services and the idea of data being a published product that then served out with customer service and everything you need wrapped around it.
Where we see in the future and what we're thinking about and getting really excited about and come and chat to us please because this is all our new exciting roadmap stuff. Is looking at geospatial data as a supply chain. So it's not just one little piece, it's what is the whole end-to-end supply chain.
And we've come from a place where we're like, this is the geospatial data market where we've got great big publishers and they publish out to many professional users. And we're like, yay, job done. That's great. Reality? It's a connected network of loads of different people doing lots of different things.
Huge amounts of data sharing before you might get one little published product that's then nice, shiny and cared for that you can then put out to a large range of people. But that's sort of the tip of this massive iceberg of project sharing and on-the-go data stuff. We also think that there's a bit of a flow. So you have at the top the federal or the
national level and then that filters down a little bit to say state level and that filters down to city level. And then you've got sort of tech firms, big tech firms dotted in there and then businesses sort of down the bottom doing a lot of consumption of data products.
And what we're working to do now is to make it really trivial to distribute data across these complex networks. To be able to synchronise updates back up. So moving from a place of here's a data product, we've published it, RAD, into here's some data we're sharing. I need feedback from you and comments and pull requests and all that kind of stuff.
And so that you can basically customise data for your projects and then prepare the data for use for many people. And that can be the kinds of things we're thinking about is I publish a data set with 50 attributes for everyone to use.
And you only want that data set filtered down to a region with an area and drop off the other 50 attributes. So it's really about doing that preparation in an app rather than if you have 50 different contractors all trying to do that for themselves with their own slightly different flavour.
So come and talk to us for a demo. We're based in Auckland, New Zealand. We're here this afternoon and really keen to chat to anyone. We'll give you an email address and talk soon. Thank you for your time. It's almost lunchtime.