Building Open Source Projects in Government Esri Ecosystems
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 188 | |
Author | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/31646 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Producer | ||
Production Year | 2014 | |
Production Place | Portland, Oregon, United States of America |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
BuildingCodeProjective planeOpen sourceMultiplication signData managementLevel (video gaming)Projektive Geometrie
00:26
CodeOpen sourceProcess (computing)WordProjektive GeometrieDifferent (Kate Ryan album)Web applicationArithmetic meanFocus (optics)Data managementComputer animation
01:16
Projektive GeometrieSymbol tableWeb applicationFocus (optics)Open setCodeCartesian coordinate systemArmMultiplication signImplementationComputer animation
02:15
System administratorSoftware developerFormal verificationType theoryCombinational logicOrder (biology)Different (Kate Ryan album)Web applicationEnterprise architectureAgreeablenessFront and back endsStack (abstract data type)Computer animation
03:34
Data managementServer (computing)Data storage devicePhysical systemBitInstance (computer science)QuicksortWordArmState of matterComputer animation
04:43
CodeMedical imagingServer (computing)Programming languageDifferent (Kate Ryan album)Formal languageInternetworkingCartesian coordinate systemWeb 2.0Web applicationQuicksortTrailComputer animation
05:59
Projektive GeometrieSoftware
06:21
Software repositoryProjektive GeometrieField (computer science)Open setFigurate numberOrder (biology)InformationCodeGoodness of fitWeb serviceType theoryLink (knot theory)Different (Kate Ryan album)Structural loadQuicksortWeb portalMereologyTransformation (genetics)Cartesian coordinate systemWeb 2.0Service (economics)Information securityInformation privacyFlow separationWeb pageDatabaseComputer programmingPoint (geometry)Direction (geometry)Server (computing)WebsiteDefault (computer science)DataflowBitOpen sourceArithmetic meanPower (physics)Set (mathematics)File formatLevel (video gaming)FingerprintMechanism designRun-time systemContext awarenessRight angleOctahedronDimensional analysisComputer animation
11:27
Cartesian coordinate systemMessage passingUniform resource locatorFormal grammar
11:51
Transformation (genetics)Open sourceWeb portalCartesian coordinate systemStructural loadFormal grammarProjektive GeometrieConnected spaceOpen setService (economics)EmailBuildingKey (cryptography)Permanent4 (number)NumberElasticity (physics)Address spaceDesign of experimentsTelecommunicationType theoryMedical imagingMatching (graph theory)Multiplication signData conversionComputer fileBitShape (magazine)File formatTrailMereologyFluid staticsMobile WebMappingDigitizingInformationStandard deviationSlide ruleMessage passingBlock (periodic table)CodeLogical constantQuicksort
15:30
Multiplication signCodeProjektive GeometrieCustomer relationship managementData storage deviceOpen sourceData managementCartesian coordinate systemProcess (computing)ImplementationComputer configurationPhysical systemMereologyInheritance (object-oriented programming)Computer animation
16:35
Multiplication signCodeMixed realitySoftware developerWeb 2.0MappingGroup actionLevel (video gaming)Self-organizationMomentumData managementOpen sourceWeb applicationMassQuicksortTotal S.A.Open setLocal ringCartesian coordinate systemData conversionProjektive GeometrieCore dumpConnected spaceComputer animation
19:00
Server (computing)System callInformationElement (mathematics)Amsterdam Ordnance DatumWeb portalMoment (mathematics)NumberMultiplication signBuildingTime zoneGame controllerSource codeForcing (mathematics)Database normalizationPhysical lawInternetworkingStandard deviationOrder (biology)Software developerBound stateArithmetic meanOpen sourceForm (programming)BitLinear regressionRight angleType theoryData conversionPoint (geometry)Optical disc driveService (economics)Parameter (computer programming)Cartesian coordinate systemEngineering physicsQuicksortOpen setMereologyWordWeb browserPhysical systemLine (geometry)CodeRewritingRepresentational state transferProduct (business)Maxima and minimaMetropolitan area networkAddressing modeLevel (video gaming)Total S.A.SoftwareOnline helpUniform resource locatorCASE <Informatik>2 (number)File formatAbsolute valueBlog
Transcript: English(auto-generated)
00:00
My name is Lizzie Diamond, I am a fellow at Code for America and I am one of two community managers of Map Time which we'll have plenty of time to talk about later. We have things to do now. This talk is called Building Open Source Projects in Government Esri Ecosystems and there are some Esri folks in the room so if there are questions I can't answer
00:23
I'm sure that they'll be able to answer them for you. This talk is about building open source web applications with government GIS data. There are a lot of different meanings to the word open source project. We could be talking about open source GIS software, we could be talking about a whole
00:46
bunch of other things but what we are going to focus on today is open source web applications because that's what we do at Code for America and it is pretty neat. So really this process is a two-way street, it requires not only the civic technologist
01:03
to do work but also the government employee who is managing data to do work. So whether you are from one side or the other there are things for you to do. So for those unfamiliar Code for America is a nonprofit based in San Francisco.
01:23
We work with local government to bring cities into the 21st century, create 21st century cities, which really means try to implement technology solutions to government problems and we work actively with local governments, the fellowship, you work directly with specific
01:42
cities, this year I'm working with the city of Lexington, Kentucky. There are a few different tenets of a Code for America project that are pretty common. Using open data, open data is a big focus. They are mostly constituent facing web applications although now Code for America projects include
02:02
backend projects. City managed and the idea basically is to attack government problems with small tech solutions. So a lot of times this involves a web application. So Esri is clearly the dominant vendor in enterprise GIS. I don't think anyone would disagree.
02:23
And civic technologists in order to work with government in the geospace need to learn how to play nicely with it. There isn't going to be a time likely in the next few years where this is going to change. So what we need to do is learn how to work with Esri technology.
02:40
So let's talk about that. What we're going to talk about is four basic steps to building an open source web application inside of an Esri technology stack. Step one is that we need to understand the ecosystem infrastructure, basically what is that tech stack that they are working with.
03:00
Cities of various sizes use different types of Esri technology. And sometimes it is a combination of Esri and non-Esri tools. Just as a disclaimer before I start talking about this, I'm not a backend developer. I'm not a systems administrator. I don't, I've never managed a city government GIS before.
03:23
This is all based on research and knowledge. So take it with a grain of salt and verify everything I say before you go and bring this back to your city. I have however worked in government in city and state, so I have a little bit of experience with it. So typically, what we're talking about right now is hosted data.
03:44
One of the biggest challenges of GIS is actually managing geographic data. It's big, it's hefty, and it has a lot of nuance to it to store the geographic information. The S in GIS stands for systems. So it's about data storage and data management.
04:03
Typically, a city will have an ArcGIS first server instance to host their data. And this is locally managed GIS server, or they'll host their data on ArcGIS online. Both of these are sort of, they're both Esri tools.
04:23
Server is more self-managed, whereas ArcGIS online is hosted. And we can get into the nuances of each of those another time, but you can read all about them online and the different ways. The idea is to figure out how the city you're working with is managing their data.
04:44
Typically, as I said, Code for America is making web applications. If you're making a web application, your application is going to want to communicate over the web to get access to data. And the web, regardless of what programming language you use, it all speaks mostly the same language of HTTP.
05:03
So we can communicate that way. And plus, you know, people access things on the internet. Mostly, generally, I would say these days. So it makes your data more open anyway if you can access it on the web. There's a lot of different ways that all of this can be kind
05:21
of strung together inside of a city. This is an image from ArcGIS.com detailing portal, which is a layer that sits in between, that can sit in between ArcGIS for server and ArcGIS online if you want to push data up to ArcGIS online.
05:41
So again, you can get more into that by poking around online. But the main idea is, like, don't try to replace the city's infrastructure. We're just going to, like, take it and work with it. So doing less is, like, the mantra. Just do less. Do less. So once you've sort of identified what that stack looks like,
06:01
the next step is to identify the data that you're going to need for your project. And, you know, it depends on what you're trying to do. Are you building a routing network? Are you trying to update people when things are happening? You know, whatever kind of data you need, you have to identify what that data is so you can find out where it is.
06:22
And there are some challenges to this, too, because certain types of data is protected. They can't just open it up online and hand you a link to it. Maybe it has personally identifiable information. Maybe the person who owns it is particularly territorial about it and doesn't want to put it online.
06:41
And maybe it's stored in such a way that even if it was put online, you wouldn't be able to access or use it. So these are questions that you have to ask when considering the data that you're using for your project. I'll interject here that a good civic tech project starts with a need instead of starting with data. It's best to kind of figure out a problem that you're trying to solve
07:02
or some need you're trying to serve instead of saying, oh, I have this data. Let's do something interesting with it. So problem first and then data. And this is the most important part is the third step which is enabling access to the data. If you can't get at the data, you can't really make a project, right?
07:20
So in the Esri context, that can mean a couple of different things. We mentioned before, I mentioned before how data can either be stored on ArcGIS for server or ArcGIS online. Through both of those mechanisms, the owners of that data can expose a rest service,
07:44
which is an endpoint on the web that you can access via the web, via HTTP. It is not open by default. If you have data on ArcGIS for server, it is closed and only you can access it by default.
08:02
Same thing with ArcGIS online. You have to set the permissions to say, yes, y'all can access it. And for, there's a lot of, there's a few different ways that you can enable access for that. But the point is that you have to, this is the sort of government partner kind of role, is to make sure that whoever is making the application has access to the data.
08:23
And this is probably the biggest challenge I would say with Code for America projects is getting access to data because there's privacy concerns. I mean even on the ArcGIS website poking through, there's several pages when reading about exposing web services about security and privacy and how to lock down your data and make sure that people don't have access to it.
08:42
So that's sort of the rhetoric when, you know, shifting the rhetoric in the other direction of this is why we should open data and this is why you should have access to it is I think the way that we should go. But the point is that you can expose a service and access it with an API. And an API is a way that databases and programs allow you
09:04
to access certain parts and control the way that you access them. I talk kind of fast so I'm going to try and slow down a little bit. Another way that governments tend to sort of put data out there and publish data is through open data portals. Maybe instead of exposing an API directly from Esri stuff, they push their data
09:26
to an open data portal like Socrata or CCAN or ArcGIS open data which is new from and directly integrated with ArcGIS online. So there's lots of different ways that you can put data out there as government
09:43
or if you're just running Esri tools, you can put data out there. And then step four is we call ETL or extract, transform, and load. And this is the idea of taking the data from that source, shifting it around
10:01
so that it fits your application, and then loading it into your application. So this is what a REST service might look like, at least the top of it. Let's go get this data and just, you know, grab it, have it. That's the extraction piece. If you're making, if your application includes a web map,
10:21
you probably want to use GeoJSON as your data format. And with ArcGIS right now, you don't, you can't export directly to GeoJSON. So you have to transform the data to GeoJSON. ArcGIS open data will have a GeoJSON export. But right now, you have to make that transformation.
10:42
And also maybe your application, you know, needs a specific field or it needs to rename a field in order to interpret that data. So that's the transform piece. And then the third piece is to actually load it into your application. And that is a matter of just, you know, depending on your application,
11:02
whether it's loading into a database or exporting to a CSV or some other way that your application is going to manage it. And then, you know, you repeat depending on the different types of data that you need or if that data is updated, then you're going to want to kind of keep that flow going
11:22
between the data repository and your application. So I have three Code for America examples that sort of show how this works and why it's valuable. The first one is an application called Citigram. Citigram was originally created by the Charlotte Fellowship team
11:41
and is now being deployed in Seattle and Lexington as well. And it's a geobased, let's see, how did I write it, location-based opt-in text message and email notifications about city services, including code enforcement violations, building permits, electrical permits, and other data sets.
12:05
The key here is that, you know, without this kind of communication between where the data is being stored and hosted in GIS and the application, like this just wouldn't be possible.
12:20
Forgive the gray slash through this image, but Citigram has an ETL layer in the middle of it called Spyglass, and what Spyglass does is it grabs data from the open data portal or an API, it does that transformation into Citigram compliant GeoJSON, and then it loads it into Citigram.
12:43
So this layer in between allows for the application to run and communicate with the open data portal or an API constantly. And then, you know, the citizen can sign up for it and get a text message when there's a building permit within a block of their house.
13:02
It's actually pretty cool. Another example is the OpenTrails data converter, and what this OpenTrails, that's the next slide, okay, OpenTrails is a data standard for trail data and trailhead data, and parks data eventually, but, and the idea is that it kind
13:22
of enables parks to move their user experience design into the digital world. Parks are really good at making people excited to be in parks. You know, there's maps and pamphlets and signage and information, but it's all sort of static, and parks are not very good at doing this online or mobile.
13:42
So the OpenTrails data standard allows for the creation of applications by putting this data into a standard format. And the OpenTrails converter itself is a tool for converting trail data into that format. This works a little bit differently than spyglass, the ETL layer for Citigram,
14:03
because it does require some manual work. The city has to export a shapefile and then upload that shapefile to the converter. This kind of speaks to that situation of, oh, I don't want to expose a service. I'm not sure how to do that. I don't want to do that. I only want to do this one time.
14:20
You can still convert data and use data in open source projects if it's stored in a GIS. The third example is Lexington Geocoder, and what this tool does is it allows you to type in an address and it does fuzzy matching on the address
14:42
so that 322W number 4THST matches 322 West 4th Street written out, which is a big problem, as you heard this morning, Darryl, talking about geocoders and how geocoders are one of the bigger problems that are left in the geospace.
15:02
The geocoder just takes open parcel data and uses elastic search as a way to do fuzzy matching. It's a very small application and it uses this one dataset to enable the connections between all of these other datasets. So, again, this is, you know, GIS data being leveraged
15:21
in an open source project and managed that way. I kind of zipped through that really fast, but there's a ton more projects and examples on the Code for America GitHub, and you just heard a whole bunch about GitHub, so you can go in and play with that.
15:42
There are a few other considerations that you have to think about when building open source projects with government. One of them is hosting. Where is this application actually going to live? There's a lot of other options for that that we can talk about if you want. Also, you have to keep in mind things like changing infrastructure
16:02
and data storage on the part of the city. If the city changes its system, if the city updates or moves, you know, your application, if it's automated, may need to change as well. So that's something to think about. And also relationship management. So many talks about open source and government are about how do you convince people
16:21
to implement open source and government. This talk is not about that, but it is super important, and without that, you kind of can't really do anything. You can't even start this process. So relationship management is super, super important. Also, Code for America has local volunteer groups called Brigades.
16:41
Their goal is to work with the city on civic technology projects. They can help a lot because they can establish relationships with the city and they can kind of have a sort of mass of people who all sort of have different skills and can work together to implement these projects. Also, there is a pretty sweet organization called Map Time.
17:06
And Map Time groups are hands-on, beginner-focused meetup groups for maps, for learning about web mapping and geospatial. And it's typically, as Erin was mentioning earlier, a mix of developers and GIS professionals
17:20
who are there to learn from each other, kind of like this conference. So these groups can help a lot. This talk was about building open source web applications with government data. I think that we did that. It's really, it's not as difficult, I think, as people make it seem. And I should say the impetus for this talk was last year at the Code for America Summit,
17:44
one of the GIS managers from one of the cities came up to me and we started talking and he mentioned this team, they built this really cool open source application, but I use an Esri stack, so I can't take advantage of this cool application because it exists outside of my ecosystem.
18:01
And that's just not true. There are many ways to connect. And you know, I will mention, Andrew mentioned Coop before. There are several tools that have been created by some of the Esri R&D centers to connect the Esri core, Esri stack ecosystem to open source tools
18:20
and kind of do that conversion. So that's pretty cool that you can do that. Opening government data is a big deal. It's really important. It's what Code for America works on all the time. If you're doing that work, then you're like totally awesome, total champion, and it's really worth it because you get to enable all kinds of innovation
18:43
by doing, making these data connections. Then all kinds of people can say, oh, I want to do this cool thing. And then they totally can. And it's because you opened the data and then you feel really good and you can pat yourself on the back. It's totally worth it. And I can answer any questions.
19:02
Thanks. Can you talk a little bit more about Spyglass and how you identify sources
19:26
and then access those sources programmatically? Sure. So I didn't write Spyglass. Spyglass was written by my colleague Danny Whalen, who's kind of a Ruby genius. I'm not a Ruby developer, so I won't be able to talk too technically about it.
19:44
But from what I understand, Spyglass makes regular calls to, right now in Seattle, for example, this is not an Esri example. This is a Socrata example. It makes regular calls to Socrata, and then it transforms and caches the data.
20:05
That's another thing to mention, actually, is that government servers are typically extremely slow, like really like crazy, crazy slow. And, I mean, unless you're talking about emergency data, that needs to be fast, because it's emergency data and, you know, an ambulance might want
20:22
to have that information quickly. So what Spyglass does is it actually caches the data so that Citigram, when Citigram makes calls to Spyglass, it can return the cached data extremely quickly. Citigram holds the information about, let's say you signed
20:41
up within a quarter mile of your house. Citigram knows that, and so when Citigram asks for the information from Spyglass, it says get Andrew the data that's, you know, a quarter mile from his house. He only cares about building permits and code enforcement violations. And then that way Spyglass doesn't have to do that work.
21:00
And Spyglass, excuse me, Citigram doesn't care about the data infrastructure. The idea is that Citigram can kind of be built on top of any open data portal. You just have to rewrite Spyglass, and it's kind of a template. And all the code, of course, is online on the Code for America GitHub. So you can check it out. Another question?
21:28
How do you overcome the nontechnical issues of getting into, I mean, because we can all suggest technical solutions to all these problems and getting out the data, but how do you, what have you seen in your experiences overcoming the nontechnical obstacles in kind
21:46
of championing open data within governments? You said the word right there, championing. One of the big things is to find someone who is a part of the system in government
22:00
who can be a champion for your idea. Typically, if you can get someone on board who's inside of government that they can provide influence for your idea. I think that a lot of government IT people feel undervalued, and like the work that they do isn't important.
22:22
There's a lot of people have their hands tied. You know, we're talking about browsers earlier. Some cities, you know, you can't download any browser beside, and you have to use IE8 because your city has services that rely on it. So I think just being kind and respectful to the technical people in government is awesome, and I really like the show, not tell philosophy.
22:45
I think if you can build a minimum viable product and show really quickly the value of your application or your idea, that's a lot harder to argue with than a rhetorical argument. And Waldo Jakewith at the US Data, Open Data Institute, right.
23:06
He wrote an interesting blog post recently about the sort of rhetoric we use to talk about open data generally, and there's a lot of like, you have to do this, or you should do this, or we paid for this, and you need to open it, and that just doesn't work, as well as like here are all the benefits to doing this.
23:21
So I think show, not tell, finding a champion, being respectful, and being positive are all great tactics to use. Another question? Awesome. Okay. I would have one comment maybe, if I can.
23:42
Did I understand correctly that you kind of welcomed the fact that the data are published via RGIS REST API services then? Can you say it again? RGIS REST services. If I understand it correctly, that you consider it as a positive thing, actually that the data are published through or via this type of service, so to say.
24:04
My comment would be, let's support open standards, open data standards. Absolutely. Yeah. And I mean, APIs are great, bulk data is great, any way that you can access data is great, considering that the status quo is no data access.
24:22
I think that any data access is better than no data access, and then once you kind of hit that point where rhetorically and in the mind it's like, yeah, open data into it, then you can start having conversations more about like what is ideal for certain types of users and applications.
24:40
But yeah, I think that available data, regardless of how it's available, is great. Are you aware of the five stars of open data classification? Pardon? Five stars classification for open data. Five stars classification system for open data. Yeah, that basically tells this. The first star means the data are available somewhere on the internet.
25:03
The second star means they are in some machine-readable format. The three stars means there is a standard for it. The fourth star means the data, help me, they are linked, you can always find the unique URL on the internet in order to download it.
25:27
And five star classified open data means the data are linked between each other. Yeah, so let's go this way. Okay, in case there is a question.
25:42
If you could have one open data spec API, what would it be? What is it? You said choose an open data standard. What is it? For geospatial data, OGC standards, totally. We are in free and open source software for geospatial.
26:01
So for geospatial data, we are talking about SRA or RGIS, REST API, then I would support to go for geospatial. Yeah, but there are some issues regarding.
26:23
Yeah, I used to say the same, but we can talk about it later. I just want to say really quickly also to kind of, you know, lighten this moment. Tonight we are having a party, Map Time is having a party at the White Owl Social Club.
26:43
There are a staggering number of RSVPs, it's kind of terrifying, but it's going to be really fun and we have rainbow stickers and stuff, so you should come out. It's at 8 o'clock. I have a question relevant to this discussion. We're like really possibly considering using the ESRI open data portal,
27:07
whatever the hell it's called. Politically that's really easy to make happen, but I want to know would somebody actually be able to use it in the open source community? I mean is it useful?
27:20
Yeah, we should talk to Andrew. I mean I'd like to talk to somebody who would use it. I work for King County in Seattle, Washington. So if somebody would use it, I'd like to talk to you. I'm going to call out the man sitting right in front of you who's the captain of the Code for America Boston Brigade and you should talk to him and you all can have a really productive and wonderful conversation.
27:43
Thanks so much. Thanks.