We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

From The Inside Out Building a City Vacancy Portal

00:00

Formale Metadaten

Titel
From The Inside Out Building a City Vacancy Portal
Serientitel
Anzahl der Teile
70
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
The City of Saint Louis has a wealth of data concerning vacant properties. Unfortunately, like many cities, it also has a data management problem. Data regarding vacant parcels is fractured, spread across many databases owned and maintained by numerous agencies and departments using a variety of technologies. In order to fully understand and eventually address this problem, a group of concerned citizens has begun building a single source of truth; a central data repository that normalizes and visualizes the problem not just for policy makers and city officials, but for normal citizens as well. In this talk, we will look at how this data portal was made, and how determined volunteers with the right skills can effect local government.
26
Vorschaubild
31:06
38
Computeranimation
Transkript: Englisch(automatisch erzeugt)
organization that focuses on making data in st. Louis more accessible and usable in addition to that I co-chair the data working group for the st. Louis vacancy collaborative which is we'll talk a little bit more about later in the presentation so what we're gonna talk about is vacancy so st.
Louis has a problem about 15% of all parcels in the city of st. Louis are vacant that Rick that's 8,000 vacant buildings about 13,000 vacant lots and it's not proportionate like it's not evenly distributed you can see pretty
clearly on the line what we refer to as the Delmar divide there's a street called Delmar that runs east to west that is the sharp at least last time I checked was the sharpest income divide on the face of the earth you literally have 10 million dollar homes across the street from vacant lots filled with needles and so 30% of the parcels north of that street are vacant
and only 3% south are vacant so like you can see like it's a huge disparity and this is a problem because as we were very fun saying nothing good happens in a vacant building right there it's bad for the economy it's bad
like they're unsafe we've got really heartbreaking stories of children that are afraid to go to their bus stop because of all the like scary vacant buildings next to it to just kind of help put put this in perspective the the thing that st. Louis is unfortunately known for in the last
like five years is like the big Ferguson event and Ferguson while outside of this map is north of Delmar like just outside the boundary of what we're looking at here not actually the problem we're here to talk about today though it's more interesting to me is like we don't you know
everybody kind of knew this was a problem but we didn't know how to address it like how do you address something if there's like so much you don't understand the city didn't know how many vacant properties there were like could didn't have a solid number we and without that you don't have a measure how effective the measures you're taking to combat it are working
how do you prioritize stabilization efforts demolition things like that we didn't even have a solid definition of vacant like what vacant meant it's actually a really hard problem when you start getting into it because like your gut reactions like oh well it's a building nobody lives in but like there are a lot of buildings that nobody lives in that aren't actually a
problem right like somebody moves out and for a while there nobody lives there than somebody moves in we're looking for chronic vacancy or blight right things that are a burden to the city still not quite the problem that I wanted to address so in 2017 the mayor hosted a hackathon to ask for
help with this problem so the hackathon was to visualize vacancy in the area Michael who's like one of my volunteers here and I went as just kind of a fun thing to do on the weekend I was like a consultant at the time and we very quickly realized that nobody was going to be able to actually do this because the data that we were provided was a dumpster
fire columns weren't named right like columns weren't no city like city like owns up to this like the columns weren't named very well nothing was normalized we didn't have proper data definitions we didn't know
what stuff meant it was spread out over like a whole host of different data sets and this problem this problem interests me I'm a longtime technology person I'm a longtime data person this problem vacancy was kind of like yeah like it's bad but like I don't have any personal connection to that issue I have a personal connection to bad data so how does this like this is
probably a tale you guys have heard right like most cities like most local governments don't have great data even if they have a good open data program I would say st. Louis actually has a really good open data program most of what the city has is available on our open data portal it's just really hard to use and the reason this happens like first off
most of this is adopted from legacy systems right city of st. Louis has been around for like 200 years for a long time paper what you know all the reporting using the Assessor's Office as an example everything was done on like in books right and then Excel came out and someone got the bright idea of
like I'll make an Excel spreadsheet and that will make this easier right but eventually that Excel spreadsheet got too full so they're like okay I heard databases are a thing Microsoft provides this tool called access let me like watch some YouTube videos and see if I can figure this out right and that's just kind of how it springs up so they reflect antiquated
processes and they're not designed as with the technology in mind right the individual departments are siloed like so you know so the Assessor's Office does this with their data and then the building division does the exact same thing with their data but they're not communicating with each other so now you've got like two separate databases right and sometimes even within that you've got the building division might have three or four
different databases that serve different functions and depending on who made them a great example of this like for identifying parcels just like a unique ID for each parcel we have three different ones depending what data set you're looking at and worse they look very similar so it's really hard to tell just from looking at a data set which if it's not properly labeled
because most of the time it just says parcel ID you don't even know which identifier they're using so attack on top of this like you know because everybody's just kind of doing this to try to get their jobs done there's no real change process so like at updates are pretty ad hoc
eventually an IT department gets formed and they're they're just trying to like manage this mess but their job you know they're not really in a place to actually improve things because it's really hard to get strategic support for this type of thing right no politician is gonna run on a platform it's gonna get elected because they're willing to fix some arcane
technical issue that well it has a huge impact like the you know average voter doesn't understand it all they understand is that it's expensive and like well things seem to be running okay right now so like why do we need to invest X million dollars into upgrading our systems be like well your current systems run on COBOL like it's it's antiquated so that's kind
of how this all gets put into place and I'm like I I'm really impressed by like what our city IT is able to do with the limited resources they have and they've been very helpful to me what like in our work as we kind of try to address this problem so in order to build kind of a vacancy portal we took
data from four different offices so the assessor's office the building division the forestry department and the our land bank so if you're not familiar with like how land banks work basically if the city forecloses on a property it goes into a holding unit in our case called the land
reutilization authority or the LRA whose job it is is to like resell it and get it back in the market so we took data from all these different agencies we combine it into one data set and we built a website there we go so this is what the website looks like today this data hasn't been
updated since June of 2018 you can see we've got a little disclaimer of like hey we're working on it but this is the city of st. Louis each block represents a vacant parcel so it could be a vacant building or a vacant lot we've got a probability attached to it so darker colors mean we know it's vacant
whereas a lighter color means like we're kind of on the fence and you can sort by address you can look by is it privately owned or does the LRA own it you can look for individual owners you can filter by neighborhood and if you look at any given property in detail it'll pull up more detailed
information about that property and this this box is adaptive based off of what you're looking at so you'll get different type of information for a lot than a parcel because different data is available right they'll get different if the LRA owns it you'll get pricing because the LRA has like rules for how they price their properties based off of you know it's
cheaper for you to buy a lot that's right next to a house you already own that it is for you to buy a random lot somewhere right that kind of thing but we can because those are algorithmically determined we can kind of put all the prices here so you can see what that looks like and you know some links to other side other other stuff so that's where you can get more
information that doesn't need to be there okay so we released this thing and it's had a like a tremendous effect it really sparked a lot of conversation about this problem in this in st. Louis it's something that everybody kind of knew what's going on like you know like I grew up being told
like don't go to this part of town but now like we have numbers for it right we can actually discuss it we can learn things like one real estate developer owns 14% of all the vacant properties in the city of st. Louis you know that is a very different type of discussion then we have a lot and it's a problem and it led to the formation of the st. Louis vacancy collaborative
which is a kind of an informal alliance of nonprofits private industry and government agencies that are working to tackle this problem so you we've been able to use this to help prioritize demolitions we there's
been it sparked some informative dialogue around the problem it's allowed not us to work on this that's probably the biggest thing for me is I'm trying to clean this data up so that other people that know more about this problem can do meaningful work and we'll talk more about like my longer term
solution for that in a minute but yeah like government and academia are starting to use this data set to answer questions and to actually make change which is fantastic so all this kind of happens and this that this takes us up to about four months ago at that point the st. Louis regional
data alliance was formed which is a similar kind of informal alliance of organizations that are interested in this problem of data transparency so a job posting goes up for data architect and I'm an IT consultant at the time and be like this description sounds an awful lot like the volunteer work I've
been doing for the last two years so I apply got the position and now here I am so the regional data alliance attacks a bunch of different stuff these are the kind of the four big things that I'm working on right now we have like a whole other guy who's only working in the health care space I
can't speak as much he's building something called the community information exchange that helps these health care industries talk with each other and with external service agencies right so like a homeless shelter like it's really helpful for a homeless shelter to know if you just got out of hospital because you had a heart attack like that's good for them to know but because of HIPAA laws it's like very difficult information to
share so he's tackling that problem and I can't really get tell you how he's tackling the problem because I'm not working on that but these are the things I'm working on so the first is the regional data exchange is anybody in here familiar with CCAN or is to use CCAN before okay so for those of you that haven't CCAN is an open data platform it's open source
its main competitor is sicrada which cost like $10,000 a year but basically it's a platform so you can build a website that hosts open data and makes it searchable and easy to use so we wanted to see can instance set up but it's if those of you who are familiar with it it's kind of a huge pain in the butt to set up you have to know a lot of architecture you got
a little like a lot of different technologies because it uses a lot of different things so we've spent the last three months not just getting ours stood up but writing terraform script so if those we aren't familiar terraform is infrastructure as code so basically we architected out a docker based
installation of CCAN with a load balancer and a search engine and the a separate database that's been separated out and we open source the code that allows you to stand this up in AWS so now our three months of you can replicate in like a couple hours in AWS just by running some code
and making the appropriate tweaks to configure for your environment you know use your name and password your domain name etc so that's that's the data exchange the portal of all come back to the portal below that I just have a table because we don't have much to show for this yet but later this
month I'm kicking off a pilot program where consulting agencies are contributing their benched consultants to our organization to work on a project so which is going to dramatically increase our scalability and what we're capable of doing those of you that work in this kind of space I would
love to talk to you more about how we do this but essentially we're gonna be getting six full-time developers for free and they're gonna be working on what we're calling the regional entity database so all of that work we did combining the datasets right we have those like 12 datasets from the four different agencies that was a one-time thing like we did it and it was done and that's how like I feel a lot of us do our work right we're
interested in answering a question so we do the do the we can join everything we do our analysis and then it's basically static what we're finding is that a lot of researchers urban planners etc are doing that same the same joins over and over and over and over and over again it's like
80% of their jobs is what we're hearing is well I'm taking stuff from the Labor Department I'm connecting it with census data with OpenStreetMap data I do all these joins and then I do my analysis and then I package that up I spit it out I start my next project I take the Labor Department data I combine it with the census data right over and over again so we're doing it once we're gonna take all the data that the city of st. Louis has
regarding parcels and we are automating a process to create a constantly updating data set that anybody can access and use so at any time you can get the freshest pull of what the data looks like we should also have an API so you can just connect to it and you don't even have to download it
you can just query it so that's the project that they're gonna be tackling and that's gonna power the next version of our vacancy database so that portal instead of being outdated by over a year will update daily weekly we got to look at how quickly the data itself actually refreshes but it will
be constantly be up to date and I won't have to be constantly badgered by people that are like this is great how do we like you know when are you gonna update the data set be like well it's gonna take me like 80 hours to update that data set like it's a lot of work so let's I'm we're gonna do it right this time and then finally this is something that we just announced the other day we were accepted into a an accelerator program hosted by data
kind which is a nonprofit that does code for good essentially and Microsoft to build an AI assessor so one of the questions related vacancy that we've been seeing a lot of is what is the loss how much of these vacancy vacant
properties depressing property values and that's a very difficult question to answer so what we're going to do is we're gonna train an AI to act as an assessor we're gonna feed it everything we know about every building in the city of st. Louis including its proximity to vacant properties we're gonna train it to using that data get the assessed value and then
once we have that we can give it the same data set but we can tell it that there are no vacancies and it will appraise the entire city of st. Louis as if it had no vacancies in it and that should get us that depressed property value it's really exciting our super pumped to be doing it and all of this everything that we're working on is in github so any if any of these
projects sound like hey my city could really use this please like connect up with me and let's be happy to share what we have yeah so questions no thank you oh yeah we do yes and I'll actually a lot of this work came from us
going to the NNIP conference like three months ago whenever that was I basically got hired immediately went to NNIP which is fun so what oh it's the National neighborhood indicators partnership it's basically
organizations all over the country that use that are trying to use data to do economic and community development and the the way they work is each city that's a member there's like an organization that represents them there might be other people that come to the conferences and stuff with them but like they have a dedicated entity in st. Louis and luckily our one of our
board the board members of the regional data alliance is also a board member for NNIP so we have a good in there so other questions yeah yes yeah I'm
actually pulling up my notes on how this all works there we go let's see if
I can zoom in on here so pardon the like crude drawing these are my like meeting notes as we discussed this and we are in the process of revamping this whole system so this could all change in like six months but in the top left you see our three primary handles we have what we call parcel 9 parcel
11 and handle and handles what theoretically everybody should be using and they're all as you can see they're all very similar right they all have block number sub block and parcel but parcel 11 also has this condo code in the middle that is almost always zero and this owner code
at the end which is also almost zero and they will have some so some number of arbitrary zeros attached to the front which makes it even worse right because then you can't even tell how many digits it's supposed to be and then
finally handle is basically parcel 11 but you take that owner code off which is almost always zero you string kind of an arbitrary number of zeros and then you throw a one at the front so that those zeros actually stay there so right and so you what we end up doing is you know I'm going through data sets and
I'm trying to you know we want to join different things I gotta figure out which one of these they're using and how to convert them all into one thing and the easiest thing usually is to just build handle from data that's in the data set if it has everything I need to assemble it and then I can just say well there's handle otherwise what we're doing is we're if it's if we can tell
that it's parcel 11 we just cut that zero off add a one to the front and that works sometimes but yeah it's an interesting challenge we're working to a more GIS based system which is going to help quite a bit and we're
also like revamping the system where it will be a forget what we settled on we're still having the meeting so it's it's constantly in flux anyway there was another problem with this that I was going to share that I can't I suddenly spaced on yeah we are also having that discussion right now so no
it so it sounds like what is currently happening is the more this is tied in
with another problem of like addresses there is no like central list of addresses in the city of st. Louis right and nobody's responsible for that there is no like canonical source of like valid addresses so we're kind of addressing that now but it sounds like right now the an N address and an identifier gets assigned when a building permit gets put gets filed but
it doesn't become official until the Assessor's Office actually like finalizes all of that oh I did remember what the other piece was and this probably pertains to a lot of this if you want to connect this type of data to open street map data one of the things you know a open street map doesn't deal
with parcels obviously like that's a problem a but problem B is like all of this data is at the parcel level but a lot of its building specific and so you get as soon as you have multiple buildings on one parcel suddenly I've got duplicate records that I can't tell if they're referring to the same building or two different buildings you know condos cause a lot of issues
with this sort of data work and buildings that span multiple parcels so if you have a like a stadium in your town it's probably on more than one parcel of land and that also if you're doing all of your work at the parcel level causes a lot of confusion a lot of weirdness yeah so yeah my our goal
here is in this talk is mostly to kind of familiarize yourself you guys with like other data sets that might be joining with open street map and some of the challenges that lie there and some of the strategies and some of the community organization type work right because if you want to do you know we
weren't working on open street map to create this but you could very easily see a very similar process for getting volunteers to clean up a street map in your town to get data sort of government data sources into your into you know our collective data set yeah other questions yeah I think like five
minutes ish yeah they charge a lot for that so no so though the ones the Holy Grails for us would be post office would be great resource and
utilities would be a great resource but we don't have access to either of those so we're our definition of vacancy is based off of primarily off of city services provided and tax delinquency there is another researcher in st. Louis that is doing similar work he includes complaint data
so you know like many cities we have you know an information line and all the complaints get recorded into a database and he uses that to when people complain about you know problem properties near them and because of that his numbers like twice ours like it's that ridiculously different and
which shows that even even if you do go about this in like a really systemic way because I our process are different but I have a really hard time like arguing with them be like you know like yeah like that you know other than the fact that it's subjective which is why we didn't include it like but it's you know his analysis is sound you know the answer is probably somewhere in the middle right like somewhere between his
number and our number is the right number but yeah any other questions all right well I'll stick around for like basically as long as we have the room to answer questions and otherwise I'll be here till I'm not so
feel free to come find me or on slack my emails on the screen so feel free to email me if you want to touch base with this if you want if there's anything you want to know how we did it for your city or if you happen to be in St. Louis you want to get involved either way let me know all right thanks