We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Where's your hood at?

00:00

Formale Metadaten

Titel
Where's your hood at?
Untertitel
Crowdsourcing Neighbourhoods using Open Source Tools
Serientitel
Anzahl der Teile
52
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Hayden's talk was the second talk in the "FOSS4G in Our Communities" session at FOSS4G SotM Oceania 2019, organised by OSGeo Oceania and held at The National Library in Wellington, New Zealand from November 12-15 2019. FOSS4G SotM Oceania is the coming together of Oceania's geospatial open source and open data community - with four days of workshops, presentations, a community sprint and social events.
t-TestOpen SourceNachbarschaft <Mathematik>RandwertStichprobeInklusion <Mathematik>TopologieSpannweite <Stochastik>BenutzerfreundlichkeitWort <Informatik>BitrateBildschirmmaskeCharakteristisches PolynomExogene VariableMatchingMultiplikationsoperatorNachbarschaft <Mathematik>PolygonSondierungPlotterSpannweite <Stochastik>HypermediaBestimmtheitsmaßResultanteEndliche ModelltheorieGanze FunktionProzess <Informatik>Gewicht <Ausgleichsrechnung>ElementargeometrieMapping <Computergraphik>PunktRandwertPhysikalisches SystemNichtlinearer OperatorRechenwerkQuick-SortEinfache GenauigkeitTermMaßerweiterungZählenInformationFlächeninhaltReelle ZahlInternetworkingOffene MengeValiditätGeschlecht <Mathematik>Vorzeichen <Mathematik>Open SourceSoftwareindustrieProjektive EbeneTextur-MappingProgrammierumgebungMinkowski-MetrikGreen-FunktionApp <Programm>WindkanalVisualisierungMetadatenWeb SiteWeb-ApplikationInformationsqualitätUmfangVirtuelle MaschineSchnittmengeTypentheorieBildschirmmaskeBitDifferenteAbstimmung <Frequenz>CodeZentrische StreckungWhiteboardStellenringLogischer SchlussEinfach zusammenhängender RaumStichprobenumfangVerkehrsinformationBenutzerbeteiligung
Nachbarschaft <Mathematik>RandwertTotal <Mathematik>TermTexteditorExogene VariableAnalysisSechseckSchwellwertverfahrenEindeutigkeitMAPFaktorenanalyseATMVarianzCharakteristisches PolynomGeschlecht <Mathematik>Codierung <Programmierung>ResiduumTabelleDrucksondierungGeschlecht <Mathematik>PolygonInformationNachbarschaft <Mathematik>Web SiteOpen SourceDienst <Informatik>Quick-SortKomponente <Software>Charakteristisches PolynomMereologieMultiplikationsoperatorWort <Informatik>FlächeninhaltAnalysisPlotterSoftwareMAPPunktQuaderRandwertAuswahlaxiomFokalpunktBitDifferenteProgrammierungSoundverarbeitungKnotenmengeZahlenbereichProzess <Informatik>Minkowski-MetrikGreen-FunktionZeitrichtungLineare RegressionMinimumKurvenanpassungExogene VariableTypentheorieBildgebendes VerfahrenExistenzaussageFigurierte ZahlStatistische HypotheseGenerizitätGraphfärbungBildschirmmaskeNatürliche ZahlFramework <Informatik>RohdatenBenutzerbeteiligungElementargeometrieMittelwertTotal <Mathematik>TouchscreenSechseckSchwellwertverfahrenRechter WinkelMultiplikationLinearisierungDicke
PaarvergleichKongruenzuntergruppeNachbarschaft <Mathematik>KlumpenstichprobePunktShape <Informatik>StellenringWechselseitige InformationResiduumCodierung <Programmierung>FaktorenanalyseVarianzATMCharakteristisches PolynomARM <Computerarchitektur>Geschlecht <Mathematik>SechseckSchwellwertverfahrenMAPExogene VariableAnalysisTabelleNachbarschaft <Mathematik>RandwertMomentenproblemKongruenzuntergruppeKnotenmengeSummengleichungQuick-SortFreewareStichprobenumfangBAYESBenutzerbeteiligungProzess <Informatik>FlächeninhaltPunktAnalysisMAPStatistikTypentheorieWindkanalEinsPolygonZahlenbereichKontextbezogenes SystemBitKomplex <Algebra>Rechter WinkelDickeMultiplikationsoperatorRoutingEichtheorieGeradeStellenringSondierungEinflussgrößeMathematikMusterspracheRückkopplungStatistische HypotheseAbgeschlossene MengeKurvenanpassungSatellitensystemWurzel <Mathematik>Cluster <Rechnernetz>ZoomDienst <Informatik>FokalpunktSelbstrepräsentationArithmetische FolgeInverser LimesKonfiguration <Informatik>Vorlesung/Konferenz
Transkript: Englisch(automatisch erzeugt)
So I'm here to talk to you about my project, Crossroads Neighbourhoods Using Open Source Tools. Make sure that's all good. First I want to talk about why do I care about neighbourhoods, or why should you care about neighbourhoods. First of all, we all live in a neighbourhood, we all have a personal connection and traverse
it day to day, and that's quite an important thing. So our personal understanding of our neighbourhood differs from person to person, even within households or between partners, across the board. There's also no one definition for what a neighbourhood is, it's used to fit the purpose. But for our own experiences, it varies quite a lot.
Neighbourhoods are also used for a lot of operational purposes, so if you think of stuff like addressing models, or dissemination of results, or emergency response, using the neighbourhood as a polygon or a boundary is quite important for a range of purposes. So, my research questions are sort of what's guiding my research, and these have changed
about ten times throughout the process, but these are hopefully final. So I'm interested in what personal characteristics impact neighbourhood delineation. So if you think of stuff like age, or gender, or how long you've lived in the neighbourhood, does that impact the size of it, the perimeter, the actual boundary itself?
I'm also interested in thinking about where does one neighbourhood end, and where does one begin? So I don't know if everyone had the concept where you're driving along the street, and you're sort of thinking, where does this one neighbourhood start, where does one begin? Is it where the sign is that says, welcome to Brooklyn, welcome to Wellington, or is it another arbitrary point? The answer is probably neither, but I don't know if I can find that out.
I'm also interested in looking at how to crowdsource or community neighbourhoods, how do they compare to the official boundaries? So the fire emergency have a localities data set, which are the official data sets for neighbourhoods in New Zealand. So in terms of things like the size or the actual boundary, how well do these official boundaries represent how we perceive our neighbourhoods?
I'm also looking at what geographic features, so things like roads, or rivers, or tunnels, or green space, inform neighbourhood boundaries. A lot of research has been done on this, but not at the scale that I'm hoping to do it at across an entire city, rather than a single suburb or a single neighbourhood. And finally, it's crowdsourcing a viable method for gathering neighbourhood data.
So a lot of it has done sampling, or on small scales, which I'll get to in a bit. So drawing neighbourhood boundaries, this has been done quite a lot since the 1960s. They've often been done in a single neighbourhood in person on paper maps.
So example here is a census unit in Cleveland, which had 20 participants. So they went there, had people draw the neighbourhoods, and then digitise them, and you think about the amount of work that would be for an entire city, is why I opted not to do that. Another example was in downtown Santa Barbara, and they had 36 participants, and they got
these by standing on a street corner, in the street, just, hey, fill out my survey. Not too keen on doing that. But through the internet, hopefully, I can reach much more people much easier. Crowdsourcing in neighbourhoods. So VGI, volunteer geographic information, is created by the general public for a range of purposes.
Ranging methods are used to collect VGI, and they're either active, and I'll give some examples. So one, if you're from New Zealand, you're very aware of the GeoNet felt-it report. But for everyone who's not from New Zealand, every time it's in the earthquake, you head up GeoNet, and you say if you felt the earthquake, and it helped what intensity, and that plots it. It's also OpenStreetMatch, OpenStreetMatch was all aware of.
There's also the great KetuCount, which is a survey run for a week in September, and asked people to cite Ketu, which are New Zealand's native wood pigeon. They're quite beloved, and were Bird of the Year 2018. That's a great example, because it's sort of getting the community involved and thinking about their environment and mapping solutions.
And again, Bird of the Year is not so much a VGI, but it's sort of like a crowdsourcing voting system, and I'm all for the hoy-ho winning this year. There's also a lot of passive dialogue collection from VGI, and this often takes the form of geo-tagged social media posts, so thinking about the amount of posts that are happening
across the world, if you were to harvest these, turn these to points, sort of infer different things. There's also GPS tracking, which is quite a lot. Benefits of crowdsourcing can reach a wide population quite easily, and moves the need for sampling, which is quite a laborious process.
There are some drawbacks, mainly data quality. When putting out to the general public, people are going to do weird, crazy stuff, so a lot of processes need to monitor that. Lack of metadata is also an issue. Self-selection bias is something that's quite a problem for my thing, which is the problem that some people filling your survey, I have no way of quantifying who they are. They live in the neighborhood, so one person could fill in it 10 times, and I'm too sure
if those were all valid responses. There's also an urban bias, which is that BGI is primarily produced in urban areas, and without funding, which I didn't get any of, will anyone give me data? But when I was looking to launch something that could gather responses, I wanted it to be quite simple, but quite flexible in terms of me programming it.
I looked in some solutions, which included an ArcGIS Online web app, which could have been okay, but it wasn't to the right extent that I wanted. There's also a software company called Maptionaire, which runs in Finland, which is sort of a simple questionnaire mapping software, but I had to pay for it, I wasn't keen on doing that.
There's also, as we have a thing called Geoforms, which would have been perfect, but they only work on points, and I was interested in polygons, so those weren't the solution. And I ended up making my own website using open source solutions, so I started off taking code from a project in Boston called Bostonography, and they did a bunch of cool visualizations in this in-app, they did one project called Hoods, and they coded on GitHub, so I sort
of took that as a base and used it into what I needed, and I used a lot of Leaflet's drawing function, Node.js, I used Flux, my web framework, and Mapbox, my base map. It was my first time making a website, and I very much pieced it together, so when something worked, I was like, cool.
But looking back, it was pretty, it was like Frankenstein stitched together. So yeah, I ended up with a website called wellyhoods.com, and the data collection sort of occurred in three processes, because I was interested in three separate things. So first of all, you draw the neighborhood boundary, and it's very simple, and then afterwards you can sort of move the vertices to change it, as you will.
I was also interested in other aspects of neighborhoods, rather than the boundary, I was interested in the focal points. So if you think about what the focal point might mean to you, it's going to vary for other people, it might be their house, but it might be the shops, or it might be their church. Then there's also a short questionnaire at the end, which asked some questions about them, transport choices, how long you've lived there, stuff that will be used later on.
I mentioned the custom base map, I used Mapbox to make this, so I was looking for a base map that didn't have a neighborhood label, because if you're labeling where the neighborhood is, you're going to create implicit bias. I started with, I can't see very well, that's Stamen's Toner, which is a nice base map,
it was a bit harsh, so I ended up with something that's a bit nicer, the screen doesn't show up that well, but, so the green space is a lot lighter and there's no neighborhood labels. So my data collection ran for three months, I got 920 responses, even without any money spent. 886 were usable, a lot of them were quite bad for reasons you can expect, we didn't
People at blank canvas are going to draw some bad stuff, but these are the actual responses for Wellington. So even then, I know a lot of people aren't here from Wellington, but you can sort of see some quite clear boundaries forming, based on the amount of overlap. Some of the street networks, some areas that aren't considered part of the neighborhoods.
So if you look down there, that's a suburb called Island Bay, which is right at the south of Wellington, and these are the polygons for there, and there's the aerial imagery. You can see a lot of people chose to include the island, which is what the suburb was named after, but a lot of people didn't. And I thought that was quite interesting, it's a pretty even split between what sort of features are forming neighborhood boundaries.
There's also the CBD there, that had quite a pronounced gridding effect from streets, so a lot of the neighborhood boundaries sort of form the streets themselves. You also have Kelvin and the Botanical Gardens, closer, so it's kind of hard to capture the Botanical Gardens, but a lot of the boundaries ended where the botanical started,
but a lot of them didn't, and it's sort of how does green space impact that. And the final one that I want to point out right now is Corori. So Corori is quite a large suburb at the back of Wellington, and you can see there's not much consensus around some strong borders compared to a lot of the other ones, and I thought that was quite interesting at that point.
And these are the focal points as well, so the points as well. So before my analysis, I opted for a lot of open source tools, so I did all my programming in Python, which is what that's showing. I also use R, QGIS, and Latex for writing my thesis, which is quite awesome for figure placement. I don't know if anyone's had a problem with Word when you try and place an image
and it just doesn't work. Latex was the godsend for that, but it was a bit of a learning curve. A lot of my programming was done in Geopandas. There wasn't an image for Geopandas, so I kind of made it on myself. So if Geopandas want to contact me, I can license this out. So Geopandas was pretty much a lot of what I did was using that, and it was quite awesome,
quite simple to use. OSMX was great for getting raw data from OpenStreetMap and some network analysis. Then I also use Tidyverse and R for a lot of my plots, Lin's data service, and OpenStreetMap as data sources. So one of the first analysis that I produced was looking at levels of consensus, and this is quite a lot of information right now, but I'll break it down.
So first of all, I laid a grid of hexagons about 60 meters wide across all of Ellington, and for each of them, I found the total number of neighborhoods that intersected it, and then the most common based on the name, and if you divide them, you can get a ratio of consensus. So the darker colors indicate that this is, a lot of people agree that this is a neighborhood, whereas the lighter colors are less so.
You can sort of see some of the clear distinctions between the neighborhoods, so where the roads are, or where the elevation is, or the CBD, there's a lot less, I can't really point to it with the CBD up there, there's a lot less consensus that that's a CBD versus one of the neighborhoods, for example. And I ignored any that didn't have, had less than five neighborhood as sort of like
a threshold value, so I didn't get non-test values. Looking at the personal characteristics, so previous findings have found that gender, age, and length residence have like a significant impact on neighborhood size. So it was interesting to see if these were replicated, and if stuff like transport impacted it, so I performed linear regression, and it's like a lot of numbers, but I'll explain
it a bit more. So you can see age and gender, both of them had non-significant values, so the findings that age and gender had a significant impact on neighborhood size weren't significant. The amount of years you've lived in your neighborhood, and the neighborhood type, so if it was like a single or like a multiple neighborhood, had a significant impact,
and it was quite a positive relationship, so the longer you've lived in your neighborhood, the larger your neighborhood began, or sorry, the size was, which is what you expect, but it was quite a cool, significant finding. And also the transport, so the amount of days you use each of the form of the transport, they were all significant, with car access being highly significant.
And the plot's there, but you can see the days used on the bottom, so that's how many days, and average week people use motor transport, and that's the area on the left. So for cars, there's a significant increase, but for walking, it was the opposite effect. In yellow, mostly, it was a decrease, which is sort of the opposite effects for walking
and cars, and that can be explained by people in cars might experience more of the neighborhood, but also it might be sort of urban bias, so people in the city don't have cars, and then delineate smaller neighborhoods versus rural people. I was also interested in comparing neighborhoods to official boundaries. So, in black, the fire service's official suburbs boundaries, and the red are mine.
You can also see some areas where there is some overlap between them, so for example, down here, they seem to match quite well, but I was interested in some other things, so I'm using a measure of congruence as a thing to compare them, which is the intersection
between the two polygons divided by the union, and from zero, they're completely opposite, and one, the boundary's completely alike, but for this neighborhood, which is in Brooklyn, you're getting congruence of 0.18, which is a bit strange, because they look quite similar, but then when you look at the big picture, they're very different. That raises the question of how representative is the black polygon of people's experiences,
because they're obviously very different by that weird one pathway going out to a whole other area. For the focal points, I clustered them. I was interested in if they clustered around any specific features, so everything in red there were significant clusters, and everything in black were non-significant, and for example,
in Brooklyn, which is the suburb I've just shown, if you zoom in there, you can't see super well, but that's like the neighborhood shops, for example, and that was the most common cluster, the neighborhood sort of main shopping center where you find your area, or takeaways and stuff, but for neighborhoods that didn't have a shopping center, where
were people putting their focal points, sort of more of an interesting question. For example, up in Wilton, which is in North Wellington, it doesn't have the really official shopping center, but when you zoom in, a lot of people have put it near the school. You can't see quite well, but a lot of people have put it near the school, or near the park, or near the house, seems to be some of the common options, but then another
example was Lyle Bay, which is in South Wellington, as you had a significant cluster, but doesn't have a really official shopping area. When you zoom in, people put it by the beach, which kind of makes sense because it's Lyle Bay, but it's quite a cool finding. If anyone doesn't know, Lyle Bay is quite a lovely beach, and yeah, that's my
progress so far. Any questions? Okay, so we've got a few minutes for questions. Anybody want to get started? Hey, thanks. It's a really... It's nice and close.
It's just been recorded. Okay. So a really interesting talk. Thanks. My question was with the Brooklyn example. Did you look at the residential polygons versus the entire suburban boundary? Would that have given a better congruence with the Peebles? So this one you mean? Yeah, yeah. So like the Linn's residential polygons? Yeah. Would that have given a better correspondence?
Yeah, so that's something, and I've included it as well. So this was just one example of localities showing it, but I've also done analysis for the residential and the stats area too, which is the new stats boundaries. So obviously which are more representative of Peebles' experiences, yeah.
Hi. Have you recorded the perceived map literacy of the people who completed the survey, and also the demography of the survey? I mean, if it's online based, then you will capture a certain demography and neglect the artist. Do you take that into account?
Yeah. So I mostly got, because it's web based, most of the sample were younger people, as you would expect. And it's something that I'm looking to do at the moment. Quite early in my analysis at the moment, but I'm looking to sort of sample by the different demographics I got, yeah. Are there any patterns in the areas that people drew lines along?
Like were they geographic barriers? So a lot of them, if I maybe go back to maybe this one. So a lot of the areas people drew lines on were main roads. So the main arterial routes were the most common one.
Another cool example might be down here where I showed. That right there is the airport. And the airport is quite like a hard boundary, so a lot of people drew that as like the boundary for their suburb, which sort of reflects the official boundary per se, yeah. But the main things so far have been roads or tunnels were quite a strong one.
So I don't know if I can point it out, but the tunnel up here, it's the Mount Victoria Tunnel. And if I go back to this one here, that's quite a clear point where like free neighborhoods sort of end, because it's quite a hard boundary that you traverse through and you sort of change neighborhoods and also causes ones to end, for example.
Did you maybe look at how many nodes people are placing to work out, to try and work out how much effort they're putting into it, whether it's just roughly sketching it out? So one analysis I've done. So this is just on the area, but I've also performed this on number of vertices or the length or the complexity
of the neighborhood. And it was a lot less than your official boundaries, if you think about number of vertices in like a curve versus a lot of people did, it was a lot less. But it was quite hard to tell because I didn't ask any questions on that. It was sort of you filled out my survey and that was it.
But it would have been cool to capture if you filled it out on your cell phone or on a desktop PC for example, or how long you spent filling it out to sort of gauge the effort people put into it. So if you were to redo the questions, for example,
what's the data you would have also liked to get? Yeah, so something I've thought about a lot actually. One thing I would love to capture is if people owned their house versus if they didn't and also if you were a student versus if you were working full time. Because that's been done before. Those two things and also if you completed it on a mobile versus desktop.
A lot of the other ones I'm quite happy with, especially the transport ones. As far as I'm aware it's a new finding but mainly housing type and job status would have been quite good. With the questionnaire, we were quite careful because it's a VGI and crowdsourcing to not scare people away.
So not asking any personal questions about where do you live or how much do you earn. Immediately people were still in that and going screw this. So it was quite a balancing act between asking questions that were interesting but also weren't overbearing if that makes sense.
Have you shared your feedback with Fire and Emergency? I haven't yet. I'm currently writing my thesis so hopefully they'll see that and can read that.