Keith's talk was the fourth and final talk in the "Data Publishing" session at FOSS4G SotM Oceania 2019, organised by OSGeo Oceania and held at The National Library in Wellington, New Zealand from November 12-15 2019. FOSS4G SotM Oceania is the coming together of Oceania's geospatial open source and open data community - with four days of workshops, presentations, a community sprint and social events.
like so to explain it for the non Australians in the room I guess we sort of briefly precise what it was but a
democracy sausage is sort of the informal name we Australians give to the fund-raising sauce sizzles and cake stalls and craft stores and so forth held at our polling places on election days these are UT schools community halls churches and so forth there's also the word of the year and 2016 out of interest because why not
democracy sausage is also the name given to a crowdsource map of these polling booths that have P cakes cake stalls and Sajit easels and so forth and I'm one of a plucky band of volunteers who've run the map for the past 16 Australian state and federal elections and this is kind of our story it's a story of the history of where democracy sausage sausage came from as a concept of how this map came to be in the first place and what it takes to sort of run this operation as a bunch of volunteers and of some of the challenges we've had to solve along the way and a few challenges we haven't quite yet solved and there's a few museums in here about the future of something it's very much social media driven and very much like we don't know what's going to happen in 5 or 10 or 20 years time and sort of the rise of social media and everyone having these things in their pocket so they can connect and talk instantly and how that sort of plays into this space so as far
as we know this is a uniquely Australian thing we don't know any other concrete's that have this sort of thing at the scale we do as sort of resident experts on the matter we think it's a combination of a couple of factors we'd like to barbecue things in Australia we don't in mind what it is vegetables meat anything and we weren't up the few nations nations in the world has compulsory voting so you need to turn up to the polling station you need to get your name ticked off you don't have to actually vote but you need to be there which makes a nice captive audience for fundraising barbecues and tech stores or 22 million of us or so and that fundraising bit probably key as well it's all going back into local communities for new sports equipment for the kids do playground equipment funding school camps whatever it's all very local focused action as well
so through something exhaustive historical research on Google and trove we think the first occurrence of a barbecue at a polling booth was around in 1970s we're not sure if this photo here of Gough Whitlam 21st Prime Minister is actually the first occurrence but we think it's an accurate historical representation of the art of barbecuing a sausage that's enough on
the the concept I guess now we'll turn our thoughts to the map side of it so our map it was the night before the 2013 federal election and all throughout land there were barbecues being greased and onions were being diced up and so forth a few of us was of gathering round bread wine and beers and so forth him sort of speculating as to what would happen the next day as it turned out a few things did happen the next day but that's another story and we thought why is that a map where you can go and find a good sausage the next day because we wanted a good sausage next day so the idea was born we got some laptops when we asked Google how do you make a map really quickly in like two hours turned out there was some Google products I think was Google my maps at the time this sort of dual Drive map GUI and it wasn't much coding required it was all quite simple so we suffered domain name we set up a Twitter account and a couple together a simple map like this and next day we sort of hopped on Twitter booked at the democracy sausage hashtag which had some traffic and just started mapping from there we asked able to report on where they had found a cakes delores ISM and also where they hadn't found it so that sort of absence of sausage or absence of cake they get a big red cross of shame much to my surprise folks jumped on board this weird thing that had occurred and they were sending us reports I'm getting involved and it was sort of weirdly just exploded from their boss called out a couple years and would have been a random idea wasn't such much bigger than any of us had expected at 10 p.m. on a Friday we'd had about three or four other slots that was sort of mapping booths someone were trying to like review the booths for quality and stuff and others had a few other approaches but they lasted about three years and they do sort of withered away so we're the last one standing so a bit about the I
guess that the tech and a map the maps lot of things as well given this as a mapping conference we'll touch a little bit on that but there isn't much tech in here so the first iteration the map was this Google my maps thing back in 2013 that was okay but there was no API so the functionality was a bit limited for us but we got put together in about 2.2 hours for that was okay Google releases Google Maps engine about a year later so we dumped one bought that that was great nice API good tech good product when Google turned that off a year later much to my personal and professional pain we decided to look at Carter's build up or up so that was excellent at the time we used that for a couple years but they then changed their pricing structure so that wasn't it work for us we had a budget of a very round figure like this and we're gonna pay what all you needed to pay to use the API so we map for the fourth time I think then Google just released their polymer framework for JavaScript and we had a nice sort of hack together PHP back in that would spit out geo JSON for us being good developers we are when we act turned up and react was the hot new thing we build up for a 6-time and react and replace the PHP PHP with some Python as well so we're up to version six and we'll see what comes after react I don't know what the hot new thing is at the moment but maybe number seven will come up so listen we learnt in doing all this try not to use someone else's services if you have no money or very little money because it's not gonna work out for you and keep the tech as simple as you possibly can the moment it's just a blob a Geo JSON that gets served or client that's it it's an acacia somewhere it's as simple as that so we try not to over complicate things if we can so that's a sort of short potted
history of this other way but what does it take to actually run this this crowdsourcing operation the core team is six people a baby some parrots - snakes some chickens a bunch of invertebrates and rabbits as well and then a whole bunch of really hard-working volunteers apart from that had about 20 per the last federal election around the country during various parts of the crowdsourcing operation helping out in social media and so forth we think part of the reason we've managed to keep going as long as we have as partly as it sort of the skills our core team has we've got people that do stats who actually work for the ABS and doing stats we've got coders people who would like to deal with the media people who can write really well artists and creatives welcome to public servants researchers general problem solvers people who would like to project manage manage things and that's all in this core team of six so we can spread the load around quite a bit and we're quite diverse and sort of multi skilled and that's worked out pretty well for us so as with all things
involving involving sausages the group real Keys and how it gets made or in our case how we actually get the data having crowd source of data we began by literally just searching for the democracy sausage hashtag on Twitter manually and just grazing through the results there that was okay but I wasn't gonna scale to hundreds of thousands of reports it also didn't take long for folks start sending in reports to us or asking us to be listed on the map with the kenai to making our lives easier we quickly added a form for the site where you could submit your stall before the day so we suddenly were haven't deal with all the stalls on the day we had I think right now it's about 60% we get beforehand so on the day it's more just sort of verifying and qualifying submissions and so forth we still do crawl many many tweets on Election Day but that's the semi automated process for us now so we can sort of triage and assign certain searches in terms and so forth to various members of the members of the team so we're not all having to monitor the one column or the one has tagged and so forth we've also expanded beyond Twitter to Instagram and Facebook on reddit and so forth and that's working out pretty well the Twitter API API is interesting if you ever had to play with it don't the real-time streaming is interesting if you have to deal with that don't the reddit API isn't nice it's nice and simple but that's enough talk
of history how this works let's get to probably the more interesting bit and talk about some of the challenges we have to solve along the way that about technical and human challenges so our
first challenge is one you're probably all familiar with from your own work getting the data in the first place you'd think pulling we've dialed is pretty fundamentally simple right it's got a coordinate just got an address probably got a name of some sort as well yeah we've got nine or lateral Commission's doing State Territory and federal elections in Australia they manage over process overseeing and running elections and making sure everything goes properly you might think we've all got a roughly the similar approach to publishing polling booth data may be the same schema maybe even some standards around how they published polling booth data common names and so forth no definitely not but they have got addresses in common they've all got an address field that's a video some of them give us nice spreadsheets with all the comments we need and coordinates and everything else and nasty structured addresses but others will just list it on their website across various pages and have only addresses not coordinates more recently some have done hand you sort of find your own polling booth booth Maps but often there's no nice machine readable data source behind the map or there's no public API for it so we're left trying to scrape it or find the right person to ask for the data and none of them use the same schema it's all sum up a different local variation they've got so our solution make friends go around frontline customer service and go around sort of the public contact point and just find a jeaious person or the person that knows how to use queue just there and say hey we're doing a nothing thing can we have your shapefile simple as that answer is McKnight for a mark for the moment we love the work the commitment but their job is not to give a bunch of random spatial data their jobs to run elections so it's fair enough I don't mind
so our second challenge is one you'll also be familiar with poor quality data we think there's a pretty interesting challenge in polling booth data that's I think reasonably unique to us as people with map polling booths having the exact coordinate is super important having addressed that's well structured and logical super-important accurate building names and so forth but to the lateral Commission's themselves that's actually not as important much of the organ organ izing happens in a real local level so you know but a local school is or the local Community Hall is you know what it's called you don't need to know exactly where it is on the ground that works okay - you start trying to map them at a national scale you start getting more eyes on the data you've got and then you end up with a bunch of interesting issues you've had things like polling booths that are actually in the middle of the ocean off Queensland because there's a five where there should be a three in the coordinates reporting booth it's in North America because there's no - there's no minus sign we should be in the coordinates so it's in Idaho not even central Queensland a Polly Goose to have the right coordinates the right name in the right address but the wrong postcode so it MUX up your sort of data cleansing routines that's like match this postcode to this thing era so our solution a lot of time writing a very complex set of data validation routines that try and fix all this stuff up try and use past polling booth locations to predict the accuracy of the current one you've got and then try to automate that as much as possible because we do for most elections twelve to twenty-four ingests ingests between election being declared an election day itself because there's new cuts every couple of days as you know the local community halls being renovated so we're using the public school so we lose one booth we add one booth so that sort of continual ongoing automation is pretty key for us so our third a third
challenge really is around the crowdsourcing side of things so we began as I say by just searching the hashtag manually but that wasn't gonna scale you can only have six people around the table looking at so many columns in TweetDeck for that becomes completely utterly unworkable and you ain't I'm just arguing and shouting about are you responding for that no I thought you were so it doesn't work so we did what every sensible team would do and build
our own damn social media management tool customers for us have got sort of triaging functionalities we can assign tasks to people they get notifications it's basically HootSuite but customized for us we've tried using HootSuite but again the cost was too big for something that runs twice a year and has no budget so and again guess the human side of
that as well so we started out with about six of us trying to run this but with growing popularity we were facing thousands and then tens of thousands posts on Twitter and reddit and Facebook and so forth our solution was ask our friends then ask them to ask their friends and then friends are those friends and then we just had many more friends left and then we asked us on social media do you want to help out the sausage it's fun also need some stickers all that employer we had twenty people on the team for the last election and we were doing more work sort of I guess wrangling volunteers running onboarding packs setting up rosters making sure that they knew what they were doing when and if I'd read the onboarding perk and then the questions about your onboarding perk and making sure they're having fun doing it because that's that's why we do this thing it's fun we enjoy it it's a good luck so with popularity also comes
attention and in our case requests from other sites to access the data we've been crowdsourcing it's like when you wake up in the morning and suddenly you're a provider for Bill and Twitter and Facebook and the political parties and you have no budget and you're just doing it at night when you get home from work so our solution Authority Catherine API
very very quickly with actual access controls and some documentation it's not just in your head and then ask your large partners to please bear with you because it ten o'clock at night you need to sleep I'll do that tomorrow Google the last
one is also s partly caused by the explosion in popularity but partly a factor that contributes to it and that's meteor interest where we began we had requests from newspapers and websites as I fall for stats and lists of you know well it's the most popular pot and booth in this electoral division that was pretty easy to handle but then as things got more popular we'll be ended up with an ABC camera crew in a land room of our HQ filming us on our computers for some reason on a interviews with CNN because they like a color story in Australia so that's good jenos from the BBC and The Economist calling for comment very serious comment and we've done way more local radio interviews than we ever thought possible there's so many local radio stations in Australia and they all care deeply about their local polling booths and the colour stories and so forth it's all a bit surreal really our solution has been find the core team people that like to do the media thing and then just let them practice it and practice and practice and practice and practice it and they're getting really good there's suddenly media pros so it's a couple of
challenges yet to solve we haven't haven't solved that one yet a bigger one is address geocoding that's our biggest expense by several orders of magnitude the last election was about a thousand dollars and address geocoding services again our budget is nothing basically so people are so used to having the address search bar at the top that we can't not have one but we also can't really pay for one so we're still trying to work out how we tackle that right now we just sort of beg and borrow credits on Google map services and that's working okay but we'd like a long-term solution to that as well might involve gene a few if you want to maybe build a geo coder into that [Laughter] secondly is big word frost is right limping on right limping on social media our accounts particularly Twitter have a really weird usage pattern where they do this and then this and then this a couple of times a year that looks like spam behavior to most social media platforms suddenly nothing and then a whole bunch of posting or applying in six hours so we often get rate limited or at worst temporarily banned haven't got a solution to that yet apart from very asking Twitter very nicely and asking people very nicely to race the rate limits just for this day but that's again the long-term solution we're also not really equipped to support more social media platforms we can just about handle before we've got but we're not set up to cope with other fragmentation of those platforms into further communities or an increase in the number of platforms they're all if you're interested in having us solve these as well let us know we're not after like technical people around a thing just enthusiasm and if you know folks that can help us solve any of those three also let us know so lastly there's a
couple of challenges we haven't had to solve yet we think but we can see them coming first one is vandalism and spam we've had one spam report in the past six years and we're not sure why because there's no real protection against it we can't verify them but there's nothing to stop you to sending in thousands of reports or sending in subtly fake reports but no one has which is weird because it's the Internet people are nice I don't know we'd be curious to work it out we thought twice about you even mentioning this because we thought so I might actually then go and do it so if it happens I've got 200 people I know I can look at this is our
big sort of risk so there's been an increase in pre poll voting in Australia particularly state elections we think are compulsory voting probably shelters us from this somewhat but we're not quite sure how long that's how long it's going to last for and this sort of explosion the popularity of of the meme of dr. sausage it's probably happened at the same point where there's been a decrease in polling with attendance so we think they're probably offset each other at some point we keep thinking it's always going to get like this is as big as it gets there's no election we bigger than this election and then the next election is bigger still so it's been curious to see for us the risk is we stopped doing a fun map and that's that but the risk is quite real for the groups that rely on this for fundraising like how are the kids going to get new play equipment or go a sports carnival if suddenly you're not making five grand on a barbecue a couple of times a year so we'll see what that goes but as long as there's a request for it will keep mapping these things so that's our story of a lucky
band of somewhat crazy volunteers the collusion of a quite uniquely Australian thing we think and this rise of social media and sort of these things being in our pockets and making it easy we do what we do I guess to celebrate democracy if that's not too ostentatious encourage but it's encourage to the participation in that process and really this support with local community groups and what they're doing making their efforts more visible and so forth and we hope we make elections a little bit more fun and a little bit less depressing [Music] so if you're eligible to vote in Australia remember that real friends don't just go to the map and pick a good booth they go to a missing data point risk not getting a saucer of cake and report it to us so please do that we haven't had some folks do road trips as well so he went to a road trip around a bunch of missing data points that's also cool we'll send you some stickers speaking of please buy the merch at pace
for service the stickers and t-shirts thank you that was fantastic I'm guessing that we quite a few questions so raise your hand if you've got one I'll come around with the mic got one there one there over there have you done any analysis on polling booth results in presence or type of food that was served we don't have at least one paper that's been published given some data we've provided we haven't tried ourselves because we're usually just exhausted by the time it's over and we're like oh it's done don't to think about again but yet has been some actual academic research on that I think if you google for our website you'll find the references I think it might be on the about page but yeah Oh awesome talk Maison collection of photos of Prime Minister's eating maintain cooking sort of two questions based on the the panel discussion about communities this morning first like what is preventing
burnout in this community is it just that it doesn't happen too often and secondly what is your community if you can get the point where you would survive the transition of the initial founders like if you left would it carry on without you no we have one and a half developers I'm one of those developers that's the big risk from my point of view yeah it's all on github notionally it's all open source and I've mostly documented how it works but there's also a bunch of goop sitting on digitalocean that only only I have access to I'd like to solve that if there's more developers that would like to contribute I'm very happy it's all modern typescript et cetera et cetera et cetera but yeah I'd like to solve that but it is entirely reliant on me as to burnout yeah the only happening a couple of times a year is a big part of it a federal election for us is about three months of planning and work so we're done by the end of it we don't want to think about this for a long time I think the most the busiest year has been for elections in one year and that was a bit too much yeah we're just quite fortunate we chose the right thing that doesn't have a very concentrated set of events to manage otherwise it'd be too much yeah have you thought about a mobile app to help with collecting location data anonymized optional location data to help yeah okay as in install app on your phone and then what you're from there yeah when you report a location yes they've got this here no they haven't it also sends the location based off your phone's location what's the benefit I mean give like but they can use geolocation in their browser already like on the ground level bits of where the actual store is yeah yeah do you want to build the app go for it have you had any feedback from the organizations that are doing like sizzles and saying thank you we've actually increased our sales because of the site that's do you get a cut that's a good idea yeah I would help by facility we've had some really hot warming emails from folks saying hey we were on the Mac and we saw a black of big increase from the last election that's great thank you so much it helped us do whatever it is they were doing locally that's part of why we keep doing it because it seems to actually make an impact beyond it just being a fun random project that lives on social media we have have had butchers try and get us to be promotion for them for their sausages not quite realizing that we do the whole country and no one cares what the local herbs which butcher is selling that's also one route we could take is yes once a ship app or the website right now guide you from where you are two questions does it guide you to the location as in like directions yes no we'd thought of doing it my first instinct would be if we can open a link to whatever the native mobile mapping app you've got is that's probably better for us because I don't want to deal with directions and again that would be an API cost for us on top of address searching no one's asked for it yet we've always just assumed that they're always going to a local pawn booth so they know where the school is or what a community call is so maybe about don't need directions that's a good question maybe there isn't need for the second question is have you thought about what three words it's reading that you know we're number if it was open I would it's not yeah so it's free to a certain level though right no certain level yeah I have philosophical issues with what free words okay so if there was a nice open standard that wasn't wasn't what three words or so forth we'd look at that yeah is that the Google yeah people using it out of interest no okay yeah totally I don't we can look at a wider computer


