Logo TIB AV-Portal Logo TIB AV-Portal

Cheap (and good) data capture for environmental projects

Video in TIB AV-Portal: Cheap (and good) data capture for environmental projects

Formal Metadata

Cheap (and good) data capture for environmental projects
Title of Series
Part Number
Number of Parts
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Satellite image archives provide a wealth of valuable historical data that can be used to assess changes in the environment, but extracting high quality information can be costly and time consuming if we restrict the interpretation to experienced image analysts. We attempt to reduce these limitations by crowd sourcing the interpretation process via a web based digitizing system based entirely on open source tools. This approach can lower project costs by eliminating the cost of office space and equipment for the analyst, as well as allowing flexible working hours and locations. The challenge with this approach is to ensure that the quality of the interpretation remains high. Within the context of a project to model historical iceberg occurrences off the coast of Greenland, this talk will discuss the methods we have implemented for quality control while providing training and feedback to our analysts from an interpretation expert. The business case for this approach will also be discussed, including the risks and rewards of paying interpreters for each correct feature digitized. In our case we were able to quickly and accurately interpret several hundred images resulting in the measurement of tens of thousands of features. By using cloud based image archives and client/server strategies, this approach can be economically scaled up to much larger projects.
inclusion geometric presentation real sources Continuation environment capture analytics
satellite Context feedback time images experts part perspective goods mathematics environment information processes web Context satellite mapping information Open Source capture applications process Computer animation environment archive interpretations systems
relation Computer animation time projects 5th pressure Results freezing form
man studies views sin experts schemes Mass Emulation inclusion period theta functions Computer animation PIE different terms URN vertical classes level
area curve building statistics Manufacturing execution system Graph exiting time expression Coloured product Location Computer animation terms calculations box input boil spacetime
satellite sources standards high resolution high resolution analysis ones combination acidication Demo Coloured Demo period Computer animation terms orders interpretations 5th
satellite meter sources Context high resolution time high resolution loss water indicators Demo Demo words Computer animation hypermedia
comparison Computer animation images Continuation bits comparison optimal
interpretations water applications training
mapping Graph Open Source server Open Source analysis The list GRASS Coloured open subsets backend geometric Computer animation
services Computer animation Universal feedback experts interpretations Results
table information mid maximal ECM Arm shape mathematics sign dual Computer animation interpretations CCC life law essence
mud Computer animation time convex hull sort DRAM wall shear stress shape Arm
digital Ionic necessities images The list rates Demo fields area geometric rates level data types area inner fit analysis experts Databases bits maximal basis part Demo process Types number words processes Computer animation Universal Contracts interpretations sort geometric
Computer animation rates vertices rates sort CAMS area
area images analysis completion completion programs area subset period processes Computer animation rates Sch├Ątzung hill factorization testing Results geometric
Computer animation digital case case images maximal hill lines table Arm Results measures
mapping Open Source digital server time images feedback projects neighborhoods Open Source experts schemes experts open subsets measures geometric processes Computer animation Software communication case interpretations Results spacetime
presentation files states necessities schemes training number templates information rates errors systems response interfaces Gender projects feedback experts lines instance several Computer animation Void's Universal website sort
OK hello let's noblemen welcome to the 2nd morning session in the town of very nice here not much in them at some very interesting session I think we found the 3 buzzwords um cheap Economy and Commerce the titles of the presentation so we use said it's the business secession so it's very sad that just a few people here I would say because all the sources business as real as a new them so now we can yeah OK no the 1st presentation from David carrying and I'm very interested to don't that breaking have about thanks for coming to my needs a Korean my present you analytic were consulting engineering company based in Calgary Canada my
partner Brent Frazier couldn't make it but he is responsible for all the questions that I can get so when
to talk about today is a 1st but the context is going to be remote sensing and satellite imaging so satellites as everyone is aware of provide us with a really good archive of environmental information and it's growing all the time but 1 of the key problems we run into is extracting good quality information certainly crowdsourcing is a is a good way to go over some applications for example openstreetmap has used satellite imagery and aerial photography map of large portions of change with from crowdsourced perspective but when we look at environmental issues things where the image images a single record of what where and what we're trying to to record then the actual quality interpretation is much more difficult thing to quantify and also it's this last motivation for people in public and the public to take part in providing us with the interpretation so 1 minute talk about is a method that we approach that we used to collect data for environmental projects from satellite imagery and confirmed that the work I
was sure that the quality of the results is is hot so my project was looking at icebergs and just to give you some background on iceberg I'm sure a lot of you are familiar with what they are they are large pieces of ice and they're much
different from seeing things which we see this great stuff here in the sense that sea ice is formed from freezing sea water so it's got in it is formed at low pressure and it's fairly high temperatures whereas icebergs performed at high pressures low temperatures on you at the tops of relations and it forms from fresh water form themselves they're much much stronger and fitter and they typically can can be 10 to 100 times thicker than
the surrounding CS now because of all the issues that experts presented there's lots of different
studies have been and we've got a whole classification scheme for a period not just in terms of the
size but also what they look like what is going to gloss over that but I think it's important to point out that it's difficult
to look at an image particularly 1 from space and say that's an iceberg that sea ice that's an island so on so it's 1 of briefly talk about what we're doing I can't talk a whole lot about who we get a formula and but essentially we're trying to minimize the risk when were when people are working offshore whether shipping expiration building facilities and so on we presence of ice is a major concern of Europe marine engineer we're not sure architect the presence of icebergs and the society that is a major consideration so what we wanted to do was get historical data for this particular area we're working on and boil it down to a particular statistic that we can graph sense in terms of light and color probability of exceedance curve basically tells you what your probability is that you will come across an iceberg of a pair of size greater than or whatever this like this here you can use that in your design calculation if the if the probability seems kind of tells you that you got an 80 percent chance of encountering an iceberg if you can't handle that you might want to change a design the
product locations where they are not very interesting area of the off the east coast agreement is the top of the box there basically the Fram Strait and that's the variance of solve our agreement and that's where most of the ice from the Arctic exits the Arctic Ocean and streaming down the east coast agreement but there's not a lot of icebergs that I start some extreme features the simple but a lot of it's basically the times that the typical expression can handle iceberg to come off the glaciers on the on the coast of Greenland and they will see their slow down towards so what we do we
look that a 12 year period 1999 to 2011 and we tried to because of what kinds of chief we try to minimize the costs that we included both in terms of satellite imagery and also in terms of interpretation and analysis of same and we have high standards that we use let's that this is a pretty good image you can see here we get it right the trapped in a piece of ice floe there but it's useful at 15 years and which ones we combined and the color combinations with let's say excellent for this 1 the lighting it's we also use
acid which is a sensor carried on the chair satellite also 15 year resolution not quite the same color capabilities but we use quite a few scenes from that so that that you know the underlying theme here is collect a lot imagery put together and and allow people to interpret it online in order to FIL in some of our
time and and geographic gaps where she did produce some images from the Japanese a loss satellite you want to to censor the media not to is very similar to ask a slightly better resolution we have a couple of those scenes and we also used the
we present sensor which is a stereo 2 and a half meter sentence is very good at the Museum of those little was there but there are indications of these icebergs here and just by way of context those words are grounded in 70 meters of water and they're holding up that happened this is where the role of years the ice that's crushed against the iceberg and is being held back you another image
comparison so this is a very the whole of the Landsat 7 imaging and the 1 of the challenges that we have these SLC or breaks in the image of you familiar with Landsat 7 imagery had a minor failure in about thousand that led to these gaps in the images making it a bit challenging to interpret what we're seeing here is more experience also and and here's the idea of imagery which gives a very good example of what I I have here and is this 1 here has been 1 of those it's 1 of point out that interpreting these receivers is not trivial it takes a lot of understanding of what's going on with respect to how things form and how they how they act in 1 so come more
examples to the rest of the optimal cluster wafers with surrounding
using this was this year or performance of noted water sitting the so these are some of the
training and things that we provided to crowdsourced interpreters will talk about in a 2nd it 1 last example here's another large image for a large iceberg grounded in fast and 3 years later it's still there current not so give you a sense this is 1 of the reasons that automated interpretation did not work very well for this application which is why we have turn to so how do we do this well
open source all the way we gather images from a list of 100 using graph G you just corrected them color combined Thailand there we serve them all and using a static with all and you look at the back end for credit we use and for reporting and analysis we use so but here is what
our universe looks like and it's basically a digitizing capabilities uses that and services the detailed images and used by humans to provide detailed capabilities of this is the of
general which allowed us to very quickly assess the the results that are interpreted providing and we had an expert interpreter interpretable as to what 1 is solved 100 or so we go through these and essentially score them and that was the ability to provide feedback to all of our entrepreneurs were regardless of where they were when they were doing the things that we did we
provide an interpretation here we build this up as we went this
is actually causes a life so this is the very essence of features that
were observe and provided the information about the size of it with the image was and also comment about how it how it appears that what works the way it does this very important I think to give this
sort of abilities so this was alive document to get built as we went along 1 of the desire was this silent and island this is not an iceberg it's actually an island that was not on the charts and they get interpreted as a measure of
several times going the president
of itself so our interpreters we
recruited them from universities and there were other people it's and the other thing is that we we we select them because they were interested in GIS but they had no experience with sea sentiment somehow only minimal interpretation experience what we may have nothing to do was analyzed as much of the scene as they could and outline that area and then digitizing observed expert in there then mark the images as completed and will only pay them on the basis of icebergs that the digitized that we approved so another they digitized precise and we said that's a iceberg they get paid the beginning pay off for completing an image so if they looked at an entire image on interspersed with so that this sort thing that so this is the this this fit into our approval process where are expert interpreter would look at all the results and score them and add a comment if necessary and so if have digitizing invalid field feature that was of no value is the accepted it and that that that went into a database but you could also scored a 6 geometry affects types of the it was classified incorrectly for about the underlying they did they did go back and fix it so assessing their performance is where we actually had to do with on 2 levels the 1st 1 is quality so in other words what is a rejection rate how well that they digitized based on you know the the quality of the following and we did a fair bit of analysis we actually provided
them with the sort of a daily score of how they were doing was it was a daily because that's the supervisory had a look at the work but overall the rejection rate of less than 2 % always
we can see that the beginning of the war and we started off quite a few so I think 15 and you
never actually interpreted anything so when a lot of people self-select program and at the end of the day there's really treated almost all the work over period of 12 weeks with a couple of weeks or and there's the rejection rate of you can see that the geometry was usually pretty good except the beginning there was a few that geometry a lot of not valid features interpreted and that I think relates back to the difficulty in doing this but overall we had I think it's very good result and we ended up interpreting some 47 thousand icebergs the the problem course
completeness is as look at the the image and the mid-sized battalion had text that will approach the reader is essentially a single-blind test will be set up give the same image to multiple interpreters and and then compare the results which was challenge to to analyse but we we we work through basically using the spatial analysis capabilities are those that were posted we get about 10 % of the images what we do what we found in that area and that was that the for small features it was very difficult to have to go to to raise these guys but for images for areas that were large enough that you couldn't miss something you were doing a proper job we found that there was less than 1 minute per 100 square kilometers the other thing that affects the results was the SLC off artifacts that there was a lot of issues where the interviews will not cross the gap to see to match up the
the person in the size and there's a rather than a table listing all the results that we we only actually compared 4 I think really actually compare for me analysts to do that
the she but you gave it to the bottom line is the results were were were very good and so the business case
for this approach I think this is the strong what we did is the approach to it was to provide a remuneration scheme to the interpreters that motivated them to do the work by minimizing amount costs that we incurred so we don't have the space we did not to provide them with software also offer was was hosted in house but and at the same time they got valuable in experience and guidance from a very skilled image interpreted so there is basically payment both ways the the trick that I would I would say is that target recruiting is very important you need to find people that are our are motivated to do it I mean when we look at something like OpenStreetMap what people are doing it because they live in a neighborhood when we look at iceberg of the coast of Norway people only doing this to get the fact that they and been there might be if you experience induces that there but we haven't found any of the performance based communication that has to be carefully tuned but I think we did a good job of it but to be honest if you ask me what I paid people I can't recall and the other thing is that the expert reviewer has a lot of work to do because these people are working evenings and weekends and the instant feedback that guy is you know basically it's best to be does a couple hours of work every 6 or 8 hours so that it might be good to have more than 1 person depending on the on the size of the so there is a requesting of that the open-source tools we used I think all those tools adequately in more than adequately provided the results that we needed and then recently this could be scaled up to a much larger project
and I think in way thank you thank you very much very great
and interesting presentation that makes us remember that rested geography and it's necessary something with the real world In the 90 conference are there any questions so thank you for the presentation of a pretty interesting approach I was particularly interested in the crowdsourcing and them what sort of channels used to actually get the people 1st question the 2nd question is how much to pay them well was the motivation scheme and that the question is why didn't you use something like Amazon Turk if you are just valuing and basically they your incentive to some monetary rewards those the x and question OK so the 1st thing we did was contacted geography departments at several universities were postings on the website that attracted 60 responses of which there were about 20 they've got serious but we could on what we did we did go that far afield looking for people because in the end we had a face-to-face training sessions were brought everyone together in a given that you know our hands-on training we could have done that here remotely the 2nd question is what we pay them and I I can't remember is that couple years but the bottom line was that the there was it to people that work really hard at this and they estimated their learning about 60 dollars an hour that so when you get it you you can you can click through a lot of icebergs canadian dollars stressed how they can have that that that that that the value of that that's within given on the last question 1 of Mechanical Turk and we did look at that but there's a number of issues with respect to Quality Control and secondly with respect to delivering the data to begin with the Amazon system work we decided that from the the and that was earlier we never use it we only looked at it made more sense to build our own system because we have most of it in place thank you you other questions thanks those of strange instances and the states could be scaled up if you're stands out too much bigger project would neutron persist with having an expert verify every single thing from all statistical approach to error rates I think we could use a more statistical approach but actually the the that expert assessment approach were pretty good with the the with the Django interface basically we use gender templates language to just create a master master file for every little image and it pops them up you can scroll through a really fast and once you've been doing it for a while it's it's not that much work is is that that feedback that's that's more of a challenge like I can see where you would want to do that in this there are some opportunities and some of the questions and then thank you very much and where the small repulsive but and