Barriers to FOSS4G Adoption: OSGeo-Live case study

Video in TIB AV-Portal: Barriers to FOSS4G Adoption: OSGeo-Live case study

Formal Metadata

Barriers to FOSS4G Adoption: OSGeo-Live case study
Title of Series
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Open Source Geospatial Foundation (OSGeo)
Production Year
Production Place
Portland, Oregon, United States of America

Content Metadata

Subject Area
OSGeo-Live is a Linux distribution, available in virtual machine, bootable DVD, or bootable USB formats, containing a curated collection of the latest and best Free and Open Source Geospatial (FOSS4G) applications. This talk investigates the correlations between worldwide download distribution, and community participation against indicators of economic, technical knowledge and socio-cultural barriers to geospatial technology and FOSS adoption. Better understanding the barriers of technology transfer are important to the outreach efforts of the FOSS4G community, and understanding the market development potential of FOSS4G around the world.Results of an analysis of the OSGeo-Live community will be shown but the techniques discussed can be applied to any software project.
Keywords OSGeo-Live Community Education
Covering space Group action Multiplication sign
Point (geometry) Installation art Slide rule Context awareness Open source Projective plane Keyboard shortcut Virtual machine Basis <Mathematik> Parameter (computer programming) Grass (card game) Revision control Right angle Resultant Operating system Window
Point (geometry) Web page Trail Open source Multiplication sign Virtual machine Translation (relic) Water vapor Login Twitter Number Revision control Mathematics Different (Kate Ryan album) Energy level Endliche Modelltheorie Booting Installation art Addition Graph (mathematics) Theory of relativity Touchscreen Mapping Line (geometry) Limit (category theory) Type theory Software Window Operating system
Trail Greatest element Open source Information Personal digital assistant Video game Quicksort Datei-Server Number Physical system
Statistical hypothesis testing Greatest element Multiplication sign Gene cluster Translation (relic) Online help Computer Theory Product (business) Formal language Revision control Coefficient of determination Latent heat Cross-correlation Internetworking Average Computer hardware Cuboid Physical system Installation art Shift operator Dialect Distribution (mathematics) Mapping Chemical equation Mathematical analysis Instance (computer science) Category of being Inversion (music) Personal digital assistant Chain Pattern language Figurate number Bounded variation Window Operating system Geometry
Greatest element Computer file Multiplication sign Connectivity (graph theory) 1 (number) Mereology Wave packet Very-high-bit-rate digital subscriber line Hypermedia Internetworking Average Computer hardware Square number Diagram Software testing 9 (number) Mathematical analysis Electronic mailing list Bit Variable (mathematics) Measurement Connected space Category of being Order (biology) Right angle Summierbarkeit Quicksort Ranking
Algorithm Randomization Linear regression Code Decision theory Mathematical analysis Virtual machine 1 (number) Line (geometry) Price index Variable (mathematics) Measurement Number Category of being Subject indexing Type theory Numeral (linguistics) Cross-correlation Causality Term (mathematics) Forest Network topology Right angle Resultant
Divisor Personal digital assistant Multiplication sign Measurement Wave packet
Point (geometry) Server (computing) Group action Open source 1 (number) Virtual machine Translation (relic) Mereology Thresholding (image processing) Computer Number Formal language Average Term (mathematics) Internetworking Core dump Cuboid Office suite Demo (music) Software developer Projective plane Mathematical analysis Maxima and minima Subject indexing Category of being Software Personal digital assistant Order (biology) Reading (process) Resultant Local ring
Axiom of choice Code Multiplication sign View (database) Materialization (paranormal) Archaeological field survey Water vapor Mereology Stack (abstract data type) Optical disc drive Mathematics Mechanism design Different (Kate Ryan album) Physical system Scripting language Intelligent Network Email Bit Virtualization Repository (publishing) Telecommunication Order (biology) Website Pattern language Right angle Quicksort Laptop Trail Slide rule Server (computing) Open source Letterpress printing Virtual machine Translation (relic) Theory Number Revision control Goodness of fit Internetworking Bridging (networking) Term (mathematics) Software testing Distribution (mathematics) Dependent and independent variables Key (cryptography) Projective plane Mathematical analysis Database Voting Kernel (computing) Software Integrated development environment Query language Video game Point cloud Window
I'm Alex middle I just finished a week and a half ago my PhD in geography at UC Davis it's kind of what you can see here is 1 of the chapters actually for my dissertation so if I don't cover anything in depth enough for a linear really wanna know what I did and you're welcome to go read that of I've been on a the lowest you alive contributor group it's on 1 of the disbursement 2009 you'll see later I think the 1st time we give an OSI alive at a faster to conference was in Australia you on that committee the reworking often yeah I intended had a strange name that there
so that the big purposes stock is if you work in an open source project your often ask questions about who uses your project why the user where the user and are the reasons why they can't use it and you know know you know why don't we have more people using it when it seems like the obvious solution for so many things and so wait so I'm looking those July is a knowledge diffusion which is kind of bringing awareness to people so a conference like this is a method of knowledge diffusion of all the people that come here now and you're going to hear about some stuff for me which you may or may not choose to adopt so that's the 2nd thing on slide is if you actually use something and in the world of thinking about this that's called adoption but just being aware of something is you know the 1st step you can obviously not something until you know about it and once you know about it then there's this whole thing of is appropriate for what I need to do but do I understand how to use its the result of a whole bunch of other things that you need to go through before Iraq's we decide that that indeed i'm going to use this tool and so I decided to study these parameters in regards to OSHA alive I I was July for those who
don't know about it it's a project so that was created by the marking Outreach Committee of last year years ago and is specifically intended for demonstration educational purposes so it's a live operating system the can run from a DVD or USB stick 1 virtual machine which is kind of the dish your thing is this and it lets you try out almost anything you can think of it as open source in geospatial already installed comes with data already preloaded it comes with a short tutorial on how to get started so it's trying to people over the initial hump of I have no idea how to install grass and Windows users ever try to install grass like 5 years ago it wasn't easy was it that that that right had things that over now but this is kind of turn a shortcut that so you don't have to go through learning to install just the decide if you want even try something you can just start with this and go from there and if want you installations then you can go seeking out additional knowledge on how to do things like insulation and so you see that this is the version that we made and released for this conference and I believe you will be able to get a USB stick loaded with that from the 1st year with at some point later in this week the syntactical difficulties with the USB supplier and so you might look on here in the I Cartwright a few those things many might look and see 0 but there's like 30 more things I've never even heard and there's talks on most of them at this conference so it kind of way to explore the world of faster gene and you know try out things that may be even relevant to which you normal basis but who knows a couple years might be relevant to the work you do OK so this
is wanted charts about the history of OSI alive and it's really hard to read the bottom graph but basically just turning all of our releases and showing some since we have different types the top graph is showing the size difference between virtual machines and then we have 2 different kinds of DVDs the no-load 1 comes with the Windows and Mac installers in addition to the bootable operating system in in C over the years that there's some limits that we had to stay under and we keep adding more more stuff you know have the standard those limits and and aside from that it's actually been fairly steady and after the initial of 1st years and the 2nd chart that shows the downloads we've had over the years the early data is a little model because we were using multiple mirrors and me added and subtracted mirrors over time and we didn't keep a large pool with all 1 place so if you're using multiple download uh places trend pull your logs as you do it the lesson we learned and we're now using sourceforge drawdown months of the last 2 releases on your 6 and 6 . 5 which is what I really analyzed is really easy to keep track of numbers because there's only 1 place to go to find out all the downloads but from all the SourceForge mirrors the so these days where you're the the the top 1 there is think about 20 read thousand downloads of and it it's kind of modulating that I think our from 7 release had something like 30 thousand downloads but are 7 . 9 release only had about 20 thousand downloads and I haven't we get about it has to do with how much time there is an between releases we try to do about 6 months like as were trying to do and you'll whole version every foster G and in the bottom chart shows the coders and in 1 you probably can't see it's yellow on the screen the translators and Alex change over time and so the 1 thing I'll point out is that we get rest of the year and only hits this was 1 by where the level water in the world in relation to its for configuring the software and the blue line also happens to include people who wrote the English version of the documentation so it's not purely coding but what we call a contributors and translators yeah
the the couple maps really watch that can show the downloads 4 6 and 6 1 5 combined and the 2 differ maps here the top 1 is purely by number of downloads per country since the that the United States had the most downloads of any country but the 2nd map shows you dividing that by the number of people in that country and so you can see that it's not the same thing having the most number of downloads is not the same thing as having a you know the largest open-source community it when you're going for relative percentage of the country and but hot spot this when you actually see in the next page where was the map the
here the top countries by number of downloads pursues the top countries by per cent of population downloading and there's some things that you probably would expect on the per cent of population download the file server that caper acting someone's of the 3 people attending this suffers from them so clearly there open source that and that is an open-source hotspot of some sort and you would have known that if you just look at regular down its it then to get an even more crazy details
what operating system people use various highly by where they are and of I don't have any explanations for why that is I can really describe it in this case and so you can see that it's in many bottom here on the winning side there's some countries that you wouldn't necessarily expect to have high Linux usage and so they may not have high Linux usage for the whole population but the people who are into us your life happen all use Linux in Tanzania to and they're not you know on the on the other side there looking a Mac apparently people and were really like the max so there's some interesting things here about when you're thinking about we're audiences for a project and of you know who you targeting this is some useful information probably didn't have what you could get out of your download information if you're keeping track of the locks so what little further in the
analysis that we and did some statistical tests for those wondering smart what I did and and so you can see for those July downloaders in general windows is still dominant as expected but not as dominant as the general Internet going the computers of the world I Mac is actually solidly exactly what you'd expect for that so the exact same percentage were almost exact same percentage of Mac users that are out there in the world is almost the exact same percentage of the downward social i've that it's happen you back come the maximal bounces at an interesting time shift which is that Hey for Windows and Linux and the other category witches means they couldn't figure out what provinces more was of the full ISO which is the 1 that contains both the alive operant system plus the Windows and Mac installers is the most popular except on a map on a Mac wishing is most popular and I have some theories about the Mac hardware is really good answer when inversion chain is kind of no cost that you can actually run almost the full speed desktop inside of a window so why not on a few other things that might be that the bootable USB sticks don't really work on Mac there's where is 1 potential reason for its I you guys might have some other ideas but it's releasing that clearly they've picked up on it without us having to tell them that that was the case then the boxplot bottom is showing the variation so you know the top you to sort of see the average for everything from at the bottom here every little dog is a different country and so where there's large clusters of doctors read it boxes drawn so you can see that the average bar once you got here the the average person in the world the what know overall if you have what was the when celebratory but there's a whole lot of things we can do do I think it was a more of the eligibility it's cheap and so then have
a certain look at the geographic distribution of who is a contributor to and translators contributor is they wrote some insulation strips or or the help with the others and datasets where the of the documentation translators the translated from English into something else and so you can see there's some patterns in this the Western Europe seems to have a decent amount of everything not surprising a problem lot people's conference Western Europe no people in North America don't seem to do much translating also not a terrible surprise his so are education system is really stressed 2nd languages all that much and you can see that there actually is a fair amount of distribution of should but there's a time axis here going from the earliest it by versions of the earliest of the modern things you're growth over time and the balancing and when Asia and start getting interest in doing translations and you know 6 they really picked up in 6 and 6 1 5 them and South America also picked up then this and 4 of those instances istic you connect the roles of specific soreness and there is a strong correlation between having the contributor translators eyes combine and having someone who is more in 1 of those categories does correlate with there being more downloads of us alive in that in that country region actually analysis by country China regions here so local matters if you have a local chapter that seems to influence the you're more likely to download things so as you have a more local chapters probably an important thing for owes you think about the future of a lot more people to be using those geo products true right now getting into the the
media topic there are all sorts of things that could be impeding the usage of sulfur these are some of the ones that I thought about a little bit for this analysis he has probably think of a lot of other ones I lump them into 3 categories roughly economic technical and social cultural but pointed this diagram is that there a bunch that overlap in ways we would necessarily expect or in obvious ways and 1 of the easiest ones is training time it's not enough to say that training time is a technical thing because obviously need the warhead had users off on order adopted but training time actually costs money it's not a free things you know you have to go to school for it or you have to have work time to do it for you have to spend your spare time doing it and you can only spend your spare time doing it if you make enough money that you actually have spare time right and so in the next part analysis that I did I tried to find some measurements for a few these things to test which of them were actually barriers which the more portend barriers 2 people downloading the so here's
a list of the variables that I am pull down down from and so you can see I have a bunch of things in here that are measuring of Internet speed what kind of inter-speaker that's a steel and I consider speed to be a technical barrier as hardware related but also an economic barrier because you have people those for high speed Internet of downloading 4 plus gage file is a pretty hefty I on a average I think the world averages about 3 . 5 of megabytes per 2nd of it's 2 and half hours to download the dices so that's quite sizable square to check the time and that seems you have a reliable connection if you vitamin a connection it's use some people can take hours days and to and I've got a little bit of a more direct an economic rankings income rankings and then the 1 at the bottom here really interesting I use this as a social cultural on democracy and X is a a ranking of how democratic government is so this is nation governments and I 0 is completely autocratic and 10 is on a per cent of Democratic and the fire on 100 % the sum in the high nines and northern European countries the so we take all those and you put on the the component
analysis of citizens were fication the economic and and income ones were categorical pretty broad categories and so they're somewhat useful but obviously not as useful as you having Pure Numerical all the other data was was purely miracle and I did a regression-type analysis so if you know about linear regressions discounter like 8 and but there's a problem where all of the variables was speaking our correlated to each other not just to the number of downloads and so you have to come up with In some ways work around that 1 of the ways is to use a machine learning algorithm called random forests I just happened to this is the bar code I actually ran to do the random forests of it's a decision tree that helps weed out basically what's important what is an important and it uses regression underneath as as the principle that uses to identify that but by doing lots of repeated tests and by dropping variables here and there it can kind of really pass out and the results are way easier understand that actually is near the
results and so what the 1st shot over on your guys is left the showing is that the democracy index was the best indicator in terms of a correlation clever not a causation so the government is necessarily causing the number of downloads but it type of government highly correlates with the number of downloads after that basically anything to the right of the dotted red line was important enough to consider important everything else was negligible at you could tell them apart so democracy index then income and ITU broadband which the Russian broad measures IT Robin was specifically the 1
that says broadband is something fast 256 K as opposed to the other measures was that this that Robin is fastened for
and then have so so that 1st trust and many getting that social cultural is active the biggest barrier to download a and then we're kind a curious about what happens if you take that out what it is then important and when to take that out then income came out as important so we were looking at is what most of us tend to think about these things being technical issues of any more training material the more training time it may actually be some other factors like business practices or of government funding or you know your company allowing you to do training may actually be a bigger impediment to adopting any new software but in this case sometime a trial is you alive end and so
those 4 port ones that I mentioned the democracy index is the the 1st shot there and you can kind of see that the blue 1 and that 1 a server moving average and so shows you that anything below 6 is tumbled same wanted 6 that's that inflection points and so what you pass a certain amount of demo of the democracy in its of being in a certain amount of democratic the governments I use are increasing the number of downloads yet half the 2nd chart which is the boxplot there is the InterVA bank and my read of this is that the income category of 1 which is basically high income OECD members if you're in that category of what you can have a lot more down as anybody else want to get past that you can see the black middle that bars that are the average is they're all kind of the same so there's really you can't really tell us apart all that much it's kind of like you're in category 1 or everything else is pretty much the same except for maybe their category 5 on the end which is a country I wouldn't expect that even have computer infrastructure for downloading and a lot of cases of this but if you look at something I had some other interesting stuff in there like by this charge him long border which is the don't the downloads uh by the peak speed it kind of shows you can see that you increase in downloads of all points and the Office of basically want to get passed on by the by the 2nd in terms of your Internet speed is fast enough it doesn't matter it doesn't it doesn't affect him down once Crown I'd like to try and find where the exact threshold is of what's the minimum speed that people in order to download something and that's in a very highly for other projects gives us July this huge in terms of size compared most the other projects that you know anybody here talking about with this analysis can be repeated on any other project is kind of that's kind of the point of a stock is I happen to study the snowshoe livestock text but I think it's important analysis that we do on other projects especially when you want to you know see who your community ends and to try and figure out what you can do and what kind of incentives would help of spread yourself on places this is a
summary of the results that Mac users like virtual machines was you I was popular with when it uses a balanced and an obvious 1 having participants in your country course was endowments so as the thing as saying were local chapters and local language groups matters culture is a big barrier and I don't know if we address enough as developers because when you try to avoid social cultural issues I think it's developers I have an end but despite that there are still of course this will definitely issues once you eliminate some of the you know cultural blockers and then some important things that came out when I was trying to think about what really matters is that there's been some discussion that the the ability to trial everything and tried as much as you want for as long as you want is a huge win for open source the other thing that I was rear reason is that the ability to reinvent which is a core principle of open source actually matters lot even if there are it's a lot of coders because reinvention also implies that people can adopt software to meet their needs so they don't necessarily use it how it comes out of the box which is also a huge you know part of the sources that you can change it and use it however you want and we don't care how use it right 10 translations an interesting 1 I don't know a translation actually matters or not I just know the countries that had translators used it more and so there's there's a few things here that I think could be explored more in depth and that really the only way that I can think of to get
at the users doing survey questionnaires which has a little bit of bias and obviously because the people were positive experiences of those you alive are more likely to answer a servant and on off to tackle that bridge when I get to it I'm not quite sure how new that and then I interest in trying to figure out how fast the Internet is good enough a year since our community relies on Internet heavily for the version repositories and e-mail communication I C and the website of tutorials having good internet access is obviously a key resource for knowledge diffusion in the open-source world we don't just send print manuals the places and we don't have sales people who take materials to other countries and sit down with people and especially in educational institutions and convince them to use the software and then I think there's room that test a lot more specific data so in actual household income there actually some data out there on the English proficiency sense a large part of the world is is written in English and it's been suggested that higher education might be a precursor technology or or knowledge that you have to have in order to move into the more technical world and then of course now that I only analyze up until about a year and a half ago there's a lot more data and start looking at change over time and seeing if there's patterns in the geographic distribution versus time distribution
I don't think the social life and so the guys what slides theory what the database and the R code and the Python kernel stuff that I used for this project about up on behalf of all put up the chapter from my dissertation with that to get consumers doing and hopefully as can reuse that somewhere or convinced me to do it for your project if you want and on looking for questions and without b you know it and that I think there might be a couple different explanations 1 is they could be downloading it to give a friend that they are downloading the full ISO has Windows and Mac and sellers that seems a little at odds with you run Linux machine why would you need that but the other thing is like for me in my personal experience I use virtual machines for development environments and so having a pre-made development environment is something you can just work with an experiment with pretty easily and and you don't have to installed to your system especially you know social life has does not stuff in service to you know this so we want all the server stuff which is more than half of it installed on your desktop laptop running on all sorts of ports all the time that that's the kind of over kill needs of research that has a side I think it's about the the yet it contained environment for testing ideas and it's a quick and easy once you've got the version wishing to make a copy of the Buddha play with it kill it make a new 1 so you know it's the same kind of people who were in the vagrants and you're running you know huge virtual stacks on cloud that sort of stuff I think this is a lot for a while when it uses renewed what and so on and of the of the of the of the of of of of so of and in the yes yes votes over the down the numbers I purely use what what comes at a source waters API so all the stuff you can act as by clicking inferencing how many downloads they have different views reason we get by country or by operating system of undesirable His Python script and that pulls all that stuff down for a given project for a specific folders and then put it into an SQL database the command query out stuff in our the the the use of the the the I did not really find projects delving into it it's so that I may not be finding the right terms I may not be finding the right kind of researchers the I think there are probably 2 of analyze their own downloads but I don't think they've gone and see the death of on into looking into so at the barrier analysis something I don't I think any of the project done I think the sort of looking at operating system country mechanism I suspect that the that have done that but it is not published literature probably project but like by the way we have this many nodes and a huge IS tracks downloads in Windows vs. Mac and Linux but it is not all 1 place and not analyzed in in some ways see if it's statistically significant or not and Luis kids very low usually it's the dev team using the release candidates over the translators in I we don't really have general people using the released in they they go by really quick 1 were destroyed build the final I think the majority people down on the final we have downward numbers on it in in some way and they're the small of of of the what this and the of thing I I think the only way to actually go into those that follow survey that I really I wanted to do for this but didn't have time to do for this of because you really have to get it ask people did you use it you know what did you adopt war why didn't you adopt on and you know find out things about do they work for governments your do work and educational institutions to the work for themselves because all of those things are going to give you a robust amount of data to really look into why they're making the choices they did but Owsley goes you alive isn't necessarily the best software to do that analysis on because adopting OSI alive as you tried it the you know more end user things like if we did a survey like that for Q Ji S users I think we could probably learn a lot and like you various users we might get a pretty robust response and what we do in


  438 ms - page object


AV-Portal 3.20.2 (36f6df173ce4850b467c9cb7af359cf1cdaed247)