Implementing Open Geospatial Data Portals with CKAN, pycsw and PublicaMundi: the case

we're studying within the cell come by
here has so welcome again the
I'm going to presented the the new geodata data . gov . GIR which is essentially based on the work that has been done with c can prices W and then you put project called public Monday so and the this is
the outline of the of of this speech we're going to make an introduction in public among project on seek can and I'm gonna steep w because I already said that before and then we're going to need to see Hubble of publications like what was the old delivered to the data but Golovin colleague differences but differs from due data that yeah and I hope to have something to do show you a demo so far so again this is a very generic and introduction like this is the definition of open data and video that some data should be freely available to everyone to use and republish the Open Data movement is similar to open movements as
open source and 2 for higher word to their and open science so the data that call is the home of from the US Government Open Data and it was done several years ago it was really a revised 3 years ago and ported to open source software completely through c can annotate it is open source its own Duke had you can find the source code there the so in Portland in 2003 I give this presentation about because of golf so now you'll see the difference in how this evolved to something new where sold actually deal data the love that yeah is something that existed since 2010 this was actually 1 of the 5 1st National Open Data portals so is it is it to use special data infrastructure and that is serving as an as as a national open data catalogs and it's inspired compliance and is maintained by a thin uh research center which set which is responsible to maintain it and provide the support and and anything else on on this platform so a the old
deal that got the gr was based on some customized to the vision of open source software it was basal also the Montcom ones where built upon my feast so at some point to where literacy can came along and c can laser software from what the the Open Knowledge Foundation and had a speed silhouette platform platform for publishing and sharing Open Data and it has really been very very popular in the open data community and it'll it is also very very popular in national portals so right now she can't powers of the European Open Data Portal The you the the the Open Data Portal in the European Data Portal actually this is this was the 6 months ago in the USA Open Data Portal The QKD portal their Australian of Greek 1 and many others who you go to this side you'll see like hundreds of them it is very but it is a powerful system it's built on Python and pylons actually it provides uh us the tools to streamline publishing and sharing and finding and using open data and it is what it was designed to aiming to to help publishers actually increases probably why it was sources success so instead of asking for like 50 and metadata fields what this account said OK just upload your data and give me some minimal information and it will be available although not very compliance to standards but it was acceptable because publishers could just go there and publish their data so that it was a success story for them so these are the main features of c can can begin as somebody can publish and find datasets story man's data that do federation nodes harvesting and it has an excellent API and something extensions so somebody with Python molars can extend camp and these are some screenshots of how things are done in the can and these is this this is like a search
result and a discovery and many other features which I will also talk
about later so I'm I'm going to to keep
the prices w slides I'm just going to say that it's OGC CSW cerebral limitation in Python is open source and uh it's enormous project now let's move to public among these Republican will be was a if funded by the European Union F P 7 but project which was the goal was to to be able to extend C can in some denies wastes and regarding the open geospatial data so so we wanted to research and development of new with all of this and to make us can more scalable more data to be more reasonable and to offer to facilitate the publication discovery and re-use of data so the the main idea was that we wanted to add the always use stacked on top of c can with because he cannot have already some special extensions but the In provide direct publication for example of w mesoderm fast so what we saw these that we should bring your stocks directly and an integrated with c can in order to make a more more easy to for the users publish their data and so we extend its again using the OGC standards the source code is 0 and you have the links are here but also the the the presentation is online or on public among the adultery you have solo what kind of stay standards we did we implemented we we've implemented we tried to to use that inspire directives for discovery services services download services and we added the processing services there and because we think that using also the data now online is important also we integrate some new technologies like we we we went to for their as observation community and we integrated Rozamond perform W cps and WCS for a roster of data and also we integrated zoo WPS project and on c can and also all the known player self always you're here grass filterable I those were also used in order to create processes that zoo was offering on top of them through a WCN the WBS support so these are contributions should during the project we
implemented the LDC and OpenSearch deal as I already said some places W W B S 2 was also implemented by Zoo project and all the C. transactional still cut covariance service was a new specification that derive from the project through work being done by the other my group also sees W 3 was implemented even though it was released last for half a year later and we also where involved in the dual the cats discussions and we contribute to them as would and this is a crucial consortia model of the
4 7 projects so soldiers of the little flashback on how the above goal was slightly in 2000 tree in this from from 2013 so it's
a 1st order 2009 and 9 and then it switches to 40 offense platforms he began in 2012 and then it was integrated with prices doubling 2013 and 14 later in many many updates since then and so this is the the architecture we see
kind with the dual data with the data golf and extensions and using them as you can API and OpenLayers and leaflet does as it does data visualization tools and but also solar as a search engine and prices W for C is w and that's more or less how data that goal is also to date and this is how it looks like chains of the look and feel of the sea captain implicitly in order to be able to search through specific organization so somebody can just look for USG as data and can get specific results and it still now has 2 million records in there so it's like I think that the biggest Open Data Portal so far as there are some specific features have 0 I don't have the time to go through other they support many endpoints somebody can publish WMS and they can the chorus from many sources but the main goal the main idea of the of Gulf is similar to inspire but it was forced from Don top-to-bottom so instead of expecting many organizations to support the standard they went there and said OK we're going to do they have evolved and we'll go on to harvest everyone else to 1 portal so if you're not compliant with what we want to you're not going to be on that govern all their organizations and government agencies in the US responded to that by providing seek and installations and CSW endpoints and then suddenly the title was made by itself the biblical doesn't have any data on site it's harvest everything only harvested data
so let's move on to geodata The goal of the GIR so this is the architecture of its became after we do
the project so here we have like we integrated data on the Web services like my mobster reduced GeoServer other ones who of course did and prods is there because it's used everywhere in the world this is dual stock also we're based on C can With by CW solar but also we need some extra cost of the spatial extensions we added some analytics that there 2 we are able to to to monitor the usage of of of of for web services through through our servers and also we did some all more extensive like we provided the processing and API over directly over the data that we've cost and we did some new additions to lose a guy like roster stores vector and some new metadata thought also that me wait so now how we
do do integration we have a better deployment sulfuric portal which is labs built the due date of the of the GIR and we have the service in the mean for education because it's a government sites and we are using their the Greeks CloudStack which is on top of all of them well following story the next minute yes we're going to sell it we deployed on several classes which sees that sort from database to seek can your server and we are using tiles and everything that's supposed to be on an SDI in order for it to be efficient have we're doing clustering for for example GeoServer but we're also doing things like using not proxy in much server subdued files so this is like we tried to to bring everything into 1 environment we have
monitoring tools to be able to use to to see for some servers stop stop responding and be able to do or to spin off you servers and all the installation because there are too many cobalt components in in the new system we did everything with some simple which is an excellent tool to be able to create the process software in in many servers and to be able to do and develops worked in this case so this is the new thing this is how we change the policy can and we provide also some some some new extra features which I'm going to go through now well the look of the fields that asserts space is not much different from what you what you can find in in in in in c can but would there are some hidden features you will like now we are doing automatic well generation of WMS and WFS for example if somebody upload some vector data so here we we iterate through we redesigned to the dataset page and now when somebody uploads saved file we have like sorry this is the Greek that I will show you later that the being page all of this is the data and these are the services and these are the metadata so we now have all of them in 1 page and somebody can just upload the same file and what happens is when you upload the same file to see kind of for example it will identify that it's a safe Ilargi that's easy and then it will get you through uh some steps in order for to come to call you publish your data so initially c can just ask for a title a description and a license and the organization that's all the all the 4 elements that says he can use to us so you can still do that but we added a fool that inspire metadata editor or to seek out so if you can if you will uh we view changes he might hear you say I don't want this to go on that 1 inspire Rizal then you get that the next stage was which is a full metadata editor and the inspired by the the pause the inspired geo-portal so you cannot things like linear in spatial resolution and conformity and all this stuff so if you're going to that trouble well which is the required for you organizations then you have a full XML representation of of inspired data and then you're able to directly sort of this then the Europe able to directly publishers and sources w so then we implemented then you've got bored with his c cancer can wasn't but was not able when you uploaded the the defense of the file was not able to do anything with it it just was there only Donald not nothing more so now we made and the new dashboard where when you upload a file let's say say file it will identify what it is and will provide you with some options so you can ingest the resource to database which is boss or you can just ignore it or you can do just some copy of data if you you wanted but the but the main idea is that you upload of supply 5 and then you go to the administration side and you will probably see you ingest your data to the to the to balls the posterior so after you do that immediately you with this that that the date of this the dataset pages will automatically created an endpoint through GeoServer over or observer for WMS and WFS and then it will be available in map directly so you can we generate the the endpoints by linking to the posterior states and then we have preview so if you're going to a data set and you keep the preview and the map you just yeah it's the WMS yet the euro how to how to link to that want to another application using WMS
or you can you get support for w fists right up at right out of the box and we do that also was arrested so once you provided the Lauren and that's to be a file we well obviously you're not going to upload the being defined on the on your browser right so instead we are we can add the link to it so if you what the link to it on the server will download it and with equal ingested try them on and then it will provide their WCS point and WMS based on this data of course it will not try to to to to guess what would what type of data that is and it would provide also previewers some conjugal really fast now development and not enough time and there are many features solely in the metadata once you get your they they're there or when you want to upload a file and dates from from that you directly have biases w there is there to help you so beginning you have your metadata your page and you can just download whatever where are the profile you want some dates and places w we also took some time to to provide for something for developers so we did that our own taste test of the providing a mapping API and we did it will balls with leaflet and operator so you have similar codes with our API almost identical elements that only 1 line changes and you have the same results either only flipped over openly and also we provide metadata API since we have all the data and now directing posteriors we created the and adjacent in API with the knowledge of the baby as the story where you can do queries to the data directly and of course because it's a national portal we created a and they knew a client and you multiplication directly with OpenLayers always features like when you see a later here you can just select that part of the layer and you can download it through the the date IPI and you can get the financial somebody just doesn't know how to use it had it would GSP and just click and download the data through the what directly in so also I don't have enough time sorry it's too many slides too many features we added multilingual support so we kind of did in a URI where somebody can translate data and metadata directly through the through seek out so you have a needless dataset and you want to translate into a week or 2 friends or anything else there's a URI where you can do the translation and store the uh different divisions and then you will get your story you will get your data in a different language have been based this dataset speech that's
about it no time for them all you want to do it in a with I think will shake that outside the so obtained you thank you thank you thank you for this and
younger must have any questions at and thank you for presenting this season embraces and that migration is an 8 8 season many passengers soldier not and there and how long have you considered heat and also on Monday in your head how you see the differences in the yes so this is this is a good question so actually have a journal developer myself but sometimes you have the requirements that are given to you by somebody has the rights and sometimes when you are working with different communities as said and the open data community has together a reference that's a implementation everybody using C and so most Open Data portals will use a gun well do you know the becoming a very strong player but the difference is that you know this is even more focused and use special that again but C can is more of I mean for example you can upload any kind of accounts can it will do its work these extensions will work so it's a it's a platform for open data well you know it is also a very good platform but it is more focused into providing just pass of features and actually see was missing those features but there were needed to solve we had to do in that we will have a seek and it's it's a matter of followed you of preferences open-source right there writing my mother's in this again and that's a general day is improving basic and conventional yes there is the again connections that both views places W so can harvest some other now but it's it's also a vibrant communities economies is also moving but the although the this project what it was a 2 year project only we went as far as we could in thinking very exchange for that I think so too can be you etc. have better amendment to and solve for example the background the current so for example so that we can get to that the the care of is that although all the software is you have you go to public come when the organization we don't use Dr. yet we're not porting it the boat here but we have all the and scripts so if you get the answer was great you will run them on 1 of the and it would create a very big vehement and very heavy VM but it will work but also you will we also provide the actual production and which creates if you have 10 or 15 PM is available you can set up the whole thing by yourself it's so I think we we spend a lot of time on civil to be able to be at 2 to reproduce it because failure happens and we that we used to do a lot of work to set up a new service on Civil was a lifesaver for us and now so it will be available soon also so 1 last question who this month this a mention the more it will be available soon but the public ammonia project finished organizations in the community is is is the hassle of rights that the with of the novel the research and maintains the the cattle so it's a national catalog the project has ended but maintenance is still going we're not implementing right now new features maybe if we get some funding if somebody is interesting we can also do some new projects are but that with the main console at all and everything that is needed to work it it will continue to be supported for many many years because the long is behind it even the beats broke at the moment but have yeah it's it's still there and so yeah and contributions are always welcome are other than the development team is also doing other stuff but it's an open source it's there anybody can pick it up and continue or ask us to help the well and they want to close the session let's thank the speaker again you
Implementing Open Geospatial Data Portals with CKAN, pycsw and PublicaMundi: the case
FOSS4G Bonn 2016
Tzotsos, Angelos (OSGeo)
CC-Namensnennung 3.0 Deutschland
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/20342
FOSS4G, Open Source Geospatial Foundation (OSGeo)
2016
Englisch

Informatik
PublicaMundi is a successfully completed EU FP7-ICT project aiming to make open geospatial data easier to discover, reuse, and share by fully supporting their complete publishing lifecycle in open data catalogues. PublicaMundi extends and integrates leading open source software for open data publishing and geospatial data management. In particular, PublicaMundi extends CKAN, the leading open data catalogue, into treating geospatial data as "first-class citizens" and providing automatic OGC- and INSPIRE-compliant access to geospatial data, through integration with pycsw, rasdaman, ZOO-Project, GeoServer, MapServer, PostGIS and GDAL. PublicaMundi was recently deployed to to serve as the main open geospatial data catalogue of the Greek government. The production system provides multilingual data access to data publishers, open data users, and developers through the main catalogue, an integrated mapping application, and various APIs (CKAN, data, mapping and OGC APIs). This presentation will provide an overview of the production system, the cloud infrastructure used and future developments.

