Birdhouse: A collection of web processing services for climate data


Title Birdhouse: A collection of web processing services for climate data
Title of Series FOSS4G Bonn 2016
Part Number 168
Number of Parts 193
Author Hempelmann, Nils
License CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
DOI 10.5446/20280
Publisher FOSS4G, Open Source Geospatial Foundation (OSGeo)
Release Date 2016
Language English

Subject Area Computer Science
Abstract Processing of climate data is often connected with big data processing, but a frequent problem is that users of the processing outcome are not optimally-equipped with appropriate hardware (computing and storage facilities) nor programming experience for software development to perform the processes themself. Web Processing Services (WPS) can close this gap and offer users a valuable practical tool to process and analyze big data. WPS represents an interface to perform processes over the HTTP network protocol, enabling users to trigger specific processes over a website. The appropriate processes are predefined, together with access to the relevant data archives where appropriate data are provided. This presentation is an introduction to the birdhouse project which provides WPS for climate data processing. Besides calling the WPS with Python libraries, birdhouse provides easy-to-use user-interfaces (web-based and command-line) to run WPS processes and combine them with climate data. The provided processes are reaching from simple climate metadata checks to complex climate impact models used e.g. in agriculture or forestry. The birdhouse is conform with the standardization defined by the Open Spatial Consortium (OGC) allowing combination with WPS from other institutions to establish a network of computing providers.
I this study did you use
to talk about the house and his everyone happy with door authority was shown thank you thanks for the opportunity to talk about the is too large for me well after the 1st 2 2 taxa was the technical solution this is not very the very real case applications as
coming from a host of other German climate computing center and basically build up by cost but them just the speaker or a more related to the climate the climate community and implementing the the the scientific publications at
this is a the the the the basic problem what we have in the kind of community that there is a very huge data volume and which is which is currently available but in basically also growing
quickly in the future so that the model resolutions are getting hot hirers saluted him so that the data volume is growing and the current and the the current state of the art the the the the current practices that that the people that the researcher don't loading the data and processing that they're at home and this will not be feasible in the future or it's it's quite difficult already now so and that is the modulation to establish processing service which would need a technical solution that and you have the possibility to process and the data are close or even into the data I have without moving them and then the researchers would just received that the result
and the technical talks before the 2 technical content partners the are basically this
is the picture in picture for it but the get probabilities and describe capabilities and execute so we're dividing so with Web Processing Service with you dividing a client-side and server-side so on the server all that has all the processing is is not very close to the data and then anywhere on the somewhere on the on the 1st there's an expert executing processing that that the job and also all the expert is needing is an internet access that's that's basically all this
is the but has a consistent source for the user's life is busy on purpose and the here and now we can pretty quickly understand what we're calling that brought those because every technical part there in that in that is that is named after after birth to get rid of all this very fancy arguments and they're there much more inside that is already and so I was simplified flying in a little bit and we have 3 main no WPS inside so the flame region which is providing processes for impact for the impact community and for extreme events then we have a 1 which is very interesting for the data centers hummingbirds at which provides some data quality checker for for the the density of status in terms of taking technical quality so if they're all time steps in there there's no negative precipitation if the metadata set in the right way and and stuff like that with the manifold which is doing a lot of back background processes which means searching and searching the Internet access to find the right data stone loading that if they're not already available locally and also what we were talking about was what will which is the standard and for sure all these parents are keeping the which is the standard so you can link with them and you can build up as computer administrator you can build up your very individual and compute provided you very individual of broadcast if you would like to to say network and even so we know in the top before and there was practice ending a lot of work processes or services which already exist so all what we need is the link to the to to compute provided and then you can register and you don't have to do the work again so once and a service is there it is is usable by by the whole community and then there we have a value of the PS graphical user interface this is the phoenix it will give a little and and presentation later on the little demonstration later and was that there is even a web mapping service but it's not so you can even combined the processing and that mapping service and the couple service and whatever so that's why we recall that that the system so that very lively and and very very individual and and moving in and change and then outside of it we have a climate that have very big and and the big 1 very important 1 from the climate community that is just 1 and will have a slide later on but you can also called link in all kind of archives which are accessible over the Web and not only providing climate data but but but if you need for your process and ensure after processing and your analyzes you haven't output you have a result and the result can be can be published anywhere or is is available over the world
then alone the other side of this slide before there was this source and on the other side you have the client side and the idea that this is a and you have 3 possibilities to execute a process or to submit the process you have a graphical user and interface or you can do it over and over a script languages you already have seen in the previous stops the syntax and and here you can choose a script language of your choice and for giving you can do it with and without and with N-terminal command so as you like so depending on your your you have you can execute the the processes and basically what all this so all these sorts 3 possibilities of doing
exactly the same and then when we're talking about whether then we very quickly have to talk about security as well and here is a tree chart which is which is the utility of the because you can use and the idea of order and the and the workflow of the teacher is that you're going to a graphical user interface was invested indications so means Wissen with with a username and a password and then there are several other accounts available so if you haven't get that account you can continue looking to you to to the funding square or well there's some some other possibilities as well and then automatically at the the the different on the teacher inside the burka is generating and token which you can use later on if you're if you would like to to process the 2 to submit the job and processing your data over and over a terminal or over a in a script language this is how
it looks like on the token security so beginning and every time you can generate a new token so that is another way of of the security and the and the token expires as well so it's you even someone is taking a picture of the stock and this is not really the valid anymore and at the end of that's what we have
already so what you need on the client side is a little is is is a little like to connect and and an internet connection and that's that's basically on and then you can define all your your your processes give as an as an expert on knowing what you're doing giving it a couple of arguments to modify the process or to you have to go to modify the process and in the end you will have your your output which can be various vowel formants so in this case it's a text file and then that's it yeah file and and and the graphics and then you can decide what you're doing research downloading it or sending it to a to a colleague of leading and exactly
the same as of calls and then it starts out with it so all you need is the the burden which isn't a client to to call the because from from the terminal and it's easy to install 9 so and then that's all what you have an and then with normal rule the body is minus age you would get the whole syntax and you can you can call over but over the the services within and this is how
you execute your your process so you connect to your to your server and provide the tokens and In this case and executing looks detection process which is an argument so was the experiment and that's and and service of peak and then I have to wait a little bit and then I will receive my my result on the
server side so if you would like to become an administrator at the and then you could quickly running into the question of dependencies and and software would you have to install and and whatever and so the vertices that is coming wasn't with an installed there is a mechanism so what you have to do is a clone the the repository appropriate repository would you would you would like to have an and make install and start the service and so you are completely independent of the local all few local architecture and it is the the make install a stretching all the dependencies all all the all the stuff that which is required in the right there be version so that you don't have a conflict with with with wages which is sometimes problem and if you would like to have would we like to have something more there's even documentation which is good news for the through the installation 1
point what I was already mentioned is used to climate data archive which is a very important that apply for for the climate to funded community of 1 in their community it is built up from the Earth System Grid iterations so all over the globe there are several data centers and here is the Australian 1 by 1 and that the chair is going from the but also in the states are in in nature and then the real in in Europe and and all this all this data centers are connected to each other the 2 so once once you're entering want 1 of the notes you Europe and you have access to to all the all the data is globally story and within the within the North there's all those things inside the and 1 of the thing is an index service so that the data of the available on how are harvested and can be searched
from the outside and this is a feature of the of the broadcast that you haven't search and search interface
and where you very very what we can search for your for your data so you have signed several features what you can but it would have can select the search for for temperature which has a resolution of 0 1 day in the European domain and and whatever so so you can search through your data and then start to process was exactly the but at the very
is not only connected to
to the history of data so but you need basically can you can use all data sources which are which are available over the web and then with the bird feeders so for example if use from catalog and if you have if you have a circular you can use the bird feeder which is another utility provided by the by the by the British to create your own indexed within your your blood and then this is the searchable that you can find data or if you have local data as we go cool Datapro produced recently been and you can you can build up a searchable interface and was done well searchable indexes I would like
to give you some real case examples now this is the hummingbird is for the quality checks and that's a different story for this many the bigger the bigger the text but if you look at the beginning the header so is a quality check from the from the modeling services and another quality check from the dataset and another quantity check from that so it's a it's about quality and that that data quality checks and in in terms of technical thing so that they're looking at there no non-negative precipitation in the data or if they are in the right way of values the metadata set in the right way so this is 1 of the main motivations behind the German computing centers building building up the things there's another
WPS it's called flying pigeon is not providing the it's the processes which are used in the impact community climate to all in the climate community there's a picture of of porous a couple of months ago where there there was a flop in Paris and in the climate community there's a development right now so that is more and more going into the service on people big big thing is about service so there is that 1 of these events and then a lot of people are very nervous and trying to understand how how hard it could happen that is it natural or is it due to climate change at home how can we explain that and then we need a lot of the facilities and infrastructures we just very operative and and quick and where can where can quickly select your you data like an analysis and have and there's another very quickly uh respect up answer for the for the for the public and this is 1 of them 1
of the processes in Europe you can select the and and data so far from of 3 analyzes calculating whether regimes over and giving bounding box and the for a specific time in a specific 1 winter and maybe and and then calculate how is the percentage of the different the weather regimes the different weather weather situations to to answer your scientific questions and that even with this with so these are 2 2 processes so 1st your processing you whether and then when you have your when you have your statistic training you can go on another more or you can go on another data set to check if there if the data if the dataset is representing the weather regimes in the right way so this is a this is useful for for model evaluation and
here is the output of the output is looking like and you can see this different kind of all format so you have a graph that you have an our work space which is really coming in and the answer if you don't have if you don't like the graphic you can you can create your own scientists are sometimes really picky and would like to have it in
a very special way this is another 1 of atmospheric circulation and you see the the the the process is implemented in in the flat pitch they're quite complex already so you have sea surface pressure there stored in an archive and then the process is doing doing the data fetched so you don't have to take care about where storage physical there right now maybe they changed it where L whatever so and then the calculated the analogs whatever it is scientifically and the the the the the the output is a big text file with a lot of numbers which are quite unhindered to Turing so we developed this or that an engineer with the developing their and young script based these 3 but publication and where you can visualize this this text file in an interactive way so this slides on this slide there is a movable and then all the all the graphics on adopting accordingly to the
area this is another aspect that I would like to introduce to you then know it's not only connected to climate data but also to to on climate data depending on the on the process you're you're providing so this is my from from the global biodiversity and storing this species distribution start storing species information and then there is an
appropriate process integrated in the face region where can we can calculate and the distribution in the future for for specific trees and the that's how it works well very
variation article in so you have the climate there is and connected to the attributes have to set of the database would you provide is the taxonomic they did that the taxonomic name of author the tree you're interested in and then you have to say which climate indices are are responsible for the distribution so the temperature in some the precipitation in winter or whatever and then the the species distribution the process is doing better basically the rest of fetching the data for training the statistics and getting out of probability over the time and then you can make some so most of its own and this is
another feature which is an integrated system that we have a mapping service right now this is only and so you can directly if you have an if you have an altered so this is an article of the subset when I was in selecting in Germany and France and wanted to have a dozen so and this is this is the result you what your line and in what you are able to want to to display online some
information so if you're interested in that so it's everything is on get and there's a nice documentation about that and then also we have to mailing lists for for general information and then for the developers which are not so much useful for now but then we have to get chapter which will within very high traffic and where you original work well if you have a question you can get into contact with with the developers real and then and there's also a demonstration of the graphical user interface which you are interested you can have on the move from and I want to give an demonstration Monday but I think it's a you think of the stop the
occurrence of an I would like to thank you for your attention and if you have questions maybe you can to come up with and yes it does anyone have any questions the have you might talk about hi and as a a question directly related to build but recommended to the content of the package manager why don't you or have you have also developed by supplying pre-compiled packages by a comma while this is a question to the to the architecture developed of customs the this is
not a straight question about having been told what of his mother wanted colonies to us what was the condition is met the article is to do what they do there have been many current OK I can tell that they're going to pop something really big or open source this week which is it that in understanding this in lab if any of you know that there are a lot the unit you have to be a read last number not get but that there really going to partners that the local things but for your if you want to say something what is opened a time published and no 1 will say a lot of don't have from my those on they can you can say something about the way that you publish and what you have I'm not looking the time on written in your skills in life but there have you go over to the last and we are cooperating with them so that they get there things published but the problem is that there 2 right of the document a need be used to learn it will be a few hundred pages 1 of the many more questions that concerning
