We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Analysis of Big Earth Data with Jupyter Notebooks

00:00

Formale Metadaten

Titel
Analysis of Big Earth Data with Jupyter Notebooks
Serientitel
Anzahl der Teile
27
Autor
Lizenz
CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache
Produzent
Produktionsjahr2020
ProduktionsortWicc, Wageningen International Congress Centre B.V.

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Growing volumes of Big Earth Data force us to change the way how we access and process large volumes of geospatial data. New (cloud-based) data systems are being developed, each offering different functionalities for users. This lecture is split in two parts: (i) (Cloud-based) data access systems This part will highlight five data access systems that allow you to access, download or process large volumes of Copernicus data related to climate and atmosphere. For each data system, an example is given how data can be retrieved. Data access systems that will be covered: - Copernicus Climate Data Store (CDS) / Copernicus Atmosphere Data Store (ADS) - WEkEO - Copernicus Data and Information Access System - Open Data Registry on Amazon Web Services - Google Earth Engine (ii) Case study: Analysis of Covid-19 with Sentinel-5P data This example showcases a case study analysing daily Sentinel-5P data from 2019 and 2020 with Jupyter notebooks and the Python library xarray in order to analyse a potential observed Covid-19 impact in 2020.
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Transkript: Englisch(automatisch erzeugt)
Okay, so once you're logged in, you can get to different platforms, but we want to actually go to another one. It's just the JupyterHub today is sponsored by Atom platform, and I just wanted to say thank you again because specifically teaching with JupyterHub, it's very convenient.
So now I ask you to go to https double-punctation slash opentuhub.atomplatform.eu and you should come to an interface that looks like mine.
Any problems? Okay, good, good, and then you can sign in with the single sign-on and you should actually now be in the JupyterHub.
So you should have the same interface. Okay, good. Okay, good. So basically my two sessions here is are here in the folder. So the first one is analysis of Big Earth data and the second one is dashboarding with
Jupyter and Voila. So today the session we enter the folder one analysis of Big Earth data, and then you see here a list of Jupyter notebooks. I got the impression from the small survey I did in the morning that I it would be helpful just to give a short overview on Jupyter, Jupyter and JupyterHub interface
just to understand how you can actually work with it. And so I would start with the notebook in True to Python and Jupyter, but before we start, so basically what I want to say is also that it follows here.
The sequence of the course is the order of the notebooks. So basically if you want to start you should open up the index notebook and this gives you a bit more details on the course. Again, some access to the JupyterHub what I think everyone is now there.
What this lecture is about? I think I gave already quite an overview this morning and the lecture outline and then I yes.
Yes, so please go to here to open so HTTPS double quantization slash slash open JupyterHub.atomplatform.eu. Is this big enough or shall I make it a bit bigger? How is it for bigger?
Okay, better or even bigger? Yeah. Okay. So here you have to go here and then you should get a SSO single sign on button and one if you are rich or if you are locked in into the platform, then you
should actually also be able to access the JupyterHub. Okay. Good. Good, so before we start is also that I really see this as an interactive session. So don't be don't apologize. If you have questions, this is really an open question.
We are a small group and if there is anything you don't understand or you want to know then please please shout we have enough time good. So if everyone is there then we can go to index 00 underscore index.
I put book we go to lecture outline and then I will start to give a short overview to of project Jupyter why Jupyter notebooks are great and why you think why I think you should try it out as well. Good. So project Jupyter in the in the official documentation.
It says that project Jupyter exists to develop open source software open standards and services for interactive computing across dozens of programming languages. So it started mainly with Python, but now it supports of more than 40
different programming languages. So this is why I also said it's so we it is set up today with Python, but it's also easy to set it up with our kernel and you can also interpret a notebooks basically combine different programming languages in notebook in one notebook. So that's that's really great just to understand who actually worked
with Python before here. Okay, so half of it. Okay, good. Okay, it's not too bad. So if we talk about Jupyter, it started with Jupyter notebooks, but now it's much more so Jupyter is not only notebooks.
It is also a Jupyter hub which we use today and Jupyter hub is basically a server which is great for for teaching and learning. So because once you locked into the Jupyter hub, this is now your environment so you can make changes.
You can add cells. You can make changes to the code and you don't change the overall setup of the course. This is your learning environment and and that's quite good. And also on Jupyter hub you can for teaching for example, you can already set up the environment you have all the libraries you need and you actually you're ready to go to teach what
you want to teach rather than installing packages and libraries for a long time. And there's also JupyterLab which is it's becoming more and more a software development environment. So this is like an installation on your local machine, but there
it's similar to RStudio or Spyder where you have a really development environment for Jupyter notebooks, which makes it very convenient to to develop and work with Jupyter notebooks.
Yes, exactly. So basically it's like so you program in Jupyter notebooks. Yeah, so but at the same time you like while you program you can then also easily share it as notebooks etc. So it makes it much more easier to combine everything rather than having different programs for different things you want
to do. Okay. Good. This is what I already said. So Jupyter notebooks it supports over 40 programming languages. It can be easily shared on GitHub. I think if you if you look on GitHub for notebooks, I think
you get over you get millions of hits. It's at a moment. It seems there is a new fashion because it's also so easy to use Jupyter notebooks. But my credo is a bit that because it's so easy. We also going towards def def of Jupyter notebooks because
everyone is just dumping code into into a Jupyter notebook and then things okay, my code is reproducible because I have a Jupyter notebook, but that's not the case. So you in order to be reproducible you also have to invest a bit of effort not only start a Jupyter notebook and share it and but it's yeah, so there is some there's
an entire ecosystem so you can share it on GitHub. There is nbviewer if you don't know it. It's here. So you can basically paste a GitHub or GitLab link here and then you see your notebooks nicely rendered we can
actually make the example here that we actually make the use the so I just copy pasted a GitHub link for so the index notebook for today's today's session and
this is how you now can share nicely rendered the Jupyter notebook with nbviewer, but this is static. So this is not interactive. So you can of course the links they work but it's so you can't execute code by an nbviewer, but still it's it's nice to share your research with colleagues.
Good installing Jupyter is very easy. So if you use use anaconda or you can also install it with with pip. So it's here's actually a conda Jupyter notebook.
You can use also pip install to to use Jupyter or to install Jupyter and once you installed it, you can run it on the terminal you open so you go to a folder and you run Jupyter notebook and then your Jupyter server starts so we can actually also
showcase this is probably a bit too small like you have to just go to I should have prepared this before I'm sorry just have to run my okay
here. Okay, so I just opened my my python environment and then if I run Jupyter notebooks, probably I go to a to a more meaningful.
So now I'm in the so now I'm in the folder
of the of the content today. And now if I run Jupyter notebook, basically a server like a local server starts and I have the same
Jupyter notebooks on my local machine. I now I close this again and I go back to to the Jupyter hub good. So who worked with Jupyter notebooks before here? Okay, a bit less than than who use python. Okay, good.
So you there is a so what you already see here. You already using a Jupyter notebook. There is like just a bit of a better understanding. So you have on the top you have a menu and the toolbar which lets you run cells which gives you provides you buttons in order to to execute something so you
can run cells you can add some cells you can cut out cells you can save your notebook. I think most important is that in the notebook you have different type of cells. So anything which is like a documentation so text
or like an image. This is a markdown cell. So here for example, this the cell is a markdown if we open it you see that you can combine markdown and JavaScript in order to document your Jupyter notebooks if I want so per default if I open a new cell I get
a code cell. So and you see here and in this interface here, what type what cell type you have. So this is code if I want to change it to markdown then I can just select it here and you can you
can enter your your markdown here. Another good or I think something important to know is that there is a difference between a active cell and a passive cell I would say and so if it's if it's highlighted here in green then your cell is active. So this means you can actually start coding I can
also make it passive then the color has blue and I can't so if I try to do something then I can't enter something. So I really have to make the cell active and then I can also execute it.
There are some I think with everything there are some useful shortcuts like for example escape is switches to the go command mode. So it switches to an active or inactive cell. So if you're in an active cell and I press escape now then the cell gets inactive then yeah,
like B is inserting a cell below a is inserting a cell above and some other useful useful shortcuts which you can have a look for that. Yeah, when you start using Jupyter notebooks, I think the nice thing with Jupyter notebooks is also
some cell magics which makes it very easy to yeah to do different different operations. So if you if you enter percent LS magic then you get a list of all the magic commands that are available Jupyter notebooks and there's a lot I just
listed here some which which might be useful for you while you work and develop your Jupyter notebooks like for example to to see and also to set environment variables you you can use percent ENV.
You can also within Jupyter notebooks. You don't have to go to the terminal. You can also install and list libraries. You can also easily write a file. So like with double percent right file and then the name of the file you want. So at this example, for example, we say we want
to write the file. Hello world.py and then we want in the file. We want to print you want to add print hello world. And then it's it's in the file. So we should so if we go actually here if we here then
we see here now a file was created hello underscore world.py and then we can also load the python file here and then it's executing or loaded and you can see what what what is in there. Yeah, you can also use a counter for how long you write
your code runs. This is I think this was formally but now this is even I think not necessary anymore to that. Also you that you get your visualizations directly in the Jupyter notebook and but this is I think already standard now as I already said Jupyter notebooks.
It's all about reproducible research sharing your research so you can share it with nbviewer. You can share it on GitHub. There's also my binder binder is basically it's similar to Jupyter hub just or it's I think it's similar
to to dockers. So basically you can set up a my binder with which has already all the environment available which is needed to run and execute the notebooks and you can share the link and people can can can start the binder environment and can run the notebooks some more information on.
Yeah, if you're interested in Jupyter in general Jupyter hub Jupyter lab, you can have a look later. Are there any questions so far? Yeah.
Yeah. Yes, there is there two ways. So there is a package it's called pi2r and you can basically so you can so you can do two things so you can either just run your R environment. So if they are here, I just take this down you see here.
This is the kernel and so this shows you okay. This is a kernel based on Python 3 but you can also set up your Jupyter on a R kernel and then your notebook is basically an R environment. So it just understands R if we want to do this. I think it's called polyglot programming where we combine
different programming languages within the one workflow. Then you can even so with pi2r you can you can specify for each cell which programming language the cell should understand exactly. I can't promise but I was thinking maybe for one of
the evenings that I can set up one one small example how this could look like this polyglot programming between different two examples. So I I make an announcement if I manage to set up a small example. Yes.
No, it's it's so you can do both. If you have Conda then I would recommend Conda. Conda Jupyter notebook. Yeah, you don't have to execute it on Jupyter hub
because the environment is already there. But if you if you have it on your local machine, then yes, you do Conda. Yeah. Okay, good. So then I would start with the real lecture. Is there any questions so far on Jupyter notebooks?
Okay, good. Then we go back to our index file. But before I think No, I do it later. Okay, good. So now we go to or we can even stay here and we go to the next Jupyter notebook.
So as I said this morning, I will go over different data access systems how so what the data access system is about what data you can get there and short example how you can retrieve data and depending on the data access system.
It's either downloading data or like interactively that we load data from the online platform and visualize it. So it just shall give you an overview of different companion mainly mainly Copernicus data who heard about
so who is familiar with climate data and atmospheric composition data from Copernicus? Okay, not so many. Okay, good. So I hope then it will be will be helpful. So who heard about Copernicus climate data store?
Okay one. Good. So okay good. So it's it so basically so the copernicus program has six different services and two of these services are run by the European Center for medium range weather forecasts and these are the services on climate and
on atmosphere atmospheric monitoring and the climate data store or the climate service provides climate reanalysis data and also seasonal forecast data what I said this morning and the data you can access on the Copernicus climate data store and so it's a web interface where
you can so you can either browse on the web interface and then you and you can download their data or if you want to because mostly with climate data you're interested in it's much more than one one one time step. For example, then you can also use an API to download
the data and then what I said also this morning to make it to make it easier to develop applications based on this climate data. There is also climate data store toolbox. It has an in Python interface. It is based on x-ray.
So it helps you to develop applications. So how does web interface looks like so you we go to cds.climate.copernicus and you see yeah, it's is it big enough or shall I make it bigger? Yeah, so you see here the menu and I'm now already
in the in data sets you yeah, you can even if you're not locked in you you can also search the data sets and this is all the data sets you can you can get for example, yeah, we can if you for example, so the most popular most prominent data set is the
climate reanalysis data set. It's it's going on back until 1979 and will even go back soon until 1950 and you have for over 130 variables hourly data sets on a on a fifth time 25 kilometer
spatial resolution climate data available until almost near real time. So if you look for era 5 then you see here different different type of not type but different subsections of the data. So the data is provided either in hourly or monthly
averaged already. So specifically for climate application rather than downloading a long time series for hourly data. It's also quite useful sometimes just to download directly the monthly aggregated data, but there's all there's much more data available.
So like for example, there's a fire danger fire danger indices available from Copernicus emergency service. There's also River discharge information if you have more hydrological applications, so you can really
have a look there some very prominent or popular data set is also seasonal forecast data and there is these data is either daily data up to six months ahead or monthly monthly statistics on seasonal forecast
data just to understand so all the data is part of Copernicus and it's openly available, but it's not like real forecast data. Yes, seasonal forecast is global. Yes. Yes. So all the data is global.
Yeah global 25 kilometers and seasonal forecast data have 10 kilometers if I'm not wrong. Yes, so so era 5 climate reanalysis is for global 25 kilometers for ocean and land.
Then there is the sub era 5. So you see here era 5 land and this is this subset it has I think around 50 variables and it's only for land but then on a better spatial resolution with 9 kilometers.
Yeah. Okay. So this is the interface you can browse through the data sets. Let's yeah, let's say we interested in era 5 land monthly average data you get some information and then you get here like yeah information how you can download
data. This is the web interface what I just said or also general documentation. So if we go to download data here, you can select the data you interested in like yeah monthly average reanalysis then 2 meter temperature. For example, you can also select multiple variables
and retrieve multiple variables at one at once. And then yeah, so this is all the variables we can select a year a month time and then here you can even say you want to have the whole region or you can also use a you can also subset
it already geographically. So if you say you just want to have it for Europe, for example, and then the data is available either in group which is a format mainly used to to to share data between metrological organizations and it's very standardized
and it has a long history of sharing data, which is which is good, but it's if you not have if you're not coming from a metrological background, it's not so easy to understand the group format. So that's why the data is also available in netCDF
is we're so familiar with metCDF data. Okay, also quite a few. Okay, then you yeah, there's the terms of use. So it's open data. You just have to agree on the Copernicus license and then you see here two buttons. So here it shows show API request and this request
basically it can be copy and paste it into a python environment and then you can download the data you you interested in. So this is I show you just in a bit how you can then programmatically programmatically do it or actually
I do it now. So basically this is just a help for you that you can easily browse data you interested in and then you can you can just copy paste it. You go to your python your python environment and then yeah, okay.
Maybe I should go over how you how it works how you can set it up basically. Okay, so we come back there good. So this is the climate data and then there is as I said, there are two services and other services the atmosphere monitoring service and they also set up specifically for the atmospheric data on air quality
data a atmosphere data store. It has the same principle just it offers data on air pollution on greenhouse gases on on climate for things Etc. And so if you go if you go to atmosphere, how
does this okay? This is a broken link up you go to atmosphere. So ADS dot atmosphere dot Copernicus EU. The thing I have to open is I have to make it bigger for each.
So it's it show it has the same interface and you can also browse here through data sets and yeah, there is for example, also a reanalysis data set for air quality. So comes is like the abbreviation for the atmosphere monitoring service.
There is reanalysis data. There's solar radiation information and there's also air quality forecasts for Europe available and it has the same interface so you can you can browse through information you interested in like nitrogen dioxide carbon monoxide particulate
matter Etc. And then in the on on the bottom you also have a API request which you can then get the data. Okay, we come back there as well. So what we saw later, so the only difference
to of the atmosphere monitoring service as that there is no toolbox yet. So you can't we don't have this interactive editor similar to Google Earth engine where you can load the data directly in the editor and then develop some applications. Good.
Yeah, what I said like go go there have a browse through the data. There's climate data stores climate reanalysis data seasonal forecasts fire indices River discharge information the on the atmosphere is global reanalysis global and regional analysis and forecast data the global fire
and a civilization system the chief us data which provides fire information and greenhouse gas fluxes there's in in future all the data from the atmosphere monitoring service will be available on on this atmosphere
on the atmosphere data store for now. There is a subset of data available, but it will evolve in the future. So the question is okay. How can we now retrieve data? So how can we interact and work with their with these data systems specifically if you not only want to download one one
layer or one variable there is a API it's for Python CDS API. It's a Python library and you can install the CDS or the library before you have to self register at the interface.
So if you go if you go here, I'm locked in here, but I think I can also look out. Let's see. So yeah, you can hear look in or register and you can create an account. Then so once you registered you have to look in to the to the web portal and
then you can go to the API how to page and okay, we let's let's do it. So I just in here and okay, and then
so once you once you installed the library you can go to to this page API how to page it's explains you again the right doesn't it?
Okay. There's a it seems to be a to be a buck but let's see if it works. So basically because usually when you're locked in then you should see here a
like a specific key and token which has been developed for you or which which has been I here now we see it. So basically you want so once you're locked in can go to this page and then you can copy paste this information you can copy paste it
here and now we can we can already use our magic command with it has to be for write file because we want to in the in the home directory. We want to write a file dot CDS API RC and we want to add this
information to this file. And so because I did it before it's overriding but if you did it if you do it, it should be new it would be actually interesting it to know if you have to do it for yourself or if you actually if I
provided a true bit of happen you were working on my image if you can actually use already my credentials would be quite interesting if if someone wants to do it and and let me know this would be very helpful and we can install the CDS API on a two-bit app you already installed it and then we can
import it. And this is now here basically an example. So this is an example so we can easily copy paste the example we had here from one of the data sets for example, where we went to to have yeah, let's go
here for example, and so basically here the information so this is not very useful here because I didn't select anything but so the API request which is shown here we can copy paste it into into the
cell here. This example retrieves now one time step on 1st of January 2019 to meet a air temperature variable from era 5 reanalysis data in the format net CDF because on default you would get a grip grip file.
So you have to specify the key format that you want to get a net CDF and also natively the era 5 data is on a 0 to 360 grid. So if you want to have it on a grid for running from minus 180 to 180 longitudes, then
we also have to specify the area key here and then we can say yes, we want to download the data and so we packed it into a function. So in Python, so if it's too fast for anyone, please let me know I can also explain more the syntax of
Python is not a problem. So in Python you define functions just with death and then your function name here. We don't have any any variable or parameter. We want to give the function. We just want to execute the code the our retrieval retrieve request here and
then we can execute it. Yeah, so executing a cell you can either do it here just click the button and you run it or you can also use shift and enter just might be much more convenient. Okay, so now this is it's incomplete.
Okay, let's see. Maybe we don't need probably we don't need this one. Let's try it again. It's always the the danger of life sessions. Okay. So now this does work. So it informs you a bit
what the retrieve is doing they welcome you they send the request and because we now downloaded a very small data set because it's just for for the purpose of showing it today. It was quite quite fast downloading the data. And if we now go back to our to our overview
here, we see here that there is a download net cdf file available. So now we have net cdf files in in python. There are different libraries you can use to to open the files one very popular
library for the recent years is x-ray specifically for for order for large volumes of modeled and organized data x-ray became I think probably now the standard it is also very easy to handle
net cdf data. There's also other libraries like net cdf4 for example, it's also quite easy to load net cdf data. And if you use grip there's even for x-ray a driver or a python interface that supports grip
together with x-ray. I just show you here as a small example on how to import x-ray how to how to open a net cdf file with x-ray. So we import the library import x-ray as xr and then x-ray has the function open underscore
data set and you don't you just you can just provide net cdf. So the the the path to the net cdf file and then the the the the data is directly loaded and this is here the structure of x-ray data set.
So it's so it gives you the dimensions of your data. It's x-ray is also very powerful in handling multiple dimensions not only up to three, but even up to five or even more. So it's it's yeah, it's very powerful in this way. It also shows you what
coordinates our data has. So here we see the the longitude and latitude and time information also that our longitude information is now shifted to a minus 180 to 180 degree grid. We see the data variables and so
x-ray also net cdf they can actually hold multiple variables. And so this is example that we have just 2 meter air temperature, but we could also retrieve 10 different variables and then our x-ray data set would have 10 different variables. Just to go back to the
command in case we want to actually load more more than one variable. We just specify here a list. So a list you in Python you just specify with square brackets and let's say not only 2 meter temperature, but we also interested
in total precipitation. Then we can specify here different variables. The same also works for year for months and for day and also for time steps. So if you want to have more time steps more variables you just specify a list
with with few information you you want to download. Now because I promised if I know there is a R package available. So because we really have to make efforts to also bring not only the Earth observation
and climate community together, but also the Python and the R community. So it's not only restricted to Python users can have guns and Rachel's over day release day package ECMWF are it's now very mature. I also and I follow it on GitHub and
I have to impression it's it's commonly used also from our users. So you can also you also have a proper interface to actually retrieve the data within our and it has the same the same structure. So you also have to provide this retrieval request which we
which we have here. It's just that you work in an R environment. And say it again.
You know, so on the website, you don't get it. Oh, yeah. Yeah. Yeah, the suggestion is for for for Python, but there's the retrieve request here. So this is not very useful here. So let's let it
just make a more meaningful example. So let's go to era five. We we click some information just to have some
more. Yeah, I can actually also showcase this one then. We are not in two years and two months and time. And so basically if you now go to the show API request, so you work with is that the example is for Python, but what's also
what you will need for for the R package is this retrieve command and so basically instead of importing CDS API you then just load your ECMWFR package and you just use the retrieve command you develop you you put together basically and this
is here this showcase also the example here that now we chose two different variables and then the API request shows you already the list of variables. You want also the same for year and four months. Okay, so far so good.
Yes. Yes. Yes, it is possible. There is this package. I said pi to R
which is actually like an interface between two programming languages and it it's supposed to work that you basically for each cell you specify what programming language to cell shell shell support so you can specify. Okay, this cell is supposed to be Python and then you can write your Python code and
then the next cell is supposed to be our I was saying before that I try to set up a small example and maybe if I'm successful then if I am in one of the evening sessions, then we could go over it's called polyglot programming that you basically not in one environment
anymore, but you bring the bring two different programming languages together. And also the important thing is that you can actually also use one object you created from Python and then you can load it in in our and so if I'm successful to develop one one meaningful example,
then I will make announcement and we can see it in one of the evening session the next day. There's anything any any other questions? I see. There's a question. Shall I also check on soon?
Oh, they don't is they don't see the demonstration is okay to
everyone online. I'm sorry if it was done very abstract. So shall I shall I wait or what do you think or it works actually also online?
Okay. Okay. Perfect. Good. Okay. Good. So yes, it's also on. Yeah, so, okay. Good. All right. Okay. So if there are no questions, then I would go to the next data access system. So it's Vicky or
who heard of Vicky or before no one. Okay. Okay, good. Then I try to shed a bit of light into into the dark because you probably in future if you want to access Copernicus data,
you you will hear a lot of with you. So in the entire Copernicus sphere at the moment, we have now very nice or a lot of open data available, but at the same time, it's so much data that we have to or we are forced to
implement and set up new data access systems because we just can't download it anymore. Is it big enough? Otherwise, I can also make it a bit bigger, but this should be fine. So Vicky wisdom is a DS. Is it better like this?
Mickey was a DS system. It's a brief abbreviation for data and information access service and it's basically the a cloud cloud computing environment or cloud computing service to access and process Copernicus data is still work in progress.
The first stable release has been at the beginning of this year, but it will just mature the next years and it will become one of the services if you want to at least if you want to have access and at the same time if you want to access and process
Copernicus data, then this will be one of the one of the system. So as I said, Copernicus has six different services and Vicky offers so far data from four different services. So from the services land marine atmosphere
and climate and you can already access different type of Sentinel. So Sentinel one Sentinel two Sentinel three Sentinel five P. Let's just open the web page. So basically the the idea
of course is in the long term that we don't we don't download data anymore, but we can just we have a hosted processing machine in the cloud and we can directly access the data there and we can run our our analysis and our experiments and then we
don't have to download large volumes of data. So you can have a look here. So if we go for example to data, we an interface opens for this actually already my example. Okay, good.
Basically. So if you if you go here and you click on click on the on the plus sign you get a catalog of the type of data. This is available. Shall I make it bigger again? Maybe a bit. That's probably better. And so yeah, you already see so
you can browse by by by service. So let's see for example for climate data. Oh, there's again our era 5 data so you can get era 5 data. You can also get seasonal forecast data, but not only for climate. Let's go for marine services. For example,
there is yeah data from the from the marine marine service. Yeah, like global ocean data global surface chlorophyll Etc. So let's go back here. We can also browse by sentinels. So if you only interested in sentinel
5p data, for example, we can though it offers so far data data from trap homie. We can get information and there we get information on the data and also the temporal extent let the spatial extent as well. We can edit to the map
so that we have we can already see how the data looks like here. You get then the different type of variables that are available. So nitrogen dioxide. I already set up the example, but we can for example also get let's get carbon monoxide for example, and we can
edit to the map, but it's still loading. Okay, so if it's finished loading then actually we like it it will show you a layer of sentinel 5p carbon monoxide and also nitrogen dioxide.
So let's go back to our information. It also gives here a data set ID and this ID we will need later on because this is ID one ID which is important if you want to retrieve data. Good. Now, let's go back.
Are the questions so far? Okay, good. So we covered this one. So the question how to retrieve data. It's with with VQ is basically actually I can also show you an example here. Can I show you?
Okay, let's let's I locked in quickly and then I can show you also how VQ looks like if you want to use the how you get my okay.
Okay. Now I'm I'm locked in I go I go so this is the data so you can browse the data. You can see what is available. Let's go back.
Okay. Now I'm sign find out again. Okay, good. So if I so if I go to VQ dot you and then I sign in or if you do it then you will you will come to this interface and there is a button go to my dashboard
and so if you go there you get your profile and then there is like yeah, so there's and this is the the interface where you can manage your your profile but also your cloud processing tenants and virtual machines
and you can also access directly the data from a Jupiter hub environment on VQ and so if you go here so let's open the Jupiter hub on VQ
it might take a while okay maybe we let it load and then I just continue okay now here so is of course there is also to to try it it you don't have to be afraid that once you fire once you look in that you already have to pay
something there is of course a free version which allows you to try out things and then you can also go to a paid version if you think it will be useful for you so let's go just to a small server option okay so we let it load and we come back there
so how to access data? so as I said you can download data from VQ as well as an interface or you can directly load the data on VQ into a virtual machine and then you can also use Jupyter notebooks in order to communicate to your virtual machine and to process data
the way how you access data it's called harmonized data access HDA API and it's a risk-based single protocol that allows you to subset and download data sets from VQ there is a step-by-step or this this tutorial here it's a step-by-step guide on how you can search and download data with the HDA API
and I will just come back on it we will make use of some functions which are stored in a separate separate notebook but I will show you how you can work with these functions so it requires
six steps in total so first of all which we kind of already did so we searched for data sets on VQ we are aware of the ID of our data set ID yeah, so the next step is then to get a data set collection ID then we have to get a VQ API key based on your registration credentials
password and username then we can initialize the VQ HDA API we can load a data descriptor file or we can also have just a data request and then we can download the data just to go back
has it started now? Okay, so so now I'm here on JupyterHub on VQ so if you see here, I've just make it bigger it loaded now it's a bit too big probably
so basically this is just a so this is similar then the JupyterHub environment we're in so you have this interface you can you can browse I'm actually in one folder here, but you there are some notebooks already provided you can browse what is what is available
but this will because this will be much more mature in the in the next month so there will be also examples provided for if you want to get started and some example notebooks, but basically yes once you understand how Jupyter notebooks work we can also just start our own notebook which is based on Python 3.3
and we can load a new one and then we can actually get started so and this is basically the examples we showcase now on our JupyterHub
you can also easily just do on VQ retrieve data and then once you have the data then you can you can work with the data on VQ you can just because if you you can just have better processing environment in case your local machine is not powerful enough
good so the first step in Python is always also to load some libraries so you load the library with import and then the library name so we just execute this one and now we come here to load our helper functions
and so if we go to our interface here you see a notebook it's called HDA API functions and we can open it this has actually like all the functions which are used in order to retrieve data with the help with the HDA API
and so yeah, let's go to one function so basically it's just a notebook full of functions as I said before you define a function with def here for example, it's the example to generate our API key
with a function takes the username and the password and then it returns a base 64 encoded API key which you need to authenticate to VQ and a good thing is that basically this helps you also to modularize with Jupyter notebook
so that you don't have everything in one notebook which might sometimes also disturb your workflow. You can really also outsource your functions and then we call it in a notebook in a different notebook and how you do it is you use the library ipynb you can hear so with this library we call
or we load the entire notebook HDA underscore API underscore functions and we import all libraries from this notebook. So if we execute it and I open a new thing and for example, I want to have now it works now
we can go through it but a good thing is also so in case I don't know what what a specific function is doing. So if we go go, okay. Now I can assume okay good.
So let's say okay, we haven't seen this notebook and let's say we we did the workflow shows us that the function generate API key is used but we don't actually know what is function is doing then we can use the question mark.
and enter the function generate API key and we execute. And then the doc string is opened for this function. So if the function is documented with a doc string, then you see here the description of the function, what the function is doing and what it also returns.
And this is quite useful to outsource functions in notebooks. So we just. Yes, it's a package, yes. Yeah, we, no, no.
Probably I loaded it before in some other notebook, but it, yeah, apparently it works like this, but I, to be honest, I don't know quite, but I can check, but it's a package, yes.
It's, I don't know what FS and full means. I have to go to the documentation, but yeah, this is, yeah, I can look it up. So exactly, so this one is basically the command
and then here, this is the name of your notebook. Yeah, which, yeah, I think it has to be in the same folder than your notebook and then it can read it. Yeah, the question, so you can follow your lecture and speak with the, basically, the special condition, the condition name, right?
Yes, yes, yes, yes. I didn't write it down in the whole book, but it's important to read, to get just the name for the point. Of points, Sentinel 5P points.
To be honest, on BigQ, I'm also not sure because this is always the big question. So either you tailor a system for EC, for large volumes of data and then you have global data, but then time retrieval is a bit difficult.
With BigQ, I'm not sure if you can retrieve just point information, like time series information, but you could certainly just narrow it down to a very small region and then you don't have to download so much data and then you can still take the time series out for one specific point.
I'm not sure if it allows you that you only specify, so I think you have to specify a small area of interest. So it has to be a four corners area. Yeah, but it can be like, I don't know, one time one or two times two, yes.
Okay, yeah, it's not optimal, sorry, yeah, okay.
It's a nature of life events. Okay, are there any questions so far? Yes, yeah, yes, so basically you go to,
so if I sign out, you go to BigQ.eu and then you can sign in here and if you don't,
so you can also, exactly, yeah, exactly, yeah, yes. Sorry, say it again.
To be honest, I don't know, there is certainly a limit, so you can't just, I guess, you can't just, I don't know, fire up 100 parallel requests and retrieve the data, so they probably will limit it
somehow, but I don't know the exact numbers, yeah. So far so good? Okay, good, then we go to HDA API, so example, so yes, we were already on this interface,
so here we can browse for datasets, we can search for datasets and we get more information and as I already said before, if we go to documentation, we get the dataset ID which we need and we can already, so let's say we interested
in Sentinel-5P trapomi data, we got the dataset ID and now we define it here as a variable dataset ID and we execute it. The next step is now get the Wikio API key. I'm quite happy that I remembered
to delete my username and password because this also happens quite often. So, but I, so I pre-generated already, so basically the function generate API key takes your username and password, which you can specify here,
so I can add my username and my password and if you then let run the function, then what you get back is a, I think base64 encoded API key basically and my API key is this one. So in this API key, so we have to store it,
so yeah, and then so we store it, this is the API key and then the next step is to initialize the HDA API request and in order to initialize it, we first specify a download directory path,
so in our example, we just, yeah, just want to, because it's an example, it's just in the home directory, we specify it and with the data set ID, we just specified our API key, which we just generated and the download directory path, we can load the init function,
which basically initiates a dictionary with all the information, which the harmonized data access needs in order to retrieve the data and so if we specify it and let's just open and see what is in the HDA in this dictionary
and so here you already see that this dictionary, it creates already a link to the broker endpoint where the data is, then it checks if we accepted the terms and conditions, it gives access token address,
so all the information that the VQ understands, okay, we followed the guidelines in order that we are ready to go and download data. Then once we initialized and we communicated basically that we want to get some data,
we have to request an access token and this access token is then also stored in the HDA dictionary, so the dictionary is basically our, let's say our documentation of requests, so it holds all the information VQ needs in order to retrieve the data we interested in,
so if we load or if we use the function, get access toggle and we just provide the dictionary, we just initiate it and we return also the dictionary again, then we see, okay, we have now an access toggle
which is valid for one hour and this access token here which is printed, if we now print again our dictionary, we should see the access token here, so there is an entry in the dictionary with the access token
and so as part of this HDA API process, we basically just develop our dictionary which then uses all the information to download data. The next term, so in my case, probably it's already, so we also have to accept the terms and conditions, so it's open data, so we can use it
but we have to agree on the licenses, on the license in my case because I already accepted it, so it also stores the information but probably in your case, once you do it the first time, then it just tells you that, yeah, it was successfully accepted.
Good, then the next step is that we load a data descriptor file and then we can request the data and so the harmonious data access API, the data information we interested in, it's stored in a JSON file and so the API understands JSON
and all the information of our data we want, so the collection ID, the temporal subset, also the time period has to be in a JSON format and yeah, so we can, yeah, we can, for example, define specific keys in the JSON,
so dataset ID, the date range, the bounding box values, also a string choice value, so what type of data should it be offline or near real time or non-time critical, et cetera.
So on the JupyterHub, we also have an example of a JSON file, how it looks like and so I just open the file, so if, yeah, so if you go back to the overview of the JupyterHub, you have here a file called S5P underscore data
underscore descriptor dot JSON and this is basically, yeah, it's big enough, yeah, so it's basically a JSON where, what I already said, where we specify the information, so here I think I created a bounding box for Europe
or at least, yeah, around Europe, the data set ID is Sentinel-5P, the data range, because it's an example, we're interested in one day, 8th of April, for example, for one hour and then we say we don't want level one but level two data, for example
and so all the information of your data, you store in this JSON encoded file and if you have this file, you can easily just open it with the JSON dot load function, so here, what this cell is doing, it's saying, okay, we want to open this JSON file
as our checked F and then with the function JSON dot load, we want to load our object and then we can see how our object looks like and if we execute it, then basically, the information we just saw in the JSON file is now in the data object here
and alternatively, so you don't have to store it in a JSON file, it's probably convenient for if you want to reproduce some data requests or if you want to use the data request multiple times, then it's probably easier to outsource it
but alternatively, you can also just define it in a cell here and specify the data object. And then we have, so now we have our data file and now we have to initiate a request to assign, yes?
Yeah, yeah, exactly, yes.
And no, this token, no, so let's go back here. So basically, you have a token which was here, I think.
So this token is basically a saying that, or communicate into VQ, okay, now you want to request something, the API key, you really have to, you have to have a, so the API key is based on your username and password
and so if you enter here, your username and your password and then you let run this function generate API key, then you get here your base64 encoded API key. And with this API key, so we initiate the library,
just the dictionary but it is really based on your username and password. Exactly, yes, so the API token, we actually have here, so this is an access token, the access token,
yes, you can, yes, you can also specify, so if you don't want to get the access token programmatically and you have it on VQ on the website, you can also specify a separate key, it's called then access token
and then you can specify your token which you have on the website, yeah. Any other questions? How are we, are we already lost or it's good to follow still? Yeah, okay.
It's long sessions but yeah, let me know if something is too fast. Okay, good, so we were, we're almost there. It's already a good sign. So yes, we specified our data request in a JSON file
and the next step is to initiate this request and for this, we can use the function getChopID because this basically assigns a ChopID to our request now and so we can, this is the same again here, so we extend our dictionary with the ChopIDs
VQ provides us for our request. So if we execute it, so we give here the dictionary, we already developed and we enter here our data object, we saved or restored, we see that,
so now the query was successfully submitted, we have a ChopID and the status is now completed and basically our dictionary now holds the ChopID which is used by VQ to run the chop and then with this ChopID, we can then run the function,
get results list which creates basically a list of data files available for our data, for our data, for the data period we specified and also the geographical bounding box and again here, these, the result list, we also want to store in the dictionary
and so we see here now, so it's running and we see because I specified it because it's just an example, I specified it just for half an hour on 8th of April and we see that for this time period, there are exactly two IDs available,
so it tells us, yes, total items too and we see here of the information, so the file name, so Sentinel-5P data also has, stores the time period of the file and it's around 11.34 and the other one is around 11.39
and so there are two items now and so the next, and then one step before we download data is to create an order ID, so based on these files we specified, we want to download, we now create an order ID and with this order ID, we can then download data
and so we can use the function, get order IDs and again here, we store the order IDs in our dictionary. Oops, yeah, okay. So now we have our two data sets, we want to retrieve, we have order IDs and status is completed and basically, if we just check here again,
the HDA API dictionary is basically that now it's much longer than at the beginning because we stored quite a few additional information along the way, so we have the data set ID, we have the API key, we have the access token,
we have our job ID, we have the results which are valid for our chosen period and data type and very important, we also have the order ID here and with this information,
we can basically use the function download data, we give the HDA dictionary and VICU understands, okay, I have to now transfer the order IDs and to download the data and we can do it and yeah, now, oh wow, this is quite fast today, okay.
Yeah, and the data will now download basically. Okay, there's one, so one data is downloaded and there's one other problem I, yeah, I have to, I have to investigate but now, so we see there should be
one Sentinel-5p data here, yes, and you see here that we now have one Sentinel-5p file. Are there any questions to VICU?
What is your experience? My experience is none, I don't have paid for VICU yet but we can go there, there are different t-shirt sizes.
There is price, so if you go to VICU.eu, you go to pricing. And then, yeah, so and then, yeah, it's basically, yeah, you have to, so you get virtual machines, processing tools, free networking, support, and I think here,
you also see the different t-shirt sizes. So depending on what size you want, you can have bigger or smaller RAM block storage, file storage, yeah, load balancers, GPUs, et cetera.
And the prices. So the prices are per year, so if it's per year and per month system in brackets,
I can also make it a bit bigger. The upper ones, the upper one is per month, and then the lower one is per year, and then it's gained per month. Yeah, it's, ah, okay, probably what I understand,
so if you go on a monthly flat rate, then you pay more than if you already sign up for a yearly, and then if, and then if yearly, then it comes down that you only pay half of actually the normal monthly fee.
I think any feedback is highly welcome,
because they want that the NICU is used, and I think it's, yeah, so I think for this, specifically this feedback is very good, because it's also should be for smaller, it should be also accessible for smaller companies or universities and researchers.
So, at the moment, they're about to start, so because the public release of the first version 1.0 was just in March or April, and so they have also a YouTube channel, and they just started a series of training events
where they provide some small tutorials on, yeah, what is Vicky or how, so something what we actually did today, so like that you have a JupyterHub interface that you can access data with the HDA API access to understand it a bit more,
and so they have now basic training, and it will be also a bit more advanced in the future. Yeah, okay, good, so if there's no other question, then we go to the next, so we continue, so I just wanted to say that basically,
I give you an overview of the data services. It's not that I have any preference, so basically, I just want to, I think it's also helpful always to know where data is available, and the idea here is that I provide you some examples,
and then you can choose yourself what system might be of interest for you. So the next one is the AWS Open Data Registry. Who has heard about it here? Okay, only one.
Okay, so yeah, because when I heard about it, I think two or three years ago, I thought, oh, wow, this is great. So basically, yeah, it's also Amazon Web Services, AWS is cloud computing from Amazon, and it has a strong focus on geospatial data sets.
So you really see that they have an entire open data registry with geospatial data, and it will just, I think, grow more in the future. So there is definitely a sector
where you can benefit to use AWS with geospatial data. There is this registry, what I said, on open data on AWS. Okay, if you go there, you see like, yeah,
there's 69 data sets at the moment, or I think even more, on different, yeah, Sentinel data, so Sentinel-2, Landsat-8, also Eira-5 data is available,
Sentinel-3 data, so it's quite powerful because there's also two options again. So if you work on a cloud-based processing environment with AWS, then you already have a S3 storage bucket with the data, which you can just load
in your virtual machine on AWS, or you can also use one of the Python libraries or APIs to actually download data from the storage space there. And the example today, I will show you
like an example how you can actually retrieve Sentinel-5P data from AWS, how it works, and actually the examples I developed, this case study on COVID-19 actually makes use of the Sentinel-5P data I retrieved
from Emerson Web Services. And there is, yeah, you can have a look yourself, there's a variety of geospatial data from, of course, other satellite data as well, but including Sentinel-1, 2, 3, and 5, and also ECMWF ERA5 climate reanalysis.
Just also to check out, AWS also has a Earth on AWS program and this is basically a program specifically they interested in building large-scale applications based on open geospatial data. So if you work as a researcher or you need processing data
and you have an idea how to make use of the data that is available on the open data registry, there is also, you can also apply for cloud credits for your research project. So it could be of interest for some here.
Okay, how to retrieve data? There is a Python, or yeah, SDK for Python, it's called Boto3 and this allows you to access data on AWS S3 storage buckets. So S3 is the short name for a storage bucket
on the cloud of AWS. And so below, basically I showcase a small example how the Boto3 library can be used in order to download Sentinel-5P data.
So we start again with importing libraries, so Boto3 and Boto4 with import. And then the first step is to initiate the Boto3 client with the function Boto3.client and then we define the bucket of interest.
The bucket of interest, let's go back and see how this is actually structured. So let's open the example for Sentinel-5P data. And so this is how it looks like. And you see, yeah, a description of the files, you see here already that,
because this is also the reason why I downloaded here, that the level two data is already aggregated on a daily level. So instead of different data each day, it's already aggregated on a daily level, which is quite helpful.
And basically, yes, we just have to find the, where's the bucket name here? Hang on, this is the resource name.
Yeah, you get, okay. So basically, yeah, so here on Boto3, let's go back here again. So we specify the bucket name, which in cases is MIO-S5P,
and you can see it here, but not specifically specified, I think. So probably, yeah, you have to see here. So yeah, so you just have to take off basically here the things before,
and then you just see here the name of the bucket, which is MIO-S5P. You have different levels. So you have near real time or non-time critical information, offline data,
but it's all in one bucket in different folders. Okay, good. So basically, we initiate our client. We say that we are interested on a Store S3 client, and this is just a standard procedure
to configure your client. And we also specify our bucket you're interested in with, and we can execute the cell. Good, the next step is then to create a pack-in nature and iterate over the results from the API request. And so pack-in natures are features of Boto3
that act as an abstraction over the process of iterating over an entire result set of a truncated API operation. So it's something specifically of Boto3. You first basically have to define
or create a pack-in nature, and based on this pack-in nature, you can then call the function pack-in nate in order to iterate over the different listings under this of the API request we have. We can do this. So the first step is here. We create our pack-in nature.
We use for this our client we just defined, and then we use the function get pack-in nature, and from the pack-in nature object, we then call the function pack-in nate. We specify our bucket and also a delimiter
in order to understand the folder structure. And if we go then, and if you have a look to how results looks like, it doesn't help us a lot because the pack-in nature object is just a page iterator. And so in order to see the whole structure of the data
that is organized on the S3 bucket, we have to iterate over the object. And so we can do this with the following command. So we basically iterate over our results, pack-in nature object,
and we look for common prefixes in order to get the folder structure. And if you do this, we see that, yes, we have different levels of data available. And the next step is that, yes, so let's say in our case, we interested on offline data
and then the daily aggregated level three data already. And so instead of the delimited keywords, which we used here, so we just used a slash, we can actually also use already
a more detailed information that we directly get to the data we interested in. And so we used a prefix that we interested in cloud optimized geotiffs online or offline, and then the level three daily aggregated NO2 data.
So we use this prefix and then we pack-in it again over our page iterator. It's here the same, so it's not very useful, but we can actually iterate over this object in order to see what data is actually available. And so we can do this to iterate over the contents
of each page of the page iterator. And since the resulting object is a dictionary, we get a much cleaner output when we just select a dictionary key. So if we select here, so we go over our pages which we developed,
which we defined, this is the page iterator. And for each object in the page of our page iterator, we want to actually retrieve the key of the dictionary. And so if we go over it,
we get a list of the files that is available. And so you see here that, so I can actually show you the difference between here. So if we just, we don't specify directly the dictionary key, but we just specify the each entry of our,
each entry that is listed in the contents. Then we see here that it's actually, each entry is a small dictionary. Is that actually seeable, yes. So it's a dictionary. So we have the key information,
which is actually the data, the file name. We also see when it was last modified, also the size, et cetera. But this can be, so this is quite messy and it's not so easy to actually see what different type of data. You also see, for example,
that yes, they are like for each day, there are several entries, but then you actually thought, okay, there's only daily data. So why do we have several entries? And so this is the reason why if we already print just the key information from the dictionary, then we get a much cleaner output.
And so we get here this cleaner output. And now, for example, we see that for each, for each date, so for example, for 28th of July, we have actually three or four different entries.
And we have the normal, just the general unprocessed raw data, but then we also have mask data where like cloud information is already flagged out for quite conservative,
where a mask filter of 75% was applied or a bit less conservative where a mask filter of 50% was already applied. So this is basically, so this is already a processing step which you would actually need to do for Sentinel-5P data. And here, because it's already aggregated on a daily level,
you can already get the data on a daily level. It is already cloud, it's already, clouds are already, or non-valid pixels were already removed. And basically, yeah, and then you can start actually using the data rather than doing,
yeah, extensive processing steps before doing any analysis. Good. So basically now we have our list of files that is available. So we see here, so if we want to check the length of the selected files,
oh, now I changed, okay, this is not, don't want to have markdown. So if we, why is this not selected files?
What did I do now here? Okay, so hang on, I do it again. And, okay. So now we have, so basically here we,
because, yeah, as an example, we are just interested in the data which used like a very conservative cloud, cloud filter. So we only, so we here filter all the data with which end with mask 75 underscore 4326.
And so basically from all the data that is available, we just get a data which we interested in and we store it in a object called selected files. And so if we just print it, we see that for each day we have now, so yeah, 5th of March, 6th of March, 7th of March,
we have now one daily file. And so if we just see how many files are actually valid we see that we have 163 files for 163 days. And so as an example, we just want to download one example.
So we, let's say we want to get 161. So this is for 6th of August this year. And if we want to download it, we can here specify it for our list of files on file names we have. We can first print it to make sure that the,
to make sure to understand what data is actually called. Then we split it by the name because in the end, we just want to take the final, just the name here
and not the entire path of this S3. And then with our client, which we defined at the beginning and the end of function download underscore file from the bucket of interest, also the file name and we can actually download it directly.
So here we just download one example and it's actually, yeah, so it's always good to see. So here, this is, I forgot to mention. So on this circle here, so if you execute a cell
and the cell is actively executing, then it is filled out. And so once the cell finished, then it is transparent. So there's always also a good check to see if the cell is still running. And so if we now go back to our overview,
we should have another Sentinel-5P data here and it's a masked data set with 75% cloud filter. Okay, are there questions to AWS Open Data Registry?
Yeah, is AWS really trying to link some of the data there? The company I work with, Mio, they edit the Sentinel-5P and also Sentinel-3 data there. So how it works is that they have a specific,
so it's, and also the ERA-5 data, it's actually companies who upload the data there and set up a S3 bucket. I'm just asking if you have any questions. I can ask my colleague and can give you feedback or can let you know offline.
We have some questions. Yes, I'm considering that you can get the Sentinel-5P. No, that's why I said like, it really depends on you.
On, at the moment, the fact is, at the moment AWS is just much more mature than BQ. So yeah, this is, so this is, it's just, the exactly or the idea of this lecture is to provide you different pathways and it's your choice than what you use.
On the difference between BigQ and Sentinel-5P on AWS is that on Sentinel, on AWS, it's already a pre-processed data, so it's on level three. So it's cloud optimized geotiffs, which are daily aggregates. So depending what you want to do,
you can save some processing steps, which you would need when you retrieve data from BigQ. It was, so big, so now the original,
so the original source is from ESA and Copernicus. So it was not ESA who uploaded data on AWS. It is so, and how open data registry works is there are always third-party providers who provide the data. BigQ is basically, it's also not,
ESA is not involved. It's UmedSat and ECMWF in Makato Ocean, but it's the Copernicus sphere. So it's still, it's from Copernicus, the data is provided there, yeah.
Yeah, could, so probably BigQ because there's the, so probably you have, once the data is generated, it will also be available on BigQ. And AWS, because there is always a dependency
on a third-party provider, there could be sometimes delays. So what we see, so here now it's 6th of August, I think the last entry was, or 7th of August. But because there is a delay, because basically the third-party provider also has to retrieve the data once it is available, they have to process it.
And there's always a bit of a delay, but I think so far they quite good in maintaining the updates. And I think, I don't know exactly, but probably once they go, or once they upload the data as part of this open data registry there,
MSN, there's a agreement with MSN and the third-party provider to make sure that the data is maintained.