Facing the challenge of climate change with xarray and Dask

Video thumbnail (Frame 0) Video thumbnail (Frame 1470) Video thumbnail (Frame 9345) Video thumbnail (Frame 11326) Video thumbnail (Frame 25763) Video thumbnail (Frame 27406) Video thumbnail (Frame 27933) Video thumbnail (Frame 29198) Video thumbnail (Frame 30102) Video thumbnail (Frame 30744) Video thumbnail (Frame 31472) Video thumbnail (Frame 32604) Video thumbnail (Frame 35317)
Video in TIB AV-Portal: Facing the challenge of climate change with xarray and Dask

Formal Metadata

Facing the challenge of climate change with xarray and Dask
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Facing the challenge of climate change with xarray and Dask EuroPython 2017 - Talk - 2017-07-12 - Anfiteatro 1. Rimini, Italy In the last years climate change has become one of the most important topic. For any period longer than a few days science is not able to provide comparable forecasts, but still a lot of useful information about future climate conditions can be gained on time scale of a few months to even several years. Climate forecast and climate projections data are quite complex to analyse and represent. The Python science ecosystem proves extremely effective as a platform to retrieve, analyse, process and present this type of data. The backbone of the platform is the n-dimensional array library xarray that provides the perfect mix between pandas data structures and dask performance and parallelization. Reliable climate forecasts and climate projections are now available from the Copernicus Climate Change Service, operated by ECMWF, that will become the central hub for European effort in study and mitigate climate change impacts. The service also provides access to an open cloud platform, the CDS Toolbox, that is based on the Python 3 xarray/dask/pandas stack. In this talk I will present how to retrieve, analyse, process and display climate data in a generic use case with xarray and with the Copernicus CDS Toolbox.
Degree (graph theory) Mathematics Frequency Average Range (statistics) Emulation
Satellite Slide rule State observer Group action Computer file Link (knot theory) Image resolution Multiplication sign Plotter Execution unit Source code Workstation <Musikinstrument> Maxima and minima Mereology Variable (mathematics) Frequency Bit rate Operator (mathematics) Energy level Selectivity (electronic) Endliche Modelltheorie Task (computing) Area Injektivität Surface Structural load Interface (computing) Projective plane Mathematical analysis Coordinate system Database Line (geometry) Extreme programming Degree (graph theory) Subject indexing Type theory Word Data model Process (computing) Personal digital assistant Configuration space Figurate number Library (computing)
State observer Context awareness Presentation of a group Group action INTEGRAL Multiplication sign Correspondence (mathematics) Mereology Dimensional analysis Computer programming Area Neuroinformatik Array data structure Different (Kate Ryan album) Semiconductor memory Ontology Drum memory Physical system Channel capacity Block (periodic table) Software developer Data storage device Electronic mailing list Fitness function Bit Price index Flow separation Type theory Web application Arithmetic mean Process (computing) Programmer (hardware) Internet service provider Interface (computing) Right angle Text editor Cycle (graph theory) Writing Resultant Ocean current Point (geometry) Metre Server (computing) Functional (mathematics) Service (economics) Computer file Divisor High-level programming language Expert system Graph coloring Machine vision Product (business) Power (physics) Operator (mathematics) Summierbarkeit Computing platform Form (programming) Condition number Task (computing) User interface Focus (optics) Graph (mathematics) Information Key (cryptography) Interface (computing) Weight Projective plane Interactive television Expert system Mathematical analysis Cartesian coordinate system Performance appraisal Intrusion detection system Personal digital assistant Formal grammar
Subject indexing Order (biology) Shape (magazine) Number
Fisher's exact test INTEGRAL Instance (computer science) System call Event horizon Neuroinformatik Subject indexing Inference Bit rate Operator (mathematics) Cuboid Right angle Figurate number Quicksort Mathematical optimization Task (computing)
Royal Navy Presentation of a group Mereology
Medical imaging Sine Link (knot theory) Integrated development environment
Purchasing Group action Software developer Projective plane Execution unit Virtual machine Variance Database Measurement Computer programming Neuroinformatik Number Different (Kate Ryan album) Repository (publishing) Physical system
and Francesco uh I talk about on how to deal with the climate changes by them and more specifically with accelerator and ask this is a story from buying xk is the about about climate change and and it is represented by the room temperature in the last 20 2 thousand the year compared to their average in the period 1961 and 1992 uh so we can see that in the last 20 thousand years the temperature is increased by 4 degrees of and that in the last for 40 years the temperature is increased by 1 degree
scientists that have developed several models to forecast that of the situation on the next hundred years and that they develop essentially 3 types of scenario the best case assume the that the inmate immediately in the immediate a massive action as them to limit the emissions but it seems that what we are a bit late in this scenario the optimistic scenarios In the optimistic scenario we really have 2 degrees more globally in the next hundred years but the current part that uh is worse than we we will have 4 degrees but more than the beverages between 1961 and 1990 marriage the same group that has been 20 thousand years to monitor the situation we have a lot of that and the main sources are a new satellite observations and conceive observation so stations on surface aircraft and ribosomes and these that a process that we've models developed by scientists to provide the global agree that that this global global agreed that the IRA reanalysis weather forecasts seasonal forecast and climate projections the analysis of medical data models applied to the past the the period during which is the 1 provided by the standard graph of and it starts from 1979 so it's a window on on our task then we have the weather forecast that army the ejido model up to 10 these and seasonal forecast of that dominate the aegean model for the next 2 seasons so up to 6 months at the end we have a chemical injections that we already spoke in in the 1st line and then we have several scenarios i.e. represent 3 C notice in this own In this chapter and we we can't see the disintegration as the 1st slide the let's investigate on undecided on this on the sides of is of 40 analysis that from using the of we have a special resolution of 80 period the kilometres and the other article coverage up to was 64 kilometres in 60 levels but emperor coverage from 1979 to 2 days 1 and the variables see Figure and the temple resolution of 12 words for 4 hours if we calculate the average size of of that make use of we found that 100 bytes so we stronger need a tool that can handle the size of that why don't provide a plenty of scientific and medical libraries and the main library to do with this kind of that that is accelerating thanks to understand traditionally arts they mention names and coordinate indexes to whom by areas moreover that's configuration provide a great tool to and the luge that the there are a lot of that of climate that is severely by the same that you have had at this link and then we can the load and it's in the air grew up by the interfaces as shown in this slide in this great in this case we selected to wearables to me the temperature and other preservation rate that you from the year in the unit database and interbreed from what 1979 to 2016 the once we have the it's CDF file you can open it we've as of the Pacific X sorry that the city's on extreme with 2 dimensional it we'll of upon us that the frame and if we printed that is that we can see that ionic sorry that's it as the coordinates latitude longitude and time in this case and other valuables that they're precipitation and temperature every other valuables depends on all the coordinates the the dependencies are explained in the but parentheses there is a time that the long time look along uh in tasks that this central good at so face that the period that is for the preservation initially we can select the variables from the status of and
we you will ever gotten accelerate that they that is an implementation of unlabeled more dimensional array now we can see what's related to that to that in this case is then the name of that temperature longer name to me the temperature and the units that the scanning and think about them is very simple accelerate the 2nd that the rate of all the men in adults and that books Bible and a lot more middle school perform operations that that 1 of the simplest is this selection we can use the name of the coordinate to select that in this case I selected uh January 2016 and now we have a market because that is that is among the uh that so if we select the 1 want to we have a look at the hour that out on depends on the on latitude and longitude so we have a lot of if we use the plot needled of X R E x array we recognize that that we have our that that that depend on Latin long and you will ultimately give the plot amount of we can
select also appointed the use using lot along the and using the need of we have to specify the middle the 2 interpolated data IEEE choose the the simplest of nearest neighbor another that depends only on time so we have a time serious and accessory will plot up sees we can perform also more complex operation that but like the mythology competition between ontology is essentially I'm not cycle so we use that thing from the time serious so we can use the group by means of the 2 groups that appointment and then we have to perform the mean over time acceleration will create a new form coordinate called that want to and if we plot we would evidence here so we can see that I I choose their 7 the Remy coordinates that we can see that the difference between winter and the summer accelerate integration we is crucial to and the lab job out as climate that the that the is RAD too many small pieces called chance each of which is presumed to be small enough to fit into memory in the which has the evaluation operations on basket arrays thoroughly z operations skewed out of a serious of task of method over blocks of and no computation is until you actually as values to be computed for example to bring the other at that over a the to plot that or to save into the disco at that point that is loaded into memory and computation proceeds d technical condition is controlled by multiprocessing or all which uh allows that to take full advantage of multiple more people precious of yeah so to open a deficit using basket we have just throughout the charts the work to the function of opened the system and we have to specify over which the coordinated we want to to China in this case I you select a lot a lot of dust will create a chart every 2 200 values of light at the entrance of the values of longitude and in this case find does not appear so only 1 shot will be used along these dimensions we can represent the workflow of that dust of the calling the book that the task and using the graph of the function by Descartes this is the the net CDF file then these are the chance that the dust you created to open that others if if we select from an exact latitude and longitude Charles fit in that's good will open will import only that in China and the and the other and that this would not be imported In this way we have a huge savings of memory because there's a lot of only 1 child on their way to perform their competition we want the Copernicus program is the world's largest that covers vision problem there by the European Commission in partnership with the use of it aims at achieving a global and continues to have a sufficient capacity In this context the premise that the store would be at the center of their Copernicus climate change service and it would provide climate information on past present and future intents of essential climate variables and climate thematic indicators the the the government that a star would be our distributed system that we end it will simplify access to comment that out for why unified web interface it will contain observations the analysis that projections cancers 4 because it would provide a software platform called toolbox that will allow to develop applications for the users using all the information in this city as in the grammar the store services are designed to meet the needs of several types of users like policy makers experts and scientists has given solutions we are in charge of the development of this is just books the Mr. looks as essentially 3 types of users their developers that are current developments and future variables of the system experts that that extra in paralyzing climate and they will actually use the tool books to create and publish custom climate application using and custom climate tools that we developed and a user of the web application developed by the experts they don't interact with a toolbox books that they but only from the application that by the expert the application some at to the compute servers provided by the CMW this is the expert interface to develop applications on the left we have the resources from the CVS a list of examples that and the use of cepstral climate that there in the on the right there is the editor where 1st actually write application by competition using XRD and the Customs climate tools provided by this and developed by us below there is a preview of the end user application and that if we run an application the application would be uh submitted to the to the service and the the the results are shown below but see our be go and user application 4 is that these additional investigates of the being part of climate change on wine production we use the this year in the amount that the best of and become a projections for the for the future for to forecast the future we can see that the this is the situation in 1979 19 86 of every color corresponds to the optimal conditions for the growth of a group of provides so we have different color in the not really have there in the final the red that is so the typical wine that we produced in Italy and the growth is the Sicilian to see a wine if we go all in the 19 86 and in the future we can see that know we we have our this the that's all there is there s are moving in the in the north and the this is did today situation and we can see that if we go in the future there moving is continued to these this is the the 1st step the 1st so factor between 2006 and 2008 26 of and we can go in the next 100 years but we can see that this is the the all of these 16 that uh I I discussed in the press like the and the we can see that the the various are moving to to the north the and the the last the last year in the history of the there is a huge production of wine from France and also from Germany and the Sham 5 will be produced by England so yeah OK thank you and lewisian mean theft
we have we have plenty of time for questions Thank you for your talk i'm just thank you the that the thank you um I use and there's in Mumbai quite a lot so you could you could give you more information about the differences between the X R N of power what are the key differences those sorry that can can speak lovely lovely focus our can you tell me a little bit more about the key differences between X a and and by OK Exalead as up plenty of meters more than 1 pi then this is the main difference is that no part you can choose the that after you can select other from the by only with indexing In but inexorably we have some
labelled index so you you can add up at the U
shape of the very number
of all of the shape you can select easily there and you can forget about the order of the the
the index of your URI lot but you can simply select with the label and the thanks to these so we we have a lot of meat to the true to perform operations on index of mory's Leon externally and the integration and thus can optimize this operation and you you have not dust on on the you you can use carries the thorough very similar optimal to inquiries with Alaska but accessory as these this fisher more good thank you more questions and thanks for your book as
inference in this thing the task is aims to be a sort of distributed computation right yeah so that it would be use for instance spot In the last call before you can pronounce what's advantages of using events with respect to spot where saluting yeah dust as so the R rating figure uh already integration with the accelerate out of the box so we we
we use them on the desk and the way we we are developing in the dust so we would try navies part but we are pretty happy about that's yeah the thanks this so it's really good to see climate change
someone on 1 of our presentations I just wanted to ask how accessible use this data to casual users
like myself you may want to images of experimentation OK in isn't that if provided of free some free that on the matzoth the by I have to link them in this is the link
and the the there are free not all the data from the sins of their from off the reanalysis that are free and you can yeah just do but I think I have an account on a free account on these in the beautiful yeah is the Denny's marks the environment to to download that yeah more questions this is actually a a comment on the on the on the sensor I'm
working with purchase got so they got these available they both the reality is that these past we know about the past and projection that is what we think this scenario here we've been in the future it's hard to use his 1st year uh this can have a large size and then because they are very different from 1 to the other even in things that are the same for example this surface temperature it you need to know how who call of colon concept measure how it there of the difference of some differences and is the reason the Copernicus program is trying to get a central repository to get to free access to that that and to do that we can work with the easy shown those layer so that you can easily go from 1 to another better than by someone else and then in a completely different but you can move easily and that it looks so you can reuse egos these are complex things to but as a researcher you should be uh what of knowledge need from the data base year and configurational beginning next year it's all of these tracer of data the variance in the system much more easily than they are so I got 1 on research how large is the computational infrastructure you need to analyze this data like in numbers of so hottest computation of the whatever their action infrastructure using not
the radio now we we we have the unique up developers of the unit users of these infrastructure so it's not the underlying the future will so what we is our the because there is not much right now what what how many machines if you need to to generate of sports or no no no this is this is what 1 machine and normal I will not excellent it are there any more questions then let's all just by again thanking fj