We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

yeoda: your earth observation data access

00:00

Formal Metadata

Title
yeoda: your earth observation data access
Alternative Title
yeoda - providing low-level and easy-to-use access to manifold earth observation datasets
Title of Series
Number of Parts
351
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2022

Content Metadata

Subject Area
Genre
Abstract
In recent years, several Python packages (e.g. xarray, rasterio) have evolved around more basic software libraries such as netCDF4 or GDAL for accessing geospatial data. These packages allow to work with all kind of data formats (e.g. GeoTIFF, NetCDF, ZARR) providing the data in array format (NumPy, xarray) and constitute a fundamental part of any scientific analysis or operational task. However, they do not offer full flexibility when working with Earth Observation (EO) datasets. The multidimensional complexity of EO data (i.e. space, time, bands) is often resolved by distributing dimensions across many files and thus not always easy to access. An important step forward to streamline EO data access has been the Open Data Cube (ODC) toolbox, which utilizes predefined dataset configurations and file-based indices stored in a database. With this setup, ODC enables an easy and uniform access to multidimensional geospatial datasets. Still, users are often confronted with a great variety of data formats, and files being distributed over different systems. This can pose a hurdle when working with ODC, especially if one wants to process a new stack of geospatial data, where the extra overhead of a database can stall swift progress. In order to close this gap, the yeoda (''your earth observation data access'') Python software package aims to resolve this shortcoming by offering a similar interface as ODC, but allowing to interact with geospatial data on a lower level. It relies on two other Python software packages developed by TU Wien: geospade (definition of geospatial properties of a dataset, e.g. geometries), and veranda (read/write access to a variety of raster and vector data formats, e.g. GeoTIFF). This modular setup ensures a clear separation of concerns, specifically between geospatial operations and I/O tasks, yielding a homogenized interface independent from the actual data format. For example, geospatial operations based on tiled EO raster datasets can be easily performed across tile or file boundaries. Data access is then realised in veranda, which combines geometric properties with I/O objects listed in a table. On top of geospade and veranda, yeoda acts as a communication layer between files stored on the file system and data objects by adding additional dimensions to the data table, such as common metadata or file name entries. Thus, one can filter multiple files by their attributes (e.g. time, bands, variable names, satellite platform) before accessing the data. Hence, yeoda guarantees the necessary freedom to apply arbitrary algorithms on manifold data formats, while simultaneously supporting scalability by means of parallelised I/O operations. Despite ODC's tremendous value for accessing EO datasets through large scale operational services, yeoda introduces a new level of data interaction making it an indispensable tool for the EO user community. When taking a look on recent advancements in interoperable cloud-based processing via the openEO API, yeoda could be utilized as a slim back-end library to lower the hurdle of sharing new EO datasets and to foster scientific exchange.
Keywords
202
Thumbnail
1:16:05
226
242
GeometryOpen sourceClosed setRight angleObservational studyGoodness of fitComputer animation
Raster graphicsGoogle EarthOpen setCubeMultiplicationComputer fileCubeSingle-precision floating-point formatTheory of relativityOcean currentSet (mathematics)MiniDiscDatabaseIntegrated development environmentComputer architectureServer (computing)Level (video gaming)Term (mathematics)Different (Kate Ryan album)WeightObservational studySoftwareBlack boxFile systemString (computer science)Raster graphicsComputer animation
Gamma functionHausdorff dimensionSurfaceOperations researchGeometryMetadataStack (abstract data type)Computer fileRaster graphicsSocial classCubePhotographic mosaicInterface (computing)Audio file formatWritingAbsolute geometryOnline helpAbstractionCubeInformationSet (mathematics)Dimensional analysisContext awarenessComputer fileGeometryMetadataModulare ProgrammierungSocial classTesselationoutputMedical imagingAudio file formatQR codeGreatest elementWordComputer animation
Transcript: English(auto-generated)
All right, good morning everyone. My name is Claudio Navaki, and, go to the microphone, sorry, sorry. Good morning, so my name is Claudio Navaki, and I'm presenting YODA, which stands for U.S. Observation Data Access, and it's an open source Python tool developed by the geodepartment at T-OVIN.
As you can see on this map, the current set of tools being able to deal with geospatial raster or earth observation data is quite extensive. So on the left side, we have some rudimentary packages which deal with file-based access
like GDAL or NetCDF4, which works quite well in terms of single file access or data sets, but not in terms of homogeneous access across different data collections. On the right side, we have some higher-level data, data cube tools, which either rely on some packages
on the left side, or implement their own software or data architecture to enable user-friendly performance and multi-file access to predefined data collections. However, those data cube tools,
however, those data cube tools, those data cube tools, sorry, however, those data cube tools are black box like the Google Earth Engine, or they introduce string and dependencies
on databases or servers which are needed to run in the background. Yoda instead cherry-picks from both sides by offering a flexible and transparent access to file-based data cubes,
and still maintaining a close relation to files on disk. And prerequisite only two things, a Python environment and files on disk. So how does Yoda actually work? So Yoda relies on several in-house developed software packages which are wrapping around
the basic packages you've seen before. So on the bottom, we have GeoSpade, which defines some basic geometries and tiles and how they relate to each other. On top of GeoSpade, you have Veranda, which unites those abstract geometries with actual IO classes to deal
with all kinds of data formats. And finally, on top, we have Yoda, which adds some dimensional context by interpreting file names with the help of GeoPathfinder, or by extracting metadata information from the files.
With this setup, Yoda accepts either data sets or files as input to initialize its data cube classes, and then those data cube classes can be used to spatially filter the data or to filter the data by the predefined dimensions.
And in the end, you can write or read the data to different data formats. So if you want to dive into the world of Yoda, please check out the documentation with the docs. Please feel free to contribute at GitHub, or check out some examples on the images
we provide at Docker Hub. And if you don't get the QR codes, I also have some handouts here if you're interested. Thank you. Thank you, Claudio.