We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Python: Winding Itself Around Datacubes

Formal Metadata

Title
Python: Winding Itself Around Datacubes
Subtitle
How to Access Massive Multi-Dimensional Arrays in a Pythonic Way
Title of Series
Number of Parts
611
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2017

Content Metadata

Subject Area
Genre
Abstract
While python has developed into the lingua franca in Data Science there isoften a paradigm break when accessing specialized tools. In particular for oneof the core data categories in science and engineering, massive multi-dimensional arrays, out-of-memory solutions typically employ their own,different models. We discuss this situation on the example of the scalable open-source arrayengine, rasdaman ("raster data manager") which offers access to and processingof Petascale multi-dimensional arrays through an SQL-style array querylanguage, rasql. Such queries are executed in the server on a storage engineutilizing adaptive array partitioning and based on a processing engineimplementing a "tile streaming" paradigm to allow processing of arraysmassively larger than server RAM. The rasdaman QL has acted as blueprint forforthcoming ISO Array SQL and the Open Geospatial Consortium (OGC) geoanalytics language, Web Coverage Processing Service, adopted in 2008. Notsurprisingly, rasdaman is OGC and INSPIRE Reference Implementation for their"Big Earth Data" standards suite. Recently, rasdaman has been augmented with a python interface which allows totransparently interact with the database (credits go to Siddharth Shukla'sMaster Thesis at Jacobs University). Programmers do not need to know therasdaman query language, as the operators are silently transformed, throughlazy evaluation, into queries. Arrays delivered are likewise automaticallytransformed into their python representation. The presenter is Principal Architect of rasdaman, editor of several "Big Data"standards, and co-chair of "Big Data" relevant working groups in several high-impact bodies. In the talk, the rasdaman concept will be illustrated with thehelp of large-scale real-life examples of operational satellite image andweather data services, and sample python code will be demonstrated live.