Scientific Data in the Cloud - Oct 18

Australian Research Data Commons (ARDC)

Readey, John

Formale Metadaten

Titel

Untertitel

HDF5 in the Cloud

Serientitel

Tech Talk

Anzahl der Teile

Autor

Readey, John

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/42938 (DOI)

Herausgeber

Australian Research Data Commons (ARDC)

Erscheinungsjahr

2018

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Webinar/Tutorial

Abstract

Processing Structured Scientific Data in Cloud The HDF5 file format has been used extensively in the HPC community for the storage of scientific data (e.g. multi-dimensional arrays). Unfortunately, the traditional HDF5 library doesn't work so well for applications running in the cloud. To address this, we've developed a service based implementation of HDF5, HDF Kita. Kita utilizes object based storage (e.g. AWS S3) and runs as a cluster of Docker Containers. In combination with the service, JupyterHub enables users to easily run notebooks in the cloud that can use an unlimited amount of data and take advantage of the parallelization capabilities of the Kita Server.