Scalable geospatial processing using dask and mapchete

Cite

Related Material

FOSS4G

Ungar, Joachim

Formal Metadata

Title

Scalable geospatial processing using dask and mapchete

Title of Series

FOSS4G Europe 2024 Tartu

Number of Parts

156

Author

Ungar, Joachim

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/68531 (DOI)

Publisher

FOSS4G

Release Date

2024

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Dask is a flexible parallel computing library that seamlessly integrates with popular Python data science tools. With its task graph and parallel computation capabilities, Dask excels in managing large-scale computations on both the local machine as well as on a computing cluster. Mapchete, an open-source Python library, specialises in parallelizing geospatial raster and vector processing tasks. Its strengths lie in its ability to efficiently tile and process geospatial data, making it a valuable asset for handling vast datasets such as satellite imagery, elevation models, and land cover classifications. This talk delves into the integration of these two technologies, showcasing how their combined capabilities can be used to conduct large-scale processing of geospatial data. It will also show how we at EOX are currently deploying our infrastructure and which challenges we face when using it to process the cloudless satellite mosaics under the EOxCloudless product umbrella.

Keywords

foss4ge2024

GeneralTrack

UseCasesApplications