Don't Copy Data! Instead, Share it at Web-Scale

Cite

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Korver, Mark

Formal Metadata

Title

Don't Copy Data! Instead, Share it at Web-Scale

Title of Series

FOSS4G Seoul 2015

Number of Parts

183

Author

Korver, Mark

License

CC Attribution - NonCommercial - ShareAlike 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/32061 (DOI)

Publisher

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Release Date

2015

Language

English

Producer

FOSS4G KOREA

Production Year

2015

Production Place

Seoul, South Korea

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Since its start in 2006, Amazon Web Services has grown to over 40 different services. Amazon Simple Storage Service (S3), our object store, and one of our first services, is now home to trillions of objects and core to many enterprise applications. S3 is used to store many kinds of data, including geo, genomic, and video data and facilitates parallel access to big data. Netflix considers S3 the source of truth for all its data warehousing.The goal of this presentation is to illustrate best practice for open or shared geo-data in the cloud. To do so, it showcases a simple map tiling architecture, running on top of data stored in S3 and uses CloudFront (CDN), Elastic Beanstalk (Application Management), and EC2 (Compute) in combination with FOSS4G tools. The demo uses the USDA��s NAIP dataset (48TB), plus other higher resolution city data, to show how you can build global mapping services without pre-rendering tiles. Because the GeoTIFFs are stored in a requester-pays S3 bucket, anyone with an AWS account has immediate access to the source GeoTIFFs at the infrastructure level, allowing for parallel access by other systems and if necessary, bulk export. However, I will show that the cloud, because it supports both highly available and flexible compute, makes it unnecessary to move data, pointing to a new paradigm, made possible by cloud computing, where one set of GeoTIFFs can act as an authoritative source for any number of users.