Sat-utils: Landsat, Sentinel and the use of open raster data

Video thumbnail (Frame 0) Video thumbnail (Frame 13961) Video thumbnail (Frame 17397) Video thumbnail (Frame 19911) Video thumbnail (Frame 22125) Video thumbnail (Frame 33143) Video thumbnail (Frame 39505)
Video in TIB AV-Portal: Sat-utils: Landsat, Sentinel and the use of open raster data

Formal Metadata

Sat-utils: Landsat, Sentinel and the use of open raster data
Title of Series
Part Number
Number of Parts
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Open satellite data from the US and EU have provided scientists and businesses with a wealth of data, but it can be difficult to fully easily access and process it. Recent efforts to put Sentinel-2 data on AWS S3 along with Landsat-8 has made it easier to build tools to access both data sources. At Development Seed, we are building tools called sat-utils to process and access open raster data like Landsat and Sentinel. We've expanded development on the tools to be a suite of Python libraries and command line tools for querying, downloading, managing, and processing other remote sensing data. It's been two years since we've launched the first sat-util, landsat-util, which has proven to be a valuable tool with a growing user base. sentinel-util is an tool that will provide the same easy access to data that landsat-util provides. We will discuss the processing for turning spectral band data into usable products such as color corrected RGB images, radiance data, top of the atmosphere reflectance, and various indices. We will also demonstrate the available APIs we have for open raster data: sentinel-api and landsat-api, that our client utils use for searching available metadata.
Satellite State observer Presentation of a group Thermodynamischer Prozess Distribution (mathematics) State of matter Multiplication sign 1 (number) Open set Total S.A. Google Earth Medical imaging Mathematics Testdaten Repository (publishing) Square number Cuboid Elasticity (physics) Series (mathematics) Thumbnail Point cloud Thermodynamischer Prozess Algorithm Satellite Broadcast programming Venn diagram Reflection (mathematics) Cloud computing Bit Digital signal Open set Demoscene Product (business) Repository (publishing) Internet service provider Order (biology) Website Hill differential equation Personal area network Right angle Quicksort Simulation Geometry 12 (number) Functional (mathematics) Link (knot theory) Divisor Process (computing) Variety (linguistics) Line (geometry) 3 (number) Time series Electronic mailing list Mathematical analysis Web browser Graph coloring Metadata Theory Machine vision Product (business) Attribute grammar Goodness of fit Musical ensemble Utility software Data structure Summierbarkeit Renewal theory Traffic reporting Computing platform Raw image format Execution unit Standard deviation Scaling (geometry) Polygon Mathematical analysis Planning Library catalog Set (mathematics) Parity (mathematics) Word Computer animation Query language Function (mathematics) File archiver Point cloud Library (computing) Elasticity (physics)
Satellite Point (geometry) Gateway (telecommunications) Functional (mathematics) Implementation Pixel Link (knot theory) Multiplication sign Letterpress printing ACID Web browser Open set Computer programming Product (business) Attribute grammar Revision control Elasticity (physics) Library (computing) Computing platform Lambda calculus Dependent and independent variables Gateway (telecommunications) File format Server (computing) Digitizing Polygon Electronic mailing list Amsterdam Ordnance Datum Counting Set (mathematics) Open set Web browser Demoscene Uniform resource locator Computer animation Repository (publishing) Query language Uniform resource name Remote procedure call Library (computing)
Thermodynamischer Prozess Euclidean vector Length Mountain pass Source code Range (statistics) Price index Demoscene Food energy Latent heat Hypermedia Query language Library (computing) Point cloud Thermodynamischer Prozess Link (knot theory) Satellite Principal ideal Polygon Cartesian coordinate system Demoscene Product (business) Software development kit Computer animation Query language Personal digital assistant Library (computing)
Addition Numbering scheme Price index Water vapor Open set Special unitary group Neuroinformatik Medical imaging Different (Kate Ryan album) Vector space Personal digital assistant Repository (publishing) Exception handling Social class Thermodynamischer Prozess Satellite Port scanner Demoscene Product (business) Software repository Green's function Internet service provider Quicksort Figurate number Arithmetic progression Point (geometry) Computer file GUI widget Variety (linguistics) Branch (computer science) Mass Product (business) Number Latent heat Boundary value problem Information Set (mathematics) Cartesian coordinate system Estimation Query language Case modding Video game Musical ensemble Table (information) Family Library (computing) Distortion (mathematics) Satellite Thermodynamischer Prozess Installation art Code Multiplication sign Combinational logic Parameter (computer programming) Demoscene Arm Subset Query language Cuboid Series (mathematics) Point cloud Area Algorithm Process (computing) Reflection (mathematics) Price index Numbering scheme Type theory Vector space Repository (publishing) Principal component analysis Website Software testing Right angle Row (database) Implementation Observational study Line (geometry) Image resolution Temporal logic Graph coloring Revision control Software testing Angular resolution Boolean algebra Principal ideal Projective plane Mathematical analysis Planning Component-based software engineering Subset CAN bus Computer animation Calculation Point cloud Hydraulic motor Boundary value problem 5 (number) Abstraction
Computer animation Meeting/Interview
all right well let's go ahead and we'll get started with the of 130 Earth Observation Panel money into Eddie Pickle from renewable technologies and I'm very happy to introduce the 3 presentations that we have today starting things out will be met Hansen from development Seed talking about so you tools but Landsat signal and the use of open rested data and then if I can go back and find them together with its ability to run in London Monday jelly Keynesian Andrea I am I will be talking about the surveyor got vision that they would use a addressing real-world requirements and then finishing up that will be Jonas everybody would talk about standards standard compliant due process and you observation time series data access and analysis so I'm going turn over to match and that it was OK thank you and welcome so that a set of items that have some time with development seed and I'm here to talk about sec-butyl utils my co-presenter Elora's is was not able to be here so is just me theory so we're in the age of satellite imagery has who saw in the keynote this morning the sentinel data is bring online terabytes terabytes of data that everybody's interested in we have the Landsat series of sensors which have a huge historical archive is notice there's plenty of others that are that have opened data freely available for every we also have a variety of personal data that's for the you can buy spot RapidEye worldview among them which range in price from about a dollar per square kilometer 235 dollars per square kilometer so looking in Germany which is 357 thousand square kilometers using this quickly gets out of hand if you want to scale up and do any sort of injury on a large scale and even small states in the US this becomes prohibitively expensive with various plant commercial companies such as planet Earth cast tested digital Tarabella which is Google's of previously described boxes this cost may come down but it development see we do we the development and development see stands for international development and also suffered a play on words as uh but we are mostly interested in open data everything that we do is open source and we're particularly interested in using Open Data for for development across the so the use problems however in accessing this data searching filtering through metadata catalogs to get the data that you want with no clouds and perform analysis on it and actually get the data is sometimes difficult historically land so that's has a website that you can go through Earth Explorer which is in the time-consuming there's no programmatic way to use that and then to actually get the data and download it you have to you know all of the details about the data such as the no data values and various gains to apply if you want to convert that into let's say top the atmosphere reflectance is a little bit easier with Landsat data have been previously so the various commercial companies have are starting to come up with their own solutions for the CC PlanetLab has a browser where you can search and access Landsat data as well as their own data and people are creating API CPIs for this as well as but providing cloud-based solutions for people to run their own algorithms and DigitalGlobe has something to do this but this is still commercial platforms what we wanted to do that development seed was to have a way for scientists analysts who wanna do things generally on a on a on a small scale who want get data query data downloaded and processed into useful products quickly and easily without any hassle so sometimes don't copy years ago we had done Landsat util which is gotten pretty good reception from people and it allows you to search download and process them data so the Landsat search so this is a command-line utility that has 3 subcommands search download process so search allows you to issue a query for a wide and long in variety of other attributes that you can search on such as cloud cover or even given a polygon as well and this will return as this here J. Sun and a link to the thumbnail so that you can query you can get a bunch of scenes and you can look at these thumbnails you can pick out which ones you want and then you issue those seen ideas to the Landsat download commands and these are downloaded from the GWAS dataset which if you've used website problem familiar with the initiative that's been happening we're planning ahead planning started to copy basically all of the Landsat data on the AWS for easier access because the aforementioned issues using Earth Explorer in a programmatic way uh but all older data and actually I think that this might be change now but before 2015 it wasn't on the S 3 and so on but in this way as a little downloaded from from Google Earth Engine once you have these you can process the data into a few different products in 2 of 3 bonds color image using whatever 3 bonds which will stretch each the bonds and produce a color image of a false color image and they what the answer you want in give and in product or a pansharpened product however it's important to note that plants that you was never really intended for any sort of rigorous scientific purposes is that it doesn't apply the they geometry factors to to actually get to a real top the atmosphere reflectance from the from the lands at the data and it structures the images in the you did very much a very visible process so that led us to start thinking about how we can improve this and also extended to other senses when utilization a single self-contained repository so no reason I talked about how we could take it and break it up into smaller reports and smaller libraries that could each be individually used by people specifically for what they want to do so we have several the 0 beyond each 1 of those in a little bit more detail and then we would also take all these libraries and this is what would form a new lands that you or the new of you know or motor pseudo or other utilities for any other open data so the set API so in order to build an API of course you need a metadata catalog and so there's a few different steps here the 1st step is to actually harvest the metadata so the the we have to repositories I should mention that if and friendly there but and you have if you did have slashed SAP dashed Udall's all of the repositories that I'm talking about are there and they're all publicly available so these still inside a metadata and sentinel to metadata they could collect the metadata and then they write that out but using some sort of writer function that you can you can so a development seed we use elastic search so we have an elastic search right function the metadata is collected every day using it WSDL pipeline are scheduled to run everyday collect metadata right to last search about you could provide other writers and there's 1 for S 3 of the we didn't we didn't using so then after you harvest the metadata and it's own elastic search you need to create an API now
all of these set this 3 PPI repositories and they're all written a node we have set API lived which is the base library that you can use and this is essentially just a around Elastic Search SAT API expressed is a way to use the library to access Elastic Search by creating your own and points and then sat API so we use and this is a Amazon serverless implementation uses API gateway and when the functions to use the API we have actually a few different public API so that these are actually are in fact publicly available but if you notice the small print there is no guarantee that this will work because we we've made a publicly available but this isn't intended to be production adopt however you we have have the code and if you want a production version that you can count on all the time that you can install it yourself on on on you're using your own Amazon account or culture I yes so these public API is the top 1 the 2nd 1 is to get a response in G adjacent format which is primarily I think what most people would end up using and then we also have a counter and point where you can get accounts of queries then what I mean by counts as a count of all the data that meets needs that query so for example we have this program here will return all the scenes intersecting which is a polygon OK for for it doesn't specify satellite and this is both a API for Landsat and some not so this would return return both of them this a 2nd URL here would get you all of the scenes by day and basically a histogram of all the scenes the count of all scenes by for some people in that these are all the attributes that
you can search for in that in that query and using the API so any of these searchable you can do to and from dates and cloud cover and pretty much any other actually this might not actually in fact be a complete list but there are some open imagery browsers that out there using our set API so we have Libra the which is Landsat only and so that's why he can go to that you can go to that now it's a imagery browser that use the API so can search you can enter in fields through through the doing and get a list of all the scenes as well as their download links from 3 C and download themselves through them Victor at remote pixel has has created his own open browser which is great we love open open the suffer and he has we have not updated lever to use to so he kindly did that and also the acid digital platforms to familiar that's digital so that uses all of our our again well except they're actually Production versions of those on their own AWS accounts not using our public and and points OK so but if you don't like writing carol commands 0 you know putting stuff into a browser we have a set search Python library which is just a library for querying the API and there's a little
example is energy in years to query and that in this particular case is very simple it's just getting the scene ID and query on that but if you don't have to have seen any of the polygon media cloud cover date range all the regular stuff it's just a Python library for you to use using your own in your own applications sat download is another very small library and really it's not even necessary if these you can actually get the URL from this a search so you could download it yourself but set download does is you can give it a c 90 and you will go actually query the API get the length and the endowment so it's just a quick and easy way to a small library to download a specific scene for Landsat or sentinel to and it actually download spike it does it downloads from either AWS for Google Earth Engine you can actually specify which source that you want and in fact the you US yes as well there's there's an example of using now the set process so
I have an asterisk here because so as whoever's
giving a talk you know you write these abstracts like 6 months before the talk right and so OK well where my going to be 6 months from now so you say certain things that maybe you don't quite finish and that's kind where we're at so this is very much of a progress of work in progress so that process you can look at the code and you can see at 1 point where we're and there's a few different branches there are but it it's it's very much has yet to be completed and sent processes for processing satellite data locally on on on your on your computer into a variety of different products so we have we've defined general products and to try and make it sensor agnostic and I'll show you a little table of what I in a 2nd time the idea that you could create true-color false color any sort of cult colorized of versions of these products using whatever bands you want as well the variety of indices and uh do we do have the implementations in our processing library to do the ACC a cloud masking algorithm that Landsat 7 used as well as s mass which is another class asking him and and other potential products which we do have code for this oldest still has yet to be influence and process would be principal components and Rx detector and there's other other types of transforms so for example different color schemes here that you could do so by giving it different bands without having to worry about what bands are what plants there or 7 or affect what other others well whatever sensory using you could just say I want a you know color for land water and a set process would actually just generate for it the goal here is to make this simple without without having people and you need to worry about these sorts of data specific details that this is what I mean here so what we do is we just specify what bands are what numbers for different senses because these things yes there are differences between the red band on 701 Landsat but these these if you if you do an and DVI calculation they're gonna they're going to use the same exact thing and so we just abstract that out by using band names rather than by numbers we also have a set of test data repository which is useful if you're a developer and you actually want to test algorithms on things here we have subsets of the goal was to have a small repository without having to download large scenes where we have some set of Landsat data that's mostly covered with clouds or Landsat scene that's you know all know data except for of a small portion of that that that sort of thing that Boolean said sentinel as well as vector shapes duties on that covers those regions for example if you wanna do mosaic 18 and you have to glance at titles and you have a vector that crosses the boundary of those this is the sort of thing we put in this test data I believe that the 1 up there now actually doesn't have most of the stuff so again this is this is a work in progress and but it it's it's it's this thing I think it's a pretty useful thing if you do any sort of development satellite data so we also have that you and yes this act utils set you toll is actually I just another repository which is is not intended to be used on its own it's just a base library for creating command blind applications for sensor specific things so in that we have I code using clicked by some click to generate the command line and the and the argument passing and automatically figure out what arguments to put in their based on what products may be available for for the sensor which can vary by sensor and so this would that set you know will lead to a downstream project here a new Landsat Udall which is to be called lands Udall any simple to you and these will provide search capabilities downloading and process and the search capability right now Landsat you'll see these plants and you know it doesn't uh each of these steps is really a self-contained you 1st search and then you have to manually going and figure out the scenes you need an endowment those and processors 1 of the goals here's to make this all the more seamless so that you can make a query and save query and then you could even possibly drop things out and then you download all the stuff in the query or new process all the scenes there in your in your query teacher have I project of files so you would say OK I want all of the scenes in this area for every summer for the last 7 years give those to me and now I wanna process those into Ellis W. life so the future here as it amazing this picture before it's actually like when I 1st saw it it I thought it was really cool I mean is a like this is this is what supposedly in 1956 they thought the future was gonna look like I only at all those people back there and the universal family they didn't know anything about the future but it turns out there's actually fit that but still there so in the future so we have to finish up step process and said you don't and release of the land study and simple to you've only answered it will essentially replace the old plants that you at all and so these are again these are geared more towards scientific use so the algorithms here are will provide actually the top of the atmosphere reflectance and so and something that can be but it's not just the these are not just his visible products that's the gulf but that I don't have a timeline for that but I'd also like to mention that it would be for anybody who might be in DC October 24th we have a set summit which we started last year sponsored by us in that box where we talk about and have have talks on a satellite open satellite data and the use in global development and there's a
website that's exceptionally useful if you're if you're new to the remote sensing world then this is a great site to which rows out a lot set summit landscape uh . settlement and if it has a lot of a lot of basic information on trade between resolution temporal resolution spatial resolution as well as different band combinations that are useful for different applications and has some interactive widgets as well where you can actually look at images b and uh Ford for different for different things such as detection of roads and you can try a different color schemes as well so it's it's it's a good 1st stop but if you want know more about remote sensing in general at and that's it how my doing on and I don't have extra time you have this type of the analysis some questions and I wanted to lose because of that we all have like people mentioned early on in the talk that some of the algorithms didn't necessarily have scientific rigor to them but you also imagine that you were intending this to be used for scientific analysis as you planning to go back in your documents and the algorithms find published papers associated with and stuff like that over of the year and I don't think the year that's a that's a good good question and the old yeah the old Landsat you all that was really it's as if it it was really for visible purposes and so this answer the util replace that as far as documentation goes I have a lot of this stuff really is it's a good question because a lot of this stuff is kind of accepted to be in the public domain like such as in yet but we don't necessarily need that that by but yes we from documentation for things such as the CCA algorithm to answer you have yes so it's a fair point text but the the questions the that some of this slightly way for you to look at a few particularly interested in reading data I didn't know where the satellites covered to just the seat which would you recommend to the Conference 2010 scenario and are so if you so there some motorists the motives a sensor actually has a lot of products for emotion there's the ocean color motors series of their series of data is a little a lot of censor what this is where it is open all notices of this distortion to the back anybody at the certain yeah they have additional billions of then I don't know if any of the murder for motion notion of sensors in OK but corporations were Matt Fisher but do you have plans to integrate also all of the sudden it's missions thinking of continuing 1 3 so lots of sorry majoria L. available since no 3 years already on data almost ready now here the fact that would be the great to to do that but the most everything we've done is is really it's seemed that optical and near-IR answere we're bonds so our depending on enough not by the use of guns are a bunch but it usually requires a lot of processing and I'm not familiar with some the 1 that you and I can talk about that we have a couple of more questions anybody woman thank you very much for sharing this
mescaline playing here afterwards if you guys have questions the 1st of all