We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Analysis Ready (Meta)Data

00:00

Formal Metadata

Title
Analysis Ready (Meta)Data
Subtitle
Moving ForewARD with Analysis Ready Metadata
Title of Series
Number of Parts
351
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2022

Content Metadata

Subject Area
Genre
Abstract
The term Analysis Ready Data started as a way to describe a Landsat product that would efficiently allow time-series based analysis by providing a consistent, grid and pixel-aligned product corrected to surface-based measurements. Since then it has come to mean a wide range of things, but without a clear set of standards on how to characterize ARD there is little to no interoperability among datasets that call themselves ARD. The Analysis Ready Metadata initiative uses the SpatioTemporal Asset Catalog (STAC) spec as the vehicle for describing well-characterized data. This goes beyond the basic geospatial and temporal characteristics captured in the core STAC spec and into detail about the processing level of the data, corrections that have been applied, as well as spatial and measurement uncertainties. Having well-characterized data through it’s STAC metadata enables discovery of usable data, automated processing using interoperable workflows, and tracking of data provenance of derived products. The CEOS ARD (previously CARD4L) specifications require certain metadata and processing to be done for it to be compliant and can use this STAC metadata to automatically assess the potential for a dataset to be compliant with the needed requirements. This talk will cover elements of STAC, ARD, and the CARD4L family product specifications.
Keywords
202
Thumbnail
1:16:05
226
242
Mathematical analysisSatelliteSeries (mathematics)Mathematical analysisTerm (mathematics)Element (mathematics)Computer animation
DemosceneWeb portalExponentialabbildungOpen setSatelliteCase moddingSource codeComputer fileOffice suiteSpacetimeSineMetadataPixelAngleDivisorLevel (video gaming)Data conversionSimilarity (geometry)IntegerProduct (business)Nichtlineares GleichungssystemContent (media)Spectrum (functional analysis)Scaling (geometry)Musical ensembleMultiplicationTime seriesAngleSpecial unitary groupComputing platformMultiplication signComputer animationProgram flowchart
FreewareMathematical analysisComputing platformVirtual realitySeries (mathematics)File formatProcess (computing)MetadataAxonometric projectionSurfacePoint cloudPartial derivativePixelThresholding (image processing)Absolute valueLevel (video gaming)SatelliteState of matterInternet forumContrast (vision)Product (business)FamilyBroadcast programmingStandard deviationCollaborationismComputer programmingProcess (computing)GeometryTime seriesSatelliteLatent heatSimilarity (geometry)Cartesian coordinate systemProjective planeType theorySet (mathematics)Term (mathematics)Mathematical analysisBlogAuditory maskingPoint cloudLevel (video gaming)Multiplication signComputer animation
MetadataMathematical analysisCartesian coordinate systemPoint cloudLevel (video gaming)Auditory maskingComputer animation
Process (computing)outputMeasurementLevel (video gaming)PixelView (database)MetadataDifferent (Kate Ryan album)Computer animation
MetadataFunction (mathematics)Different (Kate Ryan album)Level (video gaming)Right angleMetadataMathematical analysisComputer animation
MetadataPoint cloudDenial-of-service attackProcess (computing)Level (video gaming)MeasurementMetadataMeasurementPoint cloudProcess (computing)Auditory maskingMathematical analysisAlgorithmLevel (video gaming)Computer animation
MetadataMathematical analysisFile formatPoint cloudLibrary catalogComputer animation
Transcript: English(auto-generated)
Hello, everybody. My name is Matt Hanson. I'm the geospatial engineering lead at LMN-84, and today I'm going to briefly talk about analysis-ready data and what it means, and I can skip to the end and tell you that it actually doesn't mean anything at all. Everybody uses this term, ARD, and they go, oh, our data's ARD, and guess what?
There's no actual standard saying what ARD is. So, a little while ago, there was a series of ARD workshops, and I've been involved with CEOS on defining ARD, and nobody has yet to agree on it, and a lot of satellite companies say that it's ARD. So first I want to start with a cautionary tale about publicly available data.
So it turns out when you actually make data publicly available, people use it, and they use it in ways that you just didn't expect, and scientists actually don't much care for that. So here we see that once the NASA data and the ESA data was made free and available
and made available on AWS platform, everybody downloaded this data and they used it, and very few of these people, I think, actually downloaded and looked at the Landsat 8 user handbook, and if they had, they would have looked in it, and they would have saw that the data that was actually distributed wasn't corrected for the sun angle, which means
that you couldn't actually compare this data across multiple days, and yet people were taking and calculating NDVI and looking at days and time series, and I went to the Landsat science team meeting a few years ago. I was like, you know that people are using this data that way, and they're like, oh, they shouldn't do that. Well, so Landsat program started releasing ARD.
They actually coined the term the USGS and came out with this ARD project, and the idea was that it was all of the Landsat satellites, the whole constellation that could be constructed in a time series, so they were tiled in a regular projection, and they were atmospherically corrected and geo-registered, and there was a cloud mask so that you could create these
time series. Well, it wasn't long before other people started using this term ARD to kind of just mean, oh, like you can just use this data and you can compare it to other data sets, but again, the problem is that there's actually no standard associated with this. So the CEOS, which is the Committee for Earth Observation Satellites, they do have
an ARD specification. This is mostly out of Geoscience Australia, and they have all these requirements. It's pretty complicated, and you have to go through a really complex process to have your data evaluated, and a couple years later, Chris Holmes wrote this blog post about defining what analysis-ready data is, which is pretty much kind of the similar type of
thing that Landsat is, which is that, okay, this data is level two data, it's atmospherically corrected, it's been BRDF corrected, and geo-registered, and it's aligned to this common grid. At this satellite interoperability workshop, the general agreement among all the people there was that there actually was no simple definition of ARD because it was completely
dependent on the application. So I want you to think of applications such as like a cloud scientist who's looking at clouds, right, they don't need the cloud mask, or some AI applications actually don't even need level two data. So I actually would like to present an alternative view of what ARD should be, is that it's
not about having actually certain requirements on that data. It doesn't matter if it's level one or level two, or if it's been geometrically corrected to some sub-pixel accuracy, none of that matters, right? What matters is that you actually have characterized this data within the metadata.
Because the workflows all have different requirements, and workflows themselves can actually validate that metadata that they need to produce some valid output. So if a workflow can work on level one data, right, then that's kind of ARD data, right? Like that's analysis ready, because my workflow can just take that data and produce
what it needs to. And so I want people to start thinking about cloud-native geospatial workflows working on metadata and validating what it is that they need, and data producers actually providing all of the metadata there that's needed so that automatically, programmatically
algorithms can determine whether or not that data is valid for the process. And there's a lot of metadata to consider, the processing level, the geometric and measurement accuracies, calibration, and cloud masks. But as long as you characterize this, then that's really all that's needed.
So stop thinking about analysis-ready data. Start thinking about analysis-ready metadata. Thank you.