State of STAC
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 156 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/68548 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
FOSS4G Europe 2024 Tartu127 / 156
6
33
35
53
55
59
61
67
70
87
97
99
102
103
104
105
107
111
121
122
123
124
125
126
127
128
134
144
150
151
155
00:00
Gamma functionComputer virusNeuroinformatikView (database)Incidence algebraProcess (computing)Computer filePoint cloudLink (knot theory)Field (computer science)SoftwareLibrary catalogInformationStack (abstract data type)Mereology1 (number)Table (information)Dynamical systemElement (mathematics)HierarchyMachine learningAngleDatabase transactionMusical ensembleMathematicsSlide ruleTemporal logicSuite (music)Extension (kinesiology)Motion captureRow (database)Latent heatVirtual machineOpen setConnectivity (graph theory)Fluid staticsGeometryAdditionIdentifiabilityEndliche ModelltheorieUniform resource locatorCentralizer and normalizerHorizonMetadataMultiplication signLevel (video gaming)Scaling (geometry)Point (geometry)VirtualizationRevision controlState of matterBitProjective planeStandard deviationSet (mathematics)Category of beingImplementationDomain nameAuthenticationNichtlineares GleichungssystemProduct (business)Type theoryRight angleInternet service providerGreen's functionForm (programming)Different (Kate Ryan album)Lecture/ConferenceComputer animation
08:55
Gamma functionAxiom of choiceFront and back endsStack (abstract data type)Set (mathematics)InformationAlgorithmTask (computing)Client (computing)Level (video gaming)Electronic mailing listMultiplication signApplication service providerCategory of beingMusical ensembleExtension (kinesiology)Link (knot theory)Matter waveMathematicsMetadataWeb browserAuthenticationCASE <Informatik>FreewareDatabase transactionOpen setPlug-in (computing)MappingConnected spaceDifferent (Kate Ryan album)1 (number)Focus (optics)Core dumpLatent heatElement (mathematics)Revision controlNeuroinformatikRight angleBitDescriptive statisticsRow (database)State of matterField (computer science)ExpressionImplementationLibrary catalogInheritance (object-oriented programming)GeometrySubject indexingSpectrum (functional analysis)Array data structureData structureIdentifiabilityClique-widthAngular resolutionData typeEmailDoubling the cubeConstructor (object-oriented programming)Single-precision floating-point formatStatisticsLanding pageDevice driverMobile appStandard deviationComputer fileTable (information)Elasticity (physics)Endliche ModelltheorieImage resolutionComputer animation
17:45
BitRevision controlComputer animation
18:12
Centralizer and normalizerCausalityStack (abstract data type)EmailGroup actionLink (knot theory)Electronic program guideImplementationProfil (magazine)Different (Kate Ryan album)Fluid staticsLibrary catalogBitLatent heatMathematicsExtension (kinesiology)MereologyEquivalence relationRepository (publishing)Point (geometry)Position operatorSubject indexingPoint cloudElectronic mailing listValidity (statistics)Core dumpRevision controlSoftware testingPlanningStandard deviationSuite (music)Musical ensembleNumberField (computer science)Right angle1 (number)Limit (category theory)WebsiteSlide ruleMultiplication signEstimatorCASE <Informatik>Formal languageDiallyl disulfideCycle (graph theory)Correspondence (mathematics)Presentation of a groupComputer animationLecture/Conference
26:47
Special unitary groupComputer-assisted translationComputer animation
Transcript: English(auto-generated)
00:00
Hello. I hope you're all good after lunch and feeling good, so not too tired after the party yesterday. Anyway, this talk is about state of stack. First question, who has heard about stack before? May I just skip the introduction, I guess?
00:21
Let's see. Well, I'll do it anyway. It's short. So what is stack? First of all, this is the logo here. You can see it here. I don't have it on the slide. And it means spatial temporal asset catalog. So that's a long form of stack. And it's a suite of specifications and software. The specifications are two kinds, static stack,
00:42
the stack specification, and an API specification. And it has a lot of ecosystem around it. The stack specification itself describes geospatial data, which have usually also a temporal component. And that is described through interlinked static JSON files.
01:00
So you put them, for example, on a cloud bucket. And then there are just JSON files that link to each other and in that kind form a network that you can browse through. The stack API specification adds dynamic elements to these static catalogs that you can deploy on cloud
01:21
buckets. And then it adds, for example, search, aggregation, transaction, everything that makes it dynamic in the end. The specification is extensible. Both parts are extensible. There are various extensions that you can reuse.
01:42
And both specifications have also been submitted as an OGC community standard. Third, it is going through that process right now. The question mark is there because it has not fully passed all stages yet. But we are confident that this will go through and be
02:00
a community standard in the end. The current version is stack 1.0. And we are in the state of stack talk here. What we are discussing here as well is what is actually new in the spec version 1.1. Everything that is yellow on the slides is new since last was for G in Kosovo
02:24
so that you have a feeling of what has changed and what is already existing or was existing before. How does this work? Oh, yeah. So a catalog looks like this here or could look like this. You always have an entry point, a catalog.
02:40
And then you subdivide that into different collections usually. It is very open to do that. You could also subdivide that into further subcatalogs and then have two subcollections there or so. You end up always with item files usually that contain assets, which are the actual data files
03:03
that you in the end want to read. And yeah, so what are these different entities? First of all, a catalog is a very simple concept. It is also only present in static stack because there is no good way to otherwise subdivide
03:25
these large files, like if you have a collection with hundreds and thousands of files that gets very large. So you want to subdivide that usually. So if you, for example, want to expose Sentinel-2 level 2A data, then putting all these different granules
03:40
that you have into a single collection may make sense. But that's like thousands of files. That doesn't scale very well in JSON. So you would subdivide them, for example, into folders or virtual folders and catalogs, for example, by specific days. So then you have catalogs for each day. And then it makes the entities that you
04:03
expose in JSON much smaller. It's a bit comparable to what you have in APIs with pagination, yeah. Collections are a little more advanced in that sense that they add additional metadata for things
04:20
like data sets that are like, you usually like to try to expose homogeneous data sets where the properties are very similar to each other. So for example, Sentinel-2 level 2A would be one. And then, for example, the level 1C would be another collection. And you add the metadata there that describe the whole collection, like keywords,
04:42
extents, license, provider information, and so on. The item is then the last part of the equation in the end. There is where the data files individual captures of Sentinel-2, for example, are described. You specify the date and time which it was captured at.
05:01
You specify the location as a geometry in GeoJSON. You give all these additional information like the viewing angles, the projection information, and so on. Then you have assets in items and in collections which point to the individual data files.
05:20
So for example, if you have a Sentinel-2 capture, and all the bands are individual cloud-optimized geotiffs, then each of these files would be a specific asset. So the green band of Sentinel-2 capture as COG file is one asset, and then the blue band would be a separate asset, and so on. Yeah, these are meant for machine consumption usually.
05:41
That's what you would want to download on your computer in the end or use for processing in the cloud. And then there are links which are relatively similar and present in all these entities and catalogs, collections, and items, and are usually more like the related resources that you usually may want to consume as a human
06:02
or want to get additional information from are not necessarily needed for processing or download. It could be documentation. That's also used to form the hierarchy between the different stack entities, so links between catalogs, items, and collections.
06:21
Exactly. So there are various extensions in the ecosystem. For the static stack specification that usually defines additional fields, like, for example, the projection information is an extension in the field. The view incidence angle, for example, is a field, and so on. There are 75 right now, I think, or 76.
06:44
At least the ones that are listed in the official table. And new ones are, for example, for accuracy information, altimetry, authentication is a recent addition which helps to tell the consumer which information or which pathway to go through or which steps
07:02
to go through to actually authenticate and then download the data. There was an extension that describes how to implement COS-ARD requirements in stack. There was a domain-specific exchange for InSAR data. Machine learning was recently completely redone, so to expose models. And there was a new product extension
07:21
which describes product types and timeliness and some things like that. Stack 1.1 is on the horizon that's been worked on right now. There will be additional fields in the so-called common metadata model. The common metadata model is something in stack that is a central metadata model where you can define fields that can be used
07:42
in any of these entities that we saw before. At the beginning of stack, we often just said, well, this can be used in items or this can be used in collections, and then we had these specifications very narrowly and then people said, oh, I want to use it somewhere else and then we pretty much came up with this common metadata model, so all these fields can be used anywhere,
08:02
which means, for example, that now if you, before you could also use keywords in the collections. Now you can use also or assign keywords to items. You can assign keywords to assets. You can assign keywords to links and use them for search or whatever. The roles have been made more generic before they were only in assets. Now you can, for example, assign more roles to links
08:22
so that you can clearly identify what purpose they have. There was an effort to align further with OGC API records that led to a change in the license fields, for example, and the actual change there is that beforehand we had two values that could be put there
08:42
in addition to SPDX license identifiers, which were proprietary and various proprietary was not very well received in the community because even for open data you had to use that if there was no SPDX license identifier and that gave wrong impression of what it actually is.
09:02
So now everything is just using other and you put a link to your information, link license information, and then you can just get it from there. One minor thing that we had in the links is that you could only get links via a get request.
09:20
So it's where normal links in the HTTP vary, where you click a link and then it requests it. But there are other things that you may want to use like post requests for sending larger information, send headers with authentication details, et cetera. So that can be now expressed in links. If you do want to use that, that is especially helpful for transactions or something like that for pagination.
09:43
There is now an implicit inheritance between item properties and asset properties. So that means that you don't need to repeat all the information in the assets all the time. You can just put them in the item properties and then the clients, if they don't find any information in the asset, they go one level up and find them there.
10:01
That should help with deduplicating data in the metadata. There's a little more in the change log, which you can go through if you will click the link here and that is titled and much more. The biggest change in stack 1.1, which I didn't set anything about yet is the bands.
10:22
That is the main driver of stack 1.1 actually. And that is removing a historical thing that we got in stack due to how it just worked out. There was the EO extension and the Rust extension both had an array construct which was called,
10:41
or it identified bands. So for example, you could have EO bands with a name and the wavelengths. And then in Rusta bands, there was no name for the bands because it was already in EO bands. But in the end, you ended up with two array structures that you needed to look through to get all the information.
11:02
So now it's called just bands and it is in the stack core specification. And the extensions actually just add fields to it that can be used specifically for these different extensions, like the wavelength, for example, which is specific to spectral info like bands is in the EO extension and Rusta specific fields
11:22
like the spatial resolution are in the Rust extension. It also allows that if you are just have a single band, you actually don't need to use a band extension because you can use these fields due to the common metadata model, again, outside of bands and just use them in assets, for example. That also like makes the whole concept a little simpler.
11:42
And to make this a little more transparent, let's look at an example here. So before you had EO bands, right? And Rusta bands, two different arrays, but they describe the same band in the end. You had the name in EO bands, the description, the common name for the spectral band, the wavelength and the full width half max.
12:01
But then you had at the same level and you needed to identify it, that why are the array index pretty much like the first array element were parallel to the other first element in the Rusta bands. That was relatively difficult and just like was not very intuitive.
12:20
The new thing is now you have a single array with band construct. Some of the fields are in the core like name, description, data type, no data, statistics. These are all here. You don't need to prefix them anymore. And then you have some fields that are EO or spectral related and Rusta related that have a prefix here.
12:42
So there was a direct mapping. You can pretty much just rename them like data type goes in as it is. Spatial resolution has a Rusta prefix, Rusta double colon spatial resolution. And that's pretty much it. So it's more or less unifying the arrays and renaming the fields and that's it.
13:00
It hopefully makes implementation simpler and allows us to express bands in various ways. You can reuse them in other extensions as well. The Stack API specification itself is a, well, API that is HTTP based, JSON based, based on OGC API features
13:22
and defines in the core just the landing page and then adds item search on top and collection features. Based on OGC API features, so that's the same endpoints. The difference here is that pretty much in OGC API features, the geometries that you get are the actual data that you want to retrieve.
13:41
And in Stack, the geometries that you get via the items endpoint are metadata about the actual data files. So that's the main difference there. API as well is extensible. It has various extensions that you can reuse. Also submitted as community standard. And we're also working on a new version there which might be 101 or 11 depending on which changes get in.
14:05
But that waits for Stack 11 and then afterwards, Stack API will be updated. The API extensions, there are 15, I think right now or maybe 16. The new ones that are there are very collection related.
14:21
Like the focus in Stack before was more on item search but now there is like collection search as well. There was collection transactions. So transactions means you can update via an HTTP API, the collections to something new, change the keyword, send to the description or the title or something like that. And then free text search was added also based on OGC API records
14:41
which you can use to just freely search in any text through the API. We're waiting for further changes in OGC APIs and like really, yeah. We wait that they release the like SQL2 and transactions into the final like stable state
15:02
and then we follow that and just adopt that as well. So these are not like identified as stable right now but will be in the future once that is through in OGC. Various new tools in the ecosystem. I didn't knew people were still using ASP.net
15:20
but there is now a stack tool in ASP for ASP.net. Welcome but I'm not sure how many people use that. There is a new Rust client similar to PyStack, PGStack, PGStack RS or is it called? No, it should be PyStack RS. I think that's a typo. There are Julia based clients for the Julia community
15:42
that is growing Stack.jl and StackCube.jl. Our stack that was previously an API client only is now available in version 1.0 and can read and create static stack catalogs as well. In the JavaScript ecosystem, we have a new version of Stack browser which for example implements authentication.
16:02
So before you had to fully free public accessible catalog. That is now not the case anymore. You can for example, authenticate through OpenID Connect. There is OLStack which is like a mapping plugin for open layers that can render easily
16:21
the stack elements that you throw at them and Stack.js as a simple helper drop-in replacement if you are handling stack entities in JavaScript. There is a new tool to simplify the downloading of stack assets in Python. Before it was a little bit difficult to get the data actually to your computer
16:41
through PyStack itself. That is now simplified through this command line tool. PyStack 1.1 simplifies how you can interact with extensions that might help some people because before it was a little bit difficult. Also will implement the bands changes already and so on.
17:02
Stack fast API has been updated. There is a new backend before I think Elasticsearch and PostgreSQL PG stack. You can now use MongoDB if that's your backend of choice and stack task is a new tool that like is meant to run algorithms effectively
17:22
over a large set of data sets. For more C-Stack index, we have a long list of tools there that you can use. For data sets, just an extract of new data sets that were added and made available in stack. I'll not go through there,
17:40
just go through them in stack index and see whether there's anything of interest for you. A little bit of a timeline here. So we started stack 1.0 in 2017, went through release stack 1.0 in 2021 and the new versions stack 1.1 are meant to be released
18:01
in quarter two of 2024 and followed after and hopefully quarter four of 2024, we will have stack API updated hopefully. Yeah, what will be the plan for the future in stack? Well, we discussed stack 1.1 and stack API updates.
18:22
Then of course, after those have been released and we will update or try to update the ecosystem as much as possible to the latest specification versions due to the low number of changes. I think mostly updating the bands related things, everything else should be relatively smoothly. There should not be a lot of breaking things in there.
18:42
I think ideally none. Yeah, and then we wait for OGC to release the other specifications that we base our API extensions on. And then, yeah, there is currently because all stack sprints before were in the US, there was our plans to host a stack sprint in Europe.
19:03
Who's interested? Let us know. We would love to do that end of the year or beginning of next year to have something here in Europe for stack interested people. The PSC will be a bit updated and be more inclusive. That has to have been like five core contributors in the past, but we will open that in the future
19:22
and see which role actually we take there and how we can further stabilize stack to make it more like well-governed, let's say. And then of course there is the OGC community standard that hopefully will soon be a thing. A couple of resources that if you like download the slides
19:41
from the FOSC4G website that you can click and go through. I will, or I want to welcome everyone to the stack community cause if you're interested, that's every second Monday at 5 p.m. Central European time, 18 o'clock Estonian time. If you want to join that, there is a link to the Google group.
20:00
If you join that, then you'll get an invite to your email and then I would like to, or I would be happy to see you there. Thanks. I have a question.
20:21
What's the status of the stack item collection spec? It was pulled out of the 1.0 release. Is it gonna be added back in at 1.1? I mean the item collection spec is part of the API specification. There is a fragment that we point to now
20:40
in the stack specification as well, but it's not part of the core static specification of course. So we just point to it in the stack API specification, make it more visible so that users can actually see it and find it if they look for it in the wrong place, which is, I mean it's hard to navigate all the repositories. There is also a collection collection which is pretty much the equivalent
21:01
for the item collection for collections. So the item collection is a list of collection, of items, pretty much a GJS feature collection and the collection collection is the equivalent. They both live in the stack API extension and be part of that release cycle, yeah. It just, you can't validate the item collection
21:21
because it's not part of the spec, so it's only part of the API, so. Yeah, I mean it should be validated through the API validator, right? But it might not do that right now. I'm not so into the API validator, yeah. Thanks.
21:41
Other questions? Thanks so much, both of you. With any standard or specification, what's the estimation?
22:01
I mean, are we gonna add and add new fields and more things and it will again become just so big that somebody wants a more lightweight specification? Just asking. Yeah, that's of course a risk that may happen if you have extensible specifications.
22:21
We tried to look after that a bit and tried to at least make it so that there are not like three different extensions for SAR or three different extensions for EO that different people come up with, right? That's one thing of course to do, but otherwise, I mean, as it's extensive and everyone can do what they want,
22:41
you can't really limit that from the top. I mean, that's the difference from, like if you write specifications top down, then you can just say that's it and then you're limited by it. If you write them bottom up, then you can just put everything in there. It allows everything, but then of course, it may make it very complex in the end to support.
23:02
There is a goal right now to do something like profiles of extensions so that you can have best practices of, if you have spectral, central two data, right, that they recommend you should implement extensions A, B, C, and D, right? And then it hopefully guides you
23:23
towards the right extensions and helps to avoid the proliferation a bit. More questions? Okay, well, thank you very much. There was another question.
23:43
From experience from the field, so to say, will the stack API be more prevalent than static stack JSONs catalogs or is it still okay to use static JSON catalogs? It depends a bit on the use case.
24:03
The larger the static catalogs are, the harder it is to go through them and find the corresponding data, right? Because there is no search on top of that. But it's totally fine to have static catalogs in cloud buckets if that's the easiest way to expose it. Do it.
24:20
It's better than having nothing, right? So that's the way to go. And then it might just be that also others just index the data into an API if they find it useful. It at least gives the common layer of, like the common language, and then people can do whatever they want with it in the end.
24:40
I mean, yeah. Thank you for the presentation. So I wonder what's the stack community position regarding validators? I mean, I saw there are some validators referenced.
25:03
Are they, in a way, evaluated? Are there some validators that are considered official? How do you manage the quality of the validators? Thank you. That's a pretty good question and actually something that people need to be aware of.
25:21
Stack is a little differently written to OGC standards where there are a lot of requirements and recommendations listed that you can go through one by one and check in test suits. Stack validation primarily relies on JSON schema, and you can't check everything in JSON schema that we expressed in the specification.
25:41
So if you validate using one of the tools, then it just goes through the JSON schemas and checks things. But if it says it's valid, it may actually not be valid still because there are some things that are not checked. So that's a bit misleading, of course. It's more like the validator is actually more a tool to figure out what is wrong
26:01
rather than whether it's completely valid. But there is no official validator as such. I think there was one in node, one in Python that's the primary ones to use. The Python one has one additional tool that is called stack check, which does do a bit of additional best practice checking
26:22
in the background as well. But there is no test suit as an OGC that checks for full compliance right now. That would be something that is, I guess, valuable. But then, of course, where do you stop? If you start that for all the extensions,
26:40
then you have quite a long way to go. Okay, I think we have to cut it off there because it's time to switch over. Thank you.