ESA User Services Powered By Open Source
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 95 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/15527 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Nottingham |
Content Metadata
Subject Area | |
Genre |
FOSS4G Nottingham 201327 / 95
17
25
29
31
32
34
48
50
56
58
68
69
70
82
89
91
00:00
Open sourceService (economics)Server (computing)Computer fontCache (computing)Level (video gaming)SatelliteManufacturing execution systemData managementConfiguration spaceGeneric programmingPhysical systemUsabilityWorld Wide Web ConsortiumGame controllerComputer-generated imageryArchitectureClient (computing)Object (grammar)Formal languageMusical ensembleEstimatorExtension (kinesiology)NumberData managementTime zoneMultiplication signInformationNeuroinformatikConnectivity (graph theory)State of matterProjective planeMetropolitan area networkOpen sourceMedical imagingService (economics)Revision controlTesselationMereologyGreatest elementSingle-precision floating-point formatDimensional analysisConfiguration managementSet (mathematics)Programmer (hardware)Line (geometry)Flow separationInformation securityElectric generatorCommitment schemeContext awarenessAuthorizationPoint (geometry)Centralizer and normalizerUsabilityPhysical systemDecision theoryReal numberScripting languageBitFrequencyProcess (computing)LaptopSatelliteGame controllerConfiguration spaceLevel (video gaming)Cache (computing)Web 2.0Server (computing)SoftwareView (database)AuthenticationInterface (computing)MetadataGeneric programmingComputer architectureOperator (mathematics)Query languageLibrary (computing)Raster graphicsFile archiverIntegrated development environmentComputer programmingDiagramComputer wormQuicksortWeb browserRow (database)Computer animation
05:59
Server (computing)Standard deviationGame controllerConfiguration spaceOpen sourceoutputPolygonPoint (geometry)Regular graphComputer-generated imageryPreprocessorAxonometric projectionAlpha (investment)AdditionData compressionCache (computing)Level (video gaming)Visual systemSingle-precision floating-point formatHausdorff dimensionFunction (mathematics)Maxima and minimaZoom lensImage resolutionDynamical systemHexagonData storage deviceMathematical optimizationPosition operatorSoftware testingWorkstation <Musikinstrument>Point (geometry)Medical imagingNeuroinformatikInformationLevel (video gaming)Regular graphData compressionDimensional analysisMultiplication signMereologyRevision controlMusical ensembleSingle-precision floating-point formatSequelSpacetimeFactory (trading post)Client (computing)CASE <Informatik>Image resolutionMathematical optimizationComputer configurationDifferent (Kate Ryan album)FrequencyOptical disc driveGame theoryAdditionReading (process)FreewarePrice indexCuboidMathematicsLimit (category theory)1 (number)Line (geometry)SatelliteExtension (kinesiology)Data storage deviceParameter (computer programming)Physical systemGroup actionZoom lensSet (mathematics)Computer fileCellular automatonQuery languageCache (computing)Functional (mathematics)TesselationInterface (computing)File archiverDivisorInternetworkingConnected spaceGraph coloringMiniDiscType theoryPixelData structureAlpha (investment)Projective planeTriangleWeb browserPolygonOrder (biology)Object-oriented programmingComputer animation
12:46
Image resolutionGraphics tabletEmulationUsabilityExploratory data analysisUniform resource nameSatelliteVisual systemLevel (video gaming)Division (mathematics)MassNumber theoryLevel (video gaming)Open setMultiplication signServer (computing)Medical imaging
13:31
Multi-agent systemWide area networkSystems engineeringMetropolitan area networkSoftware engineeringExploratory data analysisMaxima and minimaUsabilityLevel (video gaming)Multiplication signGreatest elementElectronic visual displaySimulationTesselationServer (computing)Multiplication signMathematicsAreaCASE <Informatik>Physical systemMedical imaging
14:18
EmulationConditional-access moduleWeightMedical imagingMultiplication signCache (computing)
14:45
AlgebraUsabilityGamma functionExploratory data analysisTorusComputer-generated imageryArc (geometry)Conditional-access moduleRange (statistics)Interior (topology)Euler anglesElectronic data interchangeArmLevel (video gaming)Medical imagingMultiplication signNetwork topologyObject-oriented programming
15:31
EmulationBuildingBlock (periodic table)Open sourceSoftwareUsabilityRaw image formatSign (mathematics)Function (mathematics)Frame problemMetropolitan area networkGamma functionSpecial unitary groupDemo (music)SummierbarkeitGrand Unified TheoryMoving averageEscape characterPresentation of a groupOperator (mathematics)Limit (category theory)Multiplication signMoment (mathematics)Interface (computing)Service (economics)Physical systemResultantGraphical user interfaceConnectivity (graph theory)PreprocessorOpen sourceGoodness of fitMedical imagingEvoluteMereologyCache (computing)Process (computing)Server (computing)Data storage deviceZoom lensRevision controlBitDemosceneScalabilityLibrary catalogCuboidSoftwareCapability Maturity ModelLevel (video gaming)Complete metric spaceBackupAreaSatelliteBlock (periodic table)BuildingFunctional (mathematics)Single-precision floating-point formatSound effectSoftware testingObservational studySystem callForestTime zoneStatement (computer science)CASE <Informatik>QuicksortPhysical lawSaddle pointMetropolitan area networkCodeProjective planeBit rateExecution unitOpen setImplementationPoint (geometry)Computer animation
22:16
Shared memoryAmsterdam Ordnance DatumMetropolitan area networkImage resolutionMathematical optimizationMach's principleExecution unitMaxima and minimaNatural numberNumerical digitTouchscreenUsabilityGamma functionScalable Coherent InterfaceDean numberThermal expansionSeries (mathematics)Different (Kate Ryan album)Multiplication signElectronic mailing listMedical imagingType theoryRevision controlSoftware testingCountingGroup actionVirtual machineInformation retrievalArithmetic meanMatching (graph theory)AdditionData storage deviceCovering spaceFrequencyState of matterMetadataExtension (kinesiology)MappingLevel (video gaming)Operator (mathematics)SatelliteUsabilitySoftwareOpen setDemosceneWeb browserInterface (computing)Computer animation
Transcript: English(auto-generated)
00:12
Yes, hello everybody. Good afternoon. I'm going to talk about the European Space Agency and how they use open source in one particular context. So let me briefly introduce the
00:24
team. That's Thomas, who is sitting here in the third row. He is famous for the map cache project and map server committer. And then it's Fabian and myself from UX in Austria. We also map server committer and we're part of the project I'm talking about.
00:44
So what I'm going to talk about is, okay, a brief introduction to the whole NGEO as it is called project. That's the next generation user services from ESA. I will briefly introduce the component that we are delivering. That's the browser server.
01:01
Talk about some map cache enhancements that Thomas did mostly or exclusively maybe. And after all this boring stuff, I will come to the fun part and try to give a live demonstration. Let's see how this is going to work out. Yes, okay. European Space Agency, they are trying to make a new rehearsal of their
01:24
existing user services. So they have big archives of satellite data and they're still growing and they want to provide access to those archives or the need to provide access to these archives. So what they want to have with this new version is fully online data access to the services. What data is it? In future, you may have heard of the GMES program of
01:49
the big European initiative. It's the global monitoring for environment and security. It got renamed a couple of months ago to Copernicus, but nobody's using that name. So that will be in there. That a lot of the Sentinel satellites,
02:04
Sentinel-123 and big data. And I think Sentinel-2 is comparable to the latest Landsat mission, sort of. And of course, they have all their legacy mission. They already have like the whole Envisat archive and others. And third-party missions where they buy data in and so on and so forth. Another big objective of this project is that they want to
02:26
have a fully centralized configuration management thingy. So we needed to add this tool to use the software. Okay. I'll come to that in a second. It's easy on the architecture diagram.
02:41
Nope. Another big objective is that it should be a generic system usable for other PDGSs, payload data ground sequences, as ESA calls them. So ESA wants to have the software to provide to other satellite mission operators if they want to reuse it. Okay. That's for the ESA project. Our part in this whole story is the so-called browse server.
03:07
So the users come to the archive, they search for data and they want to view what they have. So that's what the browse images are for. And what we're using, we're providing
03:20
the standardized access to those browse images. So big number of images. That's the old two dimensional, but you have the time dimension, of course. So what we need is to access single images in time, but also to have multiple images like a whole month. And so typical queries on satellite archives. So in the middle, you can see what we call the browse server.
03:46
We are reusing a software stack that's well known, I think it's a map server in the middle and for the raster processing, input, output, everything. And we have on top of that for the
04:00
metadata management, what we call UX server. So that's written in Python and it's accessing map server internals via map script. And we store the metadata in UX server. So we don't have a map file, we have all this in the Python library. And this stack provides an internal WMS
04:22
interface that is used then by map cache to feed the cache. In the end, web cache serves WMTS and WMS, so web map tile service and web map service for the viewing in the web client. A bit of complications is that we need to also take authorization into account.
04:44
So that's a feature that we will have to implement. It's not yet done, but we'll use Chibole if somebody knows that for the authentication. And then we have the authorization, which goes to another component of the NGO system. And we have to cache the decision so that we can be really performant also we have authorization.
05:06
Another objective, as I said, is the central configuration. So it's the NGO controller component that sends us new layer configurations, for example. So we have a new dataset, we have a new satellite mission that we want to add. Here is the configuration XML configure. Or we
05:23
want to have an additional projection for this dataset. Please add this. And of course, how comes the data in? There's an NGO feed component where we have defined another XML interface. ESA loves XML these days. That describes the data that are to be interested.
05:44
First, we get the browse images themselves. I'll come to that also in a second. And then it gets triggered via an XML sent to us, which includes all the metadata information we need. Okay, let's see. Did I forget anything? Entirely based on open source. That's
06:14
the online version, I would say. I said pre-processing. So we get images, we get various different kinds of images, mainly JPEGs. Basically, there are three ways how we get the
06:26
geo-referencing information. So I start with the last one, because it's the easiest. It's already geo-referenced. We win. The other part is we get the regular grid of type points. Okay, that's encoded in XML. Imagine how it looks a bit ugly, but okay, it works.
06:43
And then the most common way we get the images is with a footprint. So we have an XML structure. It says the first pixel is this position on Earth, and then it goes round, and we have to pre-process that image. So what are we doing in the pre-processing step is several optimizations. First of all,
07:02
we need to know the footprint in geographic coordinates as a polygon. Then we reproject the image already in the projection that we need to serve in the end in order to be performant. We add, of course, an alpha channel, because usually the reprojection means we have these black triangles. I guess everybody has seen those.
07:22
And we add internal optimizations like tiling and overviews and stored geotiffs in the end. We also add compression, because the limiting factor here is the storage space. Because we see everything in the cache, we don't have the big problem on the caching that we need to be
07:40
performing on the WMS interface. Okay, now coming to what was added to MapCache. First of all, as I already said, it's the time dimension support. So as I said, we have archives of lots of satellite images, and we want to be able to access each single image, but also intervals,
08:06
time intervals of images that are merged in together. So what we have here in the MapCache configuration, I assume you're all familiar to some extent with MapCache. You can add a time dimension, and you say you have a SQLite file. And in this SQLite file,
08:24
I store all the time, start and end dates for which I have an image. And then you provide the query, how to access this. So in this case, group, that's just the formatting for the time. So you select from the time, then you get some parameters like which tiles that we are in,
08:44
and the current time. So here that's what we get from MapCache as parameter or the query. In addition, we have the bounding box of the current tile that is requested,
09:01
so that we use that as well. And then for performance reasons, or for that it ends eventually, we have also a limit that we, in this case, we only want to merge 100 tiles. So in the end, what this query does, we get a WNTS request that says give me everything for this particular day. Then we have start time at midnight and end time at midnight again for this day.
09:26
And we get for the tile the bounding box. From this query, we get back all the time entries that we have. But OK, of course, only the last 100 ones. And then MapCache goes into the tile cache and merges all these tiles and serves as one back to the client. I see
09:45
questions, I guess. Any questions directly to this? No? OK. Can you make sure that your time values in that SQLite data set directly match what you've
10:04
ingested into the cache? Yes. Not a direct relationship. Yes, exactly. As you put new stuff into the cache, it's not automatically going into that SQLite data set. No, that SQLite file has to be maintained externally. Yeah, exactly. But I would put it the other way around. You have first the SQLite file and then you see
10:20
everything in the cache. Because in the cache, you can then, of course, use other functionality. And this is, for example, very important for us to read-only functionality. As you can imagine, we have only each cell I take is only a small portion on the earth and we don't want to see the whole world. But still, the layer itself needs to serve the whole world. So that's why we use a bounding box in the seeding request.
10:44
And with the read-only setting, we make sure that for all tiles that are not in the cache, it's assumed that they are empty and empty images are returned. So we have a fully preceded cache. Given the possibility that we include in the query the bounding box,
11:01
we could also don't necessarily need to make it completely preceded because it would always, for the seeding step, go first to the query. But in the beginning, we didn't have this. By the way, all these functionalities are available in the version that was released four days ago, five days ago. Another interesting feature is this
11:25
configure-max-cache-zoom. So for browser images, for example, they have a certain resolution. And it makes no sense to go beyond that resolution in seeding the cache or caching this data. So what we do is like zoom level 10, what is it here,
11:41
eight, is configured as the highest zoom level that is actually cached. And afterwards, so like zoom level nine, 10, and so on, it's computed from zoom level eight. So you don't need, there's no new information anyway. Oops. Another interesting optimization is because we're using SQLite caches.
12:08
And in SQLite caches, Thomas did a very nice trick. For single color images, he's actually storing just one, no, it's nine bytes, but just the one value of that color.
12:21
And then on the fly, it's automatically expanded to a one color PNG image. That's very nice when you want to save on disk space. Okay. Let's go to the fun part. Please turn off your internet connections.
12:47
Yay. Okay. So by the way, that's also our submission to the opening up the map challenge. It's called changing times. If you want to vote, go away. So what you can see here,
13:02
so I selected a time already pre-selected where Nottingham, where an image of Nottingham is visible. So of course you can change the timeline, the time interval that you're showing. We're showing then this warning that only the latest hundred time values that are available are shown. But as you can see, that's all live. That's really done on the server. It's
13:25
merging all the ties into one and sending to up to 100. Okay. So you can scroll through the story
13:41
and telling some, some, some story here. You can read through it later if you want. Okay. Now, yeah. Change the time. As you see, we lost some images because they're simply not in the time interval we are still seeing. Okay. That's by the way, that's Vienna,
14:02
where we are from. Just if you're interested. Yeah. I just wanted to show you that it's really working that so that Mapcache is really well-performing. It's coming live from servers in Austria or maybe Germany. I don't know. Okay. Another thing, interesting thing is
14:30
with what, what are you assuming here? Maybe I should give you those numbers. That's a dataset from, from ESA. It's around 18,000 images. The cache is around 10 gigabytes,
14:42
SQLite cache. So this is an overview of the whole dataset. So it's a seeded as, as one layer covering all 18,000 images. So this is not now done on the, the merging is not done on the fly. That's pretty easy. But if we switch back to the,
15:03
that's done now live. Oops. Okay. So you see even on this, on this low connection, nice.
15:31
I think I should give a credit to, to Mapbox from whom we borrowed this idea with the story. Okay. Let's go back to the presentation.
15:45
Yay. Okay. I can skip the, the backup screenshots. I think it worked. Conclusions. Every presentation has to have conclusions. Okay. Good in the time. So we see that there is mature open source software building blocks
16:02
that can be readily reused for operational software. That's our main conclusion. That's in the, in the project, the quality review as it called it for the first version of all the components is just to be over. So our component is finished out as a bit late, but that should then start really the going into operations anytime soon.
16:27
That's another very interesting conclusion. As we all know, open source software, it's allows for easy adoptions. If you need, like I said, these four enhancements you need for map cache, it was basically one talk to Thomas and we did it. That's very cool.
16:47
Yeah. And of course that's, that's the important part that all this new functionality sponsored by ESA in the end is now available to everybody. So you can immediately use it. It's already released. And as we had already in the keynote, for example, about the map store
17:02
time access is important. I have to acknowledge the ESA. They sponsored this. They give us the money. Okay. Done that. So thank you everybody. Any questions?
17:32
Could you talk a little bit about scalability and like is the intention that this will be handled millions of scenes?
17:43
That's the idea. Yes. So, yeah, that's definitely the requirement that, I mean, the Sentinel missions, they will acquire millions of scenes and they want to handle it with that system. So, can you talk at all about how that's going to be accomplished or is accomplished? We will see. So at the moment, what we, what we've done is this 18,000
18:03
images cache and we didn't have any big, big problem. I must say as well that we, that the seeding step is, that's the, this lowest part of it. Then there is also the issue that we sometimes, we have to have distinct time intervals in
18:21
the cache so that they are not allowed to overlap. So if we get overlapping time intervals, we merge them to one. And of course, then we have to un-seed what we had already, at least in the overlapping parts in the cache. And this is particularly slow because we're using a SQLite cache. So the un-seeding is the really slow part,
18:42
but we, so we think in the next evolution of the system, we will make the seeding completely asynchronous. At the moment we do it right after each ingest we seed, we will make it asynchronous. So then we can wait for several images and do only one seeding step if we need to merge some of them. Do you initially know when the last image comes in and you need to start
19:04
seeding? I mean, do you seed as each image comes in or do you seed once all the images? No, as I said, we seed after, so that the process is that for each image that we receive, that the ingestion is triggered, we run through the whole pre-processing and the seeding. And we,
19:22
as I said, in the next evolution, we want to switch this that we just run through the pre-processing that we have the internal WMS interface. And then we seed only, I don't know, once an hour or whatever. And then how big do you expect each cache to be? Huh, good question, actually. I don't know, I mean, there is this limit with the,
19:46
wow, around one terabyte at the moment. I think we will go in, we're going to, okay. That's good to know. But I guess for the time axis, we,
20:06
that won't work at the moment. But it's a SQL-like interface. Is it? Ah, okay. We need to talk. I have a quick question. Yes. I mean, is it possible to identify the source of each image with the tool you've just shown us?
20:26
What do you mean by source? For example, is it possible to know whether the image is coming from rapid eye or GUI or? Ah, okay. So what I've been showing you in the live demonstration is just a simple OpenLayers interface that's not the interface
20:41
that is of the NGO project, of the user services, because we are doing just the browse server. So the actual interface of the user services will be a different kind, also based on OpenLayers, of course. But there you will have to log in for certain datasets. And there you will definitely see which datasets, so which satellite missions you're searching.
21:02
And so the results are exactly for those missions that you're searching. And there is even a possibility that you, that you, so you're searching also against the catalog and you can highlight the single scenes and they get highlighted also in the map. And that's why we need, we had this important requirement that we need to be able to access
21:21
single images on the WMTS interface. No, those images are all in that cache. In fact, you have 18,000 separate pyramids, but just with lots of blanks, because it's just hitting the
21:46
bounding box of the images. So for each scene, you have to complete pyramid from zoom level zero to whatever zoom level you have configured, like eight was in the example.
22:01
But only in the area where you have actually, actually have data. Oh yeah. Oh yeah. In the operations. Okay. Yeah. We are not, we are implementing the generic system. We are, we are not responsible for the operations in the end. So we hand this over. Okay. Yes, sure. In the end, to put it short, it adds the EOWCS or the Earth Observation
22:37
extension of WCS to map server. What does that mean? It adds like the storage of metadata. So
22:46
you have additional metadata in, in the EO coverages as we call them. And particularly you have the time start and end and the footprint. And what we all add as well is we have an additional operation on, on in WCS describe your coverage set where you can make a spatial
23:05
temporal search on those coverages. So that's why we need the time and footprint. And then we have two data types, how to group coverages. One is dataset series. That's what we use here. That's for inhomogeneous grouping, but you still, you have one layer or one coverage you can
23:23
query one dataset series you can query to retrieve all these sub, sub coverages. And in WMS, you have one layer and all the sub images are layers by their own, but you can still retrieve via the time axis, the single images. And the other one is
23:42
for MOSEX. And if you, if you, maybe if you want to have a more elaborate answer, then we have a talk tomorrow at, I think it's 1130 somewhere. I don't know where. And by the way, I was supposed to mention this, that in the background of this, of this map that I was
24:03
showing there, you see the nice terrain layer. That's the submission of York into the opening of the maps challenge. I simply reused it. Okay. You talk about satellite scenes. So who splits up the satellite data into scenes in the first
24:21
place? Because a lot of scientific users actually like to have the continuously scheduled swap. So when you talk about all these submissions, are you assuming that it's, that somebody is splitting up the swap into scenes before you get the data? That's yeah. I mean, I know for example, that for Sentinel-2, they are defining,
24:41
so a hundred, a hundred kilometers grid, and they split it up, but they already talked to us that they want to have a feature in a software that we are then again, merge it in the browsers. Anyway, that's, that's ease of policies. So we have well-defined or most of the time, well-defined interfaces that we are simply implementing. Of course, we are discussing
25:04
and asking, oh, I want to do that that way. That would, let's make us a little different, can't we discuss, but yeah. Big, big agencies, they have interfaces defined and they want to use them. Thank you.
Recommendations
Series of 45 media