Raster Data In GeoServer And GeoTools: Achievements, Issues And Future Developments
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 183 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/32103 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Producer | ||
Production Year | 2015 | |
Production Place | Seoul, South Korea |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Projective planePersonal digital assistantService (economics)Server (computing)BitGeometryRaster graphicsComputer animationXMLUML
00:33
BitStack (abstract data type)ChainLevel (video gaming)Medical imagingProcess (computing)Library (computing)Projective planeInterface (computing)PixelImplementationINTEGRALTesselationSheaf (mathematics)Image processingElement (mathematics)Service (economics)Flow separationCodierung <Programmierung>Java appletMachine codeDefault (computer science)BytecodePhysical systemOperations support systemRepresentation (politics)Auditory maskingOperator (mathematics)Moment (mathematics)Cache (computing)Structural loadMultiplication signNumberForm (programming)AlgebraRaster graphicsNeuroinformatikReading (process)Problemorientierte ProgrammierspracheOcean currentMusical ensembleGeometryActive contour modelDigitizingOrder (biology)Type theoryCodeSemiconductor memory1 (number)Food energyScaling (geometry)Just-in-Time-CompilerParticle systemSet (mathematics)Formal languageVariable (mathematics)File formatUtility softwareWebsiteInternetworkingSource codeMereologySpecial unitary groupLink (knot theory)Automatic differentiationField extensionBasis <Mathematik>XMLComputer animation
07:01
BitOperations support systemMultiplication signDifferent (Kate Ryan album)INTEGRALInterpolationUltraviolet photoelectron spectroscopyMedical imagingCausalityRaw image formatDefault (computer science)Model theoryMoment (mathematics)Graph coloringAffine spaceArithmetic meanType theoryTunisProjective planeNumberPhysical systemProcess (computing)Key (cryptography)Virtual machineSet (mathematics)DatabaseChainForcing (mathematics)Traffic reportingQuicksortScaling (geometry)Single-precision floating-point formatIteration2 (number)Photographic mosaicAuditory maskingSoftware bugLinear regressionSoftware testingXMLUML
09:47
Raster graphicsComputer configurationData compressionPixelIntegerShape (magazine)Vector spaceCategory of beingComputer fontGoodness of fitMedical imagingWKB-MethodeRepresentational state transferComputer fileProper mapScaling (geometry)Auditory maskingAreaChainWater vaporMultiplication signXMLComputer animation
12:20
NumberLatent heatFile formatRange (statistics)Function (mathematics)CubeRaster graphicsMedical imagingMereologyWahrscheinlichkeitsmaßUniqueness quantificationCodeAttribute grammarDatabaseData conversionService (economics)Process (computing)Decision theoryVolumenvisualisierungBitProjective planeRepresentation (politics)MappingData compressionUser interfaceEmailState of matterWorkstation <Musikinstrument>Musical ensembleDivisorPopulation densityCategory of beingFour-dimensional spaceMultiplication signVideo gameField extensionClient (computing)CuboidForm (programming)Maß <Mathematik>AuthorizationInverse elementXML
15:44
BitSet (mathematics)MappingProjective planeRepresentation (politics)Raster graphicsContrast (vision)Web 2.0Medical imagingPresentation of a groupAlgorithmMaxima and minimaDifferent (Kate Ryan album)Moment (mathematics)Point cloudCivil engineeringParameter (computer programming)Link (knot theory)UsabilityLibrary (computing)Vector potential
17:07
ImplementationArithmetic meanChainSocial classFreezingAlgorithmPosition operatorInterface (computing)Moment (mathematics)Multiplication signSpecial unitary groupBlock (periodic table)Parameter (computer programming)Electronic mailing listRevision controlEvent horizonOperator (mathematics)Model theoryReliefOnline helpCASE <Informatik>Software maintenanceHill differential equationExt functorPresentation of a groupAsynchronous Transfer ModeSurfaceDivisorXMLUML
19:37
Mathematical analysisCondition numberSound effectConcentricCASE <Informatik>PressureDevice driverVector spacePoint (geometry)Raster graphicsFile formatDigitizingXMLUML
20:50
CodeSpacetimeRevision controlProcess (computing)Physical systemWeightLibrary (computing)Information securityJava appletBitStress (mechanics)Link (knot theory)Vapor barrierComputer-generated imagery
22:06
Computer animation
Transcript: English(auto-generated)
00:05
I work for GeoSolutions, an Italian-based company that is providing support and services for GeoServer, GeoTools, and to a lesser extent, to GeoNetwork and so on. And we are core contributors of all of these projects.
00:22
Today I'm going to talk about raster data and geo tools in GeoServer, what, how we process the raster data and the recent achievement and issues and future developments. So, first a bit of an introduction to the technology stack. So this is more or less how our stacks look like.
00:41
At the very bottom we have JI and JI Image IO, which are the base library in Java to do serious image processing. On top of it we have Image IOX, the JIX, and JI tools. We build the GeoTools on top of them to add spatial referencing and so on
01:01
because all the lower level libraries, well, besides GDAL, are unaware of geography. They are just image processing libraries. And then on top of GeoTools we have GeoServer providing all the services we are used to and all the OGC services. So, the base element is Java Advanced Imaging.
01:24
Java Advanced Imaging is a very nice library to perform pure Java image processing. It's tile-based, which is very important to process very large images because you don't ever need to load the full image in memory,
01:40
but you can handle and process it piece by piece. It has integrated tile caching. It's easily extensible so you can add your own operators and I'll show you in a moment that we extensively leveraged this extensibility. There's the ability to plug in a library that provides native code acceleration
02:01
for the common operations. It supports multi-trading in computation of the tiles. It has one big issue. It was developed by Sun, now Oracle, on behalf of NASA, which is a very good start, but then the development ended. So, what do we do about it?
02:22
Well, we built JIX. So, JI is a fully pluggable system with interfaces for basically everything and it allows you to plug in your implementation in place of the default ones. So, we built JIX on top of JI
02:41
to replace the most common JI operations that we use in order to support the concept of no data, which was not available in JI was meant just for image processing and no data is more about representing physical phenomenon and the lack of them. We added full support for processing ROI,
03:01
region of interest, that is masks, which was presented, there was a notion of ROI in JI, but it was not properly implemented in all operations. We made several fixes. We added support for digital band masks into it and so on. So, that's our current active development of JI operators.
03:26
Then we have JI tools. This is a separate project, which is, by the way, probably scheduled to join with JIX. This one provides high performance raster processing, extending JI, but it basically builds on top
03:41
as instead of replacing bits as JIX does. So, we have a number of high level operations, such as the contour extraction, the vectorization, ROI based on JTS geometries and so on, and we have a nice package called GFL, that implements raster algebra.
04:01
So, they got their own little language to do all the pixel manipulation and it nicely integrates with the JI processing chain and we have on some projects exposed it as a WPS process for everybody to use high performance library.
04:22
One nice little bit about this one is that it literally compiles all the way down to native code if it's executed long enough because this library parses the text that you have and generates byte code out of it, Java byte code,
04:40
and then the just-in-time compiler, if it sees that there is a hot section in that code, it will compile it down to native code. So, it's a really nice way to have a blazing fast raster algebra. Then we have JI Image IO. JI Image IO builds on top of JI to connect with the data sources, so the raster data,
05:05
and it allows for the third data loading. So, not just loading everything in one shot but loading type by type from the disk, allowing large-scale processing without having large-scale memory. It is extensible. It has some accelerated code and so on,
05:21
and again, very nice library, but it's development stopped. So, we made Image IO-ext, which builds on top of the same interfaces as Image IO but replaces most of the implementations. So, we have our own JITIF reader that can do BigTIF. We have an integration with TurboJPEG, with Kakadu, with JIDL for several data sets.
05:43
We built a fast, pure Java PNG encoder that's actually faster than the native PNG encoder that's available in Image IO. And let's say every time we add the ability to read a new format, we put it in Image IO-ext. Then on top of all this, we build the JI tools,
06:02
which adds the geography concept on top of the imagery. So, the representation of geo-reference grid, reading and writing geo-reference format. So, for example, in Image IO-ext, we can read TIF and the tags, but we don't know what the tags mean. And here in geo-tools, we interpret the tag and add the geo-referencing out of them.
06:23
We do all the rendering, the reprojection, all the processing, and so on. And this forms the base for GeoServer. GeoServer adds, on top of all the rendering, image processing, and so on, all the OGC services.
06:42
So, everything that we do today in GeoServer, via WMS, WMTS, WCS, and WPS, is actually goes through the whole stack down to geo-tools, which in turn calls all the XT projects, which in turn are built on top of the base projects.
07:03
So, recent achievement, what's new? GIAx has been worked on quite a bit. We have quite a bit of performance tuning and bug fixing in it. We have managed to make it run mosaics of images, even if the images are in different color models,
07:22
which was a common requirement from our user base to have a mosaic of images that are some gray, some color, some palleted, and so on, and put them together in a single mosaic without having to re-convert them into a single color model. And it's new, it has just been integrated into geo-tools,
07:43
but we have some positive reports. We have Tom Kunicki from some weather institute in the United States. He has been using JI-X to replace JI, and he has been noticing some very significant speed ups
08:01
in the processing of their processing chain, which flows at 30 gigabits per second. Basically, they have seen speed ups between three times and 68 times faster, depending on the type of interpolation that they are using. So we are very pleased with this.
08:21
The JI-X integration has been integrated for the first time in geo-tools 14 and geo-server 28. At the moment, it's not enabled by default, because we have still to iron out a few issues. So if we enabled it by default now, we would reap some benefit, but we would have two or three regressions that we still have to iron out.
08:42
So it's something that you can enable by passing a system variable to the virtual machines. So it's there for early tester to play with. And once we fix those few issues, we hope to enable it by default, either in geo-server 29, or later in geo-server 28, when we feel it's ready.
09:05
So what are the benefits of JI-X integration in geo-tools and geo-server? Well, end-to-end no data support, which is very important for scientific data, mask support, full-time mask support in all operations, heterogeneous mosaic support, and a number of bug fixes over JI.
09:22
As I said, the JI project is basically dead, we managed to replace the operations and thus fix some bugs. And also as shown by the comments from Tom Knickie, we have very interesting speed ups in certain operations.
09:41
Not all of them, don't expect 60 times faster everywhere. It happens in some operations. On the recent achievement raster that I'm asking, since we have now a full chain supporting ROIs, we started supporting vector masks for raster data,
10:02
so that you can tell us using a shape file, or a WKB file, or a WKD file, where in the image the good data is, and separate it from the rest from the no data. Now, we could do it with the no data support too, but depending on the compression,
10:20
that's not always the best option. Like if you JPEG compress your data, or JPEG 2000 compress your data, at the border between the good data and the black, let's say the bad data, you have an area in which some black gets into the good data, some good data gets out in the black, and well, if you try to display it,
10:41
you will find a blackish outline around your imagery, which is due to the lossy compression algorithm. If you give us a proper outline instead, we are able to cut out the exact pixel where we want to show the data.
11:01
Vector outlines are nice, they are compact and so on, they have a little problem. They tend to rescale linearly, but the pixels go down by the integers instead. You cannot have half of a quarter of a pixel, it's just a pixel. So when you are scaling down, you start with, I don't know, 10 pixel and you scale down three times, you have to decide whether you end with three pixel
11:23
or four pixel, you cannot have 3.3 pixels. And the vector outlining set the scales down linearly. So how do we match it to the vector one? It's complicated. It's nicer to have instead support also for raster marks, masks.
11:41
Raster marks are black and white raster data, which is embedded in your files normally, and GTAL does that, for example. And they have the property of scaling down just as the data they are masking because they are also made of pixels. They can compress very, very well
12:01
so they don't increase the size of your data a lot. And again, they are used for the same purposes, just that they scale down exactly as the data they are supposed to mask. Oh, and when you have overviews in the file, you can have overviews of the masks as well.
12:21
We have a number of meteorological and meteorology and oceanography specific improvements, such as support for rasters in the longitude range between zero and 360 degrees. Normally, the data would be minus 180 plus 180, but many meteorological institutions
12:42
like to have their data in zero to 360 instead. And that poses a bit of a problem when trying to process or render that data. This is not just a decision of some mad scientist. Sometimes you have the sensors that are really collecting data on both ends of the dateline.
13:01
So the data is seamlessly going through it and we need to be able to process it. So we added support for it. This image that I have is another sensor very close to the datelines somewhere in the Pacific. And it's physically gathering data
13:21
that is crossing the dateline. So we need to be able to display it properly and reproject it properly, which is even more complicated. Displaying it is just fine. Much of scientific data is in the NetCDF format and many scientists want to have their data
13:43
in as precise as possible representation. So they normally choose local projections. And when I mean local projections, I don't mean they choose the typical projection of the state they are in. It's centered on the very weather station
14:00
where the data comes from. So it's completely custom. It's not part of the official EPSG database. We added support for reading that and for representing it with custom codes in a seamless way so that if you prepare a database of these custom projections, then they flow quite nicely up from the data
14:22
into the user interface and out of the services. So again, this image shows reading the headers from a NetCDF, mapping it into the right projection and showing it in output. Along with that, for NetCDF, we have a number of improvements
14:41
for the NetCDF output format, which is a WCS output format. It's the output format that allows you to download the multidimensional data. So in WCS, you can select a cube of data, longitude, latitude, a bounding box, and then an elevation range, and then a time range. So you have a four-dimensional cube that you want to represent and download.
15:01
You can do that with a NetCDF output format. And we added support for the standard names in the climate and forecast convention, custom data packing. Data packing is this notion that you don't want to have the true numbers of your physical phenomenon, but you provide an offset
15:21
and a multiplying factor, and then you just keep integer numbers. It's for compression purposes. We added support for unique measure conversion. Maybe your data isn't fit, but you wanted to publish it in meters, and for NetCDF wide attributes.
15:44
Some other bits that we did, we extended the advanced projection handling to Raster. I made a presentation probably badly named on this topic. It was Mapping Beyond 3857, where 3857 is Web Mercator. Advanced projection handling is about dealing
16:01
with the vagaries of the various projection when you have a global data set and you want to display them into each and any projection possible and available. And this improves the representation of raster data set when you project them in some unfriendly projections. That is a global rain data set
16:22
that was projected in polar stereographic before the cure and after the cure. We have improved the contrast announcement quite a bit. So if you are contrast enhancing images, now you can control in more detail the algorithm used and the parameters for it.
16:42
So here I have an image of Seoul, and I made two different contrast announcements, one with a stretch to min max to emphasize more or less the human visible features, and another clipping which basically shows me only the clouds instead.
17:04
Just a quick note about what we are working on at the moment. We have a very nice feature that just missed the feature phrase of GeoServer 2.8 by like one week, but there was nothing we could do about it. In GeoServer, the release model is time boxed, so when it's time to cut, we cut,
17:21
and we don't look back. So we never had the ability to apply hill shading or shaded relief to digital elevation model or whatever surface you want. We are about to add it. So you have the ability to control the relief factor in another bit,
17:42
and here is an example of a global digital elevation model over Korea with hill shading applied to it, as you can see. The mountains are very much more visible. At the moment, we are using a fixed sound position, but the algorithm inside the chain actually is able to take the position of the sun
18:02
at the mode and elevation into account, so we are planning to expose that as vendor parameters because the SLD doesn't have such an ocean, but it would be still nice to put the sun in the right position when we hill shade a map. Finally, we are looking for help and funding
18:22
to build a full JAI replacement. As I told you, we already replaced most of the meat, most of the implementation of the operators in JAI, but we are still relying on the interfaces and some basic classes from JAI. We would like to become completely independent from that
18:43
and have our own fully maintainable version of JAI, possibly licensed in a different way, and so we are looking for help and funding to get there because it's a huge amount of work.
19:03
So if anybody here wants to chip in, please do. And with this note, I complete my presentation. Is there any question?
19:21
One here, one there. Just to clarify, the JAI EXT work you're doing, it increases the performance significantly in some cases
19:41
for working with raster data. Does it also affect just normal WMS rendering for vector data? No, for vector data, no. Because JAI is not involved at all in the WMS request, that only involved vector data.
20:00
It speeds up, it may speed up, we still have to test it, cases in which you are publishing raster data because we use JAI for cropping, scaling down, reproducting, yeah. So those are the cases that get the improvement. And according to Tom Kunicki, the use case that was improved by this much
20:24
was processing floating point data. So actually grades with pressure, concentration of some pollutant or whatever. Stuff like digital elevation models, basically. Yeah, excellent.
20:45
For working with those formats where you have both GDAL data drivers and native JAI EXT, do you recommend using one over the other for performance, for flexibility? Generally speaking, linking to a native library from Java
21:01
is calling for trouble. So if there is a pure Java version of the code that I can run, I would suggest to use that. For example, while we could read the GT using GDAL, we suggest not to. There is a performance reason.
21:20
You have to cross the JNI barrier going from the Java way of managing the CPU registers and so on to the C way. And when you go back, Java cleans up the registers, which comes at a price. And the second issue is that many of the native libraries were not built from the beginning
21:40
to support multi-threading. They got there bit by bit. But under any stress, they can still crash. And if the native library attached to a Java process goes down, it takes down the entire process. So this is not CGI. If just ever goes down, it stays down. It's not going to get restarted by Apache.
22:03
So there's a risk. Thanks.