geotiff.js - efficient GeoTIFF exploration in the browser and server
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 351 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/69023 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Year | 2022 |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Service (economics)Server (computing)Web browserMultiplication signProjective planeWeb browserCollaborationismComputer fontComputer animation
00:18
Service (economics)Computer fontArchitectureLibrary (computing)Open sourceUsabilityFunction (mathematics)Expert systemRaster graphicsClient (computing)Constructor (object-oriented programming)Personal digital assistantSocial classAbstractionSource codeComputer fileBlock (periodic table)Cache (computing)Range (statistics)MultiplicationData conversionThermal expansionSpacetimeTransformation (genetics)PixelComputer reservations systemMusical ensembleMedical imagingFunctional (mathematics)Order (biology)Source codeBlock (periodic table)Level (video gaming)Plug-in (computing)Semiconductor memoryNeuroinformatikServer (computing)Multiplication signSocial classBitGoodness of fitCache (computing)CASE <Informatik>Data transmissionEvent horizonUsabilityRaw image formatImage resolutionSubject indexingWeb browserRight angleTrailBookmark (World Wide Web)Web 2.0Mathematical optimizationPoint cloudExpert systemRemote procedure callData structureElectronic mailing listData compressionSet (mathematics)Computer fontMetadataSheaf (mathematics)MereologyRange (statistics)Reading (process)Computer configurationComputer scienceRaster graphicsWindowConstructor (object-oriented programming)Extension (kinesiology)Computer fileComputer architectureAbstractionWrapper (data mining)Data storage deviceEmailSimilarity (geometry)Sinc functionComa BerenicesMultiplicationDisk read-and-write headMaxima and minimaSingle-precision floating-point formatEvoluteFlow separationComputer animation
09:37
Data conversionThermal expansionTransformation (genetics)Data modelSpacetimePixelComputer reservations systemMusical ensembleInterleavingRaster graphicsArchitectureService (economics)Web pageMagnetic stripe cardTable (information)Data compressionPlanar graphError correction modelWeb browserComputer-generated imageryThread (computing)BefehlsprozessorGEDCOMOpen sourceSoftware maintenanceShift operatorOpen setCone penetration testTrigonometric functionsVolumenvisualisierungStructural loadHacker (term)Line (geometry)Asynchronous Transfer ModeConfiguration spaceAerodynamicsDiscrete element methodVisualization (computer graphics)SmoothingActive contour modelVariable (mathematics)Source codeCodeFeedbackServer (computing)Component-based software engineeringLevel (video gaming)VolumenvisualisierungWeb 2.0Computing platformCASE <Informatik>CodeRevision controlMathematicsMultiplication signFront and back endsWeb browserData compressionComputer architectureMedical imagingSoftware bugArray data structureCode refactoringReading (process)Software frameworkColor spacePort scannerField (computer science)Software maintenanceShift operatorCalculationTraffic reportingNumberCurveView (database)Cartesian coordinate systemMultiplicationQuicksortFinite setLibrary (computing)Point (geometry)MereologyCodierung <Programmierung>Projective planeImplementationRandomizationRight angleBefehlsprozessorSinc functionServer (computing)Order (biology)Video gameLine (geometry)Computer configurationBookmark (World Wide Web)Graph coloringFile formatTesselationThread (computing)Planar graphInformationRaster graphicsOpen setHecke operatorMobile appTable (information)CodecFunctional (mathematics)Configuration spaceAsynchronous Transfer ModeMagnetic stripe cardPixelActive contour modelOpen sourceMusical ensembleSmoothingMixed realityComplete metric spaceClique-widthFeedbackINTEGRALAlgorithmPhysical systemDebuggerComputer animation
18:56
Web pageService (economics)Raster graphicsFeedbackComponent-based software engineeringServer (computing)Web browserView (database)Computer animation
Transcript: English(auto-generated)
00:00
Thank you. Yeah, Geotiff.js, efficient geotiff exploration in the browser. So Geotiff.js is actually a project that's quite dear to my heart. I've been working on it and in collaboration with many, many others for some time now. So I'm really happy to have a talk about it here. First, a quick outline.
00:22
I'll do it like Daniel. So I'd like to ask you, who knows about Geotiff? OK, the other way around. Who does not really know about Geotiff? OK. OK, not know really. OK. I see. So you're all very informed. I will bore you with the details anyhow.
00:41
First, there is a thing called the header, which is basically the first part of your Geotiff. And then there's a linked list of all the IFDs of the image. Since a Geotiff can be comprised of many images, they are all organized in so-called IFDs. They store the metadata for the particular image, and they have references for the image data.
01:04
There is no actual inherent structure required. Who of you has a computer science background? OK, quite some. So at least when I learned in computer science about linked list is basically the worst performance
01:20
structure there is. Maybe you have a similar opinion about that. But this is actually where Cloud Optimized Geotiff comes in and really improves on that. Because Cloud Optimized Geotiff says, well, it's just TIFFs, but we basically define how it is internally structured. So you have the header in the beginning, and then you have immediately the IFDs in a particular order,
01:42
and then you have the image data. It is way more efficient, and I will talk about, at least if you have a reader that takes advantage of that, I'm going to talk about how Geotiff.js is making advantage of that.
02:03
All right, so what actually is Geotiff.js? So it is a pure JavaScript reader and also writer for Cloud Optimized Geotiff. So I put this in parentheses because I will come back to that later because there's actually more usages other than the geosphere for that.
02:20
It aims to be feature complete, whatever that means. I mean, the TIFF spec is basically fixed for like 20 years now, but there's several extensions. For example, Geotiff is an extension to TIFF. Now, Cloud Optimized Geotiff adds small bits and pieces, but there's some evolution. So I guess many of you have seen events talk about GDAL,
02:40
so you all know there's a new compression algorithm now that we have to implement in Geotiff.js, and I'm really looking forward. Pull request a welcome, actually. And it also aims for efficiency. So I come from the computer science background, so efficiency is also very close to my heart. And then there's also a great deal of ease of use.
03:04
So I want to make it as open and easy to use for anyone to use it. But I also want to give some, I call it, expert functionality, so some functions that are really there for expert users who really want to exploit it and really want to get the best performance,
03:21
the best information out of your images. It also aims to be extensible. Coming back to the compression thing, this is actually something that you can provide as plugins. So I hope that, yeah, I said already, pull request a welcome. And Geotiff.js was really, really lucky
03:43
because the initial commit was 2015. And right after that, something happened. It really took off. So it was really unexpected, but somehow Geotiff's and Cloud Optimist Geotiff's were back in invoke and everyone was using it. So Geotiff.js was really lucky to be in the right spot
04:01
at the right time. Right. I'd like to talk a bit about the architecture and what actually makes it efficient, or at least what I consider efficient. You have abstraction classes. So it's easy to understand. You have an abstraction class of a Geotiff. There's also an abstraction class that uses multi-Geotiff's.
04:23
So this is somehow useful for cases when you have the overviews, so like the smaller resolution images in an external image. So this is a common use case. So for this case, we have the multi-Geotiff class. Then a level beneath that, which is actually
04:40
abstracting the IFDs that we have talked about earlier, is the Geotiff image. You can see here, this is actually, I would consider the expert API to you know. So you basically have to construct the source, which is an abstraction on where it is. Then you can construct the Geotiff from that source.
05:01
But there's actually an easy way to use it, which is just this from URLs. Because 90% of the time, you're basically loading it from some web server. So this is the fast track to that. And then you can actually access the image beneath that. Since there can be more, you can also pass in the index.
05:27
There it is. OK. Next thing we already touched upon is the source. So this is actually, it stands in between your TIFF and Geotiff TS. So when your TIFF is stored in some HTTP service,
05:43
some S3, maybe it's a local file. Maybe it's something that you want to use to be able to drag into your browser. It's a blob or something. So this is the source. This is the abstraction that allows you to actually access the raw byte data underneath. For the most common use cases, we already provide the sources, as we've already seen.
06:02
For example, the HTTP source has some TIFF tricks that we are actually using. So for example, so HTTP has a concept called range request, when you just want to get portions of the image. When you just send a get request to a resource, you will always get the full file.
06:23
But for huge TIFFs, you don't actually want it, because it can be in the order of, like, gigabytes. You don't want to download those at once. So you're using range request in order to just get portions of the file. And since you're using TIFFs, you actually know which portions you're actually interested in.
06:42
There's also a possibility if you know ahead which portions of the file you're interested in, you can also make a multi-range request, which combines multiple ranges of the files that you're interested in, and you get back a single multi-part thing. gtiff.js supports that, so you can even scale down
07:01
the amount of requests that you are actually needed to send. Because even if you're disjointed sections of the file you want to access, you can download them in a single request. There is a caveat. The caveat is that not every web server actually supports it. So this is an opt-in feature, so you
07:20
have to provide it in this max ranges setting. Right. Then there's something which comes back to a topic that we just had in the previous talk. Yes, gtiff.js does cache. And this is the so-called blocked source that we're using.
07:42
So this is basically a wrapper around any source that's there. So for example, you can put it in front of your S3 source or HTTP source. So it basically wraps this source in this blocked source. So you are having a block cache, which is very, very
08:02
good, because sometimes you just don't want. You may need to read a block twice or multiple times, and this way you have the block already cached. So it's good for performance and it's good for transmission of data. It uses an LRU cache, which is the least recently used cache.
08:22
So in order to not bust your computer's memory, you can say, I only want to store them the 100 recently used blocks. So you don't run out of memory anytime soon with that.
08:40
Again, you can do it like this. You can construct your HTML remote source and then wrap it in a blocked source. But again, it's easier if you just use this constructor function and then you have it as is, because it's easier that way. Right. Then we're talking about actual raster access.
09:00
So how do you actually get the raster data for your images? And there is actually two ways to do it, because one is just dealing with the raw data. And since TIFs are also fairly often used in order to store RGB data, there's also a second way called read RGB.
09:21
Again, you can simply call the simple function with no options, and then you will get all the data in just one go. But you can provide it with many, many options if you're showing client. So for example, with the window, you can specify the image window you want to read out of. You can also specify the width and height.
09:41
In this case, it will resample. You can also specify that you're just interested in one or two bands of it. You can also specify if you want to have the image data interleaved or you want to have it in separate arrays. So all of that you can specify. OK.
10:00
The read RGB method is very similar, in a sense, because it's actually built on top of the read roster. But it does some calculations for you, because there's multiple color spaces. There may be a color map that's inside of the TIFs. So when you're just interested in the RGB values and nothing else, you can call that,
10:21
and it will do a lot for you. There are several options that you can actually provide, but it may also be interesting to see what you don't need to provide, because it's sometimes interesting what the library actually does for you. OK, and what you don't have to supply is many stuff which is stored in the TIF
10:42
and GeoTIF.js can make use of. So is it in tiles or stripes? What's the internal compression? Is there a predictor applied or not? Is it pixel interleaved so there's data inside the TIF, or is it stored in planar configuration? Does it have internal overviews?
11:00
Does it have overviews at all? Does it use color spaces other than RGB, or does it use color tables? All of this is taken out of your hand. You don't have to deal with it, and it makes it automatic. So again, it's easy to use for non-knowledge people, which is good for spreading your work, basically.
11:21
Right, another important thing is the decoding. So this is the part where GeoTIF.js is actually somehow using a plug-in based system, because there's a finite set of already specified compression
11:42
algorithms that you can use. But as we have seen today, it's always possible that a new one gets invented and it will be there and you have to support it. So you can simply define your own codec, so your own compression algorithm, or implement one,
12:02
and then you need to specify the number under which it shall be registered. So this is then looked up by the thing inside of the GeoTIF. So this is the reason why it's actually composable and extensible.
12:20
Decoding, decompression is really CPU heavy. So this is why we implemented it in threads, or web workers, how they're called in the web platform. So you can use it. So you can, what's beneath there, you can create a GeoTIF pool, which is a web worker
12:41
pool, a thread pool. And when you pass this pool to the Redrastos function, you make use of multiple CPUs. So it can speed up a lot the image decoding. There is one, so sometimes you have to use the tricks because I am lazy
13:03
and I am also bad with maths and compression is usually dealing with maths a lot. So you have to be really ingenious in order to make use of what is already there. And for the life of me, I could not find a decompressor for the WebP format. So I'm actually constructing an image, a browser image,
13:24
and just load the tile of the TIF directly inside of this browser image and then read the bytes back out of it. So it uses the browser capabilities. So this is why it's not usable on Node.js, for example. All right.
13:44
So that was basically all the features that the GeoTIF comprised of. I'd like to talk more about the GeoTIF, the open source project. So when I started it, I based it off some project that someone else wrote. So it was basically, as I said, at the right time
14:02
at the right place. But now there's more than 35 contributors. And just in the past year, there were 50 merge pull requests. And something interesting happened. So there were less and less requests for new features, but more shift towards maintenance.
14:20
So more like bug reports for this does not happen, this should be done this way, and so on. So I've seen a curve of adoption of GeoTIF.js in various frameworks. And every time this happens, I get back a slew of bug reports and issues. And it's really valuable information
14:41
to see how it's actually being adopted and where the pain points still are. But it somehow gives me the impression that I think it might come to a spot where it's actually done, because there's no additional feature that you can implement, apart from your compression algorithm.
15:01
Pull requests, welcome. Welcome. So there's many downstream projects. So quickly go ahead. So this is just what I pulled from npm.js. So we have 47 dependents. I don't know them all. I know some of them, and they're all cool and great. But the one I really fell in love with
15:21
was the OpenLayers integration, because it's done really, really well and ties into their ecosystem. So it's been adopted by OpenLayers since version 6.7. But we've seen that there's also many, many other non-GEO applications where this is used. So for example, if one is a medical application, we can view body scans of some sort or, I think,
15:44
microscopic images, which is also a really cool use case. 2016, I wrote this small app, which is called the Cog Explorer, which was built for a version of OpenLayers, which was really old, and it was not really well to be integrated.
16:03
So we made a custom WebKit render pipeline and did one hack after the other just to make it work. In the end, it worked nicely, and you can still use it. It's not updated since then, really. But it basically had 700 lines of complete awfulness. So it's really, really bad.
16:20
Don't look into it. But now, the last year, I wrote this small app, which is called a DOM app, because you basically can visualize a DOM. It's built using the quite modern version of OpenLayers. And just quickly go forward. This is all the code I needed to load the GeoTIFF.
16:42
So you basically provide the URL to your TIFF, and that's basically it. And this is like, it was really awesome. What you can do, you can load a DOM inside of this web view, and you can basically define a couple of random modes, so shaded contours, slope shade. And then you have some of these nice sliders
17:00
on the right-hand side, and you can see how the image is changing. This is a GIF that I just recorded. It feels really chunky. If you do it yourself, you'll see it's butter smooth. It's not because of GeoTIFF. It's because OpenLayers is awesome. And it's really easy to set up and comprehensive.
17:25
What now? As I said, I would consider it somehow feature complete. I like that it's going more and more in the back end. It's adopted by many projects, but it's going also more and more in the back ends. And I'm happy for that.
17:40
And every time there's an adoption, I get feedback, which is also really great. So it can work out the last things that there might be, so the last issues there might be. There's always ideas to throw everything in the garbage and start anew. We don't do that, no, of course. There's ideas for refactoring.
18:02
So there's some experience that we learned over time, how we could make the architecture simpler and then better. And since JavaScript is that fast of an ecosystem, it's always easy to say, OK, I should have done this or that. And also, I think that the last talk I hinted at it.
18:22
I think it's a field it's starting. There's a broader JavaScript raster data ecosystem happening. And now even the lines between front end and back end browser and server is starting to blur, which is also really cool. So I think there is a way to emerge and something
18:48
is happening. And I'm really looking forward to it. And yeah, I'd like to thank you for your attention, and I'd be happy to take any questions. Thank you very much.