We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Breaking the curse of raster processing software-as-a-service

00:00

Formal Metadata

Title
Breaking the curse of raster processing software-as-a-service
Title of Series
Number of Parts
295
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The emergence of software-as-a-service platforms for geoprocessing of large raster datasets provides a tempting and fast way to try new raster algorithms or indicators. As with any other SaaS platform, the downside is the inability to run the same process in our own computers; be it because of recklessness during systems design or because of a will and purpose to create vendor lock-in. This talk will analyze such a case of vendor lock-in in raster geoprocessing SaaS, and a means of running the same raster processes efficiently in the end-user's (or operator's) hardware (so-called "edge computing" for the buzzword-inclined). Moving away from SaaS in this case even provides extra benefits, such as better data cacheability and real-time algorithm tweaking.
Keywords
Computer scienceBit
Raster graphicsStructural loadReal-time operating systemMedical imagingWeb browserLibrary (computing)Host Identity ProtocolSheaf (mathematics)Wind tunnelMappingForcing (mathematics)Web 2.0Lecture/Conference
Web browserOperator (mathematics)Web 2.0Real-time operating systemTransformation (genetics)Medical imagingServer (computing)Graph coloringVisualization (computer graphics)2 (number)PixelQuery languageOpen sourceGoodness of fitSymmetry (physics)State observerDatabaseAxiom of choiceGame theoryFunction (mathematics)Endliche ModelltheorieHost Identity ProtocolDigital Revolution
Video gameLevel (video gaming)WebcamSurfaceFile formatWeb browserReal numberMappingHill differential equationPoint (geometry)BitComputer scienceReal-time operating systemMetropolitan area networkForcing (mathematics)Dot productGreen's functionForm (programming)NumberBeat (acoustics)Lecture/Conference
BitField (computer science)SatelliteInheritance (object-oriented programming)Direction (geometry)Level (video gaming)Computer scienceMappingDifferent (Kate Ryan album)Video gamePixelMedical imagingFrame problemLatent heatRight angleSurfaceDeterminismMeeting/Interview
Frame problemReflection (mathematics)Server (computing)Web browserComputer-generated imageryMusical ensemblePoint cloudCloud computingComputerMathematicsPivot elementLaserComputer networkTrailInformationWeb browserMedical imagingVideo gameComputer graphics (computer science)NeuroinformatikReal numberSoftwarePixelGraph coloringComputer hardwareSurfaceVolumenvisualisierungWordBitProcess (computing)Raster graphicsVector spaceInternet service providerMusical ensembleCASE <Informatik>Endliche ModelltheorieDemo (music)Arithmetic meanComplex (psychology)Server (computing)VideoconferencingGraphics processing unitConnected spaceWeb 2.0Service (economics)InternetworkingAngleReading (process)Multiplication signLevel (video gaming)Cloud computingRefractionGreatest elementFrame problemRaw image formatOperator (mathematics)LaptopDigital photographyQuicksortNP-hardGame theoryPoint cloudPosition operatorPerspective (visual)Beat (acoustics)Right angleWind tunnelAssociative propertyMathematicsFormal grammarMultilaterationGoodness of fitData storage deviceOpen sourcePlastikkarteAreaElectronic mailing listDemosceneReal-time operating systemForcing (mathematics)System callAsynchronous Transfer ModePower (physics)Message passingProgram flowchart
Software as a serviceDemo (music)Computer-generated imageryConfiguration spaceMathematicsLogarithmComputer filePlot (narrative)Interactive televisionCodeReading (process)Structural loadNormed vector spaceForm (programming)Web browserInclusion mapSpacetimeComa BerenicesData integrityShader <Informatik>Attribute grammarConvex hullIRIS-TRenormalization groupRotationTriangleServer (computing)MetreoutputCirclePixelWeb pageData miningElectronic mailing listInversion (music)Denial-of-service attackLink (knot theory)Function (mathematics)Uniform convergenceTablet computerKernel (computing)Motion blurOvalFluxLine (geometry)Envelope (mathematics)Core dumpSanitary sewerLinear subspaceElectronic meeting systemSample (statistics)SineBounded variationMaxima and minimaWaveTupleTexture mappingFingerprintHausdorff dimensionWordVector spaceChi-squared distributionVarianceKonturfindungGroup actionMultiplication signSheaf (mathematics)Web browserWeb 2.0Food energyCycle (graph theory)Medical imagingHypermediaServer (computing)BitComputer animation
Interactive televisionPlot (narrative)TriangleForm (programming)Demo (music)CodeTransformation (genetics)Structural loadDenial-of-service attackSpacetimeRotationSanitary sewerDew pointCirclePixelMetreoutputGoogolRepetitionExecution unitHypermediaGraphical user interfaceComputer networkCross-site scriptingPhysical lawStatisticsLipschitz-StetigkeitFlip-flop (electronics)Element (mathematics)Uniform resource locatorNetwork topologyAcoustic shadowFunction (mathematics)Price indexComputer-generated imageryMaxima and minimaDivisorLemma (mathematics)Suite (music)Structural loadMachine visionLaptopWeb browserConfiguration spaceWeb 2.0Computer animation
SatelliteSanitary sewerElectronic mailing listCodeShader <Informatik>PixelSample (statistics)Structural loadoutputMaxima and minimaWeb pageUniform convergenceVarianceCore dumpEnvelope (mathematics)Lemma (mathematics)Real-time operating systemDampingLibrary (computing)Position operatorNumberReading (process)Beat (acoustics)BitPoint (geometry)Computer animation
FingerprintRotationSpacetimeDemo (music)Server (computing)Denial-of-service attackInteractive televisionPlot (narrative)CodeStructural loadCirclePixelMetreoutputForm (programming)Open setQuantumUniform convergenceSet (mathematics)2 (number)Arithmetic meanReal-time operating systemMultiplication signCollisionBefehlsprozessorRaster graphicsComputer animation
Standard deviationSubsetScripting languageExecution unitLatent heatTime domainDomain nameSubsetLatent heatInternet service providerGreatest elementScripting languageService (economics)Standard deviationMusical ensembleFunction (mathematics)CodeDemo (music)SoftwareVariable (mathematics)Video gameComputer animation
Functional (mathematics)NeuroinformatikWindowGraphics processing unitComplex (psychology)Server (computing)Scripting languageProcess (computing)Physical systemRaster graphicsShader <Informatik>Structural loadMereologyService (economics)Multiplication signCASE <Informatik>Point (geometry)2 (number)Function (mathematics)Web browserClient (computing)BefehlsprozessorBit rateArchaeological field surveyReal-time operating systemGeometryLibrary (computing)Internet service providerPerspective (visual)Direction (geometry)Latent heatBus (computing)Meeting/Interview
CodeSet (mathematics)TesselationGraph coloringService (economics)CubeGoogolCache (computing)Different (Kate Ryan album)Multiplication signMappingDefault (computer science)Internet service providerPixelPoint cloudDependent and independent variablesStructural loadTexture mappingVisualization (computer graphics)Raster graphicsSemiconductor memoryProjective planeSoftware1 (number)File formatServer (computing)Product (business)Musical ensembleLibrary (computing)Presentation of a groupPoint (geometry)Bit rateCASE <Informatik>Food energySign (mathematics)Right angleArithmetic meanAssociative propertyBitBound stateDemosceneHypermediaAreaComputer clusterLecture/Conference
Product (business)Multiplication signRight angleStatisticsLaptopMeeting/InterviewComputer animation
Computing platformLaptopWordLine (geometry)Meeting/Interview
Transcript: English(auto-generated)
Yeah, so hi everybody. My name is Iván Sánchez-Ortega. I want you to, I want to remember to remind everybody of you that I'm not a GIS person. My education is computer science and not geography, not topography.
So please bear in mind that I view things a bit differently than other people in this room. So I'm going to talk about a couple of stories. So in March 2006, there was this small company called MapSend that was sponsoring Phospho-G-C-ool and was quite well-known in the industry.
They made a WebGL real-time hillshading, hypsometric tin and hillshading library together with lots of other fancy things to do in the browser. What they did is, because you cannot really load raster data in the browser, you cannot have the full diesel suite, they packed data into three of the four channels of a PNG image.
So if the red, green and blue channels of that PNG image has this packed value for the elevation data and you had to do a small mathematical function, a small mathematical operation to get
the actual elevation out of those values. It's not really complex, but it's kind of weird to need to do that kind of transformation when you have an image. It's the only way you can have elevation in an image which works in any browser by 2015. So the web browser reads such elevation data, based on that data, finds a color in the
between the stops for the hypsometric tint and finds the elevation given the neighboring pixels and calculates the slope in the browser. And that works. And it works beautifully. It's a work of art. And all of these things are made in the browser in 2016 on the fly, given PNGs with packed elevation data.
Okay, so just remember that for a second. We have the technology to do this since three, four years ago in real time in web browser. In fact, you could change between visualizations in real time with no new queries to the web server with the same digital elevation model.
You could even, they did this fun experiment where the web browser asked for permission to use your web camera. So whatever object you placed in front of the web camera would become the spherical map for the actual hill shading. And there was, I clearly remember one of the
maps and boys, because they were all men, one of the maps and boys grabbing an orange, putting it on front of the webcam, and all of the hill shading would become an orange with the surface with the tiny dots and everything. So it was incredibly incredible to look at that point. We have the technology. So fast forward a bit to August 2016. I was
in Phospho-Gebon, as many of you were, I guess, or I hope. I was giving a talk about challenges of intermapping formats. The problem is that the more I was working in that talk, the more it became a talk about video games. And it became a talk about how cool video games are at rendering things in real time, giving some abstract data in some
weird format with some weird requirements, right? And I learned that video game technology is amazing, absolutely amazing, and that we JS professionals are smug. Very. Especially since I was coming to the JS field from computer science and to Phospho-G
2010, and so knowing nothing, I could see a bit of this smugness. I suffered it in first person. So we are like this, right? We have 15 values per pixel with our ultra super different
sensors and our satellites. Don't try this at home without specific software, right? Meanwhile, video game people are like this. Oh, that's cute. Because to render one image in Doom, you have to have a depth map, a normal map, which includes the bump map for every surface, plus a specular map, which is the direction that light would reflect in that pixel
from the previous frame to calculate the blur mapping to know how much work you have to do for that specific pixel, and that's just to calculate the surface refraction map on the bottom. For the full thing, there's a lot of other operations going on. So if you think that
your raster processing is complex, I'm sorry to say this, but you don't have a clue of what complex pixel processing is, okay? And the cool thing about video games is, if you realize is, because of video games, we have specialized hardware to do this, which are GPUs. GPUs are awesome at doing this thing, manipulating pixel values and angles and vectors and 2056
values per pixel and things like that. Resampling in real time, they are awesome. Video game hardware is absolutely awesome at doing this. So fast forward to July 2017.
I was in Paris, and I was just going around the booths and, you know, grabbing the stickers and asking the usual questions about the pamphlets, and I tried it. I signed up for the three-month trial, and I started playing with it, and it's a good tool, okay? You can go there,
you click on an area, you get one of the Sentinel data frame things, and then you see this list of possible, completely complex channels of data that are too complex for me to handle at home, remember? And I have an issue with this, okay? I have a real issue with this.
Changing layers from visible color to NDVI to false infrared is needing more requests to the web browser already has the data. So if I have created a false infrared color or false color image that's bands 8, 4, and 3, and then NDVI is bands 8 and 4, so if I already
have the information about bands 8 and 4, why do I need to request an image from a computer perspective? I know that some GIS professionals will tell me, well, there's the precision issue because 8 bits PNG, blah, blah, blah, but from my perspective, there's no real
excuse to not to do this this way, okay? So this is how it looks like. When the user wants to see false color, the browser requests false color to the server, the server gets the data, renders it, if it's not rendered, then sends the image back to your browser.
When the user changes the image mode, it goes again to the browser, the browser goes again to the server, and the server reads the same data and does some processing in the server to throw back the image at the browser which is placed to the user. Now, I'm going to put little pictures so you can understand this thing. Every time you make a request to a web server,
that's time, and every time a web server works, that's money, okay? And I'm a chip steak. I don't want to pay money. I don't know if you all agree on this. I don't want to pay more money than I need. Fine. Now, I realize there's an issue with this because my incentive as a user
is to not pay money. So as a user, I would like the server to do as little work as possible, but I do realize that as a service provider, you want as much money as you can. So service providers do not have an incentive to change this way of working, okay? The software as a service
works, the software as a service model works in the meaning in the sense that the more that this software runs, even if that run is useless, the more that that software runs, the more money you get, okay? And I don't like that. I'm sorry, I'm a chip steak. Okay, so I need that we need to
pivot from cloud computing to edge computing. Is that cool sounding? Right? Well, in case you don't understand this thing, okay, there's a little dictionary. P word means to change. Cloud computing means using other people's computer because remember, there is no cloud, there's only other
people's computer. Can I get everybody to say this? There is no cloud, but other people's computers. Yes. And there's this edge computing idea, which is using the end user's computer, which is fancy sounding, okay? So what I want to do is this.
Now you take the photo, you make the tweet, it's funny, haha, fine. Yes. Come on, take the photo. So it should look like this. This is how I want things to look as a user who wants to pay as little as possible. I want to see false color and I tell the browser, the browser requests the raw data in bands 843. The browser reads that data and gives me the data, not the image,
and then the browser calculates that, okay? Because I already paid for my computer's graphics card. So calculating false color is already included in the cost of my computer or my laptop or my cell phone. I don't have to pay for that because I already paid for it. So then when
I want to do a different utilization, I already have the data. I don't have to request anymore. There's less money being here and less time to respond. That's it. That's the whole thing. You don't need to read request data to calculate NDVI or false infrared if you already
carry it. If you already asked for some of the data resolutions, as long as you have the data, you can recalculate the data in the browser and you would tell me, no, that's impossible, it's too difficult. We have the technology to do that since 2016. Okay? So how to do this in practice? And this is where I just say fancy things about myself. With my leaflet plug-ins,
how are we on time? I will do a live demo with a computer that is not my own in front of with internet connection that is not reliable. What could possibly go wrong?
So we go here, and if this loads, which it does, and this doesn't have scroll, yay, I can do this thing, and I will choose the edge detection because it has, yes.
So I'm doing edge detection on the fly on top of a real imagery. That's fine. I don't want to do that. I just want the original imagery, right? Whoa, yes, underscore. And I want to do, what did I miss here? Oh, capital F, thank you. Yes, thank you.
So look at this, don't blink, I'm going to flip two channels of an RGB image on the fly without making any request to a web server. Are you ready for this? Whoa. Did you miss that? Okay, here I'm going again, okay? Ready? Whoa. And I'm not really requesting images.
We can do this on the web client. We don't need to do this on the web server. We can save a lot of computing cycles with this, okay? And if we still have time, and you tell me, well, but this is only with 8 bit per channel PNG images pre-rendered, I cannot use this with my GIS data. Okay, enter the thing that I have been working
for the last two weekends, because it's taking like two weekends. And this is going to load, I want to show it here, and hopefully it will load properly. So this is loading geotiffs,
and hopefully if this works, it might not work so the laptop configuration. Yep, apparently something's fucked up here. But we have the technology to read geotiffs on the web browser, so it is possible to read full precision 16 bit per channel geotiffs, and 32 bits per
channel floating point numbers geotiff. I'm not aware of any other libraries to read on the geospatial data formats, so later I can take questions and talk about it. But right now, I have only worked with PNG and geotiff, and it works. I can do this real time.
Since we are good on time, I can also see, yeah, I can also show this one, which is just to show that I can do raster manipulation in real time 30 times per second, and it's not using up any of my CPU. This is doable, okay? And because
this is leaflet, you can zoom in and you have the usual stuff, but this is smooth and doesn't reload anything. So, you know, stop writing domain specific JavaScript subset with predefined
variables for your band data and start writing standard GLSL with WebGL, because that's fun and you can actually hire video game people to write them for you, and it's not that difficult. If you don't know how to run the code on the top, which I took from one of the
Sentinel Hub demos about the uncertainty of NDVI with some weird mathematical function, I just translated to the thing at the bottom, and it's just quite trivial to do. So, it's not difficult to use. It's just we need some kind of incentive to make it happen, and that's what I'm worried about. There's no real incentive to stop this from happening,
because software as a service is money. It's a lot of money, and service providers don't want to get out of it, unfortunately. So, that's it. That's all I have. Thank you.
All right. Thank you, Ivan. So, actually, from UX, the GeoTIFF library, Fabian will be talking about it tomorrow. I also think it's great really to retrieve the data to the client and work with it directly. I was seeing writing the shader so much fun trying to debug the
fragment shader, so I'm not sure every user wants to do this, but, yeah, I think it's ... I mean, I have seen the Sentinel Hub did a contest for Sentinel Hub scripts, and the complexity of writing such script is pretty much the same as writing a shader.
So, yeah, I mean, if you have things ready and you're presenting it the way you're doing, I think it gives you quite a lot of potential to work with the data. So, I think that's really great. So, we're super great in time, so you have a lot of time to talk with him directly.
So, if there are any questions, just let's go there.
No, the question is if I had a look to GeoTIFF IO, and what are my thoughts on that,
and I have to say I haven't had a deep look at GeoTIFF IO. At some point, it has crossed my browser window, but I haven't. What I have been using is the GeoTIFF.js library, which allows you to query a window of cloud-optimized GeoTIFF, so it only asks for the right data to the server, and I haven't worked with that, unfortunately. The work I have been doing
is just to make sure that the output from GeoTIFF.js can be passed on to the WebGL functionality. I hope that answers the question. Okay, yeah, so the question is if I think
that doing the processing in the browser adds complexity to the system. I think the answer is yes, it adds complexity to the system. At the same point, at the same point in time,
it will get rid of processing time, so it's a question of whether you want to spend the extra time passing, doing edge computing, right? Is edge computing a thing that you're interested on from a buzzword perspective, right? So, yes, from a buzzword perspective, edge computing will add more complexity to the system, but it will offload load from some other parts of
the system. That's kind of the point of it, right? It's not a silver bullet, of course. Maybe depending on what targets you are looking at, you want to keep doing raster processing in the server, but offload as much as possible to the client, maybe. It depends on the business
case, right? As I said, the real problem here is not if I can do it or not and it becomes more complex, the problem is there's no incentive. That's the issue I feel it's more important. Is it more complicated? Yes. Are you willing to spend the time to do that, to relieve your
servers and pay less money to your service providers? Maybe, yeah. So you have to implement
an algorithm, I guess, for raster processing, right? Either in the browser or in the server. And the cool thing with this kind of technology, with GL, is that the same shader, the same shader function to the processing, the thing I was typing in realtime, it can work on the client,
on any browser, and it can also run on servers. Servers can handle WebGL or can handle GLSL as long as they have a graphics card attached to it. If that's the case, servers will begin using GPU instead of CPU. But I think we face, once again, the issue of incentive.
Doing things in the CPU is more costly, which means more income, okay? I want somebody in this room to prove me wrong about this, okay? Tell me I'm wrong. Maybe if I can just for a second. I think what we often see is that you have, it depends really a lot on the use case you're
working. If you have a situation where you really want to look at all of the data and you want to be able to manipulate it and work with it, this, I think, makes the most sense. But if you have a specific use case, a service that is just giving you the NDVDI or something
already that has been specified, I think it makes sense because you're transferring less data. You can, I don't know, cache things. But yeah. I see your point, but you have to remember also why Tiled Maps took over the world in the era of Google Maps, right? How many people here remember WMS in like 2005?
Okay. And why Tiled Maps just took over the scene? Because with Tiled Maps, you can cache things. So every time a user is looking around the same area, that user will fetch the same tile and the WMS server will not need to process that, right? So the same thing happens when you
are looking at compositions of bands. If you can make something that will serve the same set of data to a lot of different users for a lot of different use cases, I think that you can cache less data than caching all the visualizations for all the use cases, I think.
I think there was a comment, question. I mean, I want you to prove me wrong, okay? If service
providers are interested in passing me the savings, implement this in WebGL and slash down the cost
by 50 times, which is the performance boost that you get from doing GPU instead of CPU. I want all raster software as a service to be 50 times less costly for me as an end user if service providers are really willing to pass the savings to me, the customer.
And I'm sorry if I'm being overtly negative. I know that my style of presentation can be seen as a bit aggressive. I know that I'm histrionic. Yes, I do it on purpose. I'm not offending anyone, I hope. So benchmarking. I haven't done any benchmarking yet.
And in regards to the other question that you made, which is if you don't load the full dataset, you might end up downloading like eight channels or something like that to just show three, right? That's when knowing that what to cache comes into play. And there's the format for
cloud optimized geotiffs. So you query only data you want. The goal here is to query only data you want. So instead of one geotiff with eight channels, you want eight different geotiff with one channel each. So you can query only the ones you want. WebGL can handle eight channels
by default. WebGL one. Not eight channels, eight textures. And for each texture, you
will have four channels, four RGB channels. In WebGL two, you have like a thousand, you can have data cubes of 2,000 by 2,000 by 2,000 pixels. So you can load as many channels as you want. It's not a problem to load in memory. And in regards to fetching them to cache, it's just a matter of knowing what to cache. What am I going to make as my API?
Right? How many data endpoints do I want my API to have? It's just a design question. I haven't done any benchmarking. If I had the resources and the time to do it, I would be happy to. For what? Sorry. I would be happy for somebody to hire me and take the
responsibility of making it production ready. Because I don't know what production ready is. This thing that you have seen doing the fancy color changing, it's a weekend project for me. Okay. I know this is possible. I want just to poke everybody in this room to make it happen.
I can make it happen if given time. But hey, the code is over there. It's easy to find the Tyler GL. Just look for those names. You will find the library. You can look at it. You can learn WebGL. And anybody can do this production ready given the time. I don't have any incentive
for making it production ready right now. Provide me with an incentive. I will be happy to. You can run this on 97% of laptops as of today according to WebGL stats. I run this
usually on an Intel 910 graphics card. I have run this on Samsung Galaxy 4 cell phones. So it's 97% of platforms. And yes, this will run on basically if it's
newer than 2012, it should run on it. All right. So I think we'll go to the next presenter. Thank you. Thank you.