We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

A Tensor Based Framework For Large Scale Spatio-Temporal Raster Data Processing

00:00

Formal Metadata

Title
A Tensor Based Framework For Large Scale Spatio-Temporal Raster Data Processing
Title of Series
Number of Parts
295
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
In this paper, we address the course of dimensionality and scalability issues while managing vast volumes of multidimensional raster data in the renewable energy modeling process in an appropriate spatial and temporal context. Tensor representation provides a convenient way to capture inter-dependencies along multiple dimensions. In this direction, we propose a sophisticated approach of handling large-scale multi-layered spatio-temporal data, adopted for raster-based geographic information systems (GIS). Moreover, it can serve as an extension of map algebra to multiple dimensions for spatio-temporal data processing. We use the multidimensional tensor framework to model such problems and apply computational graphs for efficient execution of calculation processes. In this approach, spatio-temporal data can be represented as non-overlapping, regular tiles of 2-D raster data, stacked according to the time of data captured. As a case study, we quantify the spatio-temporal dynamics of solar irradiation calculations and 2.5-D shadow calculations for cities at very high space-time resolution using the proposed framework. For that, we chose Tensorflow, an open source software library developed by Google using data flow graphs and the tensor data structure. We provide a comprehensive performance evaluation of the proposed model against r.sun based on GRASS GIS. Benchmarking shows that the tensor-based approach outperforms r.sun by up to 60%, concerning overall execution time for high-resolution datasets and fine-grained time intervals for daily sums of solar irradiation [Wh.m-2.day-1]. Precisely, the main characteristics of the proposed framework include defining, optimizing and efficiently calculating mathematical expressions involving multi-dimensional arrays (tensors); Transparent use of GPU computing such that the same code can be run either on CPUs or GPUs; Implicit parallelism and distributed execution with high scalability offered by data-flow based implementation. Moreover, the Python implementation of the proposed model makes it GRASS GIS ‘Add-on’ compatible.
Keywords
TensorRaster graphicsScalabilityProcess (computing)Local GroupNeuroinformatikIntegrated development environmentSoftware frameworkSpeech synthesisTensorInformation securityRaster graphicsMultiplication signAreaPoint (geometry)Computer animation
NeuroinformatikGroup actionGeometryDifferent (Kate Ryan album)State observerLecture/Conference
Data modelData structureTemporal logicSingle-precision floating-point formatVolumeRaster graphicsSingle-precision floating-point formatDimensional analysisDifferent (Kate Ryan album)State observerData structureScalabilityNeuroinformatikInformation
Electronic mailing listData modelData structureTemporal logicVolumeRaster graphicsQuantificationTensorSoftware frameworkScalabilityOpen sourceLibrary (computing)Image resolutionAcoustic shadowCalculationSpacetimeRaster graphicsScalabilityMultiplication signTemporal logicLibrary (computing)Address spaceDataflowOpen sourceComputer animation
TensorSoftware frameworkScalabilityCalculationData modelGastropod shellDynamical systemLibrary (computing)Acoustic shadowSpacetimeGraphics tabletGraph (mathematics)Temporal logicImage resolutionStochastic differential equationDataflowLibrary (computing)Open source
Image resolutionQuantificationTemporal logicAcoustic shadowCalculationSpacetimeTensorSoftware frameworkRaster graphicsScalabilityOpen sourceLibrary (computing)Data modelScalar fieldMatrix (mathematics)Computer multitaskingShape (magazine)Element (mathematics)Object (grammar)Vector graphicsDimensional analysisRepresentation (politics)Execution unitRead-only memoryResource allocationBlock (periodic table)DiagramCore dumpBefehlsprozessorArchitectureSurjective functionOperations researchFree variables and bound variablesGraph (mathematics)BuildingComputational physicsGoogolMachine learningSoftwareSelf-organizationVertex (graph theory)Series (mathematics)Electronic mailing listGeometryDebuggerShape (magazine)Scalar fieldDescriptive statisticsSlide ruleAzimuthDimensional analysisObservational studyMatrix (mathematics)Block diagramNichtlineares GleichungssystemParameter (computer programming)SpacetimeMetric systemCASE <Informatik>Execution unitMathematical objectDynamical systemNeuroinformatikSoftware frameworkCalculationTensorTotal S.A.Image resolutionBefehlsprozessorAcoustic shadowSummierbarkeitGraph (mathematics)Semiconductor memoryThermal radiationSingle-precision floating-point formatRight angleType theoryFigurate numberOrder (biology)Array data structureCore dumpComputer architectureConnectivity (graph theory)State observerTemporal logicEndliche ModelltheorieOperator (mathematics)Position operatorMathematicsUniform resource locatorGraphics processing unitDiffuser (automotive)MereologyCodeComputer programmingCartesian coordinate systemInformationCuboidAngleData structureReflection (mathematics)Direction (geometry)Coefficient of determinationRemote procedure callGoogolDataflowPoint (geometry)Sampling (statistics)Ferry CorstenShared memoryIncidence algebraBounded variationWorkstation <Musikinstrument>Set (mathematics)TunisMachine visionBlogField (computer science)Different (Kate Ryan album)Process (computing)PixelComputer animation
AngleHorizonModule (mathematics)GeometryAcoustic shadowPixelVisualization (computer graphics)Operator (mathematics)Computer graphics (computer science)Uniform resource locatorAlgorithmPoint (geometry)Line (geometry)Set (mathematics)CalculationMereologyComputer animation
Visualization (computer graphics)Acoustic shadowGraph (mathematics)TensorWhiteboardElectronic mailing listCalculationAreaAngular resolutionMetreParallel portGraph (mathematics)Distribution (mathematics)Acoustic shadowWhiteboardResultantMereologyRange (statistics)CalculationCategory of beingImplementationPresentation of a group2 (number)MetreArrow of timeTorsionstensorDataflowCodeBefehlsprozessorRepresentation (politics)TensorProgram flowchart
DiffusionElectronic mailing listMereologyAcoustic shadowDiffuser (automotive)ResultantModule (mathematics)Grass (card game)Thermal radiationEndliche ModelltheoriePower (physics)Computer animation
TensorPerformance appraisalConfiguration spaceBefehlsprozessorElectronic mailing listConfiguration spaceGrass (card game)NeuroinformatikGraphics processing unitCartesian coordinate systemImage resolutionAverageGraph (mathematics)Revision controlUsabilityDressing (medical)Food energyDiagram
Electronic mailing listCodeGraphics processing unitParallel computingScalabilityTensorArray data structureRegulärer Ausdruck <Textverarbeitung>BefehlsprozessorSet (mathematics)Information securityAlgebraProjective planeParallel computingCubeLevel (video gaming)Distribution (mathematics)Graphics processing unitFunctional (mathematics)MereologyDataflowAlgebraic functionUniverse (mathematics)Priority queueComputer animation
Lecture/Conference
Flow separationPoint (geometry)EstimationImage resolution2 (number)RhombusAreaEqualiser (mathematics)Multiplication signCalculationTemporal logicAngular resolutionCodeLecture/Conference
Temporal logicTensorInterpreter (computing)Graphics processing unitCategory of beingUsabilityGaussian processGrass (card game)CalculationParallel portAcoustic shadowoutputAlgorithmDifferent (Kate Ryan album)Level (video gaming)Computer hardwareTesselationPoint (geometry)CASE <Informatik>Process (computing)Coordinate systemCausalityPeer-to-peerHorizonComputer programmingFamilySpeciesBit rateIntegrated development environmentLecture/Conference
Transcript: English(auto-generated)
So welcome to this very last talk of the session Which will be about the tensor based framework for large-scale spatiotemporal raster data processing so security Hello everyone
I'm sukriti Bhattacharya. I have been working on that topic for some time In Luxembourg Institute of Science and Technology in the geo computation group Along with my colleague Christian Brown and will read Leopold sitting over there so Basically, these are the research questions based on that we did our research
So actually we are trying to tackle the curse of dimensionality issues in art observation data, especially so curse of dimensionality means when you have Huge amount of data with different dimensions
So it creates the scalability problem in in the computation when you are doing computation and then we are also trying to find out a single data structure where you can keep spatial and temporal information together and Also, the main concern was the time and the scalability So to address these questions, we proposed a sophisticated way of handling
large-scale spatiotemporal Data in the raster based GIS and basically, we adopt the concept of Tensor so we have tensor algebra we implement it using
TensorFlow, which is a open source library from Google I will come to tensorflow later and tensor detail description of tensor later in the next slide and then we we Did some kind of case studies on our proposed framework. So we kind of
quantify the spatiotemporal dynamics of solar radiation calculation and 2.5d shadow calculation for large cities with a high space-time resolution Okay, so what is the tensor tensor is basically a mathematical object which is kind of generalization of
Like Metrics is with two higher dimensions. So a tensor can be recognized by three parameters the first one is the order like the unit of dimensionality then the shape of the tensor and The third one is the type of value it contains basically
So in the picture, you can see in the figure one you have scalar vector matrix and then tensor So the dimension is 3d here and the shape is so basically it's a for three cores for Matrices joke lap together and these are the different component of a tensor that you can
Use during your computation And now what is tensorflow as I told you before tensorflow was kind of invented in Google basically tensorflow was designed for Big data handling in deep learning
But it is general enough that you can use it in different scientific calculations. So In the right hand side in the picture figure three, you have the basic tensorflow architecture So you can use different front end to access that in separate distributed execution engine So we use Python front end and then you can deploy your code
Irrespective of CPU and GPU is very transparent. So here we use only CPU and This is the block diagram of a tensorflow program basically, so You first you can build the computational graph then you can allocate memory dynamically and then you can create a session where inside the session you can
Execute the graph and you can execute that particular graph in CPUs and GPUs you respectively without changing the code is very transparent as I said and then after that you can close the session Okay, so this is the basic kind of
Philosophy behind that that an entire thing so you have some Type temporal information in the x-axis and these are basically the observations So you can consider it like like a like a azimuth angle suppose so you have
The azimuth angle changes according to the Sun position So you can keep it like in t t1 t2 t3 you have different azimuth for the same location So what you can do you can just create a tensor is a 3d tensor So you have XYZ thing in the x-axis You have the temporal information and you can club all these matrices together to build a single
data structure So using that That that that philosophy we basically model Solar irradiation. So in the figure 7 you can see solar irradiation
is a total solar radiation is a summation of Direct diffuse and ground reflected irradiations. So this is the basic Kind of architecture the of our tool So you are collecting data the DSM data Like an atmospheric data you are building the tensor there
and this is dotted box is the tensorflow main program and you are calculating the solar angles and You can see that you are doing the diffuse the radiation beam radiation and ground reflector radiation summation total radiation So what we have done We need to rewrite all these equation because you know
These are not new all these equations mathematical equations are already present in the astronomy part So what you have to do you need to rewrite it according to the tensor data structure because at the end of the day You are going to execute all these kind of equations mathematical equations in tensorflow
You can see here. There is a small part called shadow calculation, which is the most heaviest and expensive operation So if you look at a pixel base operation then a particular pixel, okay So consider a raster and if a particular location is one that mean it is in shadow So you can imagine it for a big city
For each and every point to calculate whether it is in shadow or not is that it's a heavy and exhaustive operation but You know, we rewrite that set of shadow calculation. Even we published few paper on it separately. So this is the basic principle of Shadow side of geometry from DSM thing
So basically we are calculating the horizon angle and altitude angle and I'm not going into the detail but if you look at here, this is the Module for shadow calculation. So here we use tensorflow and spice park to accelerate the overall operation So you can see in the spice park module. There is algorithm called Bresenham lines drawing algorithm
So it's a very very well known algorithm from computer graphics so we use it and The overall approach is significantly faster than the already existing one Significantly means more than 70% faster So but it's a 2d 2d 2d shadow. So this is the visualization of the computational
Data flow graph. This is called tensor board. It comes with tensorflow so you can see your actual implementation as Like, you know like a graph so here you can see the nodes are basically tensors and
the arrows are the flow so tensorflow has a property like it has inherent distribution and parallelism property so we use it and We are using only CPU as I said we can use GPU then our code code could be I mean
Execute would be executed even faster, but we use only CPU for the benchmarking purpose so this is the tensor board representation of The shadow calculation only the overall calculation I can generate the graph, but it will be really big So I just put that shadow part here So we apply
That thing like our approach to as or that day is the second largest Town in Luxembourg, so you can see here so you have DSM range from 279 meter to 426 meter and The grid size is one thousand eight hundred seventy four four cross one thousand eight hundred twenty eight
so this is the result So it's in 22nd Like December at morning 830 So this this is the shadow shadow thing for the whole ash and this is a part of ash like maybe from here
I believe So you have beam diffuse and ground reflected radiation. So we basically Compare our result with grass R dot Sun so R dot Sun has a module that performs the same thing like What we have done so our purpose was to replicate that thing but in a more scalable and faster way
so you can see here the x-axis is the temporal resolution we in minutes and That graph is the percentage of improvement with respect to grass And that is my configuration of my computer. We are not using any GPUs. I have 12 core and
So somehow I am on an average. I am getting 60 percent benefit and This is the conclusion basically what we are trying to do. This is just the beginning what we are trying to do We are trying to replicate that map algebra algebraic function Using tensorflow to mix different API so that we can you know
so we we will named it not a cube rather than an indimensional mallet map algebra and And because we already have all these facilities in bold we don't need to work for Distribution we don't need to work for parallel parallel computation. It's already there and even we can use
GPUs to accelerate it more and That's it. And this is the part of secure project and funded by Innova's foundation Luxembourg I've passed eight in the morning
Yeah, yeah, so that was the performance was based on that no Okay, so I put only one Picture here, but we did for several time and several day in a year and and again
So that is the temporal resolution, you know, so this is one slab of the temporal resolution so we did from the sunrise to sunset and The performance is based on the overall calculation from sunrise to sunset. What's the size of the data set? yeah, so this is the Data set size of the data set. So you have this grid the grid size is like this and
You can change the temporal resolution in the code like one minute ten seconds anything So you can increase the temporal resolution by yourself. And this is the actual spatial resolution. Yeah By the temporal resolution, yeah Huh? I meant the temporal resolution. Okay, so temporal resolution could be anything like you can give it to there
So here is that's why if you look at the final So here you have in minutes, so you have 1.5 minute 2.5 minute 5 minutes So these are the sunrise to sunset and equal interval temporal resolution Thank you Yep. Hi. I was just wondering how much faster it is to run on GPUs
Yep, basically, we we didn't run it on GPU The reason is we were trying to you know benchmark it with r.son So it would be unfair if I if you use GPU and compare it with r.son, but I believe
It will be faster for sure even even you know like That shadow calculation thing mainly we are really proud of because I Can tell you like just run you you have in grass you have r.horizon maybe or something if I remember
The same thing just give the same input to r.horizon and it will take I am sure more than a few hours to calculate the shadow and In our case it took two minutes so you can imagine so how fast it is But for that I will give the credit to spice park as well. So we use spice park, but not GPU
But it will be really interesting to see what is going to happen if we use GPUs here any questions
Is this working? Yeah How Much would make a difference. Maybe if you just cut it up into tiles Sufficiently overlapping to take into account. Let's say the size of the shadows and then parallelize it that way in grass r.son Yeah, at least the difference you saw here. I had the feeling you were probably get the same Really?
Okay, so it's about an 80% difference now But you you you are saying that you will use that r.horizon in parallel environment, right? Like you will Yeah, of course, but what I'm doing, I'm not doing any parallel execution I'm just running r.horizon in the from from and prompt without any kind of parallel processing
Of course, if you do you have GPUs you will get more faster. So I'm not comparing with that level But yeah, you're right you will get the same thing if you parallel execution is possible in grass So if you do parallel execution then yeah, but I'm not doing any parallel execution here. Yeah
Yeah, so again so spark is Parallel like in in data level so you we are not incorporating any kind of HPCs, okay, so even Python multiprogramming you can tell is parallel, you know in that sense
But we are not using any HPCs and all we just use spark to accelerate That Bresenhem algorithm because Bresenhem algorithm work as a pair pair wise Like it takes a pair of coordinate and calculate all the possible points that join those two coordinates. So we use
Spark to run all those, you know Like a pair of coordinates Parallelly but not using any added hardware Any other questions?
No, okay. Well, then thank you very much. Thank you so much