We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

High performance computing in Python

00:00

Formal Metadata

Title
High performance computing in Python
Title of Series
Number of Parts
57
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer
Production PlaceWageningen

Content Metadata

Subject Area
Genre
Abstract
Software requirements: opengeohub/py-geo docker image (gdal, rasterio, eumap, scikit-learn) What are the possibilities to improve the performance of computation in Python? This tutorial shows how to performe Numpy operations using multicore processing, how to accelerate python functions using Numba, how to calculate fast numerical expression using NumExpr, how to use the TilingProcessing to distribute raster operations in multiple cores.
Keywords
System callRule of inferenceProduct (business)DemosceneWeb 2.0Process (computing)Local ringWave packetView (database)CuboidSoftware repositoryOpen setContext awarenessPresentation of a groupCASE <Informatik>Raster graphicsComputer animation
MereologyProcess (computing)Radical (chemistry)Library (computing)MathematicsLaptop2 (number)ResultantMedical imagingPhysical systemCoprocessorProjective planeAngular resolutionImplementationCommitment schemeBenchmarkContext awarenessContrast (vision)Error messageMachine visionGraph coloringTelecommunicationLatent heatTwitterMultiplication signComputerPixelArithmetic meanSystem callData storage devicePrime idealProduct (business)Presentation of a groupShooting methodStreaming mediaParallel computingExecution unitDifferent (Kate Ryan album)Reduction of orderServer (computing)Level (video gaming)Operator (mathematics)QuicksortRaster graphicsHeegaard splittingCore dumpField (computer science)Type theoryComputer scienceFunction (mathematics)WritingLink (knot theory)Control flowIdentifiabilityFlow separationInvariant (mathematics)Category of beingField extensionCodeComputer animation
Electronic data processingRaster graphicsField (computer science)Server (computing)Process (computing)ResultantPixelComputerPhysical systemComputer programSelf-organizationFunctional (mathematics)Library (computing)Data transmissionMatrix (mathematics)Office suiteSoftwareFormal languageNichtlineares GleichungssystemLevel (video gaming)ImplementationCodeDifferent (Kate Ryan album)CoroutineStandard deviationForcing (mathematics)Wave packetConnected spaceLatent heatMereologySoftware frameworkNumberPoint (geometry)Term (mathematics)Mathematical analysisTouchscreenDivisorSoftware developerComputer hardwareCore dumpType theoryLinearizationFeldrechnerNP-hardCASE <Informatik>Operator (mathematics)Function (mathematics)Binary codeLinear algebraAlgorithmDrop (liquid)LaptopParameter (computer programming)Electronic signatureTemporal logicSource codeComputer animation
Open sourceImplementationMatrix (mathematics)Operator (mathematics)AdditionSlide ruleSingle-precision floating-point formatMultiplication signLibrary (computing)Group actionRule of inferenceGoodness of fitDifferent (Kate Ryan album)Metropolitan area networkMetreWordIntegrated development environmentFlow separationString (computer science)Price indexAmenable groupStreaming mediaWebsiteProcess (computing)Product (business)Exception handlingWave packetTwitterSemiconductor memoryRange (statistics)Bridging (networking)ComputerWorkstation1 (number)BenchmarkMereologyIntelOpen setCodeCoprocessorMathematical analysisHarmonic analysisType theoryObject (grammar)Graphics processing unitFreewareStandard deviationContext awarenessBefehlsprozessorRevision controlTesselationPhysical systemExpert systemReduction of orderSoftware bugComputer animation
CuboidRule of inferenceGroup actionProduct (business)Physical lawPhysical systemProcess (computing)SpacetimeMusical ensemblePixelOrder (biology)Matrix (mathematics)Mathematical analysisStaff (military)Formal languageMathematicsTwitterCASE <Informatik>QuicksortFunction (mathematics)WordLevel (video gaming)Library (computing)ImplementationWritingSystem callArithmetic meanDifferent (Kate Ryan album)File formatCubeRaster graphicsFunctional (mathematics)Type theoryMereologyDimensional analysisCodeLatent heatCartesian coordinate systemComputer animationSource code
CodePresentation of a groupRepository (publishing)Wave packetContent (media)CASE <Informatik>Point (geometry)Radical (chemistry)MultilaterationNatural numberFood energyOpen setSystem callSource code
CuboidPresentation of a groupSupercomputerWave packetStandard deviationMultiplication signCodeLaptopExpressionDigital signalArithmetic meanData structureClient (computing)Functional (mathematics)Variable (mathematics)Operator (mathematics)ResultantGreen's functionReduction of orderSpectrum (functional analysis)HypermediaLevel (video gaming)Function (mathematics)Medical imagingDimensional analysis40 (number)QuicksortData managementRow (database)SpacetimeSemiconductor memoryLine (geometry)Logic gateComputer fileCASE <Informatik>MathematicsMereologyBit rateShooting methodLibrary (computing)Musical ensembleRight angleSurfaceWeightMedianNetwork topologyCalculationFiber (mathematics)Endliche ModelltheorieFrame problemState of matterGroup actionQuantum stateDifferent (Kate Ryan album)Performance appraisalPlotterCubeCubic graphStructural loadNumberRaster graphicsPixelReading (process)ImplementationGame controllerKernel (computing)FeldrechnerComputerSoftware frameworkMobile appEvent horizonCompileroutputTesselationComputer animationXML
Library (computing)MereologyProcess (computing)Semiconductor memoryComputer fileData typeData structureNumberOrder (biology)Point (geometry)Medical imagingTemporal logicFunction (mathematics)Electronic mailing listRaster graphicsDampingFunctional (mathematics)Operator (mathematics)Dimensional analysisData compressionMaxima and minimaoutputLatent heatMultiplication signReduction of orderStructural loadMusical ensembleTesselationCalculationReading (process)DialectQuantum stateBitRight angleSpacetimeMoment (mathematics)CASE <Informatik>MetadataProduct (business)Object (grammar)Wave packetCellular automatonInformationCodeMatching (graph theory)RAIDRoutingWordArithmetic meanTheory of relativityPower (physics)Network topologyOpen sourceBit rateBeat (acoustics)Computer animation
NumberComputerMultiplication signEstimatorSoftwareCASE <Informatik>Musical ensembleStandard deviationBenchmarkMedianDimensional analysisAveragePoint (geometry)Function (mathematics)Arithmetic meanGoodness of fitTemporal logicAdditionLocal ringRule of inferenceHypermediaComputer animation
Process (computing)ImplementationOpen sourcePoint (geometry)Library (computing)Core dumpRootWeightMetropolitan area networkOperator (mathematics)Electronic signatureEndliche ModelltheorieParameter (computer programming)CASE <Informatik>Multiplication signComputer animationSource code
Operator (mathematics)Multiplication signFunctional (mathematics)Core dumpFunction (mathematics)Suite (music)BuildingData storage deviceComputer configurationSemiconductor memoryCalculationDifferent (Kate Ryan album)Forcing (mathematics)Limit (category theory)MedianRoutingChainState of matterProcess (computing)Software testingGroup actionRight angleFluidForm (programming)HypermediaMathematicsReading (process)Library (computing)ImplementationMusical ensembleCategory of beingSpeech synthesisDrop (liquid)Shared memoryHeuristicMathematical optimizationAlgorithmFile formatCodeContext awarenessComputerRaster graphicsDampingGraphics processing unit2 (number)Projective planeCompilerJust-in-Time-CompilerMereologyEndliche ModelltheorieResultantPixelBitComputer animation
Process (computing)Parallel computingImplementationLevel (video gaming)Electronic signatureBarrelled spaceTerm (mathematics)ArmParameter (computer programming)Different (Kate Ryan album)Functional (mathematics)Function (mathematics)DemoscenePublic domainMedianComputer animation
Operator (mathematics)Medical imagingFunction (mathematics)Range (statistics)outputData structureShape (magazine)Dimensional analysisProcess (computing)Patch (Unix)PixelCompilerMultiplication signJust-in-Time-CompilerSpacetimeSingle-precision floating-point formatResultantComputer configurationMedianNumberLine (geometry)Three-dimensional spacePoint (geometry)HypermediaAxiom of choiceOrder (biology)Web 2.0WorkloadRootPhysical lawBit rateRight angleGroup actionCategory of beingCoroutineState of matterMathematical optimizationInformationComputer animation
2 (number)Core dumpOperator (mathematics)ImplementationFunction (mathematics)ResultantBarrelled spaceProcess (computing)Water vaporMultiplication signMathematicsRule of inferenceAverageStandard deviationVisualization (computer graphics)Object (grammar)BenchmarkReduction of orderLibrary (computing)Source codeComputer animation
Operator (mathematics)Quantum stateFunction (mathematics)Maxima and minimaSubject indexingLibrary (computing)ImplementationNear-ringIntegrated development environmentMedical imagingSatelliteSpectrum (functional analysis)CalculationComputer fileOrder (biology)Time seriesTemporal logicShape (magazine)2 (number)Musical ensembleDimensional analysisoutputResultantData structureType theoryNumeral (linguistics)Multiplication signReduction of orderHypermediaReading (process)Arithmetic meanTheoryRule of inferenceQuicksortDialectBit rateInstance (computer science)Wave packetExecution unitGroup actionLine (geometry)Data storage deviceSuite (music)Limit (category theory)Process (computing)Sampling (statistics)Rational functionTheory of relativityComputer animation
Multiplication signSoftware bugArithmetic meanLattice (order)State of matterOperator (mathematics)Numbering schemeFunction (mathematics)Universe (mathematics)Row (database)Hybrid computerType theoryCodeDivision (mathematics)CASE <Informatik>ImplementationProgramming paradigmStandard deviationComputer animation
Inheritance (object-oriented programming)Software developerProgramming paradigmSet (mathematics)Function (mathematics)Data structureEqualiser (mathematics)Data miningOperator (mathematics)outputProcess (computing)Compilation albumMultiplication signCodeUniverse (mathematics)Array data structureLevel (video gaming)Computer animation
Order (biology)Multiplication signDifferent (Kate Ryan album)Operator (mathematics)ExpressionCalculationLibrary (computing)Function (mathematics)Functional (mathematics)Control flowTheoryTwitterEndliche ModelltheorieSequelComputer animation
Performance appraisalFunctional (mathematics)Library (computing)ExpressionCalculationStreaming mediaTrigonometric functionsParameter (computer programming)Object (grammar)Logical constantGoodness of fitPoint (geometry)Group actionCoalitionTouchscreenComputer animation
Semiconductor memoryComputer configurationAddress spaceRoundness (object)Multiplication signParallel computingBitArchitectureFunctional (mathematics)ExpressionImplementationRule of inferenceMathematicsPoint (geometry)Software frameworkEndliche ModelltheorieForcing (mathematics)Office suiteMereologyModal logicResultantChemical equationWave packetFunction (mathematics)Cartesian coordinate systemProcess (computing)Online chatLibrary (computing)CurveMaxima and minimaComputer animation
Raster graphicsPhysical systemPoint (geometry)Projective planeParameter (computer programming)File formatMusical ensembleTemporal logicComputer fileElectronic mailing listFunction (mathematics)PixelDifferent (Kate Ryan album)Dimensional analysisMiniDiscDefault (computer science)Multiplication signFlow separationForm (programming)Lattice (order)Lie groupSheaf (mathematics)Forcing (mathematics)Graphics softwareProteinNetwork topologyBit rateEndliche ModelltheorieForestArithmetic meanRule of inferenceComputer animation
CalculationFunction (mathematics)Service (economics)Object (grammar)Mathematical optimizationMereologyTesselationMultiplication signExecution unitSoftwareSystem callSet (mathematics)Flow separationComputerImplementationContext awarenessProcess (computing)MiniDiscPoint cloudServer (computing)Computer configurationPoint (geometry)Open setProduct (business)Observational studyOpen sourceSemiconductor memoryHeegaard splittingSingle-precision floating-point formatParallel computingCASE <Informatik>Different (Kate Ryan album)Inheritance (object-oriented programming)PixelSuite (music)Web 2.0GradientSelf-organizationData storage deviceAreaHypermediaData structureGroup actionComputer animation
Function (mathematics)ResultantRaster graphicsLatent heatRemote procedure callTesselationMultiplicationComputer fileSoftwareRange (statistics)IP addressServer (computing)PlotterUniform resource locatorPoint cloudMathematical optimizationParallel computingProduct (business)Reading (process)Parameter (computer programming)Process (computing)VirtualizationOcean currentPoint (geometry)Photographic mosaicWindow functionDialectExpressionLimit (category theory)Performance appraisalMusical ensembleMereologyIntegrated development environmentImage resolutionCuboidAreaFile formatMultiplication signObject (grammar)Level (video gaming)Service (economics)Physical systemConnected spaceBefehlsprozessorCommunications protocolScalabilityComputer configurationType theoryMechanism designFilm editingQuicksortGroup actionWebsiteOpen setFilter <Stochastik>Dressing (medical)Speech synthesisOrder (biology)PlanningWordData storage deviceObservational studyShared memoryOffice suiteStreaming mediaLine (geometry)GoogolWeb crawlerRight angleRow (database)Functional (mathematics)Forcing (mathematics)QuadrilateralSystem callBit rateHypermediaTwitterMassSurface of revolutionModal logicDifferent (Kate Ryan album)Single-precision floating-point formatWorkstationImage warpingArithmetic meanComputer animation
MathematicsOcean currentSingle-precision floating-point formatComputer fileServer (computing)Data structureForm (programming)Musical ensembleFile formatTesselationXML
Online chatBuildingComputer fileGroup actionWebsiteStatisticsLibrary (computing)Zoom lensCore dumpRegular graphControl flowTime seriesLevel (video gaming)Functional (mathematics)Point (geometry)Function (mathematics)ImplementationCartesian coordinate systemStack (abstract data type)Multiplication signPixelCodeBuffer overflowWave packetSoftware frameworkOperator (mathematics)Linear regressionProcess (computing)Standard deviationCASE <Informatik>Default (computer science)Repository (publishing)Parallel computingData managementInheritance (object-oriented programming)Rule of inferenceDifferent (Kate Ryan album)Data storage devicePrisoner's dilemmaGoodness of fitMetropolitan area networkIntegrated development environmentPosition operator19 (number)MaizeSpecial unitary groupAreaFlow separationRight angleLatent heatComputer animation
Transcript: English(auto-generated)
Good morning, we are in the day two of the training sessions in the context of Open
Data Science Europe workshop, this was a session and now I will speak about Python,
but of course in the same context, so I will explain mainly how we are optimizing our computing process regarding to the raster dataset and how we are processing and organizing all these
raster layers that we are putting available in the web viewer, so and I will present about mainly two examples, so accessing local data that we prepared previously in the repo box
and I will access also through Cloud Optimizer Geotiff, so in the end of this presentation we will have a short use case for Cloud Optimizer Geotiff, so it's a kind of invitation
and introduction for the afternoon session that will be specifically about Cloud Optimizer Geotiff in Python, so before to present the example, so I would like to introduce
about the specific about parallel problems and how we can understand it, so there is a specific subfield or category of problems called an invariance link parallel, so in
the most part of raster processing are classified as this type of problem and considering that I will discuss and present what are the possibilities to optimize this raster processing workflow and mainly how we are doing it in the context of the project and
I will discuss about these two implementations, BLAS and LAPAC, so for sure if you work with data science in R or in Python you are using it on the low level libraries,
so these are C libraries that are used for the most part of raster processing and mathematical operations, so I will show one of the benchmarks that we prepared and how we can optimize
your processing workflow without changing anything in your code and later, so in the hands-on part in the demonstration practical part, I will show how to optimize a temporary
reduction and a numerical operation, a population of NDVI and the last part of the presentation it's really present our production workflow and how we are using like our timing system to split the process across different servers and write in the output and later put all
these data together generating one single cloud-optimized intuitive. So first it's good to introduce this type of problem, so in the computer science field
you have this embarrassingly parallel problem, so this name for me it's a bit confusing, but if you think embarrassingly it means that you can split the problem in really independent parts that will not communicate at all with each other, so it's really embarrassing,
so you can really embarrass all the problem, one specific problem and split it and if you think for example in a remote sensing data it's actually in a raster data processing it's actually it, so you can have one pixel split it, sent for different processing
units, servers, CPU cores and you can do sort of operation and later you just you can have the result, so this is the most part of problems related with the raster data and computation problems are like that, but you have this nearly embarrassingly parallel,
so in the context of the remote sensing you actually we are dealing with like thousands and thousands of pixels, so you actually you need to send the data to some processing unit and after you need to get the data back the result of the processing, so there is this
second level of classification nearly embarrassingly parallel that it's related with these two at the end and the beginning and the end of the processing, so you need to send the data, split it in different servers or different processing units, do some operation
and collect the result of the end of the processing, so it's more called like nearly embarrassing parallel, but you don't have any communication between the process, so after the after send the data you can the processing unit can do all the work without
communicate each other and as I said this is the most part of raster processing, so here is an example, so if you think this is an image of five kilometer of spatial resolution and for this image we have one big time series, so this is an AVHRR dataset and you can
perform several operations on it, so you can do trend analysis, you can try to identify break points, things like that and if you think in all these pixels that we are seeing here it's about
96 millions of pixels and because we are dealing with embarrassing parallel problem, if you have a giant and like huge computer with 96 million of course you hypothetically you can send each pixel for this processing you can get the result of the processing system,
so this is one example that can explain to show to us that it's really like it's not easy but it's more straightforward define like ways to split the processing and to optimize it and of course you can always buy more hardware or get more
servers but you need to be aware that in the remote sense field related with raster data processing you always need to send this data and so the data access it's also a bottleneck
could be an issue, so considering that we can say that these are mainly the possibilities to optimize your raster processing workflow so increase the number of cores and that one okay you can buy more servers but if you have I don't know 100 servers reading from the same
place so you will create a bottleneck to access the data so you need to also improve the data transfer speed so there are hardware solutions to do it you can improve your network speed you can create like connections between the servers like directly without pass through like a sweet
network but this is more related with hardware and considering the software part and so the processing code of course you can develop new algorithms or functions or you can use this drop in replacement so what is it so you can mainly use the same code that you are having
that you already implement in Python in R you can just replace like a part of low level library and use like a more optimized function that will do the hard work like for you so it's a
kind of new algorithm or new function but for your use case it doesn't have any impact so that one type of the of the it's one way to optimize it and the other way it's really you need to do the hard work and reimplement your code but of course the most part of the
frameworks that we have available to do it they try to minimize the your re-implementation effort so because of course if you it's always better just do a drop in replace change some specific library and optimize your processing improve your speed so and in this training
session I will show some examples of how to do it so in our notebook in Python but these two implementations they are really like available for a different language so including R
so if you think you have this blast it's a basic like it's a binary a basic linear algebra sub program so it's it's written in Python in C and plus is just in C sorry so and you can see these are basically like low level functions that will do some basic vector and matrix operations
so and if you think this is a C library so everything on top of it at the end we will use it so for sure you are using these libraries in your workflow in some way because I assume that every one that it's watching this and participating of this training session
process at some point raster data and you have this other uh library like linear algebra package so it's it was it's written Fortran 90 and it's it can manage and solve linear uh equations at least square solutions of linear system and different
types it's more specific but some of temporal analysis use it behind of of the hoof so if you think and what is interesting about these libraries they are today they are kind of a standard so if you think in these methods so you can have the same
method the same signature the same parameters and you can just replace the code so it's actually like a standard that define this uh method and the parameters and you can just change the way that these operations are really performed so you can do a drop
in replacement really quick just changing for example with different implementations so considering these two standards and these two libraries we have several implementations but
mainly and like the some of these implementations were discontinued but now these four it's are mainly like the the mainstream implementations of the plus and lapac so we you have the intel
and kyle it's not open source solution but intel provide it's free and of course it works better with intel it works actually specifically and it was designed specifically for intel processors and you have the open blast it's that's an open source implementation it's optimized also
but of course they don't have the same budget of intel so in the most part of the benchmark and kyle it's the best one so imd now it's creating more it's creating one implementation for blast and other for lapac so it's actually two libraries but and kyle do this the bolt
implementation in one single package and and for the and it's it's it's really like it's it's not really in a unstable but uh it's it's difficult to set up so it's not straightforward as mcal but imd it's trying to uh provide a good implementation also and
you have like a implementation of these libraries in and to run in gpu so and nvidia it's providing it and and yeah giving support and giving support so uh it's it's a nice implementation also and you can use if you have a gpu and just do a regular for example numpy operation
or a regular uh matrix operation in r using your gpu just replacing this this uh this library is cost effective right it's way more cheaper yeah if you think like that per processing unity
yes but gpu it's more tricky because if you buy like a a good cpu processor you use the memory of the workstation but gpu has your own memory so actually you need an additional step
to send the data from the computer memory to the gpu memory and you need to manage it and sometimes you actually you have kuda and that it's a library to do it but this packages and this library they have problems with the memory so if you are dealing with
like large objects objects in memory probably you need to deal yourself with this type of problem but it's a it's a nice solution and we are slowly trying to use it more
and this is a benchmark that we prepared and by the way it is executed in r so we we run it in r and mainly what we did we did the gap we implemented a harmonization approach including get feeling operation in like a trend analysis for this avhrr data for the whole world it's a five
kilometer five kilometer data but what it's interesting is we use exactly the same code so we changed nothing in our code and we just replaced the pre-implementations mkyle openblast and please and rib flame from mnd so you can see that the blaze and rib flame
the stop didn't finish for some bug in the in the middle of the process but if you compare mkyle and openblast so mkyle it's three times faster and we use it just just changing like the the r environment to access different implementations of these
r and of these libraries and so considering it what we will do now it's actually
use the data that we prepared in the context of the project and optimize and reduce and the computation time to considering different implementations of numpy so mainly we will use bottleneck new expert and numba and at the end i will show how we are doing our
tiling system our tiling processing but it's a it's more like for production work for the exercise when you compare these three implementations with like a regular implementation
of numpy so and in this i was talking about this lead flame and lead pack and it's important to emphasize that our environment it's already set up to use mkyle and for example if you
download for r i think if you download the microsoft version they already provide it with mkyle so that's maybe you already use mkyle and the speed that you have it's using this and so in in the context of this training session we already you we are using mkyle and
we will improve it but using a drop-in replacement inside of python so we will replace like the engine to do the processing in the numpy will replace with other implementations that relies for example in c and c++ so before to enter in the practical part
i would like to check if there is some and there is any question about this first uh introduction in more like conceptual background let me check here the chat
okay okay so i will open the ripple box here
so no it depends in our case we were using like a specific raster processing and there are like
in that case there were a lot of code is written in r but it was the same code we just replaced the the last part like the the low level function and but for sure it's for for other type of processing could change but my suggestion it's really it's important to
there is no like the silver bullet you need to do some type of benchmark specifically for your workflow and that's why we did for our workflow because we are dealing with this trend analysis and processing this data and like in the data cube formats but maybe if you are
more using other type of spatial process i don't know like a sort of segmentation that deals more with the space and needs to work like with a big matrix not a specific one pixel in the
time but like more in the space using different pixels across the x and i dimension maybe it's it's different so because i don't know if you use different functions in this this lab so it's difficult to create uh yeah yeah my suggestion it's really it's just a dropping replacement
so you can just use the same code and change the library in in a low level implementation and and do it so we will do it here in pyto but again this these implementations are specific could be used in different language so that's the main
advantage and yeah it's important to do a benchmark understand where is the for for what library it's best for your use application so the first thing i would like to do is
i will clean here so let's first i will back here so if you open a terminal you at this point you already know that we have this odsa work gear and inside of this
folder we have the code and we already have the repository for the training sessions so the first thing you need to do a pool and get the most updated code and i already did it
and for your case you should receive some more content because we are updating this this apple and adding the content for the training sessions and in the at the end of the the all these
training sessions we will keep the repository with the presentations also and everything will be here so first thing it's get the uh updater code and later i will just keep it open and i
will open again the this is my presentation inside the virtual box and i will to open the lab we'll do it here so it will resitute
and if it's open the it the first time it's if it's your first time you will see this and you can enter in the same folder that you got the new code open here here python training sessions and we will execute this notebook so high performance computing
in python and the first thing that i will do here it's i don't need to see it so
it's here there is a way to yes control b okay so okay and so me mainly this
training session we will use these functions so bottleneck it's a collection of loom pi fast functions written in c and and it's important all this this bottleneck and all these other
implementations they manage with the and not a number values and if you remember the other training sessions in python when you read the data and you have like a no data or like a pixel that we are not mapping like in the ocean water it reads as a not a number in
in python so when you perform for example operation across the time and if you organize the data in a data cube structure and if you have like a no data and you need to
ignore it otherwise you will see like a no data and your own all computation will be compromised so it's important all the operations that we will show here they are they ignore there's no data to not affect your calculation so and if you click here you
can see all these fibers but i will not do it and you have this numpy expression it's more like a expression evaluator and it used it optimized the memory usage and you can define
your calculation like in numpy using different mathematical expressions so we'll use it to calculate indvi and you you have also numba numba it's the for me it's the most robust framework considering and what we have here it's just in time python compiler so you can use it
and you are using this this framework you are able to translate like a function that you write in python you can translate to c and just within writing python and it will perform the
operation like in a compiled code so it's a fast way to optimize your your workflow but you need to you need to change some your functions like so you need to if you have like a operation to do like a ndvi you need to create a new function and
implement it but in in the most part of the case the case it's doable and but but it's and there is a lot of other functionalities it's a really a robust framework and here we will use just to specifically to present two concepts of this library like the dropping replacement
so it's it's a kind of we will use the array reduction so we will optimize like a a median calculation across different years and we will also use the vectorize function that it's
actually it will create a c++ a c function just in time so what you need to do first is set up the main python modulus and libraries so you can see here that
uh we are using the numpy i will clean up the output here so we start the kernel clean the output so yeah the first part the first thing it's really load the models
modulus and the libraries and you can see we are just loading the numpy and the patlib to deal with the file paths and we are using three functions from pumap so find files read and save raster so to read rasters in parallel and
load it in memory and put in a numpy array like in a data cubic structure and we use the plotter just to see the result and in this uh purple box you have two tiles
available one is for greece another is for netherland specifically for varnage and and here it's just uh we are set up we will set two variables to uh with the data and we'll use this raster deal variable to access the data input for the demonstration so
first we will access the lens app data so as i presented yesterday this is a temporal data set so you have several uh several files that represent each file represent one season
and one year so we have for for each year we have four files and in the file name you can see the the the time frame and use it to produce it and in this data we produce it for all the
lens events and for all uh and and for each season we calculated three percentiles the median the percentile 25 and the percentile 75 so we use it a lot in the python training sessions
but here we will uh use the one of the spectral bands to calculate our reduction but uh before to read the uh before to read the data uh i would like to explain what it's
this a ray reduction so it's a approach where you have like a multi-dimensional array so because here we are dealing with three-dimensional array so you have like space two dimensions and one dimension is time so it's a data cube but in umpire you can have multiple dimensions
so 10 dimensions it doesn't matter if you learn for example all the lens app data you have four dimensions because you have the spectral dimensions so all the spectral bands you have the temporal dimensions and two dimensions for the space so then you can put everything in one single array and do the same operations that we are doing here so mainly a ray reduction
it's a way to reduce one of dimensions one of these dimensions considering a specific operation so here it's a book example just to explain so you can see you have a input array with
and with two by two and we have these numbers here and mainly if we perform like a maximum operation in the x0 the first x we will have like as output a new array but now with one dimension less so in this concept you can apply for any kind of raster data you just need to
find a way to put this data together and in the umap we are providing the functions to do it specifically for temporal data so if when you find the files and you
create the file names and put it like in a list it will read the file the function will read the file in the temporal order because it's the same order that we have in the file name so we will do it now and you can see here that i will execute the code i'm providing a function
to show some informations about it and you can see this is actually we used this function a lot yesterday but you have this raster here and i will execute it in an individual cell and it's better if i just get the
first so and you can see in the raster here the folder that we define it on the beginning of the training session we are just looking for red lensat percentile fifth and you can see that
we have this object here it has like 84 files and i will just print in the first five and you can see it start from the the first date that we have and considering the product that we
prepared it's the winter for 2000 and it's sorted so these are the inside of this function this is a umap function that we developed we saw the files and you just need to send the files to the read raster and now you have like a data structure like a multi-dimensional array
with 1000 by 1000 so it's the time the size of our tiles and 84 times so we here we have all the lensat red band load in memory and you can see that we are using the floating 32
so and if you check these values i think i do it here in the person is on this yes so i will do it here just to show the values and you can see that we wrote these files were saved
like i just get the first image and you can see that these files were saved as byte but
to when you load it you need to have it in float so you use more memory and but the most part of these libraries they are optimized for float and the other point is you just in the python you can only use like not a number as a value inside of array in floating structures
so in the floating data type so it's we optimize the file so this is a file where we saving byte so the values here will be zero between zero and 255 but to deal with that in memory and
to optimize the processing python you need to convert to float and mainly to for two reasons because it will be optimized for the most part of these libraries and you need to manage this not a number and it's mainly a no data so and this function here the read raster
it's actually it's reading the data putting it together and as all these files they have like a metadata informing the no data it's already replaced the no data value that in this case here it's you it's already replacing it for a not a number so it manage it for you this function
320 megabytes at the moment exactly and uh but how much how big was it yeah we can check yeah we need to check it we actually need to check each file separately but let's get one just one yes so i will just get the name here so you'll have this
multiply it by say i would just because each file probably will be different because it depends on the values but it will be something like that yes yes and and but one thing that's
important if when you save the file as jotif you have the compression so you have the
compression algorithm that will reduce the file size and you have the data type that use last space right use just eight bits so here in the floating 32 we are using much more uh bytes also so that there these are the main the two regions the two reasons so
and yeah this function you can use to see the data type and i'm just plotting the the size of array here and so what we will do now uh all our uh benchmark calculations here and all
then i will use this time it so you can see it's a like a command specific from jupyter and you can define the number of repetitions so that that it's really important because
when you do a benchmark you actually need to execute it multiple times and otherwise it depends off and in this case even my computer it's doing other things here because there are other softwares and that are working so you will have a better estimation and a more reliable
estimation if you use multiple repetitions so what i'm doing here i'm mainly i'm doing this five repetitions and i will use this new pipe and in any and not a number median and i will reduce the temporal dimension so mainly i'm calculating here a median across all the seasons
and all the years so it's a long-term median of the landsat red band red band and to do it i just need to execute and i will see the time and remember this time it's actually it's
taking a while because it is acute five times and the output we will see the average of the computation time and the a standard deviation considering the five executions is no no because numpy doesn't run it in parallel automatically this is the regular
implementation of numpy we and we can see it here that's a good point so if you put here and let's execute again you can see that it's just one four and mainly they have some
reasons to do it because uh depending off the way that you split the process you will create a new process you need to send the data and after you need to process it and you need to aggregate again so uh at some point it's not uh worked but other libraries did it so for example
the library that you see here they implemented it but it's just one core and again this is just the regular implementation of numpy and now we will do the same thing but using the bottleneck but the bottleneck it's a good example of dropping replacement so it's just it's the same
signature so i it's the same method and i'm sending the same data and the same parameters i will do the same operation but i just need to send to to change the model name and
if i do it and let's see yeah it's four about four times faster in this case it just used one core so you can see and maybe we use it like more optimized functions to do the
same operation here and and yes four times faster one one thing that i would like to show is just a second so this second one doesn't bother nope then it's four times faster
maybe they did some algorithm optimization inside of each these functions they this font bottleneck it's not fully compatible with numpy but for example maybe they found a way to implement a medium calculation using some heuristics that it's better
exactly exactly and but one thing that it's i this is nice to to share so we use it a lot in the project we use this floating 16 and if you think the float 16 it's there it's
the there is a same it's the same properties of the floating 32 so you can have no data but of course it's you are using 16 bits so it's two times small considering the you use two times
less memory and you can see here it's 16 160 but if you perform the same operation using now you can see i just changed it and the read raster it our function managed it so it just now it's just a floating 16 and if i change it and i just execute the computation time will
be bigger so it depends off the data format that you are dealing and float 16 it's not like the it's it's not the most optimized format i i frankly i don't know um why and what are
the reasons behind of it but you can see it's much more slow dysfunction so and we discover it during the project some of the calculations that i were doing inside of the umap used this float 16 and it's six seconds now so it's and you if you do for example with gpu processing
you will have the same problem so to really optimize your your processing workflow you need to manage this data format and you need to understand it there are some limitations but in general float 16 it is a limited format and but of course it's it use less
memory so if you want to optimize you need to use more memory and process the your data faster it's what we are seeing here in the the context of these libraries and i will execute again so
yeah and it's the same code i'm just changing the data format here and so what is nice and you can see this is not paralyzed but and here i'm doing a medium
calculation but this is four times faster and if you use like i don't know processing the continental europe as we are doing so it's a remarkable improvement and but what it's really important you need to be aware and you need to check if the data it's the same but maybe i don't know they could reduce some precision in the calculation
and if you check with this array equal you will see that they are exactly the same so we used two different functions that produce exactly the same result so i'm using like a the
right function to do it already equal and you can see that it's it's exactly the same value for for all the pixels that we are generating here and the result and it's four times faster and but you can use other function and we will do it now and you do with this
package but the example that we have now it's a drop in replacement so it's just a matter to change your the model that you are importing and you will have the a faster implementation
but with numba it doesn't work like that you really need to do some work and it's not hard working but you need to implement part of your workflow and as i said it's a more complex library and here we will use this jit and decoration so it's a just in time compiler and
you have different options here so um and you can put you need to put it on the your signature map method and you can
and you need to define that it's a known pyto implementation and here you can use the parallelization also we will do it so and as i said there are multiple ways to and optimize your processing workflow with numba and here
we but numba works in a different way and doesn't provide this array this axis argument so and if you can check it here so all the functions compatible with numpy you have
you have it here and so they try to use the same signature but it's actually you can see the here it's just only the first argument so we are using the same function of numpy but for
their specific reasons they didn't implement like this x reducer so informing that we will reduce and we need to we will calculate the median across the temporal domain so you need to manage it and the implementation and and here i'm just showing that the to deal with that
what we will do we will now we have a three-dimensional array so we have like 1000 by 1084 temporal they all eight for dates and but as numba just had he worked with a two
dimensional array it's just a matter to change the structure of your data so what i'm doing here i'm just doing a reshape and keeping the same the last dimensions here and i'm putting together all
the pixels in this space so i think this is pretty common for r so because it's it's how the some of the patches yes they use like that but for for numpy we actually work with the data
in three-dimensional structure but here it's just a matter to do this reshape and keep all the the temporal uh all the temporal dimension intact but put together all the pixels in the space and now yeah yeah that's that's a good point you could uh process each image
individually but so that would be an option but not like a one single chunk of data yeah you are right if you you in this operation you can you you have this limitation because of
the way that some of numpy operations were implemented so what we will do now to use numba mainly you can you actually need to define this jit just in time compiler and you can see
that i'm creating like a redose function and i have the last dimension here and the output dimension and here i'm just i just presented the shape that we will put in numba but here
actually i'm doing the reshape so i'm changing the data to three-dimensional to two-dimensional structure and keeping the last dimension intact and i'm sending it to numba and now inside of numba i'm i need to create a new array so this x it will be a new array with
the same size of the input data and i need to use this p range so for in python this range it's a pretty common function to iterate over the the data and this p range it's optimized a range that we will parallelize the workflow for you and you are here we are
creating a p range across this one million pixels so and this i here will be actually like one of these each lines so and now you can just call that not a number median and do the
same operation and put in this x and return the result so till here we will send it to numba it will be processed and we will receive the result but of course we would like to have it in a three-dimensional way as we are doing for the other libraries so i'm just
reshape again the output so and let's put now the terminal and if you execute it you can see now that it's using all the cores and now we have half seconds so
but it's not like a a lot of work of coding but we need to implement our approach and do some changes but there is no free meal we need to do it to improve the result or we just
can use like a dropping replacement as we are doing here and improve just four times so here is eight times faster than the original implementation and again you need to compare
so it's exactly the same result i'm not losing anything in this process so it's really i'm improvementing the performance and here i'm just using the water function to see and the result of course i already compared all the values so visually it's the same thing
and this is the final summary so and just to explain it's more like related with this time it each time that i executed it i defined this and dash also it's a it will be the output
of this benchmark will be saved in this object so at the end i can generate like a just a visualization of it putting all the results together so here i'm getting the average and the standard deviation of the five executions for all these uh three libraries
and definitely mumba it's it's way better and it's produced exactly the same result and so this is a uh array reduction operation and uh just to to empathize
here we are using like the medium function but you have support for all these implementations so it's uh something of bottleneck doesn't have all but uh you need to check specifically all
libraries separately but you can perform several operations and maybe for some operations it will be faster i don't know maximum probably it's more simple to optimize than a medium function like a percentile it's more difficult to optimize so this is not like a again a golden
rule that always will be faster but for sure if you use numba or bottleneck use you should see some improvement considering the original implementation of numpy so i will show now the numeric operation and the numeric operation it's
a bit different so the array reduction mainly we have the input array and we will reduce one of dimension so it's mandatory that the output will be will produce less data the input because we are reducing the data so here we are reducing all the temporal dimensions in this exercise
to the one single dimension and removing this time dimension just calculating the medium in the numeric operation it's mainly like some type of calculation that will receive the same input and generate the same input so the same output considering the shape not the values but it's
more to to perform some operations so it's perfect to calculate for example spectrum index so we have like two images and each image of course one image for red and our other image
for near and we will generate the ndvi and so of course we are receiving two inputs but each input has the same structure the shape the same shape the same dimension of the output and here it's just a example so we have this in data and true inputs here so it's just
a two-dimensional array in numpy and i'm just performing like a multiplying the values and you can see the result so we have two input shapes with the same dimensions and the output shapes will be also the same dimension so now we need to read the data and as here we will
calculate like a ndvi across this this data we need to use two bands so we need to use the red and and the near so and what i will do i will just i will get all the files here so i'm finding
the files to read the red and the near but i will just read 40 images so the the first 10 years and it's mainly because i i i want to keep the same amount of run of the
second of the last exercise so but you can play with that and you process the whole time series and you can see here it's this shape so again it's the same concept we are reading the data the data will the files will be returned in the in a sorted way and we use
this to read the data in a temporal way because it's the same order of the file name and here we are reading two data so and now we have the red and the near data so it's a pretty simple ndvi calculation so who works with this satellite images and
environmental monitoring know this index index and here it's i will do just performing like regular operations because with numpy i can do it so it's compatible so it works like one
it works like for for one variable and here it's a ray a three-dimensional ray and it will be performative for the same way so i'm just calculating the ndvi here and let's see the
how is our ram okay took a while because it's it is a five times repetition and you can see here a standard deviation between all this and five implementation five repetitions and for the
for the noodle so here for for bottleneck uh i didn't see any um implementation or any function to help with that so in this case just numba provided and with numba we'll
use this vectorized decorator so uh mainly what we will do if you think here for example we are doing like we are calculating like a subtract operation a division over our rate but and the way that i write the code it's it's from it's the same way if it's just one value
so this type of flexibility and it's mainly provided by this numpy universal function so it's more like a paradigm that it implements
different ways to access these arrays structure and provide it in a high level usage. So you can perform some operations easily across multiple and multidimensional datasets arrays. But what NUBA do here is
when you use this vectorized decorator, so you can check more the documentation here, it will implement the same structure, so the same paradigm about this NumPy universal function in C. And what is nice is
you can keep the same way that you are writing and it will manage everything for you. So it's just, it's a just in time compilation and it will optimize it, the process. But to use this vectorized, you need to define what is the input value and the output.
So it's just it. So here we are defining that we are using floating32 because at the end, this function will be converted to a C, a C function. And these data pipes are really important
if you do it in C. And you can see the return here. So it gets exactly the same code. In this case, it's almost a dropping replacement. So, and you can execute
and you can see that it's yeah, half time fast. So it's not a real, like it's a, of course it's an improvement, but not like five times faster than as the other one, but it's pretty straightforward.
So you can use to optimize it. And of course here you can perform any kind of operation. So, and you have support for different mathematical operations and different calculations and considering this vectorized decorator.
And so, and of course it's important to compare if the data is exactly the same and check if we are not losing anything. And yes, it's exactly the same. So we are really optimizing it without changing the output.
And in this context, the last library is this numprex. And it's difficult to say, but it's more like a numpy expression calculator. And this is more like a, it's a nice library and it's really simple.
You have the main method is this evaluation method and you can, you have all these available functions like cosine and arctang2 and things like that. It's all these functions are supported
and you can write your calculation as a high expression. So, and you can see, this is actually a stream and you are here, it's a calculation for the NDVI and I'm defining the parameters in a second object. So I'm defining like NDVI data,
it will be remapped to near and rad data, it will be remapped to rad. And you can see that it's just define the expression, send the parameters. Here you can send like a constant. I don't know, maybe if you put some constant in, if you will calculate, for example, AVI2,
you have some constant and you can define it here or inside of the expression, it doesn't matter actually. And here you are sending these expression, the parameters and this optimization, it's more like it's a different value,
this aggressive, but I just informed there because you have these three actually are true options, moderate and aggressive. And I'm not sure, but I think it's related with the memory usage. So, and if you execute it,
so it runs also in parallel. And you can see it's a bit less, or in here it's almost the same time from the NUBA.
And you need to check if we are producing exactly the same date and you can actually compare the results now and see the performance. And let me check the chat.
So, yeah, we have one question. So are the general rules for deciding whenever to use NUBA or just a max speed while you reduce the memory usage? No, I think that that's a good question.
I think that it depends off your workflow. I think the difficult part here, it's really to integrate these libraries in your processing workflow. For sure, it's a kind of work. For bottleneck, for example,
it's just a matter to change the model, but maybe there are some functions that are not available in the bottleneck. So I think the first point, it's really like it's related with the implementation cost and you need to, it depends off each application. But in general, I assume that NUBA,
it's probably the most robust framework here. And of course, it will require more implementation work. But for me, it's a trade-off, it's a balance between the efforts to implement part of your workflow and the gaining speed, for example. Maybe you can just use bottleneck and change just the model importation
and you will improve a lot, but it's not necessary to change anything. So that works for you. And I think that's more like kind of the general rule here and of course, NUBA it's the learning curve,
it will be bigger for sure. Because with NUBA Express, this expression evaluator, it's just like one method. So it's really easy. And with bottleneck, it's just a matter to change the model.
So NUBA it's the best option, I assume here, but it will require more implementation effort and the learning curve will be like more difficult. And okay, so and now what we can do, so yeah, we processed four NDVI images, right?
And you can check them here. This is just a function that we already used in the past training session yesterday. So you can see the NDVI values for one year,
so the four seasons. But I think it's a good idea to save the images, right? So you processed it, we did that set exercise, but we are generating like a new data. So, and I like to show these rasters, save rasters,
because it's really handy. So if you see here, I will just open it here. So this is our NDVI output. So it's a NumPy array. And mainly I will send this NumPy array
to the save rasters and I will send a list with all the files that I want to save. So if you think we will save, we will generate 40 files here and each file needs to have a name because we are putting all these temporal bands
in separate files because it's easy to access and to get specifically one time if you want. And, but to save it, I need to save as raster. And if you think in a raster format, you have different parameters, projection, like pixel size and things like that.
So to deal with that, we are sending like one base raster. So it's just one example of file. And I will just get it here. And I want to show the first, I'm generating like a list of files
where the data will be here. And now I can send like this list. And of course the list needs to have the same amount of files of the dimensions that I have in my last,
the size of the last dimension in my NumPy array. And here it's actually a file that exists. This file needs to exist because inside of the function, we will get all the parameters like projection system and everything and save it. And this function, it's fully optimized
and you can do it here. And it will, now it's using four workers. It's a kind of default value. But you can save, I don't know, using 10, 20. If you use 40, maybe it doesn't make much sense because you will have a bottleneck with your disk.
But here we saved all the 40 files. And now let's open it in Qantas. So, and here you can see the files.
And one thing that it's important, this file is, if you'll remember, this is a byte format. So because we optimized it, so the values are between zero and 255. But here we calculated IMDVI and we didn't rescale the value. So I'm sending, you can send like some additional parameters.
So I send into the save raster that I want to write like a format, like a floating 32. Otherwise I will truncate my data because it will put all the values just like one single constant because I can't represent like floating points in byte.
So yeah, if you open here, you can see. And this is the NDVI that I calculated. And probably here I use the same node data.
So all the values that are equal to zero, they were classified as node data. So that's a problem, but we can inform in the save raster, we can define the node data output. So yeah, this is mainly the output of calculation
that we define it. And you can change the node data here if you want. So now, so till now we did everything just for one tile. So for me, it's a good way to start
because you have like a pilot area, you have like a small data set and you can really tune in and find what is the best approach to do it. So this is part of the optimization process.
The other part, it's really like production work. So now, for example, we can use all these implementations or actually we use just one, the best, the better one. And we will like produce it,
but like in a production way and process several tiles at once. So in parallel and we can, for example, split it in different servers. So the point here is we optimize it for one single tile, but of course, depending on your study area, you have much more tiles. And in our case, in the open data science, your context,
we have more than 7,000 tiles. So what I will show here, you can see that if you have just one single tile, it's like that. But if I would like to load all the tiles
in the context of the project, I would use 26 gigabytes, just one date. But if I would like to use like process, as I'm doing here, imagine I will put the whole year, all the pixels, 84 dates,
like 20 years, all these seasons that we are processing. So it's a lot of RAM. So if you need like almost more than half terabyte and some computers have it, but for example, the problem here,
you need to be aware that all this data is compressed and it's saved in a storage unit. So it will be a long time just to put everything in memory and later put back in the disk. So you have like a data access problem here that it's actually unavoidable.
But the best way to deal with that, it's just really splitting the process. And of course, if you use more servers, you need to be aware that it's important and provide a fast way to access this data through the network.
Because if you use multiple servers, they will access one single place. So, and it's important to be aware of that. So, and the best option is use this cloud object service. So you have like the S3 service option
and Google has one also, and you can use it to process. But of course, if the data is in their infrastructure, you need to have the servers there also, and you need to put it to run there. And in some case, that's okay.
But if you have your own infrastructure, you need to have your servers are running, for example, in-house, you can use an open source implementation for this S3, call it Minayo. So what I will do now, I just have 15 minutes to let me see here.
So yeah, I would like first to show it here. So this is our current processing workflow. So to process Europe, we are sending,
so we have more than 7,000 tiles and we are sending like just the bounding box. So each of these tiles, they have like one bounding box, like window object in the rest area. So we are sending these bounding box for these different servers. So here in this example, we are using four servers
and this servers has the Python, the Docker, it's the same environment. The most important, these servers will access like data storage, but it will access through S3. Actually here, we accessing all the files as accessing directly to HTTP connection.
So it's not actually IS3 because you don't need any type of permission and credential to access that. But, and of course you need to provide a good and reliable network connection here. So all these servers are in the same switch network
and you can access it and there are options to do it. You can use optical fiber, so for in-house processing, there are multiple possibilities. And later, so we send just the bounding box,
we read all the data that is necessary and we do the calculation, we save back here, so we send all the data to this SSD and NAS. So into save, we actually need to use the S3 protocol and later, we use the GDAL, our good friend to generate,
like first we generate a virtual raster, so it doesn't matter how much tiles you have, GDAL will manage with that. We generated like files with more than 100,000 tiles, so it's not a problem. And GDAL will create this build VRT,
so as I call it, it's a virtual mosaic that will point for the data and using this virtual GDAL VRT, we basically we do like a GDAL translate and generate all the cloud-optimized geo-tiffs. And in this, so this is how, for example,
we generated the NDVI data that we, that the red band that we access it now. And I will do just a short demonstration here, so to access now the data through the cloud-optimized geo-tiff and our S3 service.
So you can see here, I have a URL and this URL is actually pointing for NDVI that we processed for the whole Europe and this is a specific date. And this is nice, so I can just pass this URL,
raster.io will open it. I'm using raster.io just to see like the parameter. So I can see here, the whole image, everything. And what I will do, so in the EU map, we developed this tiling process. So all this mechanism to send, for example, the specific one window for multiple CPU cores,
we are doing, we are managing it here inside of the tiling process. And we have, mainly we have three functions, one to process one single tile, other to process all the tiles and other to process just a range of tiles. And here, I'm just getting one specific tile. So what it's important, I'm sending like an HTTP URL.
It will be received here by this function. And you can see in this function, I can receive the window that it's actually linked with this tile ID and this is just a sequential ID.
And here, as I didn't send any parameter, I could send a different tiling system. It will use automatically the tiling system from the open data sites, Europe. So with the seven, more than 7,000 tiles. So here, I'm getting the tiles from the cloud. The method will just get, pick up this specific,
this tile with this sequential ID. And with this tile, I'm sending to the read raster, the same function that we used, I'm sending like the URL, I'm reading just one. I can read in parallel, but here I'm just using one job. But more important, I'm using the spatial window.
So here, I'm just reading from a cloud-optimized geotiff, a specific part of my data. And the result that I want to receive, it's just a byte format. And I'm using the plot function. So, and I can see the result here.
And of course, you can play with this parameter. I don't know, maybe you can try to see all the region of Europe and it will do the same thing. Here, we can see some stripes because of the Landsat data and you can check all the regions. So this is totally working through network.
And that you can process, you can see multiple tiles. So here, it's just the same function, but now I'm passing a range. I will pass like a range of three and I can see three different tiles. And this is the location of the window.
And, but what I will do now actually, I will get, considering this URL, I already have been read in the near band there and I will send, I will get this URL and I will generate like, here I have the raster files.
So it's, I'm using the raster files just to replace and I will get, I'll show you just some example. So this is a remote URL and I'm accessing this remote URL and I'm, now I will calculate the NDVI, accessing directly from the Cloud Optimizer geotiff,
but I will do it for three tiles. So, and you can see it's the same approach. The read raster, I'm getting it specifically for a window. To optimize the processing, we need to have the floating 32
and here I will use the NumPy Express. So it's just, I will call this evaluation function and I will have the result here. And how many of these eight rasters can be done in this remote assembly? No, I never tested.
There is some limitation because, but for example, here I'm using 16 workers because there are two levels of parallelization. The first one you can see here, the three tiles are being processed in parallel and each of these tiles I will read using four cores.
But I think that maybe the limitation in their side will be more for, specific for one IP, but I don't know because if you have multiples IPs accessing it and they need to provide some type of scalability.
So here we did the same thing, but now with real data accessing a specific part of the world and we process it and generated it internally here. And I will just open it to show
and as we process it like. And then you can probably, when you've done it, you can see how you download the data right. You have to download some of the on the megabyte. Here we download the full resolution. So it's more, it's the same size that we estimated at that time.
So yeah, it was. Now when you've done this, you download it almost 50 megabytes. Yes, yes, 50 megabytes, exactly. So, and here I will just open the data that we produced it. So I created this and this is other thing that it's nice.
So for example, the save raster, I'm using the save raster. It will create all the folders for you. So you just need to define the output and I define it here, like NDVI production. And so I'm just saving all the data. And here we have the three tiles and you can open it in one GIS.
Yes. And you can use now a GDAL VRT and put these tiles together and generated one single file. It's actually, this is our current workflow.
And the last thing is, yeah, these are the commands to generate it. And yes, you can see that I changed the format. Okay, I would like just to explain this structure.
So you can see that you can play with this file structure here. And of course you can send it later for S3 server. But what I did, I just keep the same file name that I want to see. So here is the NDVI. So I'm accessing the red and the new band. I'm generating the NDVI specifically for this data, date.
So when I enter here, it's a nice structure, has all these tiles that I will generate. So more than 7,000 files in a single folder. So I can just use like this command here, this GDAL build VRT,
just pointing to use all the files here inside. So it doesn't matter if I have three files or 7,000. So it will work the same way and you can replicate this command in the same way. And yeah, that's all that I have to show to you.
Let's see if I have questions here in the chat. No. Okay. We have three more minutes to discussion. Yep.
They're more true. Bottleneck and the other it's Noot Express. Yeah.
Yeah.
Yeah. I think it's because for example, you can use the regular functions like in, in,
in R and Python, and you can just change like the, the MKL so that there is no, no changing the code here, but I think these functions are really the frameworks that I presented here. They, especially the Numba, they are really robust. So they are investing a lot of money to,
to provide like this easy way to convert a Python code in a C++ or a C sorry. So I think this is making the library robust. And, and, and I think in R, the way that R manage with the parallelization.
So sometimes it's different of the way that Python managed. So, but it's, it's nice to compare. And one thing that I just want to show the last,
I didn't include because I really, I like, I like to idea to present these libraries, but here in the EU map in the parallel, we have like one function that it's important,
get this, apply a long X. So, and you can see this is actually like one of the best answers for a stack overflow question. And the code, it's really nice. We just change it and adapt it to put here, but this function, it will process operation in a NumPy array in parallel.
So you can, each of that pixel will be sent to a different core. So it's a kind of the same thing that you are saying R, but this function, it's better to use like, like high level functions or application. So for example,
if you want to calculate a linear model over the temple of the, over time series, probably this function is better because I don't know if it's possible to do it in Numba because Numba it's more like for like these generic processing and the statistics operation, but to process more, I don't know, some serious time series analysis,
a break point analysis, or things like that, probably you need to use like high level, by high level Python functions. And this is a good starting point, just to use a regular implementation of Python to process each pixel in parallel, using all the cores available. So the question in the Zoom was,
when you use a standard Python library, they use MKL by default. So this seems that the only case when you use Anaconda, it depends off the way that you set up your environment actually. So here in the vertical box, we did the setup using our Docker and using Conda.
So for example, you can see here in the, this is the UMAP GitLab repository. And when you create like a Python, this is how we created this Docker. And you can see that with Conda,
you actually, you can form the BLAS implementation. And the BLAS, as I explained it, it's this C library with these mathematical operations to manage arrays. So here you can choose MKL or OpenBLAS. So if you are working IMMD CPU,
probably no, for sure it's better use OpenBLAS. But if you have an Intel CPU, you can just install with Conda and install directly the MKL. So it's the easiest way to set up and specific for the setup of the training session, we already use it.
We are using like MKL in all these libraries that I presented. Let's get this apply along X. So, and you can see, this is actually like one of the best answers for a Stack Overflow question. And the code, it's really nice. We just change it and adapt it to put here, but this function,
it will process operation in an UMPI array in parallel. So you can, each of that pixel will be sent to a different core. So it's a kind of the same thing that you are saying are, but this function it's better to use like high level functions or application.
So for example, if you want to calculate the linear model over the temple of the time series, probably this function is better because I don't know if it's possible to do it in Numba because Numba it's more like for like these generic processing and the statistics operation,
but to process more, I don't know, some serious time series analysis, a break point analysis, or things like that, probably you need to use like high level, high level Python functions. And this is a good starting point just to use a regular implementation of Python to process each pixel in parallel
using all the cores available. So the question in the Zoom was, when you use a standard Python library, they use them, Kyle, by default. Yeah, it seems that the only case when you use Anaconda. It depends off the way that you set up your environment actually.
So here in the virtual box, we did the setup using our Docker and using Conda. So for example, you can see here in the, this is the UMAP GitLab repository. And when you create like a Python, this is how we created this Docker.
And you can see that with Conda, you actually, you can form the blast implementation. And the blast, as I explained it, it's this C library with these mathematical operations to manage arrays. So here you can choose MKIL or OpenBLAST.
So if you are working IMMD CPU, probably, no, for sure, it's better use OpenBLAST. But if you have an Intel CPU, you can just install with Conda and install directly the MKIL. So it's the easiest way to set up and specific for the setup of the training session,
we already use it. We are using MKIL in all these libraries that I presented.