We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Tools and packages to query and process Sentinel-1 and Sentinel-2 data with R and Python

00:00

Formal Metadata

Title
Tools and packages to query and process Sentinel-1 and Sentinel-2 data with R and Python
Title of Series
Number of Parts
17
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer
Production PlaceWageningen

Content Metadata

Subject Area
Genre
Abstract
This session focuses on the products from the ESA Copernicus program Sentinel-1 and Sentinel-2. These products can be freely accessed in several manners and through different portals. We took a look at packages to query the data you need for your analyses using Python and R, switching between platforms when relevant. An introduction on how to process Sentinel data with both platforms wera also covered focusing on the particularities of the sensors. For Sentinel-1 data access, we used the ASF Data Service to query data. There is no need for credentials for querying the data but if you want to try the downloading steps, you will need Earthdata Login credentials.
Process (computing)Fatou-MengeMereologySlide ruleQuery languageProgrammer (hardware)FreewareAdditionComputer wormQuery languageProcess (computing)MereologyComputer programmingChromosomal crossoverBitSpacetimeXMLComputer animation
Reading (process)WebsiteOrbitSatelliteComputer-generated imageryAsynchronous Transfer ModeType theoryProduct (business)Object-oriented programmingCuboidMedical imagingInformationSatelliteObject-oriented programmingAsynchronous Transfer ModeMessage passing1 (number)Type theoryOperator (mathematics)OrbitForm (programming)BitClosed setDecision theoryMathematical analysisMereologyComputer animation
View (database)Computer fileKernel (computing)Open setSingle-precision floating-point formatLevel (video gaming)Product (business)Complex (psychology)Digital signalDiscrete element methodQuery languageAsynchronous Transfer ModeComputing platformMetreNetwork topologyServer (computing)Formal languageLaptopMassRange (statistics)Medical imagingRepository (publishing)Radical (chemistry)QuicksortLink (knot theory)Directory serviceComputer animationSource code
View (database)Computer fileKernel (computing)Product (business)Level (video gaming)Digital signalDiscrete element methodOpen setComplex (psychology)Single-precision floating-point formatDean numberRange (statistics)SurfaceAsynchronous Transfer ModeDefault (computer science)Chemical polarityTime zoneImage resolutionService (economics)Field (computer science)VelocityOnline helpAngleIncidence algebraCloud computingCharacteristic polynomialSatelliteEllipsoidDemosceneComputer-generated imageryMeasurementLoop (music)Level (video gaming)Different (Kate Ryan album)Focus (optics)BitType theoryMathematical analysisRepository (publishing)Single-precision floating-point formatComplex (psychology)Range (statistics)Figurate numberAsynchronous Transfer ModeTime seriesBlogElectric generatorDiscrete element methodSatelliteMaxima and minimaTemporal logicCloud computingNumberProduct (business)Link (knot theory)Medical imagingLatent heatGoodness of fitMultiplication signDistanceDemosceneMessage passingPoint (geometry)MathematicsInformationSimilarity (geometry)Cartesian coordinate systemDefault (computer science)CASE <Informatik>Polarization (waves)Process (computing)AnalogyWaveDifferential (mechanical device)NeuroinformatikComputer animation
DemosceneSatelliteComputer-generated imageryDiscrete element methodMachine codeOpen setComputer fileKernel (computing)View (database)Query languageComputing platformLimit (category theory)Process (computing)Modulare ProgrammierungHill differential equationProduct (business)GeometryPolygonData typeAsynchronous Transfer ModeRotationBookmark (World Wide Web)Maxima and minimaNumberDemosceneLibrary (computing)CASE <Informatik>Optical disc driveSquare number2 (number)Control flowRight angleGoodness of fitGeometryString (computer science)Query languageProcess (computing)Level (video gaming)Product (business)Type theoryPolygonSpacetimeCartesian coordinate systemInsertion lossCheat <Computerspiel>BitDiscrete element methodInformationShape (magazine)File formatFunctional programmingLaptopModulare ProgrammierungPlotterLoginMathematical analysisSatelliteService (economics)Flow separationResultantMultiplication signFrame problemQuicksortComputer animation
View (database)Computer fileKernel (computing)Machine codeOpen setDirection (geometry)Gradient descentASCIIComputing platformProduct (business)GeometryOnline helpPolygonQuery languageDemosceneModulare ProgrammierungData typeComplex (psychology)Single-precision floating-point formatLevel (video gaming)Asynchronous Transfer ModeDefault (computer science)Service (economics)Chemical polarityTime zoneDigital signalRange (statistics)Limit (category theory)Process (computing)Musical ensembleBitFrame problemWaveType theoryResultantDirection (geometry)Product (business)Polarization (waves)Software testingInformationSatelliteProcess (computing)Goodness of fitDemosceneExpert systemComputer animation
DemosceneNumberOpen setFocus (optics)Stack (abstract data type)Product (business)Query languageInformationOrbitVolumenvisualisierungMedical imagingDirection (geometry)Different (Kate Ryan album)NumberCASE <Informatik>Level (video gaming)Group actionMereologyDoubling the cubeMultiplication signDemosceneSpacetimeProduct (business)Computer animationDiagram
NumberStack (abstract data type)Plot (narrative)Temporal logicFocus (optics)DemosceneQuery languageProduct (business)Open setMachine codeComputer-generated imageryDigital filterAbsolute valueDiscrete element methodInstance (computer science)Characteristic polynomialCurve fittingLocal GroupOrbitPolarization (waves)Computing platformCategory of beingFunction (mathematics)LaptopDirectory serviceProcess (computing)AuthenticationDifferent (Kate Ryan album)DemosceneSatelliteQuicksortComputer fileMetreService (economics)Sign (mathematics)Medical imagingBit1 (number)Temporal logicFrame problemPlotterExpert systemAbsolute valueMereologyShortest path problemUniform resource locatorCubeProduct (business)Electronic mailing listMultiplication signFile archiverCartesian coordinate systemNeuroinformatikEstimatorComputer configurationFunctional programmingConnected spaceNumberInheritance (object-oriented programming)SoftwareLink (knot theory)Projective planeCodeQuery languageCategory of beingProcess (computing)Scripting languageAreaHand fanSheaf (mathematics)Maxima and minimaCore dumpRow (database)InternetworkingDirection (geometry)Goodness of fitComputer animation
LaptopProcess (computing)Software developerWeb pageDirected setWebsitePairwise comparisonMathematical analysisComplex (psychology)Point cloudOpen setComputer programComputing platformModul <Datentyp>Modulare ProgrammierungGroup actionRevision controlInterface (computing)Gastropod shellSource codeConfiguration spaceLine (geometry)Electronic program guideDirectory serviceExtension (kinesiology)WikiDecision theoryView (database)Computer fileKernel (computing)FAQMachine codeKeyboard shortcutLink (knot theory)Bookmark (World Wide Web)Repository (publishing)Broadcast programmingVariable (mathematics)Integrated development environmentLattice (order)Exception handlingAsynchronous Transfer ModeMusical ensembleObject-oriented programmingStack (abstract data type)AuthenticationFunction (mathematics)Category of beingQuery languageSingle-precision floating-point formatRange (statistics)Image resolutionProduct (business)Level (video gaming)Service (economics)Chemical polarityTime zoneDefault (computer science)SurfaceVelocityCharacteristic polynomialCloud computingSatelliteSurface of revolutionIncidence algebraMathematical analysisRevision controlSoftwareGraphical user interfaceBlogInternet forumBitMultiplication signFunctional programmingRange (statistics)Water vaporReflection (mathematics)Repository (publishing)Independence (probability theory)MappingPoint cloudDenial-of-service attackCartesian coordinate systemExpert systemSingle-precision floating-point formatNoise (electronics)Complex (psychology)Human migrationEmailGoodness of fitComputer animation
Digital filterTemporal logicComputer-generated imageryAbsolute valueDiscrete element methodInstance (computer science)Computer fileView (database)Kernel (computing)Asynchronous Transfer ModeOpen setRadiometryProcess (computing)Product (business)Range (statistics)InformationPhase transitionPixelMultiplicationLevel (video gaming)ComputerComputer configurationMachine codeService (economics)Physical lawRandom numberLaptopReading (process)Musical ensembleCommon Language InfrastructureAlgorithmGamma functionDenial-of-service attackType theoryPolarization (waves)BitPhase transitionInternet service providerGoodness of fitSet (mathematics)Range (statistics)MereologySelf-organizationLink (knot theory)Figurate numberInformationMusical ensembleSpectrum (functional analysis)Graph (mathematics)FreewareCASE <Informatik>Type theoryGamma functionPolarization (waves)WaveAngleDemosceneNormal (geometry)AlgorithmDialectDenial-of-service attackMathematical analysisLine (geometry)Local ringProcess (computing)RadiometryUniform resource locatorIncidence algebraNeuroinformatikResultantLibrary catalog1 (number)Product (business)Different (Kate Ryan album)Computer animation
View (database)Computer fileKernel (computing)Open setMachine codeIntegrated development environmentParameter (computer programming)DisintegrationReading (process)Gamma functionEmailComputer multitaskingPlot (narrative)Musical ensembleShape (magazine)Commodore VIC-20Coordinate systemGraph (mathematics)Attribute grammarSubject indexingAreaAverageDemoscenePolygonAsynchronous Transfer ModePolarization (waves)Process (computing)Image resolutionProduct (business)ASCIITransformation (genetics)Scale (map)Visualization (computer graphics)Object-oriented programmingLaptopPoint cloudMathematical analysisComplex (psychology)Pairwise comparisonWeb pageDirected setSoftware developerRaster graphicsInformationDemosceneShape (magazine)AreaGamma functionUniform resource locatorINTEGRALSemiconductor memoryOperator (mathematics)ResultantBitIntegrated development environmentPolarization (waves)Object-oriented programmingMetadataTesselationGoodness of fitWindowNeuroinformatikPower (physics)Process (computing)Pairwise comparisonThresholding (image processing)Water vaporDifferent (Kate Ryan album)Scaling (geometry)Visualization (computer graphics)Functional programmingLetterpress printingCASE <Informatik>MereologyMusical ensembleAttribute grammarOrbitComputer animation
Software developerDirected setWeb pageMusical ensembleObject-oriented programmingComputer fileView (database)Kernel (computing)Price indexAsynchronous Transfer ModeMachine codeOpen setPlot (narrative)Graph (mathematics)Shape (magazine)Subject indexingAttribute grammarWebsiteCoordinate systemComputer-generated imageryCore dumpDimensional analysisSubsetMathematicsBitPattern languagePolarization (waves)CodeMusical ensembleParameter (computer programming)Variable (mathematics)MetreImage resolutionOnline helpRow (database)InformationDifferent (Kate Ryan album)OpticsFrequencyProduct (business)CASE <Informatik>Point cloudFile archiverSet (mathematics)Representation (politics)Graph coloringMultiplication signSelectivity (electronic)PlotterOrder (biology)Water vaporLaptopExpert systemMeta elementResultantComputer animation
Process (computing)Query languageFatou-MengeMereologySlide ruleOpen setWebsiteReading (process)Angular resolutionSatelliteOrbitSpectrum (functional analysis)MultiplicationMusical ensembleMereologyProcess (computing)PlanningZoom lensMultiplication signGoodness of fitSystem callNear-ringRight angleImage resolutionBitSatelliteOrbitDifferent (Kate Ryan album)MetreOpticsWater vaporComputer animation
Digital filterKernel (computing)Computer fileAsynchronous Transfer ModeLibrary catalogInformationFormal languageOperations researchRaster graphicsTemporal logicArray data structureObject-oriented programmingVector space modelCubeFrame problemUniform resource locatorComputer-generated imageryElectronic visual displayVolumenvisualisierungServer (computing)Web 2.0Video gameView (database)Data typeFunction (mathematics)Cloud computingLevel (video gaming)Video game consoleIntegrated development environmentSource codeNormal (geometry)File formatTable (information)User profileMachine codeBuildingPlot (narrative)Presentation of a groupFile viewerVisual systemPasswordGoogolNetwork topologyOpen setClient (computing)Query languagePrice indexAuthenticationLaptopService (economics)Element (mathematics)BitMultiplication signType theoryLink (knot theory)Open setDirectory serviceInformationLaptopStack (abstract data type)Library (computing)Temporal logicLibrary catalogComputer virusFormal languageService (economics)Existential quantificationStandard deviationMetadataRepository (publishing)Set (mathematics)Goodness of fitOpticsComputer animationSource code
Library catalogFluid staticsData typeDigital filterLevel (video gaming)Mathematical analysisQuery languageForestSystem programmingFocus (optics)Term (mathematics)Raster graphicsData storage deviceTemporal logicSingle-precision floating-point formatType theorySpacetimeMedical imagingProduct (business)Computer-generated imageryAsynchronous Transfer ModeDemoscenePoint cloudFile formatSpatial data infrastructureComputing platformSatelliteDigital signalOpen setIndependence (probability theory)Complex (psychology)EllipsoidGeometryInternet forumMessage passingDependent and independent variablesRWE DeaData structureNatural numberConstructive solid geometryCubeMiniDiscServer (computing)Interface (computing)Inclusion mapSet (mathematics)Google EarthData analysisScale (map)Inheritance (object-oriented programming)Hazard (2005 film)Electronic mailing listSample (statistics)Event horizonIntegrated development environmentContent (media)RootPerformance appraisalComputer programMultiplicationSoftware developerGoogolSurfaceFaculty (division)Vector space modelComputer programmingService (economics)ComputerMathematicsSubject indexingMessage sequence chartPhotographic mosaicRepository (publishing)Variety (linguistics)Process (computing)OscillationSystem on a chipMilitary operationVisualization (computer graphics)Trigonometric functionsInformationLink (knot theory)Image resolutionMetrePrice indexFreewareElement (mathematics)Beta functionWeb browserClient (computing)Formal languageAuthenticationLaptopKernel (computing)View (database)Computer fileMachine codeUniform resource locatorStack (abstract data type)Library catalogUniform resource locatorDifferent (Kate Ryan album)Set (mathematics)Information privacyClient (computing)Computing platformConnected spaceRepository (publishing)Arithmetic meanComputer animation
Level (video gaming)Range (statistics)Discrete element methodComputer programElement (mathematics)Web browserLink (knot theory)Library catalogClient (computing)SurfaceAuthenticationLaptopService (economics)Focus (optics)Kernel (computing)Computer fileView (database)Open setMachine codeAsynchronous Transfer ModeUniform resource locatorLetterpress printingNetwork topologyProcess (computing)Software bugDatabaseLibrary (computing)File formatImage resolutionWebsiteProduct (business)Type theoryInformationHTTP cookieRadiometryGeometryInterpolationMathematical analysisData conversionComputer-generated imageryDemosceneData compressionPixelImage registrationPlane (geometry)Musical ensembleSampling (music)Point cloudUniform resource locatorClient (computing)Level (video gaming)Image resolutionConnected spaceMetreProcess (computing)Data conversionPoint (geometry)Reflection (mathematics)Total S.A.Library catalogComputer animation
Process (computing)Level (video gaming)Product (business)File formatSoftware bugDatabaseLibrary (computing)Image resolutionInformationView (database)Service (economics)Type theoryComputer fileOpen setLetterpress printingFocus (optics)SurfaceQuery languageTemporal logicDemosceneClient (computing)Kernel (computing)Network topologyAsynchronous Transfer ModeGeometryComputing platformPolygonPlot (narrative)Level (video gaming)Range (statistics)GeometryPolygonInformationExtension (kinesiology)Object-oriented programmingPoint (geometry)Channel capacityParameter (computer programming)Frame problemRight angleMereologyClient (computing)Query languageSocial classBitProjective planePoint cloudMultiplication signCovering spaceSatelliteAreaComputer animation
View (database)Computer fileKernel (computing)Open setCovering spacePoint cloudMachine codeNetwork topologyLaptopAsynchronous Transfer ModeThumbnailVisual systemComputer-generated imageryMusical ensembleProcess (computing)Plot (narrative)SubsetFunction (mathematics)Scale (map)Musical ensembleBitUniform resource locatorObservational studyInformationError messageDemosceneKey (cryptography)Angular resolutionProjective planePlotterField (computer science)Covering spaceCategory of beingCodierung <Programmierung>Selectivity (electronic)GeometryRepresentation (politics)File formatThumbnailMedical imagingMultiplication signMetadataResultantGoodness of fitElement (mathematics)Point cloudAlgorithmPairwise comparisonQuery languageProcess (computing)PixelAreaMereologyLevel (video gaming)Water vaporAcoustic shadowEngineering drawingDiagramComputer animation
Open setPlot (narrative)Computer fileView (database)Kernel (computing)Musical ensembleMachine codeRepresentation (politics)HyperlinkCubeLimit (category theory)Bilinear mapThumbnailComputer-generated imageryVisual systemStack (abstract data type)Raster graphicsScale (map)LaptopAsynchronous Transfer ModeProcess (computing)Function (mathematics)File viewerPresentation of a groupVideo game consoleIntegrated development environmentSource codeNormal (geometry)File formatTable (information)BuildingUser profileComa BerenicesArray data structureVisualization (computer graphics)InformationUniform resource locatorFrame problemWeb browserInterface (computing)Library (computing)PlotterLevel (video gaming)Point cloudAuditory maskingVolumenvisualisierungGamma functionQuery languageClient (computing)Musical ensembleAreaRepresentation (politics)Graph coloringNumberVisualization (computer graphics)Selectivity (electronic)MereologyDifferent (Kate Ryan album)MappingAsynchronous Transfer ModeFunctional programmingPlotterLimit (category theory)Equaliser (mathematics)GeometryQuery languageVector spaceFluid staticsRight angleExtension (kinesiology)Computer configurationMultiplication signProjective planeObject-oriented programmingResultantType theoryInformationCuboidClient (computing)CubeFile formatPolygonVideo game consoleIntegrated development environmentLaptopFrequencyTransformation (genetics)Library (computing)Computer animation
Software maintenanceCubeCAN busView (database)Link (knot theory)Source codeSoftware developerRevision controlInstallation artLibrary (computing)Client (computing)Library catalogService (economics)InformationMathematicsVolumenvisualisierungParameter (computer programming)Uniform resource locatorPoint cloudAuditory maskingNormal (geometry)Table (information)File formatHill differential equationQuery languageLimit (category theory)Asynchronous Transfer ModeComputing platformPolygonComputer fileKernel (computing)Machine codeOpen setAuthenticationElement (mathematics)Letterpress printingGoogolLaptopNetwork topologyVisual systemGeometryDigital filterAreaAdvanced Encryption StandardBefehlsprozessorCategory of beingPrice indexThumbnailFunction (mathematics)Plot (narrative)DemosceneClient (computing)Line (geometry)BitCASE <Informatik>CubeInformationObject-oriented programmingParameter (computer programming)Functional programmingPoint cloudDependent and independent variablesMaxima and minimaDemosceneUniform resource locatorStack (abstract data type)Term (mathematics)Library catalogSubject indexingPattern languageProjective planeGoodness of fitValidity (statistics)Letterpress printingOrder (biology)Geometry1 (number)Product (business)PlotterScaling (geometry)Category of beingMultiplication signSimilarity (geometry)Different (Kate Ryan album)Mathematical optimizationComputer configurationCuboidLevel (video gaming)Limit (category theory)Medical imagingMusical ensembleType theoryRow (database)Filter <Stochastik>Extension (kinesiology)Revision controlSuite (music)Computer animationDiagram
Asynchronous Transfer ModeMachine codeOpen setPrice indexPoint cloudComputer fileView (database)Kernel (computing)CubeLimit (category theory)LaptopNetwork topologyBilinear mapMusical ensembleRange (statistics)Sample (statistics)File formatMedianHyperlinkComputer-generated imageryRepresentation (politics)Plot (narrative)Client (computing)Coordinate systemPoint (geometry)FeedbackGoodness of fitComputer fileCuboidCubeType theoryStack (abstract data type)Object-oriented programmingNumberAreaBuildingScaling (geometry)PlotterMusical ensemblePoint (geometry)InformationImage resolutionDiscrete element methodBitTwitterDiagramComputer animation
CubeLimit (category theory)View (database)Computer fileKernel (computing)Machine codeAsynchronous Transfer ModeMusical ensembleFile formatRange (statistics)Sample (statistics)Bilinear mapFunction (mathematics)MedianImage resolutionOpen setConsistencyDefault (computer science)Parameter (computer programming)Revision controlNetwork topologyLaptopShape (magazine)Graph (mathematics)Coordinate systemSubject indexingAttribute grammarCellular automatonError messageBit error rateCubeMusical ensembleImage resolutionFigurate numberRepresentation (politics)DemosceneOrder (biology)AreaSelectivity (electronic)Control flowGroup actionRight angleMultiplication signInterpolationBilinear mapPixelDimensional analysisAverageSampling (statistics)Range (statistics)TimestampSet (mathematics)File formatNeuroinformatikUnified threat managementBitProjective planeComputer reservations systemWordNichtlineares GleichungssystemLetterpress printingQuantum stateCASE <Informatik>CodeThumbnailOperator (mathematics)InformationPoint (geometry)Computer animation
Range (statistics)FrequencyMusical ensembleSample (statistics)File formatMedianShape (magazine)Graph (mathematics)Machine codeOpen setNetwork topologyView (database)Kernel (computing)Computer fileAsynchronous Transfer ModeCoordinate systemImage resolutionFunction (mathematics)Read-only memoryPlot (narrative)Task (computing)AreaFrequencyMultiplication signCombinational logicInheritance (object-oriented programming)Image resolutionMedianInformationCubeMusical ensembleGraph coloringAreaGroup actionSemiconductor memorySubject indexingFunctional programmingDimensional analysisMetadataArithmetic meanObject-oriented programmingPoint (geometry)PlotterRepresentation (politics)MathematicsParameter (computer programming)DemosceneTableComputer animation
View (database)Kernel (computing)AreaPlot (narrative)Asynchronous Transfer ModePoint cloudAuditory maskingCubeVisual systemSource codeNormal (geometry)VolumenvisualisierungThumbnailFile formatTable (information)Open setMachine codeLaptopNetwork topologyAliasingComputer fileDemosceneBilinear mapTask (computing)Musical ensembleDefault (computer science)Parameter (computer programming)Revision controlShape (magazine)Coordinate systemSubject indexingGraph (mathematics)CalculationImage resolutionSinguläres IntegralBound stateGoodness of fitCubeCuboidMoment (mathematics)AreaPixelDisk read-and-write headMultiplication signSelectivity (electronic)Image resolutionMetreRepresentation (politics)MathematicsTimestampMusical ensembleBound stateType theoryComputer animation
Online helpKernel (computing)View (database)Computer fileAsynchronous Transfer ModeNetwork topologyOpen setMachine codeDemosceneImage resolutionWebsiteCurvatureMultiplication signSemiconductor memoryDemosceneNeuroinformatikComputer fileExtension (kinesiology)Configuration spaceProduct (business)Stack (abstract data type)Point cloudLibrary catalogInformationBitElectronic data processingImplementationLaptopComputer animation
Cellular automatonView (database)Function (mathematics)Military operationKernel (computing)Video game consoleLaptopComputer fileOnline helpOpen setMachine codeCubeCalculationSubject indexingCoordinate systemGraph (mathematics)Shape (magazine)Attribute grammarMaxima and minimaPlot (narrative)Arithmetic meanSoftware bugStandard deviationDemosceneLibrary (computing)Modul <Datentyp>Direction (geometry)CubeCalculationLine (geometry)Musical ensembleDemoscenePixelMaxima and minimaPlotterRaster graphicsVariable (mathematics)TimestampNear-ringType theoryMultiplication signProcess (computing)Mathematical analysisComputer animation
Computer filePlot (narrative)View (database)Kernel (computing)Open setStandard deviationSoftware bugArithmetic meanDemosceneLibrary (computing)Machine codeLattice (order)HyperlinkModul <Datentyp>Asynchronous Transfer ModeLocal ringData managementGoodness of fitProduct (business)Different (Kate Ryan album)AreaCodeMereologyCartesian coordinate systemForestTime seriesMaxima and minimaLogical constantSoftware bugArithmetic meanComputer animation
CubeDigital filterParameter (computer programming)DemoscenePoint cloudFunction (mathematics)VolumenvisualisierungAuditory maskingTable (information)File formatNormal (geometry)Visual systemEmailSource codePolygonQuery languageComputing platformInformationThumbnailLink (knot theory)GeometryLimit (category theory)Network topologyMountain passView (database)Angular resolutionElectronic mailing listImage resolutionMedianAverageGreatest elementTemporal logicComputer-generated imageryMusical ensemblePresentation of a groupFile viewerOnline helpVariable (mathematics)Category of beingPointer (computer programming)Element (mathematics)Integrated development environmentVideo game consoleBulletin board systemPlot (narrative)Machine codeComputer fileUser profileBuildingMultiplication signSystem callBitMusical ensembleQuery languagePairwise comparisonStack (abstract data type)CubeLink (knot theory)Extension (kinesiology)Arithmetic meanCuboidLetterpress printingImage resolutionMedical imagingSign (mathematics)PixelWeb 2.0Validity (statistics)Representation (politics)Visualization (computer graphics)Object-oriented programmingRectangleProper mapFunctional programmingReduction of orderView (database)Data conversionVector spaceInformationComputer animation
Standard deviationSelf-organizationElement (mathematics)TelecommunicationFeedbackLevel of measurementGoogolPositional notationFloating pointNumerical analysisComputer-generated imageryTerm (mathematics)Representation (politics)Extension (kinesiology)Content (media)Multiplication signThermal expansionNumerical digitMetreState of matterNumbering schemeInformationWechselseitige InformationMaxima and minimaPole (complex analysis)File formatHTTP cookieFrequencyPixelString (computer science)Function (mathematics)Variable (mathematics)Electronic mailing listComputer fileUsabilityPresentation of a groupFile viewerPlot (narrative)Integrated development environmentVideo game consolePointer (computer programming)BuildingView (database)Machine codeSystem programmingVolumenvisualisierungAuditory maskingPoint cloudGreatest elementImage resolutionTemporal logicMedianAverageNormal (geometry)Source codeTable (information)Level (video gaming)Process (computing)SurfaceProduct (business)MereologyAlgorithmAcoustic shadowDemosceneSoftware bugService (economics)Revision controlMusical ensembleModul <Datentyp>Thresholding (image processing)Artificial neural networkSequenceEquals signPressure volume diagramDistribution (mathematics)Self-organizing mapWebsiteType theoryDifferent (Kate Ryan album)Multiplication signFile formatAverageMedianView (database)CubeLevel (video gaming)Goodness of fitAuditory maskingSoftware frameworkExtension (kinesiology)Resampling (statistics)Point cloudPixelDemosceneAcoustic shadowComputer animation
Auditory maskingPoint cloudVolumenvisualisierungFile formatSource codeNormal (geometry)Table (information)Visual systemState of matterSocial classSoftware repositoryPlotterDigital filterMiniDiscLevel (video gaming)DivisorWater vaporSystem programmingMachine codeAcoustic shadowMedianParallelverarbeitungPlot (narrative)Acoustic shadowPoint cloudCubeView (database)PixelMedianDemosceneAuditory maskingSocial classInformationPlotterComputer animation
Point cloudSystem programmingCodeParallelverarbeitungScale (map)Execution unitVolumenvisualisierungAuditory maskingSource codeVisual systemFile formatNormal (geometry)Table (information)Function (mathematics)Process (computing)Instance (computer science)Series (mathematics)Axiom of choiceLaptopAreaBlogPolygonTemporal logicDemo (music)Repository (publishing)CuboidMessage passingMedianCASE <Informatik>InformationMathematical analysisSingle-precision floating-point formatTime seriesArithmetic meanRaw image formatSoftware repositoryLaptopPairwise comparisonBlogSuite (music)Multiplication signNeuroinformatikGoodness of fitQuicksortResultantMereologyBitDemosceneReduction of orderMusical ensembleStack (abstract data type)CubeRepresentation (politics)Software bugMoment (mathematics)Parallel portShooting methodExtension (kinesiology)Auditory maskingLibrary catalogProduct (business)Link (knot theory)Image resolutionMetreAreaObject-oriented programmingUniform resource locatorRaster graphicsComputer animation
Transcript: English(auto-generated)
Hello, my name is Lare Nava. I'm a PhD researcher at the University of Salzburg, the Department of Geopharmatics. Today I'm going to talk, today and tomorrow, I'm going to talk about tools and packages to query and process Sentinel-1 and Sentinel-2 data with R and Python, and
this is part of my contribution for the Austin yoga hub 2023. Awesome. So that's done. Let's get started. Just a quick overview of the sessions. So initially this was going to be one single session, but now it's leading to. So today I will just go through quick intro slides,
which are this very quick. Then we'll have a bit of a hands-on of exploring Sentinel-1 data with Python and some, we'll ask some questions and exercises. And tomorrow, if you're still interested and you want to still come, we'll work on exploring Sentinel-2 data with R and Python.
So today we'll only do Python and tomorrow we'll do this crossover kind of thing with Sentinel-2. Yeah. So for that reason, I more or less changed the title of today's session to tools and packages to query and process Sentinel-1 data with Python. So we're going to take the other two for tomorrow.
Great. Just a quick overview of the Sentinel missions. If someone's not familiar, they're from the European space agency, part of the Copernicus program. They have a free and open data policy.
They have six missions more or less, and we have Sentinel-1, Sentinel-2, but you also have Sentinel-3. Sentinel-5 has become quite popular for air pollution and myriad others. I work with Sentinel-1 and Sentinel-2. I know a lot of people also work with others, but that's why my sessions
are focusing on this. Sentinel-1, quick facts. Well, Sentinel-1 is a synthetic aperture radar, which is a form of radar that is used to create 2D images and 3D images and 3D reconstructions of objects like landscapes. Sentinel-1 itself has four operational acquisition
modes and four types of data products. I will show it a little bit more later. And we have right now one satellite in orbit because one is defected and two more should be operational in 2024. So I want to ask people in the audience who is familiar or who's using Sentinel-1 for their work?
Very few. Okay. A little bit? Okay. Cool. I will start with my first disclaimer. I am not a SAR expert, so everything I'm going to talk to. I've also learned from my colleagues who are really the SAR experts, so I hope I'm not making any mistakes. And if I am, excuse me. Okay.
Excuse me. Okay. Okay. That was disclaimer number one. And just to do a quick overview of this missing satellite that I mentioned, you know, when we have this type of operations and if you
have Sentinel-2 or Sentinel-1, this tandem satellites that are going around the Earth are giving us a lot of information. So Sentinel-1 passes every 12 days, like each satellite. But when you have both of them, you have like every six days imagery. And that's great. But
the fact that the satellite had a malfunction was quite a blow. So this was Sentinel-1b before the malfunction. So you see, I mean, this is Sentinel-1a and Sentinel-1b together. This is the coverage that we had for all the Sentinel data. So it was massive. Yeah. And this is what happened
when we didn't get it. Okay. So don't get like completely freaked out with it because I guess you will notice that, wow, that's really wide. Russia is a political decision. So that's why there are no imagery on top of that. But for the tropics and Africa, Australia, we do have a lack
of imagery that is going through. So this is impacting somehow the analysis that we do. So just for you to have an overview, but we're optimistic. The satellites were supposed to be launched this year, it's postponed to 2024, but there will be more data available. So hopefully
that will be solved, let's say. Yeah. So that's basically the quick intro I wanted to do, just really quick. So now it's not going to get to the practical part of the session. So
good. I'll close that up. And we're going to work on Jupyter. First question, again, not first, another question. Are people following with the Docker image that I put around? Besides you that I know you have problems, did it work? That's great. So if you could share that on the matter
mouth, because I think there was one other person, but a problem with you. Okay. So maybe his fix would work. So they were only working with the Python image. So that should be doable. If you have not started running the image and the instructions are on the readme, but basically
what you would need to do is, well, first go to your directory where you cloned the data or the repository. And then you have this Docker run and this whole myriad of
text will be popping up on your terminal. And then at the end, completely at the end, wait, not so much at the end. Sorry. Okay. I sort of lost it. Ah, here. You will have this HTTP
link. So if you copy that on your browser, you'll be able to see the JupyterLab session. Okay. So let me know if someone has issues or if someone else can assist with this, that'd be great. So I hope this is visible enough. I think so for everyone. Should I make it bigger? Is it okay?
Some interaction. Yeah. A little bigger. It's too big now. Or is it okay? Okay. Fine. Yeah. I think it should be good. Okay. So let's get to it. As I was saying,
SentinelOne has several ways of providing their products. You have, as this figure shows, you have scriptmap, you have iw, you have ew, you have wavelength. These are all modes.
And you'll have also different levels. You have level zero, where it's the raw data. You have level one, where you have what they're called single loop complex and ground range detected. And then you have level two, which is like the highest level of processing, where
things are more or less just products that you can directly use. So we'll focus on level one. And that's in general what is used for any type of SentinelOne analysis. And as I said, you have single loop complex and ground range detected. We will take more or less a look at
both. But the main difference is that single loop complex is still having all the information on the amplitude, the phase, all this information that comes with rider data, while ground range detected is a bit more processed. So you can directly work with the data.
Single loop complex is data that is on repositories that you still more often than not need to download to be able to process. This is what is used to do interferometric SAR analysis, so to calculate topographies or do subsidence. I don't know if
you've seen some of the applications of SentinelOne. You can do, as I said, subsidence analysis. You can validate tectonics. You can do some stuff in bulk analogy. It has many applications. So normally that's done with single loop complex. While ground range detected, as I said, is
more looking at the amplitude itself, but it's already a bit more processed. And this is data that you can find more on these repositories that are already open. For example, you can find it on Google Earth Engine. You can find it on planetary computer. I think OpenEO also has an API to fetch
this data. So these two ways of doing it. And then the modes. I have my little cheat here. So you have strip map, which is designed to support the European remote sensing and NV submissions. And then we have the interferometric wide SWOT mode, which is the
default mode over land. Extra wide SWOT mode is the default mode over the ocean, maritime, ice, polar zone, so really at the north. And wave is more over the open ocean.
So we are going to be focusing mostly on IW. So as I said, very few cloud computing capabilities are available to do these complex workflows on InSAR. So we still need to download the data. And to download the data, you already need to have a bit of an understanding of what's happening,
why you need to download so much data, how the processes work, and everything. So as I said, I am not an extreme expert, but I have worked with people who need to download hundreds of scenes of Sentinel-1 to do some differential InSAR analysis, which means they're
going to look at how there is a time series of movement, of deformation, of landslides, or this type of applications. What I was mainly involved in, as I mentioned also yesterday, is DEM generation, which it's also complex, but in the sense of it's a little bit similar because
you only need to think about having two images. You need to have a pair of images that will help you determine the topography of the ground. So the idea is that if you have one scene
taking a point in time one, and then you have a second scene taken more or less at the same time, not at the same time, in the same area, so it's looking at the same place at that point in time two, you can use these two images to estimate the topography. So that's the idea, and that's a little bit what the application of how to query the data is targeted to, of getting two images
that are paired to each other, where you can determine topographic information. So this is a bit how it looks like. So this will be like the satellite number one, this will be the satellite number two. These are happening in two points in time, and what we need to know is, for example,
they're not exactly on the same location, so they're having a different path. So what we need to know in one of the cases is what is the distance between satellite one and satellite two. That's what's called perpendicular baseline, and we also need to
know the difference in time. So either it passed between six days or 12 days or 36 days. So this is also something that, if you're doing the analysis, you will need to find pairs that have a specific perpendicular baseline and a specific temporal baseline so that you can work with them.
Okay, any questions so far? I hope not because, oh yeah, okay, good. For the people who work with Sentinel-1, does someone know what M and S is? What does it stand for? So this is satellite one and satellite two. Yeah, exactly. This stands for master and slave, and this is just
basically saying which one is the reference and which one is, yeah, the primary and secondary satellite. On terminology, this is a whole thing, so if you're interested in that, there is like
here a link to a blog post on why we should not use these terms. So I normally use secondary primary for this, but there's still a lot of literature that uses the other ones, so don't get confused by that if you start looking for this. This is slowly changing, okay? Good. Okay, so as I said, I'm going to give you this example of if you were going to be
selecting data to generate a DEM out of a pair of images. Just a selection, not the DEM generation for now, okay? So as I said, we would need some specific perpendicular
baseline and a specific temporal baseline. After talking to my colleagues and everything, they told me that the specific perpendicular baseline for DEM generation is optimal between 150 and 300 meters, so the distance between both satellites should be like this, and the
minimum temporal baseline is optimal, so that you have the least time between both satellite passes is better because if you let more time pass in between, probably there are going to be some topographical changes that when you want to generate the DEM are going to just give you
bias, right? So that's not optimal, so it would be optimal if we would have actually two satellites passing more or less at the same time, but from slightly different areas, so that's a bit what MBSat and other satellites do because that's what they were designed for, but that mission is over, but that's where you got quite some of the DEMs that are now popularly available that are based
on insert analysis used to do this, so like this type of path. So basically Sentinel-1 is not best to do this, but it's one of the applications that we can get out of it, okay? So what we're going to
do to query the data in this case is make use of the Alaska satellite facility. Why do I use this and other technical services? It was just way easier to work with this. The API works very well and you don't need to give any user login or anything just to query the data. You need to give
credentials when you want to download the stuff and it's free to make an account, but to query it's completely fine, so that's great and that's why I use it. So we're going to import the library. So we have the Alaska services facilities search module. I have GIA pandas,
map plot lib, so just to give you an idea how things look like, but I think that's even used later, numpy and pandas. So we're going to run this and I have here, I hope this goes okay, otherwise I would have a problem because I don't have this run anywhere.
Okay. Taking a bit longer than usual. Let's hope it works. So we have the import of the libraries and then we will define an AOI
and start and end there for our queries. So yeah, we're in Poznan, so I just have like a GIA JSON of Poznan, a square on top of Poznan. You will see it as soon as this runs and
we use GIA pandas to read that in and we use the function explore to explore it. And we are going to define timestamp. So we have a date start and a date end. This is randomly selected just 2022, May until October just to have an idea and what I'm doing
here is translating the information from the AOI to WKT so that the query understands what is going on because otherwise the information comes in a different thing that is not understood.
I think this one is working mainly with, I mean, if you have a WKT, so if you can translate the KML to a WKT, it will work. You can also give it a shape as long as then in the end you have a WKT, it will just function completely fine. So that's whatever initial thing you have, it will work, especially if you call it with GIA pandas and any format that GIA pandas understands,
then you can pass this WKT function and it will just understand what you need. Let me take a look at what's happening. This is a bit odd. That's not so nice.
Someone else having problems or are you already like completely running everything without me? Like if someone is more advanced than me, I might borrow the laptop. Cool. So that was a nice break. Oh, sorry. Nothing happened.
As I was saying, we import the library, we get our file, and with GIA pandas you can see very nicely how it looks like. I don't know if people here have more Python or R. How many pythons? Okay. Rest is R. Cool. Second disclaimer of the day, I am not so proficient in Python.
I have a lot of cheat code there that sometimes I would normally have to look up. So yeah.
Bear with me. And if you have any, like, that's how you do it, do let me know. Okay? Good. So as I said, we have the footprint, this footprint here of Postman. We can transform it into a WKT, and that is directly understood by the query. So how does this look like now? It's just footprint,
oops. It's just a geometry object, okay? In WKT format. WKT is when you have, like, a polygon and then in parentheses then all the values, right? Okay. So what we're first going
to do is the geographical search. So we defined the extended space and the time, so we're now going to say in this space and time frame, find me all the Sentinel-1 images, scenes, that you can find. So for that we use the geo search and we have established a platform, so that's platform.sentinel-1, that's what we want. We can also say if we want Sentinel-1A
or Sentinel-1B or in the future C or D, but we just want everything right now, right? That would say it has to intersect with the footprint. So that's the footprint that we just printed before. So as you see, I added dot geometry and then zero because basically,
let's do that here so that you see, we need to extract it from this format, so then the zero is to get the first index, and then you just get this string right on WKT where you can get that. Then we can define the processing level. So again, remember this long
time ago when I was talking about SLC and GRD? Yeah. So now we can define if product type is SLC, that's what we want, and we can define the start date, the start date, and here I've just restricted my results to 10 just for this, but normally when I work with this, I just let it
find anything because I don't know which of the scenes would be the most valuable for me. Okay. So we run this to get the product and that goes fairly fast. So then we can see, I like to see them not in like a little bit not so nice way of seeing things. So like if you have
products, this is how it shows. So what I did here is just convert it to a panda data frame so that I can look at it better, and this is what we got. So we have 10 because that's the maximum number of scenes that I asked for, and then we have several sort of information out of here. So
we have the VIM mode, we have browse by center lots, yeah, several types of stuff that I would normally be able to move around, but I think this is just a little bit too big,
so let me just put this a bit smaller, just take a look at this. Oh yeah, it was, that was not working. So yeah, we have the center longitude, the fire rotation, like a bunch of information about this, right? One of these is flight direction,
so if it's ascending or descending. We have also the processing time, processing level, the scene name, the sensor. So Sentinel-1 has a C SAR sensor. There are other SAR satellites that have an L band or an X band. This is mainly the way they're
taking the data. So for example, L band is better to penetrate through vegetation and yeah, all types of other things that I'm not an expert at, but Sentinel-1 has C band, which sometimes gives us problems. Anyway, you also have a dot URL, which means where you can
fetch the data from, and a frame number, which is a bit like the orbit, like which direction it's going through. And remember I was talking about the mode, so IW, EW, and the other other waves. So here you'll see all of them are IW. You can also define that, I just want IWs, or you can
say I want EW and IW. So we can say, okay, I want this product with IW, and that should just give us the same thing, hopefully. Yes, but you can also say I want EW,
and this is zero, or it gave me no results. Does someone know why I don't get results with EW? Aha, the tests are coming. Now, as I explained before, EW is mostly for polar regions,
so there's nothing here because we are working on land, we're working on Pozna, Pozna is not on the polar. As far as I know, we're not on the Arctic, so that's why we don't get any results when we say EW. So we have IW, and that should just work. Good.
So as I said also before, we have this ascending and descending. So what does that mean? It's the flight direction, and in general here we're looking at how the satellite is taking the information. So the orbit is going like this, and then to the south and back. So sometimes it's
going downwards, and then it goes on the other side of the globe, and then it goes upwards, right? So that's what it means by ascending, so it's taking the information in an ascending form or in a descending form, okay? That's it, and the best way for me to understand it is to look at
this. Well, this didn't render so nice, but ascending will just take all of them like this, and then descending all of them like this, okay? So that's the main difference. So yeah. For example, if you want to generate the DM, you should take the same movement.
Exactly, exactly. So that's one of the main things. If you want to generate the DM, if you have an image that is like this and the other one like this, that's not good because, yeah, you're going to have weird artifacts. So you always want them to be in the same flight direction. That's why this is important, okay? And you also want them to be on the same orbit,
so these numbers that I showed you at the end of this, they should be similar as well. So when you're finding a pair, these should both match, and usually they match, but it's good to do a double check in case something is weird. We are thinking with my group of ways of merging, ascending, and descending, maybe not at this stage of doing these first parts of interferograms,
but maybe at a later stage, but that's another story. Normally, you should work with both going together well. Yeah. So now we have 10 scenes that intersect with our space and time
for Poznan, right? So now, as I told you, we want to find pairs for these scenes, right? We want to find another scene that can be paired with this, but also we want to find something that is according to our requirements. So as I told you, I want to find something that is with a perpendicular baseline between 150 and 300 meters, and I want to find the images that have the least
time interval in between them, okay? So for that, we can start working with the baseline search. Just for a demo, I will just find all the pairs for one single scene that we got from the previous
query, okay? So I'm going to take products, I'm going to take product zero, so the first one, this first image, and then what this function does or this query does is find me any scene on the Sentinel-1 archive that is on Alaska satellite facilities that has more or less an
overlap, so it has to be in the region. It's not usually, not even completely an overlap. It has to be in the region that can be used as a pair, okay? So we will find it by the whole thing,
right? So here, I cannot restrict it by time. I cannot tell it just look for 2020 or 2022. It's going to find everything, and it's going to do a computation of this perpendicular baseline on the fly. Why this is useful? Because if you've ever worked with Sentinel-1, the people who raised their hands before, to compute the perpendicular baseline, normally you have to download
both images, open what is called Snap, so this Sentinel application program, then you have to load them both and do a whole process just to know that the perpendicular baseline is 90 and you don't need 90, you need 150, so that means that then you need to delete
this file because you don't need it, right? So that's how it was working before, but now this Alaska service facility has this option to do this stack and perpendicular baseline on the fly. It's not exactly the same as when you compute it then afterwards, but it gives you a pretty good estimation and therefore saves you a lot of time and downloading data because each of these things
are four gigas, so you don't want that. Trust me, you don't want that. Okay, so we're going to run this stack and it's going to look for many, many, many, many, many things, so I should have run that while I was talking. That was very smart. Okay, that's done. That was not so lengthy. So as you see here,
we got 446 pairs for one single image. We have 446 scenes that are paired to this image, okay? So that's a lot. So we're going to take a look at that again as a data frame because that's just
easier and here you'll see at the end of this that we have now an extra column, temporal baseline, and perpendicular baseline. Okay, so that's basically this thing that we took on the fly. So we have temporal baseline, this is in days, so this is 2,892 days between the primary and
secondary image with a difference between them of seven meters. I have now my sheet here of sort values because I didn't know how to do that before. So now we can sort between
four perpendicular baselines so that we can see a bit what we're getting. So this is the minimum perpendicular baseline, 184 negative, and then we have 191 positive for this specific image. The sign doesn't matter because it's just basically which one is the
primary and which is the secondary, but if you then say, okay, now then this one is the primary, then the sign just flips. So you can take absolute values here, that's okay. Same with the temporal baseline. We're just interested in absolute values, we don't care so much about negative
or positive, which one is first or which one is second. Okay, so now I know that I have 446 files, but I don't want them all. So yeah, I want just the ones that fit my thing. So what we can do is just do a little stack, a plot to take a look at what we're getting.
So we have these temporal baselines, we have a lot of scenes that are really old, as you see, like 3,000 days before the image that we want to find a pair for, and the perpendicular baselines are more or less usually closer to zero, closer to 50. This is common for Sentinel-1, this is
for the most of the uses, that's more or less the perpendicular baseline that you need, but for this particular one, we want something higher, right? We want some just the imagery that is here, or probably this one looks actually pretty good, but it's really close to zero.
So we're basically going to look just as this very small area here or this very small area here to get what we want. So what I do is just pretty small search of what I want, the temporal baseline less than 30 days, and then it has to have a perpendicular baseline, higher than 115,
lower than 300, and I get one pair that is fitting the requirements that I need. So we have this here, and again, we have the perpendicular temporal baseline, it's 24 days, that's the minimum that I could get, but yeah, that's probably also because we don't have enough
data for this area or something like that, so it's something that we can test. So we can look at the properties of this image, that's basically what I was just showing you before, but in a less pretty way, and as you see, you have the URL. As we were saying, we have, where is that,
the flight direction somewhere. I think it was ascending, but I just don't see it anymore. Ah, here, yeah. It's ascending, right? So now we remember that we are looking for a pair
for product zero on my list before, right? So as you may see here, also ascending, and we have a frame number or this orbit, 167, this is also 167, so we seem to be on the same page, this is going to work, we can use these two images to get our stuff, right? So this is a bit of workflow
that I was automating. Now I just show it to you with one image, but the advantage of using these query tools is that you can basically combine them in a for loop and start querying for every single
pair that you want, you start doing the filtering, and then you end up with a super nice list of image pairs, maybe sometimes, I don't know, only 30 are going to match your expectations, but still, you can query for a whole year or the whole season that you want. It will take some time, but
it will save you so much time. So that's really a nice thing that I enjoy finding out that it's possible, and at least my colleague who is the SIR expert is always super excited about it and telling me how much her work is simplified by this. So if you're looking for some sort of way
of doing that, I will put a link because I think I didn't put it of this project where I did this. The code is not so nice because as I said, I'm not a Python expert, so maybe there's better ways to write this, but at least it does the job, right? So you can also take a look at the script where you can do the querying of this data and just get anything you want. So that's pretty neat.
Okay, and I added one section here on downloading the data. I'm not such a fan of downloading data that you don't need, but I anyway put it because in this case, you do need it when you have both.
So this is just an example of how you could do it. Because you have the property URL here for both, it's a zip file as you see. You can just say, okay, these are the URLs that I want from products, my first product zero and the stack 416, which was the one that won the lottery, right? Yeah,
that's the row ID. So then we just put them together and then we just use this function, download URLs. We just give it the URLs in a list format. That's why it's like this. We tell it where to store it and we give it our credentials in this case,
earth data credentials are needed for this Alaska services facilities. Okay, you can also define how many cores you want to use for this. So this is Mark now, right? Okay, yeah. You can define how many cores you want to use for this.
You can put five, whatever. That means that not, sorry, how many parallel downloads you want to do. Because as I said, this is four gigas per scene. So depending on your internet connection, you don't want to be downloading 30 scenes only one at a time, or maybe you have to because you
don't have a good internet connection or you can do more. So it's just that processes means how many you can download at a time. Okay, so that's the first part of it. So how to download, so how to query and how to download the SLC data. What you can do then with this data,
normally you would go to the snap software. So as I said before, and if someone was on X cube yesterday, this was briefly mentioned. It's a software to do analysis of Sentinel data, Sentinel-1 and Sentinel-2. But yeah, it's interesting. Normally, you can also interface
this. So this is a GUI software. You can normally interface this also with Python. The reason why I didn't put a practical example here is because it's based on Python 2. And yeah, that's not
super. So it has not been given too much maintenance of it. So yeah, I decided to skip it, but they are working on this new, okay, on this new snap version. So version 10 with a new snap
Python API. So it's now gonna be called isa-snap-py or isa-snappy, which before it was called only snappy, but another package had the name. So Python, they make a mess of their names and stuff. So this is gonna be the one. So if you're interested, so the reason why I didn't put it is because it was supposed to be out in June, but I recently looked at the blog
that they wrote, not the blog, the forum that they wrote about. And is it here? Yeah. So initially they said June. So I was like, oh, June, that sounds cool. Then they said tentatively mid August. So I was like, okay, maybe it is on time for this. And now it's
two or four more weeks. And this was like two days ago. So I was like, okay, I will show you how to do this. This is a bit of an intro of how to do it. So if you're more interested in it, you can definitely automate a lot of stuff with this snap API function over here. So if you are interested in that and then
you run into questions or problems, feel free to also just let me know via email or whatever, because I'm gonna be doing the migration from this Python 2.1 to the Python 3. So I'll also be trying to get a handle of that. So that was this SLC whole thing. Are there any questions so far?
Okay, good. So as you remember from the talk,
and not you, here, we have the single look complex and we have the ground range detected. As I told you, ground range detected is normally on repositories like Earth Engine, OpenEO,
you name it. And it's easier a little bit to work with. It's normally used, I don't know if you've seen this flood mapping with Sentinel-1 is one of the most common applications. And this can be done with GRD. So it's basically in the claim that water has less of this reflectance. So then you
can easily detect it because you can use a simple threshold, let's say on the bees, and then you can get that. So that's why it's a very popular application. And of course, the advantage of Sentinel-1 and of SAR in general is that it's weather independent. So it can penetrate clouds
when it's not raining like crazy, right? If it's raining like crazy, you're going to have a lot of artifacts and a lot of noises and everything. But if it's just cloudy, it will penetrate through the clouds. And of course, it's also light independent. So it can get data in the day and in the night. That doesn't matter because it is an active sensor. So it's
hitting a signal and then getting the signal back while Sentinel-2 or optical sensors are only relying on sunlight to receive the signal, okay? I guess I don't need to explain too much about this because you're all experts on sensing. But yeah, just in case. Okay, so we have this here.
So for this one, we're going to do just a bit of an exploration. And as you see here, the title is Exploring RTC Data. So I will explain in a second what that is. But again, we'll need
these libraries and let me run this just in case it breaks again. Okay, so yeah, we have the ground range detected product, which is already a bit more processed. So as you see here, it's sent in one data that has been part of the process from this SLC data. So it has been
multi-looked and projected to ground range. That's what it says. Don't ask me details of what exactly it is because I'm not exactly sure, but it's part of the process, okay? So then we have like the SLC products that we just had before are more, they preserve the phase information
while the GRD process have more amplitude and they are multi-looked. So they don't have so much noise or speckle, okay? And an extra processing step that you can perform if in SNAP, for example, is a radiometric terrain correction, yeah? So it's basically a terrain correction of this whole
thing. And some data providers already have computed this radiometric terrain correction for people to use it. Especially Microsoft planetary computer has a big catalog all over the world with already some RTC data. And you can explore that yourselves. I did not include it on this one
because I did not want anyone to need to get credentials to something. If you have Microsoft planetary computer, there are many good tutorials to how to get this. So you could, again, do it for PoS. I think Earth Engine only has GRD data, so not RTC.
So yeah, you could also do the radiometric terrain correction. If I'm not mistaken, there's a way to do it with Earth Engine, but that just needs an extra step. But planetary computer has it. So is everyone familiar with planetary computer? Okay, it's a bit like Google Earth Engine,
but Microsoft was like, we also want to do it. So they started this whole thing. And it's basically the same. They have a bunch of data sets. You can apply for it for an account. You don't need to pay, but yeah, you will get credentials and so on. And then you can get the data. They have a good Python
API to do stuffs and queries, which works quite well. I've not used it that much, but I think it's even a little bit easier to work with an Earth Engine, but don't quote me on this because I'm not completely certain. But it's a bit of an Earth Engine, but not Google, but Microsoft.
Yeah, I think you're less locked in, because they build a lot more open source tools. So you're not like the Google Earth Engine API and on the planetary computer you want so whatever you want with this stuff. Exactly. So everybody got that? Less locked, more open. You're not locked on the API that they built, but you can just install anything.
So probably worth to check it out. I mean, it's still proprietary, but you can check it out. Cool. So because I don't want you to get credentials for anything, I found, and it took me a while, but I found this Sentinel-1 RTC data set for the contiguous United States. They have some
data available. It's freely accessible, and yeah, you can do a bit of analysis. So here I'm going to show you how to query this data with Amazon Web Services, the CLI directly, and the package is called AWS CLI, and we can explore what data is available.
So this is a bucket. So Amazon Web Services have buckets where you can put data in. They can be restricted so that you will need to give credentials, or they can be not restricted, so you don't need to sign in. But when you're querying something that is not restricted,
you have to tell it to the query, I don't need to sign in, because otherwise it will tell you, you do need to sign in, but you have to say, no, I do not. So this is what this line does. So we're just going to take a look at this bucket to see what's there. So yeah, there are many folders, but as you see, we only have IW data, so we're going to look at that. And then we have
a bunch of other folders, so it's just for fun. Take a look one by one. Then we have other things, so this is a bit like how the organizer thing is on paths and everything, so how it's looking.
So here we go. We'll get there at some point. And then we have different years, so let's keep with the first one. And then you have, for example, for this specific one, two scenes, Sentinel-1A and Sentinel-1B.
The name is pretty self-explanatory, right? You have Sentinel-1A, 2004, 29 of July, 2016. This is a bit the path that we were like this, and this is an ascent path. So you have this is an ascending, so like this, and then we can see what is in there.
And then you have this. So you have this information, this gamma naught BV and this local incident angle tips that you can directly create. So I will come in a second to what this is, but if you want to download this, for example, like if you directly want to download this,
this is another example, so not the same one that I did above, but you can say, okay, Amazon Web Services, I want to from this scene here, I want to download it in this location and include the no sign-in request. Don't run this if you're working locally, if you don't want to
have megas of information. I hope no one did. I mean, you will just get a nice picture and then just you can delete it. So the available ones that you saw have this prefix gamma naught, and that's the result of the RTC algorithm. So that's what you get out of it. And there's
a whole thing of backscatter types and what exactly this means. Because I'm not an expert, I'm not going to get completely into it. But if you're interested, please do. What you mainly need to know is that that's just the terminology. There's other type of terminology out there,
Sigma, Delta, all the letters. But for now, we're working with gamma naught. And the data has this suffix, in this case, only VV. So I chose a poor example, but you also normally have VH. So what does that mean? That is the polarization of the data, so how the data is obtained. So I
found this nice figure where you see the way that it works. So this is the satellite, and this is the terrain. And this is basically how the wave is sent and received back. So when you have VV, it means that the wave is sent on a vertical way, and it comes back on a vertical way back to the
sensor. If you have VH, it means that it was sent on a vertical way, and it comes back on a horizontal way, so like this. If you have HH, everything is horizontal. And if you have HB, it means that it was sent horizontal, and it returns vertical. This is a bit to find subtle
differences between the information that you get. Sometimes it can help. It's a bit like having a red band and a green band. So you get a bit this different spectrum. Let's say when you have these waves of receiving the data. Normally for this IW and for this data that is on land, you have the
VV and VH polarizations. HV and HH are for this EW thing, and here you get a nice graph of what that looks like. So all these at the top are HH, and then all the rest are either VV or VH.
So normally you will not, if you're not working on the Arctic or on poles, you will not get this HH. There must be a reason why HH is better on the polar regions. I do not know the answer to that, but for sure there is a reason why this is collected like this. But I put some links,
if you're interested. Free links. You can read it here, here, or here. You can get a bit more information about that. All right. So we can now explore the data. So now I show you how to look at the data on the Amazon Web Services bucket. You can take a look at that. You can,
as you saw, have fun looking through the directories, because that was so much fun. And then you can download it. But we can also directly open it in Python without the need to download it. And for that, I'm going to use ReoX array. So you can directly give it
the location of the data, and it will just retrieve it. So for that, again, we have to tell Amazon Web Services that we don't need to sign in. And we will do it in a different way. So we'll save it as an environment thing and say, hey, I don't need to sign in, so don't bother me with this. And we have these two URLs. So this is from another area.
And then, as I told you, you have the gamma.vv, and you have vh. We're going to read both of them. Yeah. And then we're going to say, with openRaster.io, we give it the URL vv.
And then this chunks parameter, what it does, it's leverage this integration of X array and Dask, in the sense that you can lazily read the data with these chunks parameters. So that means that you're not really getting anything in memory or downloading anything. You're just requesting the actual data when you need it. And for the rest, you're just doing some operations before you
actually have to tell, OK, now I really need everything from here. So that makes it a little bit faster. So this is how this looks like. If you're not familiar with ReoX array, it's basically a way of working with raster data, but directly with X array. So
we're basically working with X array. People familiar with X array? I just got to know X array, so I hope you're more familiar to me. But yeah, it looks cool, because the printing method is nice, as you see. So we have this little cube, because this is quite useful to work with, for example, data cubes, where you have a bunch of information. I'll show
you again tomorrow some examples of this. But it's also helpful to do all this chunking and dusting and everything. So this is what it looks like. This scene is 114 megas. This array, we have this shape. We have some information here, like band. We have the attributes.
And these are the attributes of the Sentinel-1 itself. So we have the orbit. We have the dates. It's the Sentinel-1B mission in this case. So the one that does not work anymore, that's the one that we have. We have ascending orbit and a bunch of other metadata
around, right? So that looks good. This is the tile ID. Remember when we're doing 10, and then dash U, and then CU, that's the tile ID. And the name of the thing is gamma naught VB polarization. OK, so let's take a look at the data. And here, what I'm doing is
selecting just a slice, because in this case, I'm really going to call the data from the server, right? So I have this S1VV that I had already opened with rear x-ray. And what I'm doing here
is a pre-selection of an area, because I don't want to read the whole thing, because otherwise we're going to be here until tomorrow. So I just was looking around. This looks nice, this window. And this last part is this compute. This is from BASC. And what it does is basically
read temporarily in memory this data that you're calling so that you can fast visualize it, or if you need to do any computation, it's just easily there. So you don't need to wait so long. So when you have a lot of data, then this compute will take long. But now we just took a little part
and then we can get to know it. And then we can plot it. So this is what it looks like. And it's beautiful, isn't it? It's a bit dark. What we're having here is some amplitude kind of thing, or power data. And it's not so visual. So what we can do for variable visualization
is convert this to decibel scale. And this is just a logarithmic scale that allows you for Easter visualization, but you should not do any computations on these decibels because the data is getting distorted. So this is really only for visualization. So there's a little function that
lets you do that. And then we can see it much nicer. So here, it's a bit more evident. This is an area in Alaska. And this is the ocean. And then you have here land. So that's clearer to see
than in this one. This is just very dark. But with this decibel representation, you can see it a little bit better. So that's a bit how you can get this information. Then you can do a bunch of
other things. I've added on the more resources. So this I took a lot from this one RTC processing from Ellen Marshall workflow with X-ray, where she's doing some comparison with data from the planetary computer and data from somewhere else. She has a lot of exercises and stuff.
But it's a bit of what you can already just visualize it. But then probably you can also do, for example, some here you could do some simple threshold and find water values or different kind of things. So this is something that you can do already now if you want. I have
one first exercise for you. If you're following and if you're happy with doing something. Happy? Yes? You're all so eager to go to the city game, right? Why is this not ending? Okay. So what I want to... It's getting loud in the exercise.
I'm so happy. Okay. No one will be happy. Okay. So I just wanted to challenge you into, if you know X-ray, this is probably like a two-second thing. If you don't, you will need to Google a bit how to do it. But it's not that hard. So if you can combine this Sentinel-1 BV polarization and this Sentinel-1
BH polarization objects that we have, you can put them together and you can compute a new band, which is the result of dividing BH by BV. Why? Because when you have these three together, it's a very popular way of creating an RGB composite of Sentinel-1 data that allows you
to see a little bit of changes between this BH and BV polarizations. And when it works well, because I'm not sure how good it will work with this one. Yeah. It lets you see some other patterns in the data. Yeah. So this should be, if you've been going along, this should be easy.
You can work on the subsets because that will be faster. Yeah. And I will also have this little code ready when you're ready so that we can also see how it works. So shall we say 10 minutes for this to work? Sounds good. And if someone has any questions, please let me know.
To the bands, because otherwise it just says band one, two, or three, or something like this, and it's a little bit non-intuitive. So this is used to give names to the bands. And I give this
dimension the name band. So you'll see that this is one single argument that is normally to say in which dimension you want to join this data. So I want to join it on a new dimension called band and the data that I'm giving it is BH, BV, and BH, BV, or duration. Okay. So we run this,
and then you get an X-ray with three bands and yeah, the other variables that you've had. And then you will see here, you have the band BH, BV, and BH, BV, right? So that's a way to combine it,
I hope, a fairly good way. And then I computed a subset again of this whole thing, again with compute to be able to actually get something that doesn't take forever to load. Let's do this here. So this is the subset, the sentiment subset. And then I use my function,
this power2db to convert to decibels, otherwise it gets not so nice, right? And I do this band selection. I figured out that if I don't do this band selection, then the way the thing is collected is a little bit off.
pop, so that's why I did this, but maybe there are better ways to do this. I'm not exactly certain, but okay, like if you're going to do an RGB representation of a Sentinel-2 image, you would have here red, green, and blue, because that's the order that you want to
give it, because if you're using the initial thing, you probably have it in the order blue, green, and red, and then it will look funny. But I'm not so used to these RGB representations, so I'm not exactly certain of how the color should look like, but I think it should be fine. So this is computing, so maybe it's going to take time, because as you will see
here, we are not having any more 100 megas, but we have three times the amount of data, so then I'm basically fetching all this data to be able to then do this plot, so that's why the compute takes a little longer, a lot longer. There we go, and then I have my
plot. I know it doesn't look amazing, like I was really also hoping to see so cool patterns and blah, blah, blah, but at least you have a quite nice representation of water differences and here are some differences in the area. So I'm sure as our expert would be able to
tell me you're doing this completely wrong, what is this, or we'll be able to see really cool changes, but as you already see, because this is radar, you already see quite some topographic kind of features here in these areas. Yeah, so that's my approach to do it.
We'll include this notebook with the resolution of this exercise on the GitHub, so if you need to go back to it, you can just go back to it and with everything already run so that you can just look at it and you don't need to run it yourself. Yeah, so I think, yes?
Can you just put the first result exercise in the meta mode? Yeah, sure, I will copy this code as well. Yeah, any other question? Yes? Yeah, it's maybe a naive question, but I was wondering what's the difference in maybe using Sentinel data or Landsat data? Well, this is more for optical for sure because
Landsat data is only optical, so you don't have radar data for Landsat. Landsat is only working on optical channel multispectral information. The main difference between Sentinel-2 in this case and any other Landsat mission is the resolution. So Sentinel-2 is a 10 meter
resolution where Landsat is 30 meter. So Sentinel-2 goes from 10 meters, 20 meters, and 60 meters depending on which band and Landsat goes 30 meters and 100 meters depending on which band.
Sentinel-2 mainly has multispectral and Landsat also has thermal, so that's one extra let's say advantage. Of course, the archive information for Landsat goes way back, so you have data from 1984, while for Sentinel-2 you have data only from 2015 because it was just launched
by the ESA. So yeah, those are the basic differences. Of course, you can use both because they're both optical, they're just minimal changes between the multispectral channels. So red in Sentinel-2 is not exactly the same as red in Landsat because the frequency is just
slightly changed, but there is now a product called harmonized Landsat Sentinel or something like this where you basically have a data set with Sentinel data that is filling the gaps where Landsat doesn't have information. I don't know, maybe clouds or something like
this is filled with Sentinel data and everything at a 30 meter resolution because that's the coarser resolution between the two. So you can also work seamlessly with both of them. Yeah, but there's something for tomorrow. We will talk about optical as well. I will just show Sentinel-2, but everything that I will show for Sentinel-2 you can also do
with Landsat. So if you're still interested, you can come tomorrow. Otherwise, you can check the recording. Yeah, sure. Any other questions? Okay, so I think with that, we can wrap it up. And thank you for all the attention and help and patience.
Hello, everyone. Again, I'm Lorena Nava, PhD researcher at the Department of Geophysics at the University of Salzburg. I'm going to present you tools and packages to create
process in Sentinel-1 and Sentinel-2 data with R and Python. This is part two, and this is my contribution to the OpenGeo Hub 2023. So as I said, this is part two. So this is just a recap of the plan for the sessions. We're in part two. And today we're going to
mainly focus on Sentinel-2 data with R and Python. So let's get to that. And from yesterday, we saw Sentinel-1, today we will check Sentinel-2 and we will work with R and Python. Wish me luck. I'm going to try to run two Docker containers at the same time with the Zoom call. So let's hope that everything goes well. Oh, and thank you for the people who are here.
You're so hardcore after the nice evening we had. So it's really cool to see you all. Good. Let's just start with some quick Sentinel-2, quick facts. Yesterday, there was already a question on what's the difference between Sentinel-2 and Landsat. So I started
planting already about all this stuff, but just telling a bit about it. But just as a recap, Sentinel-2 is a high-special resolution optical imagery. So you have 10 meters, 20 meters, 60 meters. It's systematically acquired over land and coastal waters. We have two satellites
in orbit right now, Sentinel-2A and Sentinel-2B. There will be a third one, Sentinel-C in 2024, which just gives us way more data. So we can also get data here more or less around every six days, which is great. And it was launched in 2015, Sentinel-1A and Sentinel-2B was like in 2016,
more or less. It's multi-spectral data, so completely different from what we were looking at yesterday, but probably something that you're more familiar with, with remote sensing. You should start with this. It's 13 bands, it's in the visible, near infrared, shortwave infrared,
and yeah, it has no thermal bands, but there are some people trying to also emulate some thermal stuff, but that's a bit more advanced. Yeah, so that's basically it from the quick intro. So now we're going to just get to it. Yeah, so today is going to be a bit
back and forth. I don't know if you managed to run both Docker containers at the same time. What I do is basically have, I open two PowerShells in my case, because I'm using Windows, and in one, I run one container, and then on the other one, I run the other container. It's just that they both
keep running, because it's not like you can then type anything here. And then you'll have these two, like the JupyterLab, as I showed yesterday, you just copy the link on top, and then you'll get here. And then to open the RStudio server, you just type localhost.8786, and you should have
your RStudio open. If you're not working with this Docker container, you're just working with RStudio, then probably you will have less problems than me. So that's good. Yesterday, I didn't go through it, but the way you find these notebooks, I think people are smarter than they found out,
but this is the main directory that is mounted on the Docker, right? So that's why you have to put this, you have to navigate to the place where you downloaded all this stuff. And then that is inside this Docker container, so you get all this information. So you have this data, Docker files, which is where you can do the stuff, some figures, and then you have the notebooks.
So in the notebooks, you have two directories, Jupyter and Quarto. This is Jupyter, and this is the Sentinel-2 notebook. And then for RStudio, you have the same setting. So you have notebooks, you go to Quarto, and then this is the Sentinel-2
Quarto document, which is also more or less a notebook. So I hope I don't confuse you, and if it gets confusing, let me know. But I will start with Python and then show you how to do the same in R, okay? So we're going to do a bit of a comparison, okay? I can do this with Python,
how does it work in R, and back and forth, okay? Good. Let's see if that works. Okay. And this should be good. I hope this is good and visible. All right, so I'm jumping right into
accessing data. So probably if you have ever worked with Sentinel-2 or any other type of optical data, have you? Can I get a raise of hands if you work with optical data? Yeah, sure,
amazing. I don't know if when you first started, you had to download all the data, because that was a common thing, right? I remember looking for so much data in the LCJS, and then getting so much information into my laptop. Hello. That's quite intense, let's say.
But nowadays, in modern times, we can fetch the data in a whole new way, right? So I'm going to talk to you today about the STACK API. First, I just included all the libraries that we need. There's always a hint of what each library does, but let me know if something's not clear, and it will become clear also when we go further. But I'm going to start with this STACK thing. So
STACK stands for Spatial Temporal Asset Catalog. It's a common language to describe just spatial information, so you can work more easily with it. It's indexed, and it's easily discoverable,
okay? So that's basically, imagine this. You have these repositories of data, like Sentinel, or something like that, or like Copernicus at the service or whatever. What they did is just have this common standard of metadata that just helps you find it easier, right? So then you have
this huge dictionary of, okay, this is the data you have here, and this is the URL where you can find it, right? But it's all in one single thing so that you can basically search. It's a search catalog, okay? So yeah, I don't know an analogy, but it's a search catalog. So it looks like this,
and you will have here all the different repositories where you can get the data. So you have Cladias if you're familiar with it. Digital Earth Africa also has something. If people remember ETSIR's talk, OpenEO also has stuff. Yeah, for example, they have this
connection of the Google Earth Engine OpenEO thing. So with OpenEO, you can get Google Earth Engine data. So that's one. You have the Microsoft Planetary Computer that I was talking about yesterday. Yeah, you name it, you have a lot. Here's the OpenEO platform. Some of them are
public and some of them are protected, and you will see it here. Some of them are a catalog. Some of them are an API. An API usually is that you can just query and also get the data. A catalog is that you can more or less can just check what is there, but then getting the data is a little bit different. So you will need to have more credentials to access and so on,
or another way to access it. But it is there on stack so that you know which data is available everywhere. So today we're going to work with the Earth Search, which is a stack API of public
data sets on Amazon Web Services, meaning that you don't need to get any credentials to get this. So for Sentinel-2, in this case, you can just access the data without the need to have an account. For example, on Sentinel Hub, you usually need to have some credits or something like this
to fetch the data. So that's not super nice. It's just nicer if you can do it without. And this is the collections that you get in this catalog. So let's look at that over here. So as I said, we're going to do Earth Search catalog. And for this whole thing in Python,
we're going to work with this PyStack client, which is basically a client to access the stack catalog with Python. Very self-explanatory. So we first have to get this stack API URL. And where did I get that from? It's from here. So on the stack catalog, you have it really like
on the second part, you have the URL that you're going to query. So I copy that here. And we start fetching data. So we are going to create this, what they call a client, and we tell the client to open the URL for us. So client, PyStack, and then you have the API URL.
And you already have this client that is going to give you the connection to all the data that is right there, right? So let's run this. And then the next line, what it's asking is to print all the collections. So here you will see the collections that are available on this specific
bucket. So you have this Copernicus DM, NIPE, Sentinel 2, Level 2A, Level 1C, Landsat. Then we have the Copernicus DM at 90 meters resolution and Sentinel 1 GRD. As I was saying
yesterday, you sometimes have this as well. I think for the Sentinel 1, you do have to have credentials, but for all the others, it should be fine. And as you will see, this is exactly what you have here. So I'm just basically asking stack, what is there? So it's in collections. So then I get collection, and then I get all the collections that are in this particular stack
REPL. So now we're going to focus on Sentinel 2, Level 2A. And if you don't know what Level 2A is, we're going to take a look. So these are the different levels. It's not visible. Yeah, that's more visible, right? So you have Level 0 again, not something that you want to
directly work with. Level 0 consolidated, this is just stuff that the people really working on ESA are working with and doing all these pre-processing for us. In Level 1A, some
radiometric corrections happen. Level 1B, there's some sampling, conversion to reflectance. So you don't have digital numbers, but reflectances. And on Level 1C, you have already an atmospheric correction. Sorry. Yeah. So you get total reflectance at this point. Yeah. So total
is top of the atmosphere reflectance. And that's one of the most common ways of working with Sentinel 2 until a few years because we didn't have the next one. But now more and more catalogs allow you to also fetch Level 2A, which is surface reflectance. So it's basically from the TOA data,
you're going to do an extra processing step where you're going to get directly surface reflectance values. That means without any atmospheric influence. Probably if you have been working with optical data for some time, you must have done this step locally. If you do
it with QGIS, there is this plug-in that lets you do this. So that's basically what they now do for us so we don't have to do an extra thing. So more towards this analysis-ready data era that we're living in. Yeah. So that's Level 2A. And you can look at all the other levels if you want
to take a look at that. So then we'll say, OK, this is the collection that I'm going to get, this Sentinel 2 Level 2A. And from yesterday, remember we had this definition of where we want to get our data from. We do exactly the same here. We're going to give this spatial and temporal
extent. So we have the coordinates and we have a date range. And this is the way to define the date range. So in one single string, you have the start date, the end date, and you join it with a backlash. And the longitude and latitude, now it's a point in Poznan because we're in Poznan. And we
call it into this point object that is a Shapley point. So Shapley is in charge of all the very low-level geometries in Python. So in R, SF has way more capacities to work directly with
the points. And then you can work it as a data frame and everything. Shapley is here more just in charge of the geometry part. So we have this point. And we can pass then all these arguments to our search. So very similar to yesterday's. We have our client. We have the command search.
And then we're going to say, OK, I want you to find this collection. So the collection we defined up here. We can also ask for more collections. That's why you can pass it as a list. So if you would like to fetch not only Sentinel 2 but also Landsat, you can do that. You can do it in one single query. We're going to tell it that it has to intersect with this point.
So this point has to be the Shapley point. And we have the daytime already defined. So we're going to get our search. And then we can see how many items are on my collection. So I found 61 items. What you can also do is some extra queries based on the data that you have
available on the items. So for example, again, it's not nice to look at the items just like that. So I created a Geopandas data frame so that we can read it easier. Yesterday I did a pandas or maybe it was a Geopandas. But you have the geometry right here. So the polygon of what is covered.
And we have the platform, constellation, what instrument it is. So this is multi-spectral instrument. And then you have some useful stuff. Like for example, you have the cloud cover. And that's something that you can use to query, right? Something that has
less than 10% class or something like this. You have the projection that these data is on, the UTM zone, blah, blah, a couple of things. And the granule ID. Yeah, you name it, you have a lot of information from this one. And so what we can do if we go back to our search
is use the query part to say, okay, from this EO cloud cover, I want it to be lower than 10. And it has to be the same name as here. And this is here. So that's why it's EO to call on cloud underscore cover. And then as you see, these are in percentage. So we can directly just say,
I want this to be below 10. And then we'll get something, a little bit more filter collection of our data. So remember before we had 61, now we have eight items that correspond to our query. Okay. So that's, yeah, that's way less. So we can take a look at them. Again,
we have quite some, and here you see which satellite it is to A or to B and so on. So yeah, we can get this. And what we can do, for example, is just take a look at
the cloud cover over time. So this is how it looks like. So because we queried this with less, can do it without the queries so that we can see a bit more of the cloud cover itself.
What I'm doing here is just plotting the date against the cloud cover. And you will see that indeed over our area in Poznan. So for our area in Poznan, we had quite some days with cloud cover.
That's usual. If you've been working with this, you know that this is always the case, but we have some elements that are rescuable. Be a little bit cautious with this EO cloud cover from Sentinel-2 that is really in the metadata for some time. It's been noticed that it's not really
accurate, especially in the tropics. Sometimes it's stated that it's a lot of cloud cover, but actually there's not so much cloud cover. So there's something on the atmosphere that is being detected by the algorithm as cloud cover. And for this reason, there's a lot of information
that has been lost. People who just query things, like I'm from Ecuador, I've always been querying stuff. And I was just losing a bunch of scenes that I didn't know were useful because this was a little bit flawed. They're definitely working on this, and I think they already have it corrected.
But if you have older scenes, this might be a problem. There's an interesting paper that I will put on the MatterMouse for people to take a look, and I'll also put it here, that looks at this bias of what happened with all this loss of information and how much more we could have done with it. I think we've launched that as well or something. I'll definitely put the
paper on. So be a little bit careful with this, but now we are going to trust on the data that they give us, and we're going to create it like this. So let's go back to our query. We're going to have cloud cover for 10. And then if we do our plot again, I already showed this. We have
not so comprehensive time series, but at least we don't have so many clouds. Of course, you can work with this and make it less and less. So now we can look at the properties of of one single item. And I'm going to select one that has a low cloud cover. So for that,
what I did was just, again, some filtering on the data frame. So I'm asking for some cloud cover that's only under two. And there's also another property of these items that's called no data pixel percentage. So if you have no data, because sometimes you have this scene that's going and you have this whole strip that has a bunch of no data. So that's not so useful, but you maybe don't want
to get all this information in, so you can also filter by that. And we get three that correspond to our query. So you'll see here that your cloud cover is very low. And then this no data should be
somewhere here. Otherwise, I cannot process it. But OK, let's trust that it's doing it good. Good. So we can take a look at the first one that comes from our result. So this is the item,
this geometry is basically what is covered by, like the scene is covering this square, let's say. This is, we can fetch all these properties, right, just with this point thing. So like you have daytime, so which exact date this scene corresponds to. And we can get all
the other properties. So we have when it was created, which constellation platform, cloud cover, tiles. You also have now with level 2a some more processing on how much water there is, what is the vegetation percentage, what is the cloud shadow percentage.
So if you're really looking into these kind of things, it also can be something good to query, because then you'll get a little bit filter information. But again, always try to that what you're getting is really what you want. Because you can say, yeah, OK, I want this very
low cloud, but your study area is very small in comparison to the whole scene. And it could be where the cloud, like there was no clouds over that very small area, but was cloudy, right? So then you're losing that because, yeah, it was there, right? So there
are also ways to get a pixel kind of representation of the clouds, and I will show you later. That's also what you can do with Earth Engine, right? You can ask it by pixel, so not by the whole scene, but by pixel, where do I don't have any clouds? That's also what my colleagues do with the CentroCube implementation, where they have this semantic layer when you can really say by
pixel, OK, here the probability of cloud is high, so then I don't use this pixel, but here it's low, so the querying gets a little bit more localized to your real study area. So that's also something you can do, and I'll show it in the R part of this. OK, and we can also take a look at the
bands that are available, so that we can do with accessing the assets and the keys of the assets. So if you don't put keys, I think you get this whole thing, so each of the bands. So I'm just asking from this dictionary to give me the names of the values, because otherwise I just get all
this stuff. So these are all the bands that are available, right? So you have this blue, coastal, green, but you also have things like a thumbnail, you have a visual representation. In this case, we have this JP2 format of getting the data. So as you see, this is just basically repeated,
but this is in JP2 format. So it's not TIFF, but JP2, that's it. But you can fetch all of this. So we can take a look, for example, at the thumbnail to see what we are dealing with. So for that, we go to assets, we ask for the thumbnail, which is here, and then we get
the URL that was shown before. So that's how we're going to get it. And then I'm using this Python image just to show the URL. So this is the thumbnail. This is the scene with very low cloud cover, with not so many no data values. That's what I'm looking at. So you can have a
quick overview. As you see, this is running live. So this was fast. And then you see that you can easily fetch this and take a look at what scene you're working with without the need to download it or anything that's working great. We can also take a look at the single bands. This might take
a little bit longer, I think, but you can also do that. So let me get this running because this one takes a while. So for that, you, again, can call the red band and you can see also what extra fields we have. So we have, for example, here for this particular bag of this particular
image, you have all the descriptions, but you also have the projection. We have spatial resolution and a bunch of other things. So again, as yesterday, if you remember, we can directly call this data in with ReoXRA without the need to download anything. We can just use this URL and
then we can just open this. So for that, we use ReoXRA and then open Rester.io. We give it the URL and this decode coordinates is so that nothing goes crazy. I think I had an error when I didn't have it. That's why I added it, but cannot completely remember why, but use it. So that's
why it's there. And now we can also plot the data. And again, I'm using this eye cell, so I'm slicing it so that I don't get this whole scene now because when I plot, I'm really fetching the data. As yesterday, then I'm really asking, okay, please bring the data in because I want to
see how it looks like. So this is going to take a little bit longer and that's why we have this selection. So here you'll see, this is the, for some reason the F11 works for getting up and not in. So this is the red band. Okay, not in super red colors. This is VDD, one of the famous palettes,
but still. This is the red band of my collection of small things. So you'll see this is POSNAM, probably people who are from POSNAM will have an idea of where exactly this is. And you have then here this surface reflectance numbers of your area for the band red. So we can also do an
RGB representation. And remember we had this visual thing at the top. So that's also something that we can take a look at. Again, if you see here, we have this visual. That means that it's
just combining the red, green, and blue together so that we can just directly get an RGB. So we don't need to call each of the bands and we can just directly fetch it. And again, I do a selection and here you have it. This is how the RGB representation of that specific part looks like.
So this is an interesting way of where you can just easily check the data you have, any selections or anything. Yeah, just working with the Stack API. So at this point, I am going to show you how to do all this with our implementation as well. So I hope you remember
every single detail that I talked about because I'm going to ask a lot of questions of what the differences are. But I might ask a couple of questions. So we go to our RStudio.
What I like to do to make it easier for me is have the console here and the environment over here because then I can... It's easier for me to work, but don't get confused. And to make it even less confusing, I'm just going to open this completely outside. So then this looks like a Jupyter notebook. So it should work. So don't get confused. This doesn't look like R. It does.
I'm just opening just a Quattro document outside. So again, we have all the libraries in R. You'll see some different libraries here. Again, all these explanations of what each of them does. You will see some of them have their counterparts on Python. So we'll get to that
while we go through. So in this part, we're going to go and define an extent one more time. To define an extent, again, we need to have an AOI and we need to have a time period. So for the people who also wanted to go to the OpenStreetMap session, but thought maybe we go here. I'm going
to show one function that probably was on the OpenStreetMap session. So we're going to fetch data with OSM data, which is a package to just fetch information out of OSM data. And last time I tested was, or no, that was like a week ago. It was not working,
but I think it's working now. And so what you have here is a very nice one that I like, where you can get the bounding box of a city or an area in the world. Mainly I use it for cities, because that's way easier. So you have this getbv is from OSM data. And then you tell it,
I want it from Pozna in Poland. I think I had to add Poland because it was another Poland or something. There was some, like with the geocoding, there was some overlaps and it was a little bit weird. You can tell it to give it the out format in SF. So SF is, oh wait, I forgot to ask,
how many people who are more into Python are here? Can you raise? OK. And more people into R? Yeah, that's really a majority. OK, so I know you'll follow along, but SF is the counterpart of Geopandas, let's say. But yeah, SEP was there first, I guess, or at least
I would like to think so. So OK, you have this, you can get this out as an SF polygon. And here I added limit equal one because for some reason I was getting a duplicated geometry. But this is just something that you can tell, just give me the first result that you fetch out of this. But of course, it can be that it's not always correct, so we can make sure that the
AOI is correct. And for that I will use TMAP. TMAP is a really great package for visualization in R for anything that is cluster or vector. You can use it for static maps or for interactive
maps. So to make it interactive, you can say TMAP underscore mode view, and then you'll get an interactive map right in. And then you have this quick TMAP function where you can just pass your data, you get a fill, and I put an alpha, and you should be able to get a really nice
plot of your area. So this is Poznan, and this is the bounding box that was fetched from the OSM data file. OK, so that was the quick OpenStreetMap interlude. You didn't miss everything. Maybe this is something that you might want to also look at at OSM data as
option to fetch some OpenStreetMap data. OK, so we have now our AOI, and we have our time extent. So here I added it as a vector of two dates, so the start date and
the end date. And we have that in our time extent object. And what I need to pass the data to the search query is a bounding box. So I have here my AOI, and I'm going to compute the bounding box with SF. I call it BB4326 because this is going to be in 4326 or WGS84 because that's how OSM gives
me information. So we can compute that, and then we can also get a transformation very fast of the same bounding box, but then in 3857, so Mercator projection so that we can
later pass it to plotting or other type of information that we need to get a different projection in. So there's a reason why this is computed. I will show you later. OK, so we have our bounding boxes, and now we have defined extent. So now we can start
creating the data. So the counterpart of PyStack client in R is RStack. So RStack, I think, it's developed by the Brazilian Data Cube. So it's a really, really nice package. I mean,
the Brazilian Data Cube are doing a lot of things to also get this data and everything, and they did this counterpart to get stack information. So that's quite a nice contribution for that, and now we're going to start looking at how we get started with the API.
So again, as before, we're going to work with Earth Search, and we just, again, give the URL, and we can get the collection. So this looks very similar to what we had with Python, right? Like you say, OK, I want this from this client. I want the collection. It's a little bit less.
Let's look at it. A little bit less lines of code, probably, because that's usually the case when you're working with R. So here we had to define the API URL, and then first the client, and then the collection. Well, it's a bit similar. So yeah, we have the stack function that will just
fetch everything, like what is called a client in Python, and then we can get the collections out of that with this GET request, OK? So every time you want to really fetch something, you do this GET request, and you will get the response from the stack API.
So here we have our collection, Sentinel 2, level 2A, level 1C, Sentinel 2A COGS. That means Cloud Optimized Geotiff, so that's just a faster rendering. It's a great type of extension now.
It's used a lot for this saving data on the cloud. We have Landsat 8, and yeah, does someone notice any difference between this collection and the collection I showed before on Python? I told you there will be tricky questions.
Yeah, Sentinel 1 is missing, right? The reason for this is that in this case, I'm using the version 0 of this stack collection of Earth data and not version 1, because for some reason, this was not working so properly with R. I guess it will eventually, but this is the reason why
the collections look different. So it's also, we didn't have these Sentinel 2 COGS before, because now on the version 1, everything is Cloud Optimized Geotiff, so you don't have any other option. And of course, they included Sentinel 1. So that's the reason why things are different,
but it still works, and you still get the data. So you already passed the quiz. But yeah, that's the reason why this looks like this. So okay, now we can do a search. Exactly as we did with Python, we can start a search. So first, we have to give it the S object, which is the stack
client in this case. So we have S, and we're going to do a stack search. We give it the collection. So I'm going to ask for the Cloud Optimized Geotiff Sentinel 2, level 2A, and I'm going to give it the bounding box. Here, I'm going to give the bounding box in WGS 84. The reason why I pass it like this is because it really needs an exact order of the X minimum, Y minimum, X max, Y max.
That's why I pass it like this. Here, I have to give daytime. Remember in Python, I showed you that you needed to collate everything with a backslash. Here, I'm doing the same. So I just put it with code. And I asked for a limit of 500 so that I don't get a lot of data, but I don't
think we will reach that much with our stuff. So this should be OK. So we got 144 features matching here. And then, well, the printing method is a little bit different. So you get the first ones,
one of the first images that you get, and you get the assets. Remember the assets that we could see before when we did assets keys, which is basically the bands that we have for this particular collection of information. So we can explore this again as an SF object. So before, I converted it
into a GeoPanda. So now, we can convert it into an SF, which prints a little bit wonky on this quarto, but you can still get an idea. You have the geometry. You have projections, platforms. So
this is a little bit harder to read, but we can just take a look at the initial names with this glimpse, just lets you see a bit how the data looks like. So you have the product ID. You have the projection. You have what instrument it is. It's always going to be MSE, MSI. And you again
have this EO Cloud Cover, for example, where you can again do the querying based on EO Cloud Cover. This is also something that you can do. So we can now take a look at this EO Cloud Cover
again, as we did before, and we're going to get more or less the same pattern as we got with Python. So that's what I can do. Here I'm using ggplot. On the X, I have the daytime. I have Y on the Cloud Cover. I plot points. I plot lines. I group them in one single one because then it knows it's the same thing. And I just use a scale of the date every two months so that I don't get a
lot of dates down here. That's something that I don't know how to do with Python. And that's why I think this was looking really wonky when we had all the dates, because I don't know how to specify that. I'm not so fluent in Python plotting as I am in R plotting. But again,
probably you can do it. I just don't know how. Good. So we can again look at the properties. So this is a little bit more readable for one single item. So I'm just getting items, the features, and feature number one. Remember in R, you don't index from zero,
but you index from one. So if you ask for feature zero, it would be like, what are you doing? This is not Python. So it's number one. And we get the properties. So we have the daytime, the platform, constellation, everything that I was showing before. But you also have the valid Cloud Cover thing, property. The data coverage is the thing that I was mentioning before
when you have an ACE or you don't have an ACE. So we can take a look at that. The valid Cloud Cover property is also trying to enhance this Cloud Cover from this EO thing to get you something
more accurate, because as I say, there was some problems with this. So that's what this one is for. So we can filter this as an object to get something more precise of what we want, as we did before. So we get this filtering. I just give it a row number, because in this case,
I don't have an ID per row. I do the filter based on the valid Cloud Cover. That means if the Cloud Cover that is given by the EO Cloud Cover parameter is valid or not. So it can be either one, true, or zero, false. So if I have this EO Cloud Cover is zero, but actually it's
not true, there was a mistake in the algorithm, then this valid Cloud Cover will be false. This is not valid, so just ignore it. So that's why I'm using this valid Cloud Cover to get everything
that is really correct, let's say. I want everything that has data coverage over 80%, and I want to get just one item. I want to get the one with the minimum Cloud Cover. So if there are two items that have zero, that's why I needed to get it like this. So
if I don't do this, wait, let's not do that. Okay. I think there are two items that have zero Cloud Cover, and that's why I just asked for the first
ID and not get everything. So now I save this on item, and then again, I can take a look at the URL with this thumbnail, and I can take a look at how this scene looks like. So, so far, so good.
That's basically what you can do with Python and what you can do with R in terms of searching and querying this stack index catalog for whatever you need to do. So are there any questions so far?
Am I going okay or too fast? It's okay? Cool. So yeah, you can just choose your weapon and see whatever suits you better if you want to work with Python or if you want to work with R, at least until this point, you can do it in both. So yeah, that's cool. Because normally,
I found that a lot of APIs used to work more with Python, but more and more you get R counterparts that you do this as well. Okay, so now we're going to go back to Python. I hope this is okay also, the searching, that you're not getting like, what is she doing? So
that's why I also kept it in two different Jupyter and then RStudio so that it's clear that I'm searching because I was thinking of doing everything in one single notebook, but then I thought it would be even more confusing. But I don't know, you will tell me later with your feedback what was better, okay? Good. Yeah. So also, if you want to take a look at
fetching only the red band and doing a small plot or something like this, feel free to experiment on R. How would you do the extra things that I show in Python? How would you do them in R? There will be a way to do it for sure. Okay. So now, we're going to go into how to get
not only one scene, because that's really nice to see it, but we don't really want to work only with one because we're in the era of big data, so we want to work with a lot of data. And usually,
the way to do that nowadays is with data cubes. I think if you've been to the X cube talk, OpenEO talk, and probably any other talk in this OpenGEO hub, you will notice that that's really not a trend. So, we can also create data cubes with this stack API. Okay? In this case,
we can do what's called like a data cube on the fly, or not on the fly, on demand. So, you can basically find any place in the world that you want to work in, and then you can create this data cube where you can fetch any type of information. So, you don't only need to do it with Sentinel-2. You can also get lots of data on the same data cube. You can get Sentinel-1,
or a DEM on top of it, and then with that, you can do whatever crazy thing you want to do. At some points, you will need to define at which resolution you want to do. We're going to go through that. So, that means that everything's going to get resampled. So, you have to be a little bit careful with how you do this, and think about your variables, if this makes sense
to resample it at this scale with this type of resampling method and everything. Okay. We're going to now look at how to do a data cube with Python. For that, we're going to use the package stack stack. They are very creative with their naming. So, we're going to use stack stack,
which basically means we're going to stack stack items, okay? We're going to limit our data cube only to an area in a Poisson number bounding box. So, this is how this works. I'm now having
comments in R. So, there's a file in data called Poisson and GeoJSON, and this is how this looks like, the bounds of it. Yeah, that's it. You have the bounding box of this Poisson and GeoJSON file.
I mean, I think we can just see. Probably I showed it yesterday, how this looks like. So, it looks like this. This is the Poisson and GeoJSON file that we're looking at yesterday. We're going to work with this one again. Yeah. So, for this now, we're going to
rely on our previous search, okay? So, let's go back a bit into what this items is so that we don't get completely crazy because I'm also not
remembering completely. Here. So, initially with PyStack, we did the search, right? We told it which collection it has to intersect this point. This is the daytime, and this is the EO Cloud corridor I want to have. Then with this items object is basically the search. I want to get the item collection. So, all the data that matches my search. So, I have this item collection, and this
is my object items, okay? So, remember here we had eight. So, what we're going to basically do is now create a data cube with these eight items stacked on top of each other, okay? That's why we're going to use items as the base of the function stack stack. This is going to be fun
like a tongue twister. Okay. So, we have stack stack dot stack. So, here you can start building your cube like this. So, we have the items, and here we can give a resolution. So, as I said
before, Sentinel-2 is what resolution? You can't say it. Yes. Between 10 and 60 meters, right? So, it depends which bands you're going to use. Normally, the visible bands are 10 meters. Then, yeah, the more away from the spectrum, you'll get a little bit bigger. But here you can
specify a different resolution of how you want to fetch the data. Why we do this or why I chose 200? Just because I want things to go a little bit faster, okay? So, like if you ask for it at 10 meters, it's going to take a very long while. So, we don't want that, and actually, let me just,
while I'm talking, start running some things because I think this one is the most expensive computation that we have today. Okay. Yeah. So, we can select this resolution, 200 meters,
so that things are not going crazy, right? Then, we give the bounce. We can give it in latitude longitude. So, that's what's here, footprint, and then the total bounce. That's how you get the latitude longitude. It's an array. So, this will be understood by stack stack dot stack.
If you can also give it a bounce in any other CRS, and I think then you have to give the CRS, or maybe, yeah, you have to give the CRS. So, like if you only give bounce, then you have to give the CRS of which is, yeah, in which projection you're giving the bounce. So, it's easier to keep it in that long because it will just understand it.
And you have to give the resampling way. So, like how you want to sample this, and in this case, we're using a bilinear interpolation. So, you can also use nearest neighbor. This is because normally have 10 meters, 10 meters, 10 meters, but now I'm asking it to do 200. So, that means we're going to have a bunch of pixels in one single pixel, right? So, how should
this be combined together? What values should it take and everything? So, that's why you have to give a resampling technique. So, bilinear is just going to do some equation that gets you everything together, but you can also do nearest neighbor where it starts taking like the value, the average value of the nearest neighbor or something like this. So, yeah, that's what the resampling
technique basically does. So, here, we have our cube. So, now it's already printed and this cube is an X-ray data ray. They love to do this, that I have to say the word twice all the time. So,
this is an X-ray data ray and as you see here and I think I showed this yesterday as well, you have all this chunk of data. This is like our data cube and I like the printing method of X-ray because you can really see it's a data cube, right? So, you have this is your like X and Y. So,
what area is covered in X and Y and then you have here all the rest of your data, right? So, like you have in this case a preemie 32. So, we can take a look at what exactly this is. Wait, not here. Here, yeah. So, we have four dimensions for this data cube, okay?
We don't only have time and X and Y, but we also have the band. So, in this case, the representation here is choosing to show us the bands. So, it has each of the things 32 bands and here you will see that the extra dimension means that you have eight time stamps
in this data cube, okay? So, because it's hard to show four dimensional things, that's the way they decide to do it. So, I've not used five. So, I don't know how the representation of that is going to be, but I guess they'll figure it out and you can use a lot of representations.
So, you have time, which is, yeah, our time for each of the scenes. Then we have the band, which is all the bands that we saw before. So, you have all this naming, but then also the JP2. That's why it's such a big data set and you have X and Y, right?
Then you have the CRS. Now, this one is 32633 because that's the CRS where the data came from, this UTM representation. As you see here, we have the resolution of our data cube is 200 meters.
So, that's what we're working with. Again, as I was mentioning yesterday, this is doing some chunking in the back. So, that means that even though our data array is 39 megas and the way it's working now and fetching the data is parallelizing the back. So, it's not going to get me data until I really ask for it. So, you can do a lot of operations already on this array
of 39 megas, but in the end, it will only start the real computation when you really want to fetch the data. So, when you want to plot or when you want to extract point information or something like this, then it's really going to be like, okay, now I really have to execute this whole
code. So, if you were on the OpenEO, it's a bit like when Etzer showed how he can have all his code and only log in on the last step because then is when he needs to do the actual computations, a little bit like this. So, what we can do with this cube is select only the RGB bands,
for example, because as you saw here, we have just a little bit too much redundant data. We have all these bands, but then again in this JP2 format and then you have this thumbnail and the visual and blah, blah, blah. We don't need all this, right? We're not going to work with all this. You could just take all the bands or in this case, I'm only going to get RGB and for that, you can
do a selection. So, you can use cube, which is our cube object. I'm going to select red, green, and blue and this is the order of how it's going to be selected, right? So, we can do that. That's easy and we can also create monthly composites. In this case, it's going to be a bit weird because
we only have eight scenes, but you can still do it. So, what we can do is what is now called, again, a resampling. So, now I've basically here made a cube or created a cube. Let's look at it. I hope this doesn't break.
Not EGP. RGB. Yeah. I created first a cube. So, this RGB cube is, as you see, just a selection of three bands. So, instead of having these 32 bands, I only have three now. Yeah. So,
that's much more manageable and you see that the size of my data cube also was reduced like 10 times. So, like this is 39 megas and now I have only three megas right here. So, that's just helping with the amount of data that we have. So, now we have this RGB and we can do this also
temporal grouping of our data. So, in this case, what I'm doing is saying, okay, now my new cube RGB, I'm going to resample it or like reduce it based on time and to say that I want months, then I do MS. These are all the time range formats that you can get to do this. So,
you can say I want it yearly, I want it quarterly, I want it hour frequency and then you have like crazy combinations of things like I want it yearly, but I want it to start in January. I want it yearly, but I want it.
to start in August or I want it quarterly for May. So like this x-ray time reference is, you can customize it to whatever you want, I discovered. It's quite interesting. But okay, in this case, we're going to use it per month. And we're going to compute the median. So like anything that has to be combined because it's like all the scenes that are for June and all
the scenes that are in July, they're going to get combined, but we have to tell them how we have to combine the values. So in this case, we're going to compute the median of the values along each of the bands. So the median of the red band, the median of the green band, the median of the blue
band. And that's going to be combined over this time dimension. And we're going to keep the attributes, meaning that I want to keep all this metadata with me because otherwise I'm losing some information. So this is what we're doing with this monthly. And that's what is printed here. So as you will see here, now we have three bands. So that's the same because we're still with
RGB. But now we only have three points in time because as I already said, we only have three months to group in. Again, you'll see the array is getting smaller and smaller. So you have three megas. Now it's only one mega. And yeah, we're trying to make it smaller and smaller.
So you will see on the indexes now you have the bands, again RGB, and you have your time that starts on the first of the month. That's why it's MS and not only M. So MS means M,
but to the start of the month. So that's why it's like this. First of June, first of July, first of all. So that's what we have now. And what we can do is then ask this data in memory. And that's why I run this earlier because that's what takes time. And I hope it's,
I think it's worked. So we use this, I have now this cube that is, I'm like looking at it, I'm thinking, okay, it has a nice size, only one mega. It only has three, three, and then the other dimensions. So it should be okay and doable. So then I can ask this compute function to really
fetch me the data so that I can really work with it. And that's this Dask function that reads the objects in memory because before that we were not doing that. So I do this because I want to plot. I can also directly try to plot without doing compute, but it will just take much, much longer because it has to like get everything in at the same time as plotting.
Well, if I do the compute before, it will just have it in memory. So if I need to change one parameter of my plot, I don't want it to start fetching the data every time. So that's why I compute this so that it just works. So what I'm doing here now is doing an RGB
representation. So that's this plot in show, I guess it's in the show. I can tell it which column I want it to use and how I want it to wrap the columns. I want three because I have three months. The RGB representation is stored in the dimension band.
Robust is to do a stretch of my values so that I don't have like these weird artifacts. I don't know what size is, must be something and at labels false also probably for something else. So here you see that I have my three months. So this is basically a monthly composite
of the data that I have. So that's for June, that's for July and this is for August. You see that the resolution is not super like, yeah, but that's because I asked my cube to be 200 meters. So that makes sense. That's exactly what I can see. But yeah, this is the way that you can get
all this information. And as you see, there are already changes. You see this area in June, probably crops and vegetation area, they're getting differentiated in color. So you can already see some changes that look pretty neat because the amount of cloud cover seems quite good. So yeah, we can take a look now at how to do this with R.
So I'm going to confuse you again. Good. That's it. Any questions so far? Yeah. You have been using a bounding box while doing the cube. Could you use a little small one so
that you only focus on your area of interest? Of course. At that moment you use it? You can definitely do that. So that's what I will, I don't know when, I will show you later. But for example, here, like this is a smaller one. I
thought also like you, I want to just look at the crops, for example. You can do that, but you can, like here I create a new cube. But you can definitely also just say, OK, now just extract this smaller bounding box. You can do it with iCell, like I did before, that you say,
OK, I just want this selection of from this pixel to this pixel. Remember, I was doing that. But that's hard because your head is not computing the pixels, right? So that also took me a while. So you can do the same way, this selection and getting this smaller area. You can do that as well already with the cube. Yeah. So it's like the same way as reducing it in time. You can like
chunk it down to an area of interest. But can you do at the moment the definition of the cube? Yes. Like at the beginning, I think of the goal you use. Yeah. Yeah, it's exactly that. So like
in the previous case, I gave the Poznan bound. And in this case, I'm giving a smaller one. So we can take a look at this already. So it will profit on one area. Exactly. It will just create a new cube with a smaller area. And in this case, for example, I called it with 10-meter resolution because I thought, OK, this would just work. So that's
exactly what I'm doing here. I got a smaller area with this geojson. I just recreate the cube with the items. I say I want this in 10-meter resolution. That's the reason why I didn't just crop the previous cube because I had it in 200 meters. So now I want to have a new cube in 10
meters. And then you can give this smaller bound area. Then I do directly the selection of red, green, blue. And then I have my RGB representation of a smaller type of chunk. Using resampling because all the bands needs to be resampled at 10 meters.
60 meters band can still be at 60 meters. Exactly. So basically, every band that is also in 20 meters or 60 meters will get resampled to 10 meters. And that's where the resampling becomes even more important because you have to basically downscale. So that is going to chunk down the pixels. But still, you need to
think about it for sure. So that's basically what we're looking at. I was just wanting to visualize a smaller area. So since we already started talking about it, I can a little bit go on. Yeah, we have this other new cube that is now 10 meter resolution. So you'll see it here.
And then we can just take a look at what we got with this. So we have, again, the eight timestamps. And what I did here is create a crop of this data. So you can see how this changes.
Not a crop, a gif. And you can see how this then changes over time. So these are my eight scenes just looping over. Yeah. Just make sure all this is in memory, right? Now this is in memory because, yeah, I call compute.
How would you do for exporting? You want to export the extension of the file with nets? Yeah, I think that's actually probably in the documentation in a way. But you could definitely save all this information when you already have it. You can export it as a NetCDF. I think even
GeoTIFs allows you to have now a bit more of this data process. Or you can just export every scene in one GeoTIF. That's also one way to do it. But yeah, NetCDF could be a good way. I'm usually not exporting the data because you can do a lot on the cloud. And normally,
when I really have to export a result, it will be the last step of everything. So I wouldn't, in my opinion, export a bunch of scenes for then doing work locally when you can do so much.
For example, if I, in the end, want to just have a composite of NDVI or something like this, then I will just get the final GeoTIF and then continue working with that. Or if I'm only interested in time series, then I will just export a CSV, for example, and not export all this data. For me, that's the main advantage of working with stack, that you don't need to get all
this in your own laptop. You can just work and process things and then have the final product with you. But I see the need to also sometimes have these individual things. So I was thinking about the XQ. So there is a way of configuration already in the cloud.
Yeah. Yeah, I would hope so. But I mean, XQ is a little bit like, I didn't know about it before this summer school. But I have the feeling it's quite similar to what you can do with stack stack. But it's really like their own solution. So I was talking to Alicia before, and I was asking her if
they're also planning to have a stack implementation so that you can have all the data on stack and then those stuff. And she said that they're working towards that. That is quite a priority, that they want to have all their stack data catalog also there. So it's also going to work,
right? You can also fetch the data in that way. So there will be, for sure, ways of integrating and doing things. Definitely. Yeah. So yeah, this is the nice gif of crops. And you can see them changing. And OK, we can go on in this direction
where you can do calculations like, for example, NDVI. So you can say, OK, from this nice cube that I have now, I want to compute the NDVI. So for that, you can say, cube, I want to select the band near infrared, and I want to select the band red. I find this Python syntax very funny,
where you can just create two variables in one single line. But OK. So I have this, this is the near infrared is this, and then the red is this. And then I compute the NDVI, which is just, we already know, near infrared minus red divided by near infrared plus red.
And then I have this new cube with NDVI. So as you will see here, I have now my eight timestamps. I have my x and my y, but I lost all the bands. You see that, right? I don't have this extra small thing here that shows me the bands. And that's because I just basically combined all
the bands into the NDVI. So that is not there anymore. I only have time x and y. I don't have the RGB anymore. So remember from the exercise that we did yesterday with Sentinel-1, you can definitely concatenate this back into your cube, and then you can have it as an extra,
yeah, as an extra band. You can have the NDVI on the same cube. If you want to do then other type of processes and things, it can just be one extra band. But in this case, we just created a smaller cube only with NDVI. So I can ask different questions to this cube. So I can tell it, okay, I want to get like a composite with the maximum NDVI value. And I
want you to look over time, like what is the maximum overtime for each pixel, right? So like, if I have eight days, eight dates, I want to get a composite of the maximum NDVI for each pixel on that date, in the sense when there was more vegetation for each of these pixels. Yeah, I see
you looking at it. I don't know if I get you, but I will show you. So in this case, like I'm only reducing the whole thing by time. So I don't have any more all the time values, but I only have
one scene with each pixel is composed by the maximum value of NDVI per time. So I don't have time anymore. I only have one single raster with the max NDVI. And then I can compute this again and plot it. I actually don't know how long this is taking because I was doing this technique of
letting things run while I'm talking. So let's see. So the idea here would be to just have this composite. You can also not do the max, you can also do the mean. So as the mean NDVI for all my time steps. And then I can do some analysis. In this case, I want to do the max.
So then you will see here, you have one single value where you have the least NDVI. So probably built up area and the maximum NDVI in my time series where you will have most likely the forest really, really green. And then you'll have some crops you will basically get when they were at
their maximum. Yeah, that's basically it. And what we can also do is compute what we, like it's called an anomaly. So let's see how much the NDVI deviates from the mean of the collection. So like if you have the max and then you can compute the mean and then you say, okay,
what is the difference between the two? And then you can see where there is like bigger changes, when there is like a big change in between. That's just one of the applications. So for that, we do this composite of max minus the mean, and I call that anomaly. And then we can see this.
We can have a look at what are the areas that changed most. So you see here, there's not, like this is a positive anomaly because there was like increase, this is a negative anomaly, it's a decrease. And then in other ways, it's staying a little bit constant. Yeah. So these
are just some of the things that you can do. I mean, like the idea here to show you is that you can already do a lot just with stack without the need to download. So for example, this will be, if this is the aim of my analysis, I would end up only downloading this TIFF instead of my whole data set, because this is my aim product, right? Then if I download this, then I can already use
it on whatever thing that I need to communicate to partners or whatever, then I can use this already as the final product. Yeah. So then this can be exported as a TIFF most likely, and then you can download it. Okay. So this is, how am I doing with time?
We'll manage. Here's also this, again, like this download part of it. You can download for sure each of the items that you queried from stack. I don't put this on an executable code because I don't want you to do it, but you can do it, right? You can definitely say, okay,
I want to download each item in the collection. So that's where there is a for loop. And then you just basically start retrieving all this data locally. I don't know what you want to do. Anyway, I put the code snippet there if you want to do it. Okay. Good. So now let's go to R and
let's see how to do this in R. So here they decided to come up with a name for the package that is not a tongue twister. So here we're going to do stack data cube with a package called
GDAL cubes, or actually like every time I have Roger around, I'm thinking I have to call it Google cubes. So Google cubes is the package that we can use to make these data cubes on demand. It's incredibly powerful and you can do a lot. And this is already something that Marius Appel
presented two years ago in the summer school. So I'm sure you will get a lot of more data and there's a link to his presentation at the end. But yes, I thought it would be a nice comparison between the two. So we will use this one. And in this case, I'm going to start it a little bit
more clean in the sense that I will already do some pre-selections so that things are not too big as on Python, but you can also do this with the Python stack stack. So again, here, first in this line, I'm just selecting which bands I'm going to call.
In this case, as you see, the bands are still with numbers. And that's because we're using these Sentinel to calls like the version zero. So that's why they're not named. You remember how the items look like, right? Like it's from this query at the top. Here we query the items and then to
get each of these features. So as you see, this is in features. So that's why we do this dollar sign features. And then we'll say, okay, with all these features here, not all of these, because this is a little bit more queried, but with these features, you can please create a data cube for me.
And here we give which assets we want, so which bands we want to use on our data cube. And in this case, we're going to do the filtering already on the items that we got from stack. So in Python, we did this filtering when we searched the collection. In this one,
we're going to do it when creating the cube. So again, you create like a function where you say, okay, I want the EO cloud cover to be below 10%. And I want everything that has a valid cloud cover for this particular collection. So then we can create the cube like this, and then we'll just
get our information of how our cube looks like. The print method is not amazing for Quardo, but let's go through everything that's printed here. So we have 28 images with 12 bands. These are the bands, so all these, sorry, these are the images,
and these are the bands, band one, two, three, four, and like 11. And yeah, that's the data that you have. That's how your cube looks like. It's not as a nice cube representation print method as in Python, but it's more or less what, you can picture it now in your head, you can do the cube. And now we can do directly some visualization of the data.
For that, we can do some, we fetch the function cube view. And with cube view, we're going to get an AOI that we already defined, but in this case, we need it in 38.57 because it's going to
do the plotting on Web Mercator. And that's why I did this bounding box 38.57 before this conversion of my bounding box. So for viewing this cube, we're going to create this cube view object, I call it V. So you have the reference system, in this case, 38.57. So we give the extent,
this is the time extent. So this is all these values that we fetch before. So like, here we have the time extent. That's why I gave it as a vector and not completely glued together,
because then I can fetch each of the dates. And this is the bounding box that I was using before. So here I have that. Then I can give it again a resolution. So it just looks a little bit different, but this is the same, right? Like this is how much I want the resolution in X and in Y, meaning that you can also give different resolutions. So you can say in X, I want it
to be 100 and in Y, I want it to be 200. That also works, if you want to do that. Now, if you want to do that, you can do it too. So you're not bound to just a square, but you can work with rectangles. And here you have to give the temporal resolution. So the temporal resolution here is P1D. So it's a little bit cryptic, but this means one day.
And you can give it another type of resolution. So I asked here, Chris, how would you use biweekly? But let's see how you would use other things. So we can always go to the documentation. That
is one thing that I love about R. You can always go to the documentation and take a look at what's going on and what should we do. So this DT object in here, for example, tells you what is the size of pixels that you want in the time reduction. You express it in this ISO 8601 peer string,
and you can say every 16 days, for example, or every something, something, something. So if you look for this, you will get all these conventions and how to do this properly. Wait, probably this is not it. And how to choose different types of conventions.
You'll find it for sure, but this is the way you define the time in this format. Then we say how we want to aggregate this whole thing. So again, we want to do the median. And in this case, I'm using a resampling method average. So how is it going to do this thing?
I'm not using bilinear anymore. I'm using average just to show you that there are many ways. And if you read the documentation, you will find the one that is suited for you. Good. Yeah. So this is how you create this cube view. And we will go to this in a minute
to actually see this. But we're basically setting the framework of what we want. We want this time extent. We want this resolution. We want this time. We want this type of aggregation. We want this time resampling. Now what we can also do is do some cloud masking. So this is based on SCL values. So SCL values are based on a level 2a algorithm,
which is basically a scene classification. So you get this whole workflow that they did from TOA. Then they classify their data. And then they get basically some cloud snow detection algorithm, brightness, like a bunch of things that you can get from this. There's like a lot of
processing done. And then you basically get this in this classification value in there, where you have this one extra layer in your data that has no data, what pixels are saturated or defective, what are shadows or cloud shadows, vegetation, water, whatever. But the important
thing here is that you have it per pixel. That means that you can really create composites taking out the cloud pixel, but per pixel, and not looking at the whole scene. So that is helping you a lot. So that's what we're going to look at now at how this looks like. So how
one of these scenes looks like. So for this, I just took one of the items that I have. And yeah, I just pulled the first item. I think I did this before. And then I just read it with rasterio so that I can take a look at what this is. So I'm asking for this
scene classification layer just to show you how it looks like. I have here a CSV with the classes just to make my plot pretty. But yeah, it's just to show how this looks like. And this is how one of the scenes classification would look like. So you have basically all the
information of what is a cloud shadow. So you see this brownish. You have vegetation. You have non-vegetated. You have water. You have anything that is unclassified. You have cloud probabilities and SIRIS. So basically with this, you can per pixel say, OK, I want this in my cube. I don't
want this in my cube. So you can do an extra filtering your data. So what you do here is define a mask. That's the way that you would do it. So in this case, we say, OK, I want this mask and I want you to mask these values. So when I saw this in the documentation,
it was like 3, 8, 9. And I was like, what is this? So that's why I made this plot. 3 is cloud shadows. 8 is cloud medium probability. And 9 is cloud high probability. So I was basically going to say, everything that has these values on the synthesification, please avoid
putting them on my median view because it's just going to mess up my data when you calculate the mean. The median, I don't want them there. So that's what the masking is doing. And yeah, you can specify that when you're working. I forgot to run v.
Yeah, so I forgot to run v. So that's basically what I'm telling this to show me, this RGB. Sorry, this is basically what I want to do now, the viewing representation. I can pass this mask and tell it, OK, I have this v object where I defined all my extent and everything.
I have my collection. And I want you to mask everything with these mask values that I just gave you before. So here we're going to do an RGB reduction with the mean. We select the bands. So you need to know the bands, blue, green, red in this case. That's how they're, right? Yeah.
And we reduce it by time. So we have, we're going to say, OK, we select the bands and then we say, reduce by time each of these bands and get me the median of each of them. So then I have my cube with a mean value for all my stuff. Then I can plot this. And I don't know how long this will
take. So again, this is the moment when all the data is going to be fetched, right? This is the moment when I'm going to tell the Stack API, OK, now I defined everything I wanted. Now I want you to do a visualization for me. So in this case, we don't have compute or Dask or anything. Or maybe there is some sort of parallelization happening in the background. I'm not completely familiar
with that. But this is only the moment when you will get everything. I will not run. Well, maybe I could try to run this. This is probably going to take a while. But wait, I should have done that. Ah, shoot. Oh yeah, OK. I'm not going to run it because we're not completely,
but I copied it before what the result looks like so that you can see it. And yeah, this is it. This is basically the representation that you would get once you fetch all the data, you get a median composition of your data that you already fetched from the Stack API.
So that's what you can do with good old cubes, JITO cubes. And it's pretty powerful as well. So I have added some exercises for you. You don't need to do them now. You're probably tired.
But if you want to play a little bit on how to do this computation of the MDVI and the anomaly that I showed in Python, how could you do this with good old cubes? And there, Marius has our spatial blog where he's looking a little bit into this. So you can follow along there and take a look at how you do that. And you can also compute time series. So you can also take
values outside of this information. Like maybe if you reduce everything in space, so you can reduce in x and y instead of reducing by time, then you will get information just for your values. And there you can do, for example, some time series analysis of one single crop
parcel, for example, things like this. So this is also something that you can do as an exercise. You have so much free time. And for that, you can easily create a geojson with this geojson.ai. And yeah, just play a little bit along. You can have some sort of temporal
grouping so that your analysis is not so crazy. But if you're really going to use this for your work, then, for example, you can do yearly analysis or something like this and then have this mean NDVI value per month or something, whatever really suits you. This is quite a
powerful tool where you can just really play with everything that you have and use it really more in production. So as I mentioned before, this is really inspired completely of Marius' talk from the OpenGeo half 2021. So do check his repo and his video. He's giving a lot of info
over there as well. But in this one, as I said, I wanted to do this comparison between what you can do with each. So I think I'll stop talking and ask if you have any questions so far or any doubts or if you found it interesting. Will you be working more with the R implementation,
the Python implementation? What do you think? Julia. All right. I forgot completely to add like a Julia segment on this. I should have asked Martin to like, let's do it. I'm pretty sure if I ask him, he will come up with also a notebook to do this in Julia. I don't know if it's advanced
enough with all the packages, but why not? You could be the first one to do it as well. No, thank you. All right. To me, it happens to use this data, but I actually use the product.
So not the raw images. Is there any way to include the product? I mean, are there the product in stock? It depends which catalog you look at. So now we're looking at this earth data one, but there are many other like catalogs that you can query. Probably these
derived products are more in catalogs that you have to give credentials, but you can definitely query that. So like you just need to give your credentials to the stack API, like the stack package, and then it will just anyway, just fetch you anything. So you can also work with Microsoft planetary computer with Google cubes, or you can work with earth engine.
I mean, yes, you can work with anything that you have credentials to, and that is on stack. So the stack catalog is quite big. It's not only for Sentinel-1 or Sentinel-2, you have Landsat, as I said, but you also have Copernicus data services. So like you have all these other
things. You also have climate data, I believe. It's quite big and it's getting more and more conventional on how to share this raster data, like how to like establish well the metadata and stuff. So for sure you'll find a lot and they will always have good instructions on how to fetch that data as well. So yeah, that should be there. You just need to find the URL. Yeah,
you need to find the URL. So this stack catalog link that I gave will give you all the collections that you have. So you can browse through there what is available. They are very comprehensive. And can you find the old data of NDVI from the 90s or something like that?
NDVI from the 90s? Probably, yes. I mean, that's probably computer from Landsat, but there's also some derived products, I think. I'm not sure if I'm going to say this correctly, but MODIS maybe has some data as well, but that's... Sorry? MODIS doesn't exist in the 90s, though.
I don't know. I thought so. All right. Then thank you for the correction. Probably from Landsat then for sure you can get quite some good data. I mean, it's 30 meter resolution. It's good enough. And you can definitely compute NDVI from that. And you can do this sort of analysis with
SciKPI. That's also, I think, on their data also available. And they've been working a lot in making it harmonize all this Landsat 5 to Landsat 8. Or is there Landsat 9 already? Yeah, Landsat 9. So everything is more harmonized than there's even a product. I think I mentioned it yesterday. The harmonized Landsat Sentinel,
where all of them are combined together. So for any data holes or for increasing the time series that you have, you can also use that. And probably with that product, you could compute a really long-term NDVI thing. That would be interesting to see. I think it's already on stack in one of the collections. I don't remember which one, but I can take a look.
Any other question? Yeah. I don't know if that was clear. Sometimes when I query for many images, not all of them many scenes, not all of them are at the same boundary, bounding area. Sometimes one scene is
just a little piece. And do you manage to use something to integrate that image into others? Otherwise you only have data for a little bit. Yeah, this definitely happens. And this is with Sentinel 2. The reason for this is just because
there is no data in this part. So Sentinel 2 collects the data in a very systematic way that every scene will really be the same scene every time there is a repeat pass. So these bounding boxes should not really be the case. It's not that the sensor passed slightly on the side. It's just
that there must have been something there that you have no data. So remember that I showed you how to take all the scenes that have no data. You want the coverage of 80% of the scene. So that's one way you can deal with it. And of course there's no data. You can also ignore it when you do your medians. You just don't want that of course in there, but you do want this one part maybe.
So that's one way to do it, definitely.