The new Forest Inventory Estimation and Analysis system
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 295 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/43526 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
EstimationFaculty (division)Associative propertyMultiplication signDomain nameLattice (order)Presentation of a groupGoodness of fitLecture/Conference
00:35
ANSYSEstimationForestImplementationRevision controlSoftware testingMatrix (mathematics)CASE <Informatik>Observational studyPhysical systemComputer fontExplosionExistenceFIESTA <Programm>Hill differential equationSoftwarePlanningContinuous functionTheoremTotal S.A.Inclusion mapInequality (mathematics)Service (economics)Continuous functionPhysical systemForestSummierbarkeitPlotterArithmetic meanDiagramImplementationSoftware developerBitLevel (video gaming)Perspective (visual)Relational databaseStatisticsFIESTA <Programm>Figurate numberObservational studyVolume (thermodynamics)ResultantCategory of beingSelectivity (electronic)CASE <Informatik>Projective planeMetric systemComputer fontEstimatorLogic gateMereologyConfiguration spaceField extensionTotal S.A.Loop (music)Focus (optics)Goodness of fitLink (knot theory)DatabaseNoise (electronics)Probability density functionRight angleEndliche ModelltheorieExtension (kinesiology)AreaCalculationInclusion mapMedical imagingPairwise comparisonSoftwareAnalytic continuationGroup actionComputer animation
06:32
ImplementationSoftwareMatrix (mathematics)Representation (politics)Physical systemData structureFunction (mathematics)EstimatorInformationTable (information)Data managementConfiguration spaceType theoryLinear mapLinear regressionProcess (computing)Plot (narrative)EstimationPhase transitionSet (mathematics)Cellular automatonData modelParameter (computer programming)Attribute grammarPlotterDomain nameProgrammable read-only memoryAmicable numbersMechanism designStructural loadGeneric programmingSinguläres IntegralForestDatabaseSource codeComputer fileSource codeImplementationFunctional (mathematics)Table (information)Compilation albumNeuroinformatikCodeModal logicPlanningDatabaseData managementPosition operatorLinearizationPlotterInsertion lossPhysical systemLinear regressionInformationAreaMultiplication signData structureField (computer science)ResultantGraphical user interfaceSoftwareProjective planeEstimatorGene clusterEndliche ModelltheorieBitGroup actionFilm editingData analysisGodNumbering schemeComputer animation
10:11
Configuration spaceMereologyGene clusterConfiguration spacePlotterSet (mathematics)Table (information)Endliche ModelltheorieResultantVariable (mathematics)EstimatorDatabasePhysical lawProgram flowchart
11:01
EstimationConfiguration spaceHacker (term)Client (computing)Function (mathematics)Analytic setInverse MatrixJust-in-Time-CompilerTotal S.A.ImplementationNetwork topologyRaster graphicsMatrix (mathematics)Representation (politics)Inversion (music)Slide rulePosition operatorStructural loadField extensionTerm (mathematics)Graph (mathematics)Object (grammar)DemosceneEstimatorProjective planeSoftwarePresentation of a groupLibrary (computing)Metric systemFunctional (mathematics)Analytic continuationEvent horizonImplementationSubject indexingResultantDialectNeuroinformatikType theoryCalculationQuicksortPhysical systemSoftware developerDifferent (Kate Ryan album)Data storage deviceDatabaseNumberConfiguration spaceConnected spaceComputer fileFlow separationGoodness of fitSemiconductor memoryAsynchronous Transfer ModeScripting languageMultiplicationMultiplication signMaxima and minimaDataflowNichtlineares GleichungssystemSparse matrixEndliche ModelltheorieMatrix (mathematics)PlotterIdeal (ethics)Inverse elementRow (database)Parallel portInversion (music)Table (information)Representation (politics)Limit (category theory)BefehlsprozessorLink (knot theory)Computer animation
16:52
Coding theoryClefParallel computingHacker (term)EstimationInstallation artTerm (mathematics)Extension (kinesiology)Visualization (computer graphics)Software testingLinear regressionEigenvalues and eigenvectorsMatrix (mathematics)Data storage deviceObservational studyCASE <Informatik>ForestTotal S.A.Real numberDatabaseMatching (graph theory)Row (database)Branch (computer science)Table (information)Software testingValidity (statistics)Parallel portTesselationRaster graphicsLink (knot theory)Library (computing)Observational studyFormal languageSoftware developerCASE <Informatik>Type theoryPresentation of a groupDirectory serviceMetric systemServer (computing)Real numberField extensionFunctional (mathematics)EstimatorLinear regressionMedical imagingSystem callConfiguration spaceVisualization (computer graphics)PrototypeMehrprozessorsystemSet (mathematics)Data typeSoftware bugExtension (kinesiology)Computing platformSoftwareImplementationCodeContinuous integrationRevision controlResultantData miningSuite (music)Data storage deviceBus (computing)Information securityAdditionTape driveRegulator geneCalculationMachine visionGoodness of fitSimilarity (geometry)Endliche ModelltheorieDirected graphSpecial unitary groupComputer animation
22:43
Event horizonCalculationForcing (mathematics)Computing platformSuite (music)Type theoryForestGroup actionMatrix (mathematics)Library (computing)Computer animation
Transcript: English(auto-generated)
00:07
Guys so I'm I'm Dan. I'm associate professor at the Faculty of forestry in Brasov and I'm going to chair this wonderful session So we have 20 minutes after that brief questions and I present
00:26
G it's okay. Yeah GT okay with his very nice presentation. We'll see. Good luck. You have the floor 20 minutes Hello, good morning, my name is easy far and I'm going to show you something about
00:43
The system which we are developing Which is called an fiesta it means new forest inventory estimation and analyze the system Maybe in the beginning I would like to ask you if you are some foresters here
01:00
And statisticians Okay, and the rest probably software developers yes, so I will try to Focus so the outline What I'm going to speak about so why to Develop the new some some new system to to do first inventory
01:23
So I will show some existing systems maybe try to Make some comparison. I will try to say what this our system special in what is new in our approach and After that, I will like to focus to the implementation because it is a post SQL extension
01:42
So maybe I can show you how we are developing this extension and maybe you can Do it also with your projects to to connect it to post SQL that way and after that I will like to talk about how we are implementing the the metrics
02:01
calculations and I would like if you have some notes maybe to tell me your your ideas how to how to do And how we can maybe make it better and after that we have performed some case studies Some of them are published some not so I will speak about about that
02:25
So existing systems The name and fiesta we are using is after after maybe this system fiesta, which which was created in Switzerland in 97 and
02:43
Also, it is a little bit maybe It's not very good that in a US Forest Service created also system Fiesta, so it is some The name is very similar and
03:02
Also in Switzerland and Brazil created forest inventory package in our This software is also in our and Also that exists a forest system which was created by the group and fina and this is a post SQL Extension and in fact, we are somehow extending what what is
03:25
Already has been implemented in eforest So what is new in our system? We are tightly trying to implement Something called continuous Hurwitz Thompson theorem, which is statistician approach
03:44
How you can make estimates from the continuous populations it means that if you It depends in NFI Where how is designed your grid of plots where you are measuring?
04:00
the the properties of the forest and because the you are selecting plots from a Continuous space the probability that you will select one particular plot is zero. So there is some there is some something called the probability density of the selection of the plots and And
04:21
This is what we are using so I can say that our software is based on some well-known statistical approach and What what is the? The result of that that we can estimate the totals It means for example total of biomass total of volume for the wood volume
04:44
without known Map exact map of forest because the forest has a quite complicated definition. So If you are thinking that you can map it It can be really tricky or problematic. So we are not
05:03
trying to tell that we know exact map of the of the forest and We are only estimating for example area of first And this is from the perspective of Statistic and what is also novelty of our approach is that it is not our package or something like that, but it is a
05:26
Database extension so you can load the data. There are some constraints and and checks So what here is some diagram what you can do with our system so you can of course
05:43
Configure some estimates it means what? Tell what will be the model what will be the auxiliary data because you can use in our system Also the auxiliary data from a remote sensing for example some airborne images or something like that. So
06:04
This is the estimation configuration part after that you can perform the calculations and After that, you can store all the results and and all these things We have some limited
06:20
documentation of our system and one of PDF Which I can which are links are here included. So in this PDF that is described how to use the system So now about software Implementation I would like to tell something what is implemented what is not implemented and
06:44
basically, it is just a collection of some tables and functions in database soon soon, I think complicated so we Check the time maybe okay So our system consists of 47 tables and
07:05
17 functions, which helps you to interact with the system It has no graphical user interface or something like that. So You can maybe Connect it to your existing postgreSQL data or something like that
07:24
You can do some management of field data and auxiliary data the auxiliary us are called For example data from remote sensing or Lighter or something like that
07:40
Our Modals which we are using are linear. It means we have no Non linear modals in I think that in first inventory packages. There are Also included some more complicated modals. We just focused on linear model
08:01
And Also, this is maybe important that Spatial data the precise position of the plots are not mandatory from the comp for the computation So it it is important because a lot of countries for example Do not want to provide their position of the plots because the it is quite the sensitive information
08:24
say maybe There are many reasons And the project of measuring the data can be very long So so they don't want people know where our plots to do not somehow
08:41
Bias the the results or something like that. So our system Was designed to be able to compute for example estimates on areas covering many countries like for example Alps or Or car patiens and you can
09:00
fill the data without their position just just just telling which areas they belongs to and We can we can make estimates on on that areas without It is not necessary for the countries to provide the plots positions for us So what is not implemented?
09:25
Anyhow, we are we are not Dealing with data upload if you would like to insert data in our system, you can do you do it for example with some ETL existing software or use foreign data repress post SQL or
09:44
Do it yourself. So we are not telling how to upload data or insert data And we have no plans to to implement these I Think that's it
10:00
This is a structure of the database The red tables are just lookup tables with some code Here you have for example table of plots table of clusters because usually in some countries the plots are grouped to clusters and
10:20
There are also panels it means that Usually in NFI there are some sets of plots which are Which are visited for example in one year and another set visited in another year. So they are organized to some panels So this is this part and Here we have a quite complicated part of the database, which is
10:46
Dealing with the configuration of the estimates. So the models for example Variables which are used and The the final results are here in this table
11:03
Tables are storing the data and Also, we have a functions. I don't know how many maybe 70 or something like that and with that functions You can make the configurations They are somehow Documented with database commands or if you would like to try our system
11:22
we will be happy to to help you to to describe the functions that tell you how to use them and You can make a configuration after that you can make the estimation also, it can be quite time consuming to
11:41
To compute the large number of estimates it can be for example It is quite normal to have several thousands of different estimates with different models and you are Comparing it after that. So it it's some When when it is time consuming we are usually doing something in Python
12:04
We have such a script which can make for example 20 connections if you have 20 CPUs in your server, so it make the 20 connections and run it parallel So we are not using the native post SQL parallelization now because
12:21
it has some limits and for example, if you are computed something in parallel and Inserting the results in some table. It is not possible in in post SQL You can make a parallelization but not not insert so we are We are trying to to do that with this hack
12:41
Okay, so Here is something about How to deal with the matrices in a post SQL. It is the the biggest problem of That system. I Don't know somebody of you are trying to Implement something some metrics calculation in database in post SQL
13:05
Okay, so and you used some Type for metrics or Maybe we can speak about it later Yes, I will show our approach the most I think simple is to do it in R for example or in Python and you can create your
13:26
functions which are Implemented in R. And for example, here is a simple function which is doing the The inverse of metrics, so it is quite easy to do that, but the the problem can be if
13:45
the matrices are large and in a Horvitz Thompson continuous approach there is this equation and You are you have a matrix which has a size and By n and it means that for example in Czech Republic
14:01
We have 40,000 of plots in NFI and It will create the object which has almost 12 gigabytes and it is not so bad, but we were thinking that maybe we will use the system for some European project where can be
14:21
Many more plots than 40,000 for example in France, I think they have 300,000 and You can make the estimates by the whole Europe and there can be half million of plots or something like that And after that the amount of memory will be too big. So we were
14:42
fighting somehow with this and Finally we have found the solution how to how to So this is for now
15:01
Okay, we have also the reference implementation in our it means to be sure that that equations are implemented properly we have the parallel implementation in our which is very simple and You can compare the results, but the SQL implementation is much more like fast and
15:26
Not limited to memory problems so here is a link to the our implementation and What we are doing now that matrices are
15:41
We are not Storing them as some Type or something like that, but we are storing them in a tables where we have a row index column index and value so it is something which is I think similar to Sparse matrix approach which is viable in some
16:03
Software like mad lib library or something like that If you have this this representation of the matrices So for example, what we are using is the inversion of matrix transposition and multiplications So for example inversion of matrix is done in Python now
16:25
So because the the matrix to be inverted is quite small. The transposition is something only like Changing column as a row and the row as a column in the indexes and matrix multiplication is done like this
16:40
I think this is this approach is not Ideal and PostgreSQL people usually tell about this that it is not not good to do such a such a Imputations, so we are thinking in the future that we will develop custom type for for metrics something like post GIS
17:01
has a raster That you can store the tiles of rasters on the rows of the tables So we would like to do something similar Here is how to do parallelization in in Postgres so you can create Just
17:21
Python code which will implement Multiprocessing Library and after that just with this Call you can make it run in parallel. So it is quite easy about installation Do you know PostgreSQL extensions like post GIS? Have you ever been installing it it means
17:48
There are two ways one is in that you have a Linux package like post GIS and you install this package and after that just You run in SQL this create extension and if we our our
18:06
Software is not packaged in in the BNS and so you are in Linux, so you have to clone the database Go Inside the directory and to do make and make install and it will do everything for you
18:21
So it is not complicated to do the installation About the license we were choosing from these possibilities and finally we selected that you European Union public license, which is translated on all the languages. So it's valid in Czech Republic for example
18:44
So The PostgreSQL extension we have versions So for example, if we will found some bug we can make the fix and deploy it to our Partners who are using this software. So it has versioning inside. It has regression testing
19:03
We have a quite nice set of data from check NFI Which are included in the test suite, so we are running the tests that that was quite complicated step, but we are very happy that we have set up this and
19:22
We have also some visualization we are We have a docker image That database and in gitlab it is possible to set up CI continuous integration So when I do something in gated it will spin the docker image and run the test
19:42
on that data servers so here is a links to the CI configuration and We can see for example here that in pipelines that after for example branch of
20:00
Or match of these branches that the tests are okay so I would also recommend if you are doing something with Post GIS or PostgreSQL extension to connect it to to the CI CD Platform
20:22
So next steps what we are planning to do. We would like to create custom data type type for metrics I have some prototype here Maybe you know C++ library called Eigen it is It is created in Germany, and it has a lot of functions for
20:47
For metrics calculations, so maybe we will not do our Our implementation, maybe we will try to make API to to this library and
21:01
also if we will Solve some problems where the data will be really big we are thinking about To make a custom data storage, which is also possible We have done some case studies so in test data In test data
21:23
There are included data from check NFI We have a micro case study which on was performed on synthetic data We have a case study which Was published and there are really NFI data from France Germany Switzerland and Czech Republic I will show just some examples how how the results look like in a micro case study on a synthetic data
21:47
so we for example performed estimates on some such great and it is it is a comparison of Of One face without any auxiliary data and regression estimator with auxiliary data data that the performance is
22:04
better Here is a link to the presentation of our colleague Adrian Lance and regime adult with real data from those countries the Case study done with our software, so that's that's all from my side
22:27
So if you have some questions
22:51
Yes, yes Can be uh-huh I
23:01
Think that Eigen library is probably doing something like that in that some calculations on matrices can be done on a GPU so maybe if we will use that I that I can so it will We will use it but Yeah, thank you for for this note
23:28
So Related to the forest inventories related to the forest inventories in Europe we need to speak out loud To get them open because now we don't have access to the data to the national forest inventories
23:44
and yeah, I think you you develop a very nice platform and We should we should mitigate because we are a lot of users of forest inventor national forest inventory data And not even those type only there are private companies that they can share data with that and maybe we can discuss
24:04
to create a small group discussing on that on Getting the data to the people