How can machine learning help to predict changes in size of Atlantic herring ?
Video in TIB AVPortal:
How can machine learning help to predict changes in size of Atlantic herring ?
Formal Metadata
Title 
How can machine learning help to predict changes in size of Atlantic herring ?

Title of Series  
Part Number 
160

Number of Parts 
169

Author 

License 
CC Attribution  NonCommercial  ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. 
Identifiers 

Publisher 

Release Date 
2016

Language 
English

Content Metadata
Subject Area  
Abstract 
Olga Lyashevska  How can machine learning help to predict changes in size of Atlantic herring ? This talk is a casestudy of how Python (Pandas, NumPy, SciKitlearn) can be implemented to identify the influence of the potential drivers of a decline in size of Atlantic herring populations using Gradient Boosting Regression Trees.  A decline in size and weight of Atlantic herring in the Celtic Sea has been observed since the mid1980’s. The cause of the decline remains largely unexplained but is likely to be driven by the interactive effect of various endogenous and exogenous factors. The goal of this study is to interrogate a long timeseries of biological data obtained from commercial fisheries from 1959 to 2012. We use gradient boosting regression trees to identify important variables underlying changes in growth from various potential drivers, such as:  Atlantic multidecadal oscillation;  sea surface temperature;  salinity;  wind;  zooplankton abundance;  fishing pressure. This learning algorithm allows to quantify the influence of the potential drivers of change with the test error lower when compared to other supervised learning techniques. The predictor variables importance spectrum (feature importance) helps to identify the underlying patterns and potential tipping points while resolving the external mechanisms underlying observed changes in size and weight of herring. This analysis is a useful casestudy of how Python can be implemented in academia. The outputs of the analysis are of relevance to conservation efforts and sustainable fisheries management which promotes species resistance and resilience.

00:00
State observer
Standard deviation
Machine learning
Lecture/Conference
Personal digital assistant
Gradient
Musical ensemble
00:28
Computer animation
Reduction of order
Virtual machine
Species
Species
Product (business)
00:54
Surface
Causality
Lecture/Conference
Function (mathematics)
Surface
Faktorenanalyse
Species
output
Directed graph
Pressure
01:11
Pressure
01:51
Observational study
Multiplication sign
Sampling (statistics)
Total S.A.
Directed graph
02:21
Area
Algorithm
Linear regression
Divisor
INTEGRAL
Gradient
Gradient
Variable (mathematics)
Residual (numerical analysis)
Computer animation
Personal digital assistant
Object (grammar)
Musical ensemble
02:41
Dependent and independent variables
Linear regression
INTEGRAL
Gradient
Mathematical analysis
Variance
Mereology
Sequence
Variable (mathematics)
Residual (numerical analysis)
Computer animation
Personal digital assistant
Network topology
Forest
output
Endliche Modelltheorie
03:19
Decision tree learning
Algorithm
Outlier
Gradient
Interactive television
Symbol table
Inclusion map
Computer animation
Nonlinear system
Insertion loss
Function (mathematics)
Different (Kate Ryan album)
Selectivity (electronic)
Linear map
03:49
Inclusion map
Scaling (geometry)
Computer animation
Insertion loss
Outlier
Different (Kate Ryan album)
Function (mathematics)
Different (Kate Ryan album)
Instance (computer science)
Variable (mathematics)
Linear map
04:10
Point (geometry)
Information management
Functional (mathematics)
Personal digital assistant
Linear regression
Different (Kate Ryan album)
Square number
Insertion loss
Instance (computer science)
04:30
Curve
Implementation
Outlier
INTEGRAL
Weight
Multiplication sign
Gradient
Sampling (statistics)
Bit
Mereology
Sequence
Wave packet
Theory
Inclusion map
Latent heat
Computer animation
Insertion loss
Function (mathematics)
Network topology
Different (Kate Ryan album)
Formal grammar
Endliche Modelltheorie
Linear map
05:11
Data model
Addition
Bit rate
Weight
Gradient
Frustration
Nichtlineares Gleichungssystem
Endliche Modelltheorie
Mereology
Distance
System call
Limit of a function
05:39
Axiom of choice
Functional (mathematics)
Divisor
Weight
Insertion loss
Bit rate
Parameter (computer programming)
Mereology
Regular graph
Sampling (statistics)
Number
Data model
Bit rate
Insertion loss
Iteration
Network topology
Different (Kate Ryan album)
Square number
Negative number
Endliche Modelltheorie
Metropolitan area network
Sound effect
Degree (graph theory)
Number
Computer animation
Personal digital assistant
Function (mathematics)
Iteration
Curve fitting
Square number
Gradient descent
06:35
Simulation
Dependent and independent variables
Letterpress printing
Bit rate
Parameter (computer programming)
Theory
Sampling (statistics)
Number
Computer animation
Bit rate
Iteration
Network topology
Insertion loss
Personal digital assistant
Different (Kate Ryan album)
Function (mathematics)
Endliche Modelltheorie
Square number
07:03
Stochastic process
Interactive television
Sampling (statistics)
Bit rate
Sampling (statistics)
Number
Roundness (object)
Computer animation
Iteration
Network topology
Insertion loss
Personal digital assistant
Function (mathematics)
Endliche Modelltheorie
Square number
Random variable
07:27
Axiom of choice
Functional (mathematics)
Linear regression
Gradient
Square number
Variance
Insertion loss
Endliche Modelltheorie
Resultant
07:46
Addition
Functional (mathematics)
Statistical dispersion
Multiplication sign
Insertion loss
Bit rate
Price index
Mereology
Wave packet
Sampling (statistics)
Wave packet
CAN bus
Number
Computer animation
Iteration
Network topology
Insertion loss
Personal digital assistant
Function (mathematics)
Software testing
Endliche Modelltheorie
Error message
Square number
08:33
Dependent and independent variables
Statistical dispersion
Interactive television
Set (mathematics)
Variance
Bit
Wave packet
Wave packet
Mathematics
Coefficient of determination
Process (computing)
Computer animation
Iteration
Software testing
Iteration
Endliche Modelltheorie
09:18
Computer animation
Sound effect
Bit
Endliche Modelltheorie
Cartesian coordinate system
Resultant
09:48
Graph (mathematics)
Personal digital assistant
Reflection (mathematics)
Multiplication sign
Endliche Modelltheorie
Variable (mathematics)
Graph coloring
10:14
Inheritance (objectoriented programming)
Personal digital assistant
10:31
Message passing
Graph (mathematics)
Computer animation
10:56
Degree (graph theory)
Dependent and independent variables
Plotter
Interactive television
Variable (mathematics)
Partial derivative
11:26
Degree (graph theory)
Area
Computer animation
Personal digital assistant
Circle
Position operator
11:53
Computer animation
Length
Sound effect
12:16
Area
Message passing
Focus (optics)
Personal digital assistant
12:39
Computer animation
Divisor
Personal digital assistant
Plotter
Interactive television
Sound effect
Endliche Modelltheorie
Lie group
Mereology
13:11
Degree (graph theory)
Surface
Machine learning
Computer animation
Multiplication sign
Interactive television
Mathematical analysis
Endliche Modelltheorie
Limit (category theory)
Computer icon
13:55
Surface
Multiplication sign
Length
Mathematical analysis
Interactive television
1 (number)
Variable (mathematics)
Variable (mathematics)
Sound effect
Degree (graph theory)
Type theory
Causality
Degree (graph theory)
Invariant (mathematics)
Computer animation
14:26
Metropolitan area network
Rule of inference
Execution unit
Presentation of a group
Multiplication sign
Point (geometry)
Length
Code
Maxima and minima
Set (mathematics)
Menu (computing)
Usability
Bit
Device driver
Mereology
Scattering
Programmer (hardware)
Error message
Computer animation
Network topology
output
Summierbarkeit
14:47
Standard deviation
Email
State diagram
Price index
Digital library
Inversion (music)
Plot (narrative)
Coefficient
Radiofrequency identification
Object (grammar)
Pearson productmoment correlation coefficient
Set (mathematics)
Musical ensemble
Gamma function
Conditionalaccess module
Triangle
Default (computer science)
Data type
Sine
Decimal
Cellular automaton
Lemma (mathematics)
Magnetooptical drive
Menu (computing)
Mereology
Trigonometric functions
Shape (magazine)
Error message
Preprocessor
Computer animation
Personal digital assistant
Crosscorrelation
Revision control
Computing platform
Orientation (vector space)
Normal (geometry)
Diagonal
Random variable
Matrix (mathematics)
Form (programming)
Resultant
15:21
Frame problem
INTEGRAL
Moment (mathematics)
Mereology
Special unitary group
Rule of inference
Graph coloring
Variable (mathematics)
Coefficient
Term (mathematics)
Pearson productmoment correlation coefficient
Wahrscheinlichkeitsmaß
Endliche Modelltheorie
Conditionalaccess module
Triangle
Area
Computer font
Algorithm
Theory of relativity
Moment (mathematics)
Coma Berenices
Ripping
Variable (mathematics)
Product (business)
Computer animation
Crosscorrelation
Buffer solution
Data conversion
Diagonal
Metric system
Matrix (mathematics)
16:21
Lecture/Conference
Parameter (computer programming)
Endliche Modelltheorie
Rule of inference
16:39
Core dump
Letterpress printing
Special unitary group
Wave packet
Number
Data model
Estimator
Number theory
Sample (statistics)
Computer animation
Bit rate
Estimation
Befehlsprozessor
Coefficient of determination
Musical ensemble
Physical law
Software testing
16:58
Graph (mathematics)
Set (mathematics)
Core dump
Bit
Letterpress printing
Wave packet
Data model
Sample (statistics)
Computer animation
Estimation
Lecture/Conference
Coefficient of determination
Befehlsprozessor
Musical ensemble
Convex hull
Software testing
Conditionalaccess module
17:36
Asynchronous Transfer Mode
Texture mapping
File format
Length
State of matter
Electronic mailing list
Bit rate
Mereology
Plot (narrative)
Wave packet
Commercial Orbital Transportation Services
Sampling (statistics)
Heegaard splitting
Number
Error message
Computer animation
Iteration
Insertion loss
Set (mathematics)
Partial derivative
Musical ensemble
Hill differential equation
Cuboid
Summierbarkeit
17:52
Validity (statistics)
Range (statistics)
State of matter
1 (number)
Set (mathematics)
Parameter (computer programming)
Mereology
Regular graph
Error message
Computer animation
Estimator
Quadrilateral
Musical ensemble
Software testing
Summierbarkeit
Conditionalaccess module
18:16
State of matter
Multiplication sign
Combinational logic
Parameter (computer programming)
Parameter (computer programming)
Wave packet
Error message
Computer animation
Different (Kate Ryan album)
Touch typing
Musical ensemble
Metric system
Curve fitting
18:45
Modal logic
Error message
Computer animation
Musical ensemble
Parameter (computer programming)
Bit
Function (mathematics)
Parameter (computer programming)
Summierbarkeit
Conditionalaccess module
Wave packet
19:02
Modal logic
Graph (mathematics)
Interior (topology)
File format
Length
Wave packet
Sampling (statistics)
Wave packet
Computer animation
Insertion loss
Estimation
Estimator
Iteration
Synchronization
Software testing
Physical law
Software testing
Cuboid
Alpha (investment)
Resultant
Data type
19:19
Type theory
Algorithm
Different (Kate Ryan album)
Multiplication sign
Moment (mathematics)
Iteration
Endliche Modelltheorie
Parameter (computer programming)
Mereology
Resultant
Number
19:52
Computer animation
Forest
Point (geometry)
Code
Hand fan
Resultant
20:11
Arithmetic mean
Linear regression
Network topology
Bit
Endliche Modelltheorie
Resultant
20:35
Programmer (hardware)
Computer animation
Link (knot theory)
Point (geometry)
Code
Online help
Number
21:01
Term (mathematics)
Virtual machine
Time series
Quicksort
Object (grammar)
Variable (mathematics)
21:36
Computer animation
00:01
welcome everyone my name is olga standard I work as a postdoc in
00:05
Ireland so I will go to show you how machine learning can be applied in sciences and after the previous talk if you've been here that's a nice introduction about all kinds of ensemble methods so here I'm going to show you 1 specific case on the gradient boosting OK so is a background of the problem in the past 60 years observed decline in size of fish by about 4
00:28
centimeters on a rich so thing about having which is about 20 long 4 centimeters a lot of a lot of reduction so we would like to find out what's the problem why is it happening and
00:38
we're going to use machine learning to answer this question so why is it the problem is because having is very important species for consumption and we know that if it does decrease it has a consequences for 1st of production it means there'll be less fish in the future so we can consume and we don't know what's
00:56
called declined but we are presuming there is interactive effect of various uh factors such as the surface
01:03
temperature may change and much
01:06
of was happening over there has a like bombers may change
01:11
efficient bombers may change or fishing pressure you OK so to answer
01:19
this question i'm going to use data from uh for the
01:23
past 60 years from 1959 uh 9 to
01:26
2012 and the data is spread throughout the year to such that should have a cake
01:45
and so I'm going to use this data and there's the way data has been
01:49
collected is uh it has
01:52
neglected from commercial vessels from uh taken at time them 50 to 100 samples at the time and um total sample size about 15 solves into individual features so imagine a dataset of 50 thousand euros um of them OK so study it is this is where the data comes
02:12
from its cold Celtic Seas just on the sources are learned and is bounded by this and your channel in the channel and so it's just a you can imagine where we are now since about the study
02:22
area size and there some objective is identified wouldn't factors which underlies this problem and to answer this
02:30
question i'm going to use a gradient boosting integration these which is 1 of the ensemble algorithms which is available as is the case uh white because we don't have a collection of don't have
02:43
1 thing but they have a collection of trees so I and the final and model is in the interests of the final models improved because we have a collection of interlinked trees so in this case as opposed to other methods such as bagging or integration uh over the random forest with the independent analysis methods all trees are dependent on the ways that
03:06
a is it also want to be so unexplained part of the model is and this as the input in the next 3 so we have a sequence of interconnected trees which is a nice feature it allows to reduce variance it allows to introduce bias the only
03:19
problem is this is because of their internally sequentially come to realize that our algorithm because they all depend on each other OK so and so
03:29
advantages of gradient boosting regression trees are basically more or less the same as all of those of us in symbol methods which means uh just to mention a few we can detect a nonlinear feature interaction is just because of the underlying feature selection which is going on in the algorithm i it is
03:49
resistant inclusive irrelevant features which means we can include as many variables as like and islands there won't be selected so we don't care OK so which is nice it is it is good to the deal
04:01
is data with different scales and you don't have to standardize data we may have you may wish to standardize but you don't have to because they are abused and if you for instance the
04:11
user normal uh like linear regression model will explode so in this case is this idyllic what advantage but
04:18
also robust to outliers so that any data points which are not fitting data it may be because it's a mistake or maybe some special and we don't care at all it's more accurate and we can use different loss functions like for instance the least square or others which
04:32
is an implementation gradient what integration theories which is nice OK disadvantages it requires careful tuning it takes a lot of time to get there with models it's all detained but at a faster predict and also you after I finish my the top part of my talk I'll so you implementation that by the noble curve OK so a little bit of a creations so as a formal specification of the model and we have it is in additive model so we have a sequence
05:00
all of these and they're each these the weighted error so that it's a it's a is it as we get to compare a sample of trees they all combined through this grammar weighted can see here and each
05:12
individual is shown as the UN's part of the equation and then we build an additive model in afforestation frustrations of size is said to be at each the sequentially reserves parameter epsilon which sitting patrols enormous learning data we know we'll talk about learning a distance learning rate so in learning later allows to control and uh a speed call fast we descend along the gradient and finally
05:37
at each stage the weak learner is chosen to minimize some
05:40
loss function in my case I took a least square because it's a natural choice but it can be any other function which you can do differentiate and is that this is part of the model and the aided by the negative gradient descent of the I won't go into that that but it's simple about promoting the OK so parameters which I finally selected in my case I needed to about 500 iterations and
06:06
learning rate of about 0 . 0 5 and this this the parameters are referred to to as the regularization parameters of k and affect degree your feet and there is therefore effect well of each other which is a bit complicated because if I increase the number of iteration let's say by a factor of 10 it doesn't mean that learning degrees but fucked up there it's not proportional so which is difficulty we you may increase the iterations but the learning rate might be a decrease by different proportion and that's why it's getting thinking OK so next
06:37
parameters Moxon 3 there reaches in my case a
06:41
6 and uh for this particular all that it is known from theory and from different simulation models that uh 3 prongs so it means that the response print only perform best OK which is nice so you don't need any deep trees but in some cases you may need to go from 4 to 6 Moxon 8 uh is uh a uh the data the rates
07:05
of in my case at 6 it means that my model can accommodate up to 5 years interactions of is what means OK next round
07:14
subsample in my case 75 per cent it's in all optional if you specify in English the monarch means that you get a stochastic model so we introduced some randomness it can be nice because it allows to reduce
07:28
variance and then reduce bias and that is practically I found out that this was a better result therefore I introduced so and basically my model is stochastic gradient boosted regression the with size OK and
07:40
loss function is least squares as I mentioned it's a natural choice nice to start it's easy to interpret but that can be any
07:46
other loss functions and their nicely implemented this I could learn and it's very easy to to change OK so if is to make our model in this case I it's pretty my data in 3 parts uh you know if I have enough time I'll show you how I did it splits into parts sales have results and they're very similar which is nice shows that was the so my model but in this case I um I data 50 per cent for training 25 tested 25 an addition there is no particular reason why because I have 57 throws I can
08:17
I just I just can't if you have less data they you don't you may choose for maybe of only about 2 0 consolidation of some of the methods which are more specific of smaller datasets but I have a big datasets and you can see I have uh indices mean squared error which is the beauty of a few
08:34
say well it's so I'm I'm I'm happy enough is my model and I can see that after some interactions that my model
08:42
and a flat knowledge so there is it is no being as it is no change in there and see which means that I have enough iterations and R square it tells me I proportion of variance which is explained by model and therefore training set is slightly higher which may indicate a bit of overfitting but it's not a big gap between them so I I'm satisfied and that's but this all those who follow each other very closely so it means it'll marriage my model is doing a good job of it and there is some of if I Fred induces variability in data I see that R square goes up so this is
09:18
basically an effect of so a little bit of results so if I plot here a lens of
09:25
the fish on its axis and you can see that it's maybe around uh from 20 to 30 centimeters so imagine and my model predicts fees from 22 to 28 so basically it is what it says on every street here a correct value if you have experienced still smaller due to be the 1 that predicted correctly OK so it's 50 per cent of the R
09:49
square each what what's the reflectance graph OK and if you want to find out which variables play
09:55
a role in the and in my model this is what I wanted to find out and the this is the way it's performance and each variable is used 1 the most important 1 is used to speed to the more often is used to speedup the if account times it's used we can say OK so that means it's more important in this case I have a color
10:15
coding here so it is 1st as parent is basically moms
10:19
OK so we know that is something that in attributed I could see it's a 100 % of
10:25
the cases it has been used after that we have seasurface temperature uh which is
10:31
uh I'll show you next graph how it's affected but is basically some relationship and other things I have
10:38
food availability so that is a doubtful to see and then abundance of fish so how the how many topical large population etc. so most important message here is to remember is that tent is important 1 and after that we have the sea surface temperature and the and food OK so
10:56
if you thorough visualize the variables in them partial dependence plots so um the 1st throw he is the 1 they partial dependence plots basically where Paul each feature against air our might explain the data dependent variable said lands of the fish we can see that uh becomes really see a need particular relationship here it does this is a relationship but it shows a high degree of interaction is the way
11:23
how it's uh it's dependent child so we
11:27
don't really the company dependence here but we do become here so I highlighted here circles this 2 areas um uh it means that maybe if you if you can see about 14 degrees so if it's a surface temperature is below 40 degrees there is a positive relationship Sophie's gets larger so likes the temperature up of wood and it
11:50
is in this case if it gets too warm feature is is a
11:54
negative relationship so it does it is it that it's definitely
11:57
shows some kind of dependence between length of the fish and the temperature well I don't want to talk about climate change here because it's very debatable issue but you can imagine if temperature you know a global warming if temperature without that may have an effect on the future and their on us eventually because we can't consume future like OK so
12:17
this is an interesting message and the final
12:20
layer here is this is 1 of the food sources say in this particular case phytoplankton is what the sheets if you focus on this area uh while worldwide focus here not focus here because uh my most of my data is concentrated over here is you see because this little Dixon deciles so it's where is
12:39
concentrated on making the goal up to here just because I
12:44
have some lie but I don't care because I know my model is reversed so just idle upper part so if I look at this part I don't see any dependence I think it's just
12:54
because in this case uh it's not a limiting factor obviously they have less of food it effect but in case of Celtic see there is a lot of phytoplankton so if he doesn't is not dependent on the OK and then the 2nd here we have a 2 way interaction plots is plot each feature against each
13:12
other justified to see if I can pick up any interaction between those of
13:17
OK so we can see here is basically the same story pieces of temperature about putting degrees here you see that something is happening so uh is uh what it
13:25
says is this analysis tells me well I know that is that I broadened features but I can't really say why is it so buys of base effects at Trent is important it tells me that I might need to going use maybe time modeling to find out is the way it's the pencil icon answers questions Machine learning collect can do is to be copters features out of the badge of other features on the big datasets and it's as far as it goes so there are limitations to how you can apply it and so conclude the
13:56
season there are 3 important features which I just stand time tends to
14:01
surface temperature and food availability something is going on this temperature which is clearly about 14 degrees and uh is there is a high degree of interaction between these features and the members that this this method we can't and find the causeeffect relationship but we have a relative importance of the variables so from a bunch of variables I picked up the ones which are more important and they can take away I think it is me for the next type of analysis
14:26
and OK so this is the 1st part of my talk and not
14:30
on the show how much time I have I would like to show you um a little bit of how it has been implemented and some of them have 5
14:39
minutes so it's basically the 1st part of this is so what I've shown you in the my presentation it's a 3 way splits what my data set
14:49
so I'll go a bit confused large enough OK so
14:54
you know and I'm sure it's all familiar to you it's a virtual libraries and the BD producible because work in science they need to succeed because I want to run it again against him results appeared in the data and ICT about about 50 cells and throws and about 15 features in my case um I haven't discussed this but I do check multicollinearlity which means it has 2 features which are related their dependent um full of normal
15:21
integration these like when I have 1 the only may area to it not made real for sure blow your model and you can't allow that to a new model for uh um assemble measures for this particular algorithm it doesn't matter but if you can detect analytically and is that they call variables which are you know
15:39
which and yeah so is basically how I did here I construct the metrics of this and more for the moment relation that is just what a cold and idea is that it here and I can find out which variables so the higher uh in multicollinearlity is a more intensive color is basically uh there is no rule but having a buffer at 2 per cent of 0 . 8 may indicate multicollinearlity so I see here is this is the eyes are red or not Our so is basically those variables I just a part of my model of and this 1 as well OK so I do more terms and I do the create so 50 per cent 25 25 for each part of the model and and
16:24
I think my model OK so as this is the final parameters but it took me if you uh if you nations for sure to be satisfied I have and then how I found out how many is to me this I need here because it is the usual rule
16:40
is to set the learning rate as low as possible and to get a number for estimators of number theory as high as possible and if you do
16:47
that you model around forever but you should end up to something feasible and and you can start playing around by reducing the fate of how found others
16:56
500 is used
16:58
apply all that means which is called early stopping in that set available it comes in little bit
17:06
later on OK so so much
17:11
more OK so yes and the trade and budget OK I just touch a button that just pushed on so OK and the same graph again you've seen it before and again I think
17:38
what's and what's interesting the show quickly and other part of so this is a stopping which I mentioned earlier the was part of many do it
17:46
the way split because to pretty something which is down to my opinion morphogens here
17:53
uh into is pretty only have trained and tested all have validation set of the air and to identify
18:00
parameters for this part I used a grid search so specified the range of parameters you can specify for all but you like but only to regularization parameters because those almost a year difficult ones so I specify here not that's which I know I had 6 so I do want to derive want to the
18:17
left and I know from duties it should be higher than 8 so I don't go there OK and that learning in a I have 0 . 5 now 0 . 0 5 and I want to be inclusive details and see how
18:28
it works so what happens is that we get where it begins you to get a um confusion methods and look at the but they have a metrics of different combination of parameters and each time fit the model around on the run and eventually the 1 which gives the highest accuracies chosen and it tells me which parameters such as the state so you
18:46
can see the output is best hyperparameters it says that learning should be about 10 % of 0 . 1 instead of 0 . 0 5 and that's that can be a bit more shallow but it's very
18:57
close and close if I feel those parameters and all other parameters keep uh you
19:04
to say mn I eat what I
19:07
get is very similar results again here you can see
19:10
it's about 50 per cent or 50 per cent for the training and test data which is good OK so again we have to see the same graph which is good it means
19:19
I have the same algorithm but I applied to different types of data partitioning 1 time I did 3 ways create Rasera stopping to find a number of iterations of and otherwise splits into into parts and they used the research to find the best parameters I change parameters and steal my model does uh I think I have to finish my model does give similar
19:43
results which is good at this moment all that miserable OK I think I'll stop here because it's not doing well so think much for attention if if you have questions
20:01
yes because of the did you compare
20:09
the results with our new random forest
20:11
trees I II
20:13
III around some results is just a normal and for a standard mean there is slightly higher but I didn't tell you because I I was a bit stressed this all this thing but I didn't normally square regression and again it does shows that the model does it is less and so yes it is OK and do you
20:35
have the data on the local global yes it's some ideas that you can see it you
20:40
put my ears on them right so
20:42
it's a this thing to person number of
20:51
so when you said there is a link between the temperature and the fish just does that help you then you can overcome the more research is that it's not kind of the able this yes and basically this was a
21:03
kind of this idea very interested to find out of that I have it in terms of this is a time series data for 60 years this is very unique and science is similar to the term collection so he wanted to find out which variables are more important so that we can use our dataset to more filament once and then I could take into some kind of automatic timeseries analysis it there's no 1
21:27
object and you have some sort of 4 quality of research efforts and those you that was blamed machines