We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

How can machine learning help to predict changes in size of Atlantic herring ?

00:00

Formal Metadata

Title
How can machine learning help to predict changes in size of Atlantic herring ?
Title of Series
Part Number
160
Number of Parts
169
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Olga Lyashevska - How can machine learning help to predict changes in size of Atlantic herring ? This talk is a case-study of how Python (Pandas, NumPy, SciKit-learn) can be implemented to identify the influence of the potential drivers of a decline in size of Atlantic herring populations using Gradient Boosting Regression Trees. ----- A decline in size and weight of Atlantic herring in the Celtic Sea has been observed since the mid-1980’s. The cause of the decline remains largely unexplained but is likely to be driven by the interactive effect of various endogenous and exogenous factors. The goal of this study is to interrogate a long time-series of biological data obtained from commercial fisheries from 1959 to 2012. We use gradient boosting regression trees to identify important variables underlying changes in growth from various potential drivers, such as: - Atlantic multidecadal oscillation; - sea surface temperature; - salinity; - wind; - zooplankton abundance; - fishing pressure. This learning algorithm allows to quantify the influence of the potential drivers of change with the test error lower when compared to other supervised learning techniques. The predictor variables importance spectrum (feature importance) helps to identify the underlying patterns and potential tipping points while resolving the external mechanisms underlying observed changes in size and weight of herring. This analysis is a useful case-study of how Python can be implemented in academia. The outputs of the analysis are of relevance to conservation efforts and sustainable fisheries management which promotes species resistance and resilience.
Standard deviationCASE <Informatik>GradientMachine learningMusical ensembleState observerAverageLecture/ConferenceMeeting/Interview
SpeciesReduction of orderVirtual machineProduct (business)SpeciesComputer animation
Function (mathematics)Directed graphoutputCausalityFaktorenanalyseSpeciesSurfacePressureInteractive televisionDivisorSound effectSurfacePressureMathematicsLecture/ConferenceMeeting/InterviewComputer animation
Total S.A.Multiplication signSampling (statistics)Set (mathematics)Observational studyAreaRow (database)
Directed graphObservational studyAreaDivisorObject (grammar)Computer animationLecture/Conference
Variable (mathematics)GradientLinear regressionResidual (numerical analysis)Musical ensembleCASE <Informatik>GradientINTEGRALAlgorithmNetwork topologyLinear regressionEndliche ModelltheorieMathematical analysisForestDependent and independent variablesRandom variableComputer animationLecture/Conference
Residual (numerical analysis)Variable (mathematics)GradientLinear regressionEndliche ModelltheorieSequenceVarianceMereologyoutputNetwork topologyResidual (numerical analysis)AlgorithmGradientDecision tree learningMusical ensembleComputer animationLecture/Conference
Linear mapInclusion mapOutlierInsertion lossFunction (mathematics)Different (Kate Ryan album)Decision tree learningSelectivity (electronic)GradientSymbol tableInteractive televisionAlgorithmNonlinear systemInclusion mapComputer animation
Linear mapInclusion mapDifferent (Kate Ryan album)Insertion lossFunction (mathematics)OutlierVariable (mathematics)Different (Kate Ryan album)Scaling (geometry)Instance (computer science)Lecture/ConferenceComputer animation
Information managementCASE <Informatik>Linear regressionEndliche ModelltheorieDifferent (Kate Ryan album)Insertion lossEvent horizonImplementationFunctional (mathematics)Instance (computer science)Point (geometry)GradientSquare numberLecture/Conference
Linear mapInclusion mapOutlierFunction (mathematics)Different (Kate Ryan album)Insertion lossWave packetWeightGradientINTEGRALBitImplementationTheoryMereologyEndliche ModelltheorieMultiplication signLatent heatCurveSequenceTunisGoodness of fitNetwork topologyLaptopNichtlineares GleichungssystemComputer animation
WeightLevel (video gaming)Data modelFormal grammarNetwork topologySampling (statistics)WeightGamma functionEndliche ModelltheorieBit rateNichtlineares GleichungssystemMereologyParameter (computer programming)GradientAdditionDistanceSystem callLimit of a functionFrustrationLecture/ConferenceComputer animation
WeightLevel (video gaming)Data modelFunction (mathematics)Insertion lossCurve fittingMetropolitan area networkBit rateSampling (statistics)NumberIterationNetwork topologySquare numberInsertion lossCASE <Informatik>Level (video gaming)Functional (mathematics)IterationGradient descentParameter (computer programming)Axiom of choiceNegative numberDegree (graph theory)Bit rateDifferent (Kate Ryan album)Sound effectMereologyEndliche ModelltheorieDivisorRegular graphSquare numberNumberFormal grammarFitness functionDifferential geometryBitLecture/ConferenceComputer animation
Bit rateSampling (statistics)NumberIterationNetwork topologyInsertion lossFunction (mathematics)Square numberParameter (computer programming)Different (Kate Ryan album)AlgorithmCASE <Informatik>SimulationTheoryLetterpress printingBit rateEndliche ModelltheorieDependent and independent variablesHeegaard splittingMaxima and minimaComputer animation
Bit rateSampling (statistics)NumberIterationNetwork topologyInsertion lossFunction (mathematics)Square numberCASE <Informatik>Sampling (statistics)Endliche ModelltheorieInteractive televisionRoundness (object)Stochastic processRandom variableReduction of orderVarianceLecture/ConferenceMeeting/InterviewComputer animation
Linear regressionGradientVarianceEndliche ModelltheorieResultantDecision tree learningInsertion lossFunctional (mathematics)Square numberAxiom of choiceLecture/Conference
CAN busNumberIterationBit rateNetwork topologySampling (statistics)Square numberFunction (mathematics)Insertion lossWave packetSoftware testingStatistical dispersionAdditionWave packetCASE <Informatik>Multiplication signMereologyInsertion lossEndliche ModelltheorieFunctional (mathematics)Row (database)Validity (statistics)Software testingQuicksortComputer animationDiagram
Price indexError messageCross-validation (statistics)Degree (graph theory)Set (mathematics)Lecture/Conference
Wave packetSoftware testingStatistical dispersionIterationInteractive televisionEndliche ModelltheorieProcess (computing)MathematicsBitIterationCoefficient of determinationSet (mathematics)Dependent and independent variablesVarianceWave packetSquare numberGoodness of fitComputer animationDiagram
Sound effectResultantBitCartesian coordinate systemEndliche ModelltheorieAverageGraph (mathematics)Extreme programmingLecture/ConferenceDiagram
Reflection (mathematics)Variable (mathematics)Graph (mathematics)Endliche ModelltheorieMultiplication signGraph coloringCASE <Informatik>Network topologyLecture/ConferenceDiagram
TwitterGraph coloringInheritance (object-oriented programming)CASE <Informatik>Graph (mathematics)Lecture/ConferenceDiagram
Graph (mathematics)Partial derivativePlotterVariable (mathematics)TwitterLengthMessage passingDegree (graph theory)Row (database)Interactive televisionDependent and independent variablesComputer animationDiagram
Degree (graph theory)CirclePosition operatorAreaCASE <Informatik>Lecture/ConferenceComputer animationDiagram
CASE <Informatik>LengthSound effectLecture/ConferenceComputer animationDiagram
Message passingCASE <Informatik>Focus (optics)AreaComputer animationDiagram
Endliche ModelltheorieMereologyLie groupCASE <Informatik>PlotterInteractive televisionDivisorSound effectLecture/ConferenceComputer animationDiagram
SurfaceDegree (graph theory)Interactive televisionMathematical analysisMachine learningMultiplication signComputer iconEndliche ModelltheorieLimit (category theory)TwitterVirtual machineTime seriesSet (mathematics)Computer animationDiagramLecture/Conference
SurfaceDegree (graph theory)LengthInvariant (mathematics)Sound effectCausalityVariable (mathematics)Metropolitan area networkPoint (geometry)CodeProgrammer (hardware)Menu (computing)TwitterMultiplication signCASE <Informatik>Type theoryInteractive television1 (number)Degree (graph theory)MereologyMathematical analysisVariable (mathematics)Computer animation
Execution unitDevice driverScatteringLengthRule of inferenceUsabilityMaxima and minimaSummierbarkeitNetwork topologyError messageoutputMetropolitan area networkMultiplication signBitMereologyHeegaard splittingSet (mathematics)Presentation of a groupLecture/ConferenceComputer animation
Revision controlComputing platformPrice indexTrigonometric functionsState diagramDefault (computer science)DecimalRandom variableRadio-frequency identificationMereologyInversion (music)Musical ensembleError messageStandard deviationPreprocessorConditional-access moduleMenu (computing)EmailLemma (mathematics)Gamma functionShape (magazine)Data typeObject (grammar)SineTriangleDiagonalCoefficientPearson product-moment correlation coefficientCross-correlationPlot (narrative)Set (mathematics)Matrix (mathematics)Orientation (vector space)Form (programming)Magneto-optical driveBitLibrary (computing)Cellular automatonDigital libraryResultantNormal (geometry)CASE <Informatik>Linear regressionEndliche ModelltheorieSource codeXMLComputer animation
Endliche ModelltheorieVariable (mathematics)AlgorithmWahrscheinlichkeitsmaßINTEGRALAreaAssembly languageMetric systemLecture/Conference
TriangleDiagonalSpecial unitary groupMoment (mathematics)Matrix (mathematics)Pearson product-moment correlation coefficientProduct (business)Cross-correlationCoefficientRippingComputer fontComa BerenicesConditional-access moduleVariable (mathematics)Frame problemData conversionGraph coloringTheory of relativityMereologyBuffer solutionEndliche ModelltheorieVariable (mathematics)Moment (mathematics)Metric systemRule of inferenceTerm (mathematics)Heegaard splittingPearson product-moment correlation coefficientMultikollinearitätComputer animation
Parameter (computer programming)EstimatorEndliche ModelltheorieRule of inferenceComputer animation
Physical lawBefehlsprozessorCoefficient of determinationWave packetMusical ensembleEstimationCore dumpData modelLetterpress printingSoftware testingSample (statistics)Special unitary groupNumberNumber theoryEstimatorBit rateReduction of orderEndliche ModelltheorieFeasibility studyAlgorithmSource codeComputer animationLecture/Conference
Convex hullCoefficient of determinationCore dumpLetterpress printingBefehlsprozessorWave packetMusical ensembleData modelEstimationSoftware testingSample (statistics)Conditional-access moduleSet (mathematics)BitMultilaterationGraph (mathematics)Touch typingSource codeComputer animationLecture/Conference
IterationCuboidFile formatPartial derivativePlot (narrative)LengthWave packetElectronic mailing listAsynchronous Transfer ModeHill differential equationTexture mappingCommercial Orbital Transportation ServicesNumberMusical ensembleBit rateSampling (statistics)Insertion lossState of matterSummierbarkeitError messageSet (mathematics)MereologyHeegaard splittingComputer animation
Musical ensembleEstimatorState of matterSoftware testingError messageSummierbarkeitConditional-access moduleQuadrilateralSet (mathematics)Validity (statistics)Parameter (computer programming)Heegaard splittingSoftware testingRange (statistics)Mereology1 (number)Regular graphMaxima and minimaRight angleComputer animationLecture/Conference
Parameter (computer programming)Musical ensembleWave packetCurve fittingError messageTouch typingCombinational logicMetric systemMultiplication signDifferent (Kate Ryan album)State of matterParameter (computer programming)Matrix (mathematics)Function (mathematics)Computer animationLecture/Conference
Parameter (computer programming)Musical ensembleWave packetError messageSummierbarkeitModal logicConditional-access moduleFunction (mathematics)BitBit rateParameter (computer programming)ResultantComputer animation
Modal logicInsertion lossInterior (topology)Software testingWave packetEstimationAlpha (investment)SynchronizationSampling (statistics)Physical lawEstimatorIterationFile formatCuboidData typeLengthResultantGraph (mathematics)Software testingWave packetIterationAlgorithmMultiplication signMereologyParameter (computer programming)Different (Kate Ryan album)Endliche ModelltheorieType theoryNumberLecture/ConferenceSource codeComputer animation
Moment (mathematics)Lecture/Conference
Hand fanCodePoint (geometry)ResultantNetwork topologyRandom variableForestArithmetic meanEndliche ModelltheorieBitLinear regressionSquare numberLaptopXMLComputer animationLecture/Conference
Point (geometry)CodeProgrammer (hardware)NumberOnline helpLink (knot theory)Observational studyLecture/ConferenceXML
Variable (mathematics)Term (mathematics)Time seriesMathematical analysis1 (number)Lecture/Conference
Virtual machineNeuroinformatikQuicksortObject (grammar)Computer animation
Transcript: English(auto-generated)
Welcome everyone. My name is Olga Lushevsky and I work as a postdoc in Ireland So I will go to show you how machine learning can be applied in sciences And after previous talk if you've been here, we had some nice introduction about all kinds of ensemble methods So here I'm going to show you one specific case on the gradient boosting okay, so
Here's the background of the problem in the past 60 years We observed decline in the size of fish by about four centimeters on average So we think about herring which is about 20 centimeters long four centimeters is a lot of a lot of reduction So we would like to find out what's the problem? Why is it happening? And we're going to use machine learning to answer this question
so why is it the problem is because herring is very important species for consumption and We know that if it does decrease it has a consequences for further stock production so it means there will be less fish in the future so we can consume less and We don't know what's causing decline, but we are presuming there is an interactive effect of various
factors such as sea surface temperature may change So like an abundance may change Fish abundance may change or fish pressure Okay, so to answer this question, I'm going to use data from
For the past 60 years from 1959 to 2012 And the data is spread throughout the year Okay, so I'm going to use this data and the way data has been collected is
It has been collected from commercial vessels from taken at random 50 to 100 samples at a time and
Fishes so imagine a data set of 50,000 rows Okay, so study areas this is where the data comes from it's called Celtic seas just on the south of the Ireland and it's bounded by St. George Channel English Channel and so it's just you can imagine where we are now. So it's about study area size and
So objective is to identify important factors which underlies this problem and to answer this question I'm going to use a gradient boosting regression trees, which is one of the ensemble algorithms, which are available these days Why ensemble is because we don't have a collection of we don't have one tree, but we have a collection of trees
so and the final model is Curious of the final models improved because we have a collection of interlinked trees so in this case as opposed to other methods such as bagging or Regression Random forest where trees are independent in this method all trees are dependent in the ways that
Residuals of one tree so unexplained part of the model is enters as an input into next tree So we have a sequence of interconnected trees, which is a nice feature. It allows to reduce variance It allows to reduce bias The only problem with this is because of their interlinked and sequential we can't paralyze our algorithm because they all depend on each other
Okay, so so advantages of gradient boosting regression trees are basically more or less the same as those of us ensemble methods which means Just to mention the few we can detect
Nonlinear feature interaction is just because of the underlying feature selection Which is going on in the algorithm It is resistant to inclusion for relevant features Which means we can include as many variables as we like and if they're irrelevant, they won't be selected. So we don't care Okay, so which is nice It isn't it is good if we deal with data with different scale
We don't have to standardize data We may have we may wish to standardize but we don't have to because they are a boost and if we for instance use a normal Like linear regression our model will explode. So in this case, this is a really good advantage Also robust to outliers. So there are any data points which are not fitting data
Maybe because it's mistake or maybe some special event. We don't care at all It's more accurate and we can use different loss functions Like for instance least square or others which is implementation gradient boosting regression trace, which is nice Okay, disadvantage is it requires cattle tuning. It takes a lot of time to get a good models
It's slow to train but very fast to predict and I'll show you after I finish my talk part of my talk I'll show you implementation and I Python notebook how I did it okay, so a little bit of Equations here. So the formal specification of the model we have it is an additive model So we have a sequence of trees and each tree is a weighted
so that it's a As we get a sample of trees, they all combine through this gamma weight as you can see here Okay, and each individual tree is shown as on this part of the equation and then we build an additive model in the So as I said, we add each tree sequentially with this parameter epsilon, which is shrink
Which are also known as learning rate, you know, we all talk about learning rate. This is a learning rate So learning rate allows to control speed how fast we descend along the gradient and Finally at each stage the weak learner is chosen to minimize some loss function in my case
I took least square because it's a natural choice, but it can be any other function which you can Differentiate and This this part of the model is evaluated by a negative gradient descent Okay, I won't go into details of that but that's all about formality in my talk Okay, so parameters which I finally selected in my case. I needed about 500 iterations and
learning rate of about 0.05 These two parameters I refer to as Regularization parameters, okay, and they affect degree of fit and therefore the effect of value of each other which is a bit complicated because
If I increase a number of iteration, let's say by fact of 10 It doesn't mean that learning rate will decrease by fact of 10. It's not proportional So which is difficult you may increase iterations, but your learning rate might decrease by a different proportion and that's why it's getting tricky Okay, so next parameter is maximum Three depth which is in my case six for this particular algorithm is known from theory and from different simulation
models that Three trunks, so it means that there is one split only perform best. Okay, which is nice So we don't need any deep trees, but in some cases you may need from four to six maximum eight
Eight Eight Splits okay in my case. It's six. It means that my model can accommodate up to five interactions Okay, this is what means Okay, next prompt is sub sample in my case. It's 75% It's an optional if you specify anything less than one. It means that you get a stochastic model. So we introduced some randomness
it can be nice because it allows to reduce variance and reduce bias and Practically, I found out that this was a better result. Therefore I introduced so basically my model is stochastic gradient boosting regression tree to be precise Okay, and loss function is least squares as I mentioned
It's a natural choice nice to start with easy to interpret but it can be any other loss functions and they're nicely implemented It's like it learn and it's very easy to change it Okay, so if we estimate our model in this case, I split my data in three parts In if I have enough time, I'll show you how I did it split in two parts
I also have results and they're very similar, which is nice show sort of business on my model. But in this case, I Split data 50% for train 25 test 25 validations There is no particular reason why because I have 50,000 rows I can I just I just can if you have less data You don't you may choose for maybe one live out or cross validation or some other methods which are more
Specific for smaller data sets, but I have a big data set and you can see I have MSC's mean squared error, which is degree of accuracy Well, it's rather low. So I'm I'm happy enough with my model and I can see that after some interactions my model
Flatten out. So there is no big there is no change in the MSC Which means that I have enough iterations and our square tells me a proportion of variance, which is explained by model And for train set is slightly higher which may indicate a bit of overfitting but it's not a big gap in between them
so I I'm satisfied and But this always follow each other very closely. So it means that on average my model is doing a good job Okay, and there is some if I reduce variability in data, I see that our square goes up. So there is basically a effect of that
So a little bit of results so I plot here lens of the fish on x-axis You can see that it's maybe around from 20 to 30 centimeters imagine and My model predicts fish from 22 to 28 So basically it is what it says on average we give a correct value if you have extremes too small or too too big
They won't be predicted correctly, okay, so it's 50% of the r-square each what's reflected in this graph Okay, and if you want to find out which variables play a role in in my model, this is what I wanted to find out The way it's performed is each variable is used
Well, the most important one is used to split a tree more often It's used to split a tree if we count times it's used we can say okay, then it means it's more important in this case I have a color coding here. So this first is a trend is basically a month Okay, so we know there is some trend in data and as soon as it reduced it
I could see it's in hundred percent cases. It has been used after that. We have sea surface temperature Which is I'll show you next graph how it's affected but it's basically there is some relationship and other things are food availability, so if there is a food in the sea and
Abundance of fish or how how many How much is the population etc? so most important message here is to remember that trend is important one and after that we have sea surface temperature and and Food, okay. So if we further visualize those three variables in the Partial dependence plots. So the first row here is a one-way partial dependence plots
basically where I plot each feature against our Dependent variable which is the length of the fish we can see that we can't really see any Particular relationship here. It doesn't say relationship, but it shows a degree of interaction is a way how it's
Depend on each other so we don't really pick up any dependence here, but we do pick up here So I highlight it here with the circles these two areas It means that maybe if you if you can see here about 14 degrees
So if sea surface temperature is below 14 degrees, there is a positive relationship So fish gets larger. So fish likes temperature up to 14 degrees in this case if it gets too warm Which is there is a negative relationship So it does it it definitely shows some kind of dependence between length of the fish and the temperature
Well, I don't want to talk about climate change here because it's very debatable issue But you can imagine if temperature, you know global warming if temperature goes up It may have an effect on the fish and on us eventually because we can't consume fish. We like okay So this is an interesting message and the finally
Here is this is one of the food sources in this particular case is phytoplankton. It's what fish eats if you focus on this area, well, why I focus here not focus here because My most of my data is concentrated over here as you see because it's little ticks in our deciles So it's where data is concentrated
It may go up to here just because I have some outlier, but I don't care because I know my model is robust So just I don't interpret this part. So if I look at this part, I don't see any dependence. I Think it's just because in this case It's not a limiting factor Obviously we have less food it will affect but in case of Celtic sees there is a lot of phytoplankton
So fish doesn't is not dependent on that Okay, and then the second row here. We have a two-way interaction plots is where I plot each feature against each other It's just to fight to see if I can pick up any interaction between those. I'm sorry Okay, so we can see here. It's basically the same story. We see surface temperature about 14 degrees here
We see that something is happening. So What it says this analysis tells me well, I know that is that I important features But I can't really say why is it so by the fact that trend is important It tells me that I might need to go and use maybe time series modeling to find out
The way it's depends so I can't answer this question machine learning All I can do is to pick up these features out of the bunch of other features on the big data set And it's as far as it goes. So there are limitations to how you can apply it and so conclude We see that there are three important features which are this case trend which is time trend
surface temperature and food availability something is going on with temperature which is clearly about 14 degrees and there is a high degree of interaction between these features and Remember that with this method we can't find the cause-effect relationship, but we have a relative importance of the variable So from a bunch of variables, I picked up the ones which are more important and I can take away
I take it with me for the next type of analysis Okay, so this is the first part of my talk and I'm not sure how much time I have I would like to show you a Little bit how it has been implemented Okay, I have five minutes So it's basically a first part of this is what I've shown you in my presentation. It's a three-way split
Of my data set So I'll go a bit quicker here is clutch enough Okay, so, you know, I'm sure it's all familiar to you. It's a portal libraries To be reproducible because I work in sciences. I need to set seed because I want to run it again and get same results
Okay, I didn't data and I see here about about 50,000 rows and about 15 features in my case I haven't discussed this but I do check multicollinearity, which means if I have two features, which are really dependent
for Normal regression is like where I have one three only it may it not made real for sure blow your model You can't allow that in your model for Assemble methods for this particular algorithm. It doesn't matter. But if you can detect multicollinearity, it's better to take out variables, which are You know, which are multicollinearity, so it's basically how I do it here I construct the metrics of Pearson
well product moment correlation coefficient is what's it called and I Get it here and I can find out which variable so the higher Multicollinearity is a more intensive color. It's basically There is no rule but everything above 80% or 0.8 may indicate multicollinearity
So I see here this is either red or dark color. So it's basically those variables. I just took out of my model Okay, and there's this one as well Okay, so I removed them and I do three-way split so 50% 25 25 for each part of the model and
and I fit my model Okay, so this is the final parameters but it took me a few a few iterations for sure to be satisfied with what I have and how I found out how many estimators I need here because is a usual rule is to set
learning rate as low as possible and to get a number of estimators with number three as high as possible and If you do that your model will run forever But you for sure end up with something feasible and then you can start playing around by reducing. Okay? How I found out this 500 is I used apply algorithm is which is called early stopping and it's available in scikit-learn
It comes a little bit later on okay, I'm so sorry Okay, I'm sorry guys, I'm not sure I don't touch it, okay, I just touch it but
Just push down. So okay, and it's the same graph again, you've seen it before and Again, I think what's what's interesting to show quickly as a part of so this is early stopping which I mentioned earlier
As a part when you do a two-way split because to a split is something which is done To my opinion more often than three-way split In two-way split you only have trained and test you don't have validation set Okay, and to identify parameters for this part. I used the grid search. So I specify the range of parameters
You can specify for all parameters you like, but I only took regularization parameters because those are most difficult ones So I specify here Maximum depth which I know I had six so I took One to the right one to the left and I know from theories. It shouldn't be higher than eight So I don't go there. Okay, and learning rate. I have 0.5 now
0.05 and I want to increase or decrease and see how it works. So what happens is that we gather if we give we get Confusion metrics not confusion But we have a matrix of different Combination of parameters and each time we fit the model around around around and eventually the one which gives highest accuracy is chosen
And it tells me which parameters I should take take so you can see here output is best hyper parameters It says that learning rate should be 10 percent or 0.1 instead of 0.05 and max depth can be a bit more shallow, but it's very close Okay, and close if I feed those parameters and all other parameters I keep
Keep the same I What I get is very similar results again here You can see it's about 50 percent or 52 percent for train and test data, which is good Okay. So again, we see the same graph which is good. It means I have
Same algorithm, but I applied two different types of data partitioning one time I did three-way split with early stopping to find a number of iterations Otherwise I split in two in two parts and they used grid search to find the best parameters I changed parameters and still my model does
My model does give similar results, which is good. It means I'll get a boost Okay, I think I will stop here because it's not Doing very well. So thank you very much for your attention If you have questions, yeah
Okay Did you compare the results with any random forest trees I Run some results with just normal a random forest and Mean square there is slightly higher. I didn't tell here because I was a bit stressed with all this thing
but I did a normal least square regression and Again, it does shows that Model does is less accurate. So yes, I did compare And do you have the data or the notebook available? Yes, it's on my github You can see if you put my surname, it's all there. Yes. Thank you
another question No, okay So you said there was a link between the temperature and the fish? Yes, does that help you then get another grant to do more research? Is that is that kind of the aim of this?
Yes, basically this was a kind of a press study. We are interested to find out if there are In terms of because it's a time series data for 60 years, which is very unique and sciences, you know, it's a long-term collection So we wanted to find out which variables are more important so that we can reduce our data set to most relevant ones And then I could take and do some kind of multivariate time series analysis. Yes
No one Okay. Thank you. Yeah I'm sorry for all the two of her. I know it's computer Always blame machines