Bestand wählen
Merken

Machine Learning: Power of Ensembles

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
2 prominent you and you explain some stuff from one-dimensional OK so we are going to be run the
stock is what ensemble models some of you might have thought about some of the material participated in our cabin
competition almost invariably solution is an ensemble model so loan
what ensemble models are both of those 2 using break down into the
stock fall from a data scientist at the school and rendering machine learning for about 14 years before I start with the talk of
this 1 to you a shot of the on this a bunch of bullets but there in a tree the actually 150
bullets hundred comes along and the players 3 shots on the what's our has probably give hit the dog at this point to the 20 per cent of the time see compared with that distraught T times of 1 below this had the Christian as for the not but homey
bullets no agreement on the street yes yes they also play right so you can
use complicated models from simple models deep learning models but never lose the big picture of it's extremely important it's this is an example where you along
related to but in practice kind of models that you built of maybe so complicated that you may not hear and understand that it relates to the domain of knocked what's the
machine learning process we have input it gets fed into a learning algorithm which creates emission learning model and that's used to build the prediction that's the
that's what is taught in schools and in the cost
but in reality the input data is in this in a different form but I have to transform have to to modify you to clean up to do a whole lot of stuff before you create features from the input data on that gets fed into model
and use the final model to predict in reality it's even more complex because there's so many ways using which you can create creatures you can fit
to so many different kinds of model not just 1 kind of model that you can build so many kinds of qualities of the new models linear regression nodes distribution of tree-based models Mission trees and have ensemble previous models random forests
William Whiston machines have deep learning models support vector machines it's and not a lot how would you do something like this it's extremely difficult or what what am I really however is really going to select something from the regarding the 2
important steps In the process you have to create features from the input dataset you also need to select models in some way the challenge here
is that the model is only as good as you it's because you're the person is creating the features so the model is only as good as features which is created by
you and this is where a lot of
time spent in traditional machine learning please you spend a lot of time in identifying the features in creating the features on generating a transformation all kinds of stuff to create the
features of all this is made famous by the New York Times articles which tells that this is the janitorial work in that it assigns process and it takes about all 80 per cent of the time the challenge after you
create a feature so that if you if you have the same features different models give different predictions and why is
that it's because the solution
space is so huge it's defined by your and there the soccer features on the models can go on search
in different regions of the solutions to these different algorithms such different solutions pesos on it's
kind of hard to figure out which 1 is what should take so that they have better generalization property OK given that you
know how to create features from a lot of talk about how to improve model performance this is where you have different models to select
from each models has different parameters that you can use this column hyperparameters and
make it you can try different sets of features for each of the models how can you create the
final model using those not try all possible combinations that really explored it's exponential in nature the exhaustive search is definitely and this is where the ensemble models helps us come up with some kind of strategy on solving this the talk about
what the the ensemble models are just with some by example with grade on a dummy data said 3 kinds of models 1 linear model Lojistik revisions a classification problems have a large
degradation model a based on random forest
model on the gradient
boosting model so it's very easy to build and all by
Don all modesty provisions directly various I could learn a discrete of this
use linear model called loves to condition not reading distinguishing this implemented
by the loss of liberty ecology boast of it's better than the 1 that's the of and you have
the prediction on the test dataset the assumption here is that quality do is datasets are 1 and this is how the model comes the also have different predictions on the same in data points but they all have the same accuracy on have 70 per cent how do we go and with a final prediction
when how all the models with equal to if you probably use cross-validation symmetric have 70 per cent of all 3 of them but if you look into
what does here is the easy way where you can come by and these 3 models into something that you
can take going number that comes maximum security max count and you see that OK in my 1st role 1 comes the most so I'm going to
set my output as 1 and if
you do something like the the 2nd 1 0 come small assignments and 0 you know those you miraculously enter a 90 per cent accuracy this is the simplest model maximum counting all which will also be your average in some sense you
going to use CPU as
a proxy for creating more features of creating this particular kind of what so would
talk about strategies which can you use some clever techniques to search the solution space
having said this I should tell that ensemble models are not
new it's been there for a very long time techniques like random forest gradient boosting machines are also ensemble models but they just use
it as a very simplistic level advances in computing power has led to a far more powerful techniques that the really sharply on some models for use predominantly and
academia for a long time until the the Netflix competition this happened to those and 7 of this was 1 creating an ensemble model and that's where an
industry-standard noting that all you can use ensemble models and now you see a lot of ensemble models and in production
In sense this assault it looks like you have an input data you create different kinds of models it's very important that you create different kinds of models is not the same models and you combine them using some logic and then you use that to date your final prediction this is what the architecture of an ensemble model would look like
what are the advantages of doing this well it definitely improves the accuracy not obvious but most of the times it improves the accuracy it becomes the robust your variance of the output reduces and because we're doing things in that you can do things in parallel the lends itself nicely depalatalization anything you can do things much faster 2
important things as they here you have to sell at different model so you should have this model diversity and there should be some somebody to negate the models so the subdued things that's very important for of ensemble
models the before important techniques that you can use to create this model you can use different training data
sets you can take for that's your number of the regions you can sample on the number of
features you can use different algorithms have been used in addition you can use random forests use neural it works you can have that's different algorithms and each of them can have different hyperparameters this will cost you to build different models once you have that
the combinational logic can be according really salt water was the 1 that we used it could also be averaging if if you want to use probability as your own book then you would have rich you just 1 big class of output you just walked and blending and
stacking all stacking looks something like that that you would come to its mature data into through pipes you would use a fast but to create something called this modern of put for this area based models and that's used as an input for your 2nd model so that's that's what you do docket of this could be so this would also be a modest models input would be dealt with all of the previous model how would you
do this in Python you would this this is based models OK here you have a base models which can be done using by plane so could learn house by plane so you can take all of the different libraries like atlas terraces for deep learning of you will this was really interesting to build a model and a set everything up pipeline and you can use randomized so it's the before cross-validation school and once you have your based models you can use so if you want to do build great Dave you want to read an average all you use the library qualified but often it's an hyper optimization library of our if you want to stacking you again created to another model the following energy boost our mothers to conditions to create the final prediction something on
randomized so it's even it's faster than what you would do
in In our grid search it's much faster and more hyper
as a it's it's for optimization the search business that and optimization working widely
used 1 for parallelization somewhat models lend itself nicely to balance each month and run each model in problem generally what you would do if you run 1 model you would run your cross validation in fact instead of a curious can scale-out and can run each model in parallel and job it does this task extremely nicely part you can log inventories that we
you you know which model takes a lot of pheromone really should optimize the you soon enough flexibility in doing that all OK home to really
know more about ensembles that will grow on really suggest you to look into that of competitions than models and invariably ensemble models these days
but here's an example of all this was used by the winner for the cold global search relevance competition Crawford have manner competition where the search results have to be the relevance of the search results have to be classified are you can see that once you get to the feature extraction stage different models vote and and the final models combination of these models all of of
them having said this not everything is fine an ensemble model of if you want to interpret the model interpretable model this is clearly not what you would use interpretability goes for across sometimes this is also true if you look into gathered to get to the last 2 per cent or 1 per cent mixture the time it takes to improve accuracy may not make sense re-elected practice so that's a big disadvantage it takes a really long time to improve accuracy so you would probably this really a full-blown metrology only accuracies are most important metric of the what's the
cool stuff around that's all we actually created package on building on some models of it was due to years back crowd song the dump cited this summer typified by going to the home so that's sort of spectrum which will on those that's all I have to thanks
distance OK we'll have
secretions announcement known and there to gift of slide is the what they did have slept threat to your picture on the other documentation is still a lot
of what it has it does notebooks which talks about how we
use this the the pocket
northern christians thank you so much of it is the 1st
from
Dämon <Informatik>
Großkanonische Gesamtheit
Großkanonische Gesamtheit
Mathematisches Modell
Mathematische Modellierung
Leistung <Physik>
Computeranimation
Großkanonische Gesamtheit
Mathematische Modellierung
Kontrollstruktur
Automat
Netzwerktopologie
Sichtbarkeitsverfahren
Netzwerktopologie
Punkt
Zahlenbereich
Computeranimation
Mathematische Modellierung
Computeranimation
Domain-Name
Prozess <Physik>
Prognoseverfahren
Algorithmus
Prozess <Informatik>
Mathematisches Modell
Mathematische Modellierung
Maschinelles Lernen
Ein-Ausgabe
Algorithmische Lerntheorie
Computeranimation
Bildschirmmaske
Prozess <Informatik>
Mathematisches Modell
Maschinelles Lernen
Ein-Ausgabe
Computeranimation
Netzwerktopologie
Großkanonische Gesamtheit
Distributionstheorie
Subtraktion
Knotenmenge
Wald <Graphentheorie>
Prozess <Informatik>
Lineare Regression
Mathematische Modellierung
Mathematisches Modell
Randomisierung
Maschinelles Lernen
Computeranimation
Mathematisches Modell
Virtuelle Maschine
Prozess <Physik>
Mathematische Modellierung
Vektorraum
Ein-Ausgabe
Whiston, William
Computeranimation
Mathematisches Modell
Mathematisches Modell
Computeranimation
Systemidentifikation
Prozess <Physik>
Automat
Transformation <Mathematik>
Computeranimation
Subtraktion
Mathematische Modellierung
Prognoseverfahren
Mathematische Modellierung
Prognostik
Computeranimation
Lösungsraum
Subtraktion
Mathematische Modellierung
Algorithmus
Mathematische Modellierung
Dialekt
Raum-Zeit
Computeranimation
Mathematisches Modell
Kategorie <Mathematik>
Mathematische Modellierung
Mathematisches Modell
Parametersystem
Subtraktion
Mathematische Modellierung
Menge
Zahlenbereich
Mathematische Modellierung
Computeranimation
Großkanonische Gesamtheit
Lineare Regression
Großkanonische Gesamtheit
Mathematische Modellierung
Logistische Verteilung
Exponent
Natürliche Zahl
Mathematisches Modell
Schaltnetz
Versionsverwaltung
Gebäude <Mathematik>
Feasibility-Studie
Ausgleichsrechnung
Computeranimation
Gradient
Mathematisches Modell
Lineare Regression
Mathematische Modellierung
Strategisches Spiel
Großkanonische Gesamtheit
Wald <Graphentheorie>
Mathematisches Modell
Matrizenrechnung
Gebäude <Mathematik>
Gradient
Wald <Graphentheorie>
Ausgleichsrechnung
Computeranimation
Gradient
Mathematisches Modell
Zufallszahlen
Wellenpaket
Parametersystem
Mathematisches Modell
Lineare Regression
Einfügungsdämpfung
Logistische Verteilung
Wellenpaket
Konditionszahl
Lineare Regression
Parametersystem
Matrizenrechnung
Gebäude <Mathematik>
Gradient
Ausgleichsrechnung
Computeranimation
Softwaretest
Lineare Regression
Zufallszahlen
Punkt
Prognoseverfahren
Mathematisches Modell
Gradient
Wald <Graphentheorie>
Symmetrische Matrix
Computeranimation
Großkanonische Gesamtheit
Zufallszahlen
Funktion <Mathematik>
Extrempunkt
Computersicherheit
Mathematische Modellierung
Zahlenbereich
Gradient
Zählen
Wald <Graphentheorie>
Computeranimation
Großkanonische Gesamtheit
Zufallszahlen
Funktion <Mathematik>
Mittelwert
Extrempunkt
Mathematisches Modell
Gradient
Wald <Graphentheorie>
Computeranimation
Funktion <Mathematik>
Proxy Server
Proxy Server
Zentraleinheit
Computeranimation
Lösungsraum
Algorithmus
Virtuelle Maschine
Großkanonische Gesamtheit
Wald <Graphentheorie>
Mathematische Modellierung
Randomisierung
Strategisches Spiel
Raum-Zeit
Computeranimation
Gradient
Großkanonische Gesamtheit
Mathematisches Modell
Mathematische Modellierung
Leistung <Physik>
Übergang
Großkanonische Gesamtheit
Großkanonische Gesamtheit
Subtraktion
Mathematische Logik
Mathematische Modellierung
Mathematisches Modell
Prognostik
Biprodukt
Ein-Ausgabe
Mathematische Logik
Computeranimation
Mathematisches Modell
Prognoseverfahren
Ein-Ausgabe
Mathematische Modellierung
Computerarchitektur
Mathematisches Modell
Großkanonische Gesamtheit
Großkanonische Gesamtheit
Parallelrechner
Mathematische Modellierung
Mathematisches Modell
Varianz
Computeranimation
Funktion <Mathematik>
Algorithmus
Mathematisches Modell
Subtraktion
Wellenpaket
Menge
Stichprobennahme
Stichprobenumfang
Mathematisches Modell
Mathematische Modellierung
Zahlenbereich
Menge
Dialekt
Computeranimation
Mittelwert
Addition
Algorithmus
Subtraktion
Wald <Graphentheorie>
Stichprobennahme
Wasserdampftafel
Schaltnetz
Klasse <Mathematik>
Abstimmung <Frequenz>
Menge
Computeranimation
Mathematisches Modell
Algorithmus
Mathematische Modellierung
Funktion <Mathematik>
Ebene
Subtraktion
Hypercube
Minimierung
Mathematisches Modell
Diagramm
Übergang
Ein-Ausgabe
Computeranimation
Energiedichte
Prognoseverfahren
Funktion <Mathematik>
Menge
Flächeninhalt
Mittelwert
Konditionszahl
Mathematische Modellierung
CMM <Software Engineering>
Programmbibliothek
Serielle Schnittstelle
Distributionenraum
Minimierung
Parametersystem
Dimensionsanalyse
Stichprobe
Statistische Analyse
Globale Optimierung
Programmbibliothek
Parallele Schnittstelle
Computeranimation
Task
Mustersprache
Funktion <Mathematik>
Prozess <Informatik>
Mereologie
Mathematische Modellierung
Mathematisches Modell
Kreuzvalidierung
Parallele Schnittstelle
Computeranimation
Resultante
Großkanonische Gesamtheit
Großkanonische Gesamtheit
Subtraktion
Abstimmung <Frequenz>
Schaltnetz
Mathematische Modellierung
Computeranimation
Zusammengesetzte Verteilung
Mathematisches Modell
Großkanonische Gesamtheit
Interpretierer
Großkanonische Gesamtheit
Linienelement
Gebäude <Mathematik>
Mathematische Modellierung
Mathematisches Modell
Metrologie
Speicherabzug
Punktspektrum
Quick-Sort
Computeranimation
Rechenschieber
Abstand
Computeranimation
Großkanonische Gesamtheit
Notebook-Computer
Computeranimation
Dämon <Informatik>
Computeranimation

Metadaten

Formale Metadaten

Titel Machine Learning: Power of Ensembles
Serientitel EuroPython 2016
Teil 167
Anzahl der Teile 169
Autor Subramanian, Bargava
Lizenz CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
DOI 10.5446/21111
Herausgeber EuroPython
Erscheinungsjahr 2016
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Bargava Subramanian - Machine Learning: Power of Ensembles In Machine Learning, the power of combining many models have proven to successfully provide better results than single models. The primary goal of the talk is to answer the following questions: 1) Why and How ensembles produce better output? 2) When data scales, what's the impact? What are the trade-offs to consider? 3) Can ensemble models eliminate expert domain knowledge? ----- It is relatively easy to build a first-cut machine learning model. But what does it take to build a reasonably good model, or even a state- of-art model ? Ensemble models. They are our best friends. They help us exploit the power of computing. Ensemble methods aren't new. They form the basis for some extremely powerful machine learning algorithms like random forests and gradient boosting machines. The key point about ensemble is that consensus from diverse models are more reliable than a single source. This talk will cover how we can combine model outputs from various base models(logistic regression, support vector machines, decision trees, neural networks, etc) to create a stronger/better model output. This talk will cover various strategies to create ensemble models. Using third-party Python libraries along with scikit-learn, this talk will demonstrate the following ensemble methodologies: 1) Bagging 2) Boosting 3) Stacking Real-life examples from the enterprise world will be show-cased where ensemble models produced better results consistently when compared against single best-performing models. There will also be emphasis on the following: Feature engineering, model selection, importance of bias-variance and generalization. Creating better models is the critical component of building a good data science product.

Zugehöriges Material

Ähnliche Filme

Loading...
Feedback