Bargava Subramanian - Machine Learning: Power of Ensembles
In Machine Learning, the power of combining many models have proven to
successfully provide better results than single models.
The primary goal of the talk is to answer the following questions:
1) Why and How ensembles produce better output?
2) When data scales, what's the impact? What are the trade-offs to consider?
3) Can ensemble models eliminate expert domain knowledge?
-----
It is relatively easy to build a first-cut machine learning model. But
what does it take to build a reasonably good model, or even a state-
of-art model ?
Ensemble models. They are our best friends. They help us exploit the
power of computing. Ensemble methods aren't new. They form the basis
for some extremely powerful machine learning algorithms like random
forests and gradient boosting machines. The key point about ensemble
is that consensus from diverse models are more reliable than a single
source. This talk will cover how we can combine model outputs from
various base models(logistic regression, support vector machines,
decision trees, neural networks, etc) to create a stronger/better
model output.
This talk will cover various strategies to create ensemble models.
Using third-party Python libraries along with scikit-learn, this talk
will demonstrate the following ensemble methodologies:
1) Bagging
2) Boosting
3) Stacking
Real-life examples from the enterprise world will be show-cased where
ensemble models produced better results consistently when compared
against single best-performing models.
There will also be emphasis on the following: Feature engineering,
model selection, importance of bias-variance and generalization.
Creating better models is the critical component of building a good
data science product. |