Deep Learning: A Bayesian Perspective

Cite

Related Material

Banff International Research Station (BIRS) for Mathematical Innovation and Discovery

Sokolov, Vadim

Formal Metadata

Title

Deep Learning: A Bayesian Perspective

Title of Series

Synthesis of Statistics, Data Mining and Environmental Sciences in Pursuit of Knowledge Discovery (17w5076)

Number of Parts

Author

Sokolov, Vadim

Contributors

Polson, Nick

License

CC Attribution - NonCommercial - NoDerivatives 4.0 International:
You are free to use, copy, distribute and transmit the work or content in unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.14288/1.0366083 (DOI)

Publisher

Banff International Research Station (BIRS) for Mathematical Innovation and Discovery

Release Date

2017

Language

English

Content Metadata

Subject Area

Computer Science Mathematics

Genre

Workshop/Interactive Format Lecture

Abstract

Deep learning is a form of machine learning for nonlinear high dimensional pattern matching and prediction. We present a Bayesian probabilistic perspective, and provide a number of insights, for example, more efficient algorithms for optimization and hyper-parameter tuning, and an explanation of finding good predictors. Traditional high-dimensional data reduction techniques, such as principal component analysis (PCA), partial least squares (PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are all shown to be shallow learners. Their deep learning counterparts exploit multiple deep layers of data reduction which provide performance gains. We discuss stochastic gradient descent (SGD) training optimisation, and Dropout (DO) that provide estimation and variable selection, as well as Bayesian regularization, which is central to finding weights and connections in networks to optimize the bias-variance trade-off. To illustrate our methodology, we provide an analysis of spatio-temporal data. Finally, we conclude with directions for future research.