Generalization properties of multiple passes stochastic gradient method

Institut des Hautes Études Scientifiques (IHÉS)

Villa, Silvia

Formal Metadata

Title

Title of Series

Computational and statistical trade-offs in learning

Part Number

Number of Parts

Author

Villa, Silvia

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/20848 (DOI)

Publisher

Institut des Hautes Études Scientifiques (IHÉS)

Release Date

2016

Language

English

Content Metadata

Subject Area

Computer Science Mathematics

Genre

Lecture

Abstract

The stochastic gradient method has become an algorithm of choice in machine learning, because of its simplicity and small computational cost, especially when dealing with big data sets. Despite its widespread use, the generalization properties of the variants of stochastic gradient method used in practice are relatively little understood. Most previous works consider generalization properties of SGM with only one pass over the data, while in practice multiple passes are usually considered. The effect of multiple passes has been studied extensively for the optimization of an empirical objective, but the role for generalization is less clear. In this talk, we start filling this gap studying the generalization properties of multiple passes stochastic gradient method for least square regression in an abstract non parametric setting. We show that, if all other parameters are fixed a priori, the number of passes over the data indeed acts as a regularization parameter. The obtained bounds are sharp and matches those obtained with other regularized techniques such as ridge regression.