7th HLF – Lecture: Automatic Step-Size Control for Minimization Iterations

Heidelberg Laureate Forum Foundation

Kahan, William Morton

Formal Metadata

Title

Title of Series

7th Heidelberg Laureate Forum (HLF), 2019

Number of Parts

Author

Kahan, William Morton

License

No Open Access License:
German copyright law applies. This film may be used for your own use but it may not be distributed via the internet or passed on to external parties.

License

Identifiers

10.5446/44102 (DOI)

Publisher

Heidelberg Laureate Forum Foundation

Release Date

2019

Language

English

Content Metadata

Subject Area

Computer Science Mathematics

Genre

Lecture

Abstract

The "Training" of "Deep Learning" for "Artificial Intelligence" is a process that minimizes a "Loss Function" ƒ(w) subject to memory constraints that allow the computation of ƒ(w) and its Gradients G(w) := dƒ(w)/dw` but not the Hessian d2ƒ(w)/dw2 nor estimates of it from many stored pairs {G(w), w}. Therefore the process is iterative using "Gradient Descent" or an accelerated modification of it like "Gradient Descent Plus Momentum". These iterations require choices of one or two scalar "Hyper-Parameters" which cause divergence if chosen badly. Fastest convergence requires choices derived from the Hessian's two attributes, its "Norm" and "Condition Number", that can almost never be known in advance. This retards Training, severely if the Condition Number is big. A new scheme chooses Gradient Descent's Hyper-Parameter, a step-size called "the Learning Rate", automatically without any prior information about the Hessian; and yet that scheme has been observed always to converge ultimately almost as fast as could any acceleration of Gradient Descent with optimally chosen Hyper-Parameters. Alas, a mathematical proof of that scheme's efficacy has not been found yet. The opinions expressed in this video do not necessarily reflect the views of the Heidelberg Laureate Forum Foundation or any other person or associated institution involved in the making and distribution of the video.