We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Gradient descent for wide two-layer neural networks

Formal Metadata

Title
Gradient descent for wide two-layer neural networks
Title of Series
Number of Parts
5
Author
License
CC Attribution - NonCommercial - NoDerivatives 2.0 Generic:
You are free to use, copy, distribute and transmit the work or content in unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date2020
LanguageEnglish

Content Metadata

Subject Area
Genre
Abstract
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. Towards understanding this phenomenon, we analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations. We show that the limits of the gradient flow on exponentially tailed losses can be fully characterized as a max-margin classifier in a certain non-Hilbertian space of functions.
Keywords