Learning with differentiable perturbed optimizers

Zitieren

Zugehöriges Material

Centre International de Rencontres Mathématiques (CIRM)

Berthet, Quentin

Formale Metadaten

Titel

Learning with differentiable perturbed optimizers

Serientitel

Optimization for Machine Learning

Anzahl der Teile

Autor

Berthet, Quentin

Lizenz

CC-Namensnennung - keine kommerzielle Nutzung - keine Bearbeitung 2.0 Generic:
Sie dürfen das Werk bzw. den Inhalt in unveränderter Form zu jedem legalen und nicht-kommerziellen Zweck nutzen, vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/54182 (DOI)

Herausgeber

Centre International de Rencontres Mathématiques (CIRM)

Erscheinungsjahr

2020

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik Mathematik

Genre

Konferenz/Talk

Abstract

Machine learning pipelines often rely on optimization procedures to make discrete decisions (e.g. sorting, picking closest neighbors, finding shortest paths or optimal matchings). Although these discrete decisions are easily computed in a forward manner, they cannot be used to modify model parameters using first-order optimization techniques because they break the back-propagation of computational graphs. In order to expand the scope of learning problems that can be solved in an end-to-end fashion, we propose a systematic method to transform a block that outputs an optimal discrete decision into a differentiable operation. Our approach relies on stochastic perturbations of these parameters, and can be used readily within existing solvers without the need for ad hoc regularization or smoothing. These perturbed optimizers yield solutions that are differentiable and never locally constant. The amount of smoothness can be tuned via the chosen noise amplitude, whose impact we analyze. The derivatives of these perturbed solvers can be evaluated eciently. We also show how this framework can be connected to a family of losses developed in structured prediction, and describe how these can be used in unsupervised and supervised learning, with theoretical guarantees. We demonstrate the performance of our approach on several machine learning tasks in experiments on synthetic and real data.

Schlagwörter

perturbation methods

structured learning