Deep Neural Networks motivated by PDEs

Cite

Related Material

Banff International Research Station (BIRS) for Mathematical Innovation and Discovery

Ruthotto, Lars

Formal Metadata

Title

Deep Neural Networks motivated by PDEs

Title of Series

Reconstruction Methods for Inverse Problems (19w5092)

Number of Parts

Author

Ruthotto, Lars

License

CC Attribution - NonCommercial - NoDerivatives 4.0 International:
You are free to use, copy, distribute and transmit the work or content in unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/56564 (DOI)

Publisher

Banff International Research Station (BIRS) for Mathematical Innovation and Discovery

Release Date

2019

Language

English

Content Metadata

Subject Area

Computer Science Life Sciences Mathematics Physics Engineering

Genre

Workshop/Interactive Format Lecture

Abstract

One of the most promising areas in artificial intelligence is deep learning, a form of machine learning that uses neural networks containing many hidden layers. Recent success has led to breakthroughs in applications such as speech and image recognition. However, more theoretical insight is needed to create a rigorous scientific basis for designing and training deep neural networks, increasing their scalability, and providing insight into their reasoning. This talk bridges the gap between partial differential equations (PDEs) and neural networks and presents a new mathematical paradigm that simplifies designing, training, and analyzing deep neural networks. It shows that training deep neural networks can be cast as a dynamic optimal control problem similar to path-planning and optimal mass transport. The talk outlines how this interpretation can improve the effectiveness of deep neural networks. First, the talk introduces new types of neural networks inspired by to parabolic, hyperbolic, and reaction-diffusion PDEs. Second, the talk outlines how to accelerate training by exploiting multi-scale structures or reversibility properties of the underlying PDEs. Finally, recent advances on efficient parametrizations and derivative-free training algorithms will be presented.