Abstract

Momentum in Stochastic Gradient Descent and Deep Neural Nets

Bao Wang
University of California, Los Angeles (UCLA)
Mathematics

Stochastic gradient-based optimization algorithms play perhaps the most important role in modern machine learning, in particular, deep learning. Nesterov accelerated gradient (NAG) is a celebrated technique to accelerate gradient descent, however, the NAG technique will fail in stochastic gradient descent (SGD).
In this talk, I will discuss some recent progress in leveraging NAG and restart techniques to accelerate SGD. Also, I will discuss how to leverage momentum to design deep neural nets in a mathematically mechanistic manner. This is joint work with Tan Nguyen, Richard Baraniuk, Andrea Bertozzi, and Stan Osher.

Back to Deep Learning and Medical Applications

Momentum in Stochastic Gradient Descent and Deep Neural Nets

Bao Wang
University of California, Los Angeles (UCLA)
Mathematics

Programs

News & Research

People

About IPAM

Momentum in Stochastic Gradient Descent and Deep Neural Nets

Bao WangUniversity of California, Los Angeles (UCLA)Mathematics

Programs

News & Research

People

About IPAM

Bao Wang
University of California, Los Angeles (UCLA)
Mathematics