Gradient descent, Stochastic gradient descent, and acceleration

Adam Oberman
McGill University
Mathematics and Statistics

For large scale machine learning problems, second order approximation is not practical, and so first order (gradient-based) optimization methods are used. These consist of gradient descent, Stochastic gradient descent, and their accelerated version. In this tutorial we will review the basic ideas of the algorithms, the convergence results, and the connections with ordinary differential equations.

Presentation (PDF File)

Back to High Dimensional Hamilton-Jacobi PDEs Tutorials