Exploiting the 3D Geometry and the Structure of the World in Deep Learning

Ersin Yumer
Uber ATG

In this talk, we will focus on tasks where data is either scarce or impossible to collect, which results in a challenge for learning algorithms where data is key to success. More specifically, we will look at this challenge from a neural network’s perspective and propose solutions that help address the problem by building our knowledge about the geometry and structure in the world into differentiable network layers to introduce task specific model bias, and by utilizing physically-based simulations to generate synthetic or augmented data. We will explore these ideas through several applications: end-to-end neural face editing in images, structured prediction for 3D body tracking from videos, and helping traditional 2D scene understanding by implicitly lifting the problem to 3D. These applications will demonstrate the power of exploiting geometric intermediate representations into a network design, by both constraining the latent space to the specific tasks we care about, as well as providing intermediate weak supervision opportunities at training time.



   View on Youtube

Back to Long Programs