An Optimal Transportation (OT) View of Generative Adversarial Networks (GANs) - Part 1

David Xianfeng Gu
SUNY Stony Brook

Generative Adversarial Net (GAN) is a powerful machine learning model, and becomes extremely successful recently. The generator and the discriminator in a GAN model competes each other and reaches the Nash equilibrium. GAN can generate samples automatically, therefore reduce the requirements for large amount of training data. It can also model distributions from data samples. In spite of its popularity, GAN model lacks theoretic foundation. In this talk, we give a geometric interpretation to optimal mass transportation theory, explain the relation with the Monge-Ampere equation, and apply the theory for the GAN model. In more detail, we will discuss the following problems:



1. The real data satisfies the manifold distribution hypothesis: their distribution is close to a low dimensional manifold in the high dimensional image space. Deep learning has two major tasks: a). learning the manifold structure; b). probability measure transformation. The second task can be explained and carried out using optimal transportation theory. This makes half of the blackbox transparent.



2. In GANs, the generator G computes an optimal transportation map, which is equivalent to the Brenier potential ; the discriminator D calculates the Wasserstein distance between two probability distributions, which is equivalent to the Kontarovich potential. According to Brenier theorem, the optimal Brenier and Kontarovich potentials are related by a closed form. Therefore G and D should collaborate , not compete, with each other. This reduces the complexities of the DNNs and greatly improves the computational efficiency.



3. According to the regularity theory of Monge-Ampere equation, the transportation maps are not continuous. Deep neural networks can only represent continuous maps, this conflict induces the mode collapses in GANs. In order to avoid the mode collapses, the DNN should learn the Brenier potential, instead of the transportation maps directly. This gives a rigorous way to avoid mode collapse.



Based on this theoretic interpretation, we propose an Autoencoder-Optimal Transportation map (AE-OT) framework, which is partially transparent, and outperforms the state of the arts.


Back to Geometry and Learning from Data Tutorials