Research at the boundary of vision and language has been receiving increasing attention in our community. This talk will present the core challenges in vision + language, a survey existing methods, and will touch on available data sets. We will have an emphasis on methods for automatically translating visual data (images and videos) into natural language, including both low and high level methods, but we will also touch on other aspects of the problem space, such as language as context and image synthesis from language.
Back to Graduate Summer School: Computer Vision