From Network Dissection to Policy Dissection: Discovering Emergent Concepts in Deep Representations

Bolei Zhou
University of California, Los Angeles (UCLA)

When the deep neural networks are trained to solve a primary task, many interpretable concepts emerge inside their representations. I will talk about our series work on discovering the interpretable concepts that emerge in the deep representations trained for solving a range of tasks, from image classification, image generation, to reinforcement learning. Discovering these concepts not only reveals the inner workings of deep neural networks, but also facilitates meaningful human-AI interactions and brings new applications such as human-guided AI content creation and human-AI shared control for robotics.

Back to Explainable AI for the Sciences: Towards Novel Insights