Understanding History through Topic Modeling

David Blei
Columbia University

Significant events are characterized by interactions between entities (e.g., countries, organizations, individuals) that deviate from typical interaction patterns. Investigators, such as historians, commonly read large quantities of text to construct an accurate picture of who, what, when, and where and event happened. In this work, we present the Capsule model for analyzing documents. Capsule identifies and characterizes events of potential significance.
Specifically, we develop a model based on topic modeling to distinguish between topics that describe ``business-as-usual'' and topics that deviate from these patterns. To demonstrate this model, we analyze real-world datasets, including a corpus of over 2 million US State Department cables from the 1970s.

Presentation (PDF File)

Back to Workshop IV: Mathematical Analysis of Cultural Expressive Forms: Text Data