Word Vectors in the Eighteenth Century

Ryan Heuser
Stanford University

This talk explores how word embedding models such as word2vec might raise interesting questions for eighteenth-century literary studies. These new models of vector space semantics have generated excitement recently in NLP and machine learning circles due to their ability to represent and predict, mathematically, semantic relationships as complex as analogy. "Man is to woman as king is to queen" becomes V(Queen) ˜ V(King) - V(Man) +V(Woman). Or, "riches are to virtue as learning is to genius"—as Edward Young argued in his 1759 treatise against classical imitation—becomes V(Genius) ˜ V(Learning) - V(Riches) + V(Virtue). Such mathematical-semantic operationalizations carry immense and still relatively unexplored possibilities as analytical tools for literary-conceptual history. At the same time, their mathematical operationality makes visible the ways in which eighteenth-century literature was in fact already deeply invested in the distinctly mathematical ways in which concepts act upon and against each other. Hume's "Of Simplicity and Refinement" (1742), for instance, describes the moral and aesthetic dangers of too simple and too refined writing by, effectively, "subtracting" the two concepts to produce a third, emergent one: the concept of an axis of difference between them in the first place, along which any author's writing can, even must, move. What would it mean, then, pragmatically and theoretically, to represent this Humean axis of difference as the vector, V(V(Refinement) - V(Simplicity))? Inspired by Hume's subsequent correlation of this axis of difference with several others (the Beautiful - the Dangerous, Passion - Wit, the Affections - Imagination), my talk hopes to demonstrate how word vectors might allow us to model and explore such networks of subtractive and analogical relationships at a larger cultural scale—yielding a vector-based, and ultimately mathematical, distant-reading of eighteenth-century literature.

Presentation (PowerPoint File)

Back to Workshop IV: Mathematical Analysis of Cultural Expressive Forms: Text Data