Despite large investments by industry and the expanding body of research on "big data" in academia, gaining insight into collective behavior online has been elusive due the considerable computational challenges posed by the scale, complexity, and the inevitable noisiness of the associated data sets. This talk addresses this fundamental and pressing problem and offers a gestalt computing approach when studying complex social phenomena in large datasets. This approach involves extracting macro structures from aggregated user actions, finding their possible meanings, and arranging data in layers so that it is iteratively explorable.
Applying this methodology to four years of data from a political social news site reveals a nuanced panorama, studded with distinct political preferences and themes and vividly captures the transformative effect of a contentious political event. The derived themes are not only statistically significant, but also meaningful to human understanding. The approach is scalable, provides a compelling summary of the corpus, and being rooted in sociological concepts, has the potential to apply to different contexts.