Massive Internet-scale data, comprising content and associated user action logs, have raised hopes that data-driven models in the social sciences will proliferate. Given the scale, complexity, and noisiness of the data sets, however, the content and the associated social interactions cannot be explored via a manual process, and automated methods are needed to provide useful summaries. The extent to which the promise of the so-called big data in social sciences is fulfilled will depend on our ability to identify summative macroscopic observables – analogs of quantities such as temperature or pressure– and then to be able to design the equivalents of measurement devices to estimate such observables reliably and robustly. Given such macroscopic descriptions of the data, experts can build predictive models, and when necessary, delve into the finer microscopic patterns and details. This talk will use an example social-media data set, generated online around a major political movement, and demonstrate how some of the underlying dynamics and major events can be estimated and summarized in a completely unsupervised manner. The talk will also highlight several unresolved issues and open problems, especially when one has to build a composite and consistent model of events from many correlated data sets, e.g. from data generated in Twitter, Facebook, and different activist forums and social news sites, around the same political uprising.
Back to Mathematics of Politics