Massive data-sets are often generated by a large network of users. And large networks represent a fundamental medium for the spreading and diffusion of various information where the actions of certain users increase the susceptibility of other users to the same; this results in the successive spread of information from a small set of initial users to a much larger set. Examples include the spread of rumors in online social networks and power outage in smart grids. Modeling these massive data-sets as huge graphs, how to have cyber-security or timely quarantine against failure propagation in networks? We introduce the idea of network centrality as statistical inference in large networks to solve two statistical optimization problems, namely rooting out rumor sources and averting cascading failures. We conclude the talk with insights on putting the theory into practice in graph analytics software development.
Back to Science at Extreme Scales: Where Big Data Meets Large-Scale Computing