Task-driven Network Discovery via Deep Reinforcement Learning on Embedded Spaces

Tina Eliassi-Rad
Northeastern University
Computer Science & Network Science

Most network analysis is conducted on existing incomplete samples of much larger fully observed networks. For example, many researchers obtain their networks from online data repositories without knowing how these networks were collected. Thus, these networks can be poor representations of the fully observed networks. More complete data would lead to more accurate analyses, but data acquisition is at best costly and at worst error-prone. The network discovery problem is defined as follows: Given a query budget for identifying additional nodes and edges, how can one improve observability so that it is a more accurate representation of the fully observed network? The key phrase in this problem definition is “a more accurate representation of the fully observed network”. In active exploration, the more accurate representation involves growing the network by adding nodes and edges. However, in active learning, the more accurate representation is learning the best performing function on the network for a down-stream task such as selective harvesting (namely, the optimal collection of nodes with a particular attribute). This is problem is related to, but distinct from, topics such as network sampling and crawling. In this talk, I will formulate the task-specific network discovery problem as a sequential decision-making problem and present a framework, called Network Actor Critic (NAC), which learns a policy and notion of future reward in an offline setting via deep reinforcement learning. The NAC paradigm utilizes a task-specific network embedding to reduce the state space complexity. Our empirical study shows that offline models of reward and network-discovery policies lead to significantly improved performance when compared to competitive online-discovery algorithms. Finally, I will outline learning regimes where planning is critical in addressing sparse and changing reward signals. This is joint work with Rajmonda Caceres (MIT Lincoln Laboratory) and Peter Morales (Microsoft; formerly at MIT Lincoln Laboratory).

Presentation (PDF File)

Back to Deep Learning and Combinatorial Optimization