White Paper: Understanding Many-Particle Systems with Machine Learning

Posted on 3/1/17 in Reports and White Papers

MPS2016 ImageThis white paper was prepared by the participants of the fall 2016 long program Understanding Many-Particle Systems with Machine Learning.

Interactions between many constituent particles, i.e. quarks, electrons, atoms, molecules, or materials, generally give rise to collective or emergent phenomena in matter. Even when the interactions between the particles are well defined and the governing equations of the system are understood, the collective behavior of the system as a whole does not trivially emerge from these equations. Despite many decades of prominent work on interacting many-particle (MP) systems, the problem of N interacting particles is not exactly soluble. In fact, computational complexity typically increases exponentially with N. Although many attempts have been made to define collective (emergent) variables in numerous fields of natural sciences, the progress is often painfully slow. This situation stems from the lack of easily identifiable symmetries in complex dynamical systems such as materials, chemicals, and proteins. In fact, the identification and understanding of descriptive collective variables is among the most time-consuming and rewarding processes in a multitude of sciences. In this context, the main goal of the proposed program was to develop and apply novel machine learning (ML) methods to significantly accelerate the discovery of descriptive variables in complex MP systems at the microscopic scale. Examples of collective behavior are abundant in nature, manifesting themselves at all scales of matter, ranging from atoms to galaxies. Examples of collective behavior include spontaneous assembly of organic and inorganic crystalline structures on surfaces and in the bulk, self-assembly of proteins and DNA in cells, the behavior of human and animal crowds, the dynamics of sand dunes, formation of clouds, and formation of galaxies.

Machine learning methods have been used extensively in a wide variety of fields ranging from e.g. the neurosciences, genetics, multimedia search to drug discovery. ML models can be thought of as universal approximators that learn a (possibly very complex) non- linear mapping between input data (descriptor) and an output signal (observation). ML approaches are frequently applied as “black box” approximators but have been rarely used to learn new physical models for MP systems. Therefore, the aim of this IPAM long program was to develop the “black box” scientifically by bringing together experts in MP problems in condensed-matter physics, materials, chemistry, and protein folding, together with experts in mathematics and computer science. This helped address the problem of tackling emergent behavior and understanding the underlying collective variables in MP systems. Only collaborations during and after the program, between researchers in these areas could lead to breakthroughs in our understanding of complex emergent behavior in MP systems.

The combination of ML with atomistic simulations is an emerging and quickly growing field that brings many challenges as well as potential opportunities. As such, it is clearly impossible to cover all topics of interest discussed during the long program in this document. We have organized the document around important areas of research that were prominently discussed during the long program. Read the full report.