The application of machine learning (ML) to the computer simulation of materials has features that are somewhat uncommon in ML: the data is often free of noise, in principle unlimited amounts of data are available at known unit cost, and there is often considerable freedom in choosing data locations. This calls for the close examination of which ML strategies are best, and what their ultimate limitations are in practice. Can we create ML models of arbitrary accuracy? How can recent advances in on-line or active learning be utilized? What can more classical statistical interpolation methods contribute?
Traditional, non-data-intensive models in the physical sciences are “extrapolative”, i.e. the parameters are determined by observing limited data in some domain, and the models are tested in extended or even wholly different domains, and the performance of such models is evaluated according to how well they do in such a situation. In contrast, high dimensional ML models are best at interpolation. What are the best criteria for assessing the quality of such models? Do they only give back what they were “taught”? What new discoveries of structures or processes could ever result from an interpolative ML model?
This workshop will broadly address the reaches and limitations of ML as applied to the modeling of physical systems and highlight examples where physical models can be successfully combined or even derived from ML algorithms.
This workshop will include a poster session; a request for posters will be sent to registered participants in advance of the workshop.
Gabor Csányi (University of Cambridge)
Marina Meila (University of Washington)
Klaus-Robert Müller (Technische Universität Berlin)
Sadasivan Shankar (Harvard University)