Big Data of Materials Science -- Critical Role of the Descriptor(*)

Matthias Scheffler
Fritz-Haber-Institut der Max-Planck-Gesellschaft

Luca M. Ghiringhelli1, Jan Vybiral2, Sergey V. Levchenko1,
Claudia Draxl3, and Matthias Scheffler1

1 Fritz-Haber-Institut der Max-Planck-Gesellschaft, Berlin-Dahlem, Germany
2 Charles University, Department of Mathematical Analysis, Prague, Czech Republic
3 Humboldt-Universitaet zu Berlin, Institut fuer Physik and IRIS Adlershof, Berlin, Germany

Statistical learning of materials properties or functions so far starts with a largely silent, non-challenged step: the choice of the set of descriptive parameters (termed descriptor). However, when the scientific connection between the descriptor and the actuating mechanisms is unclear, causality of the learned descriptor-property relation is uncertain. Thus, trustful prediction of new promising materials, identification of anomalies, and scientific advancement are doubtful. We analyze this issue and define requirements for a suited descriptor. For a classical example, the energy difference of zincblende/wurtzite and rocksalt semiconductors, we demonstrate how a meaningful descriptor can be found systematically.
(*) Submitted for publication in Physical Review Letters. See also

Presentation (PDF File)

Back to Machine Learning for Many-Particle Systems