Accelerated discovery of high-refractive-index materials, using molecular modeling and machine learning

Mojtaba Haghighatlari
SUNY Buffalo
Chemical and Biological Engineering

The idea of utilizing modern data science in chemical and materials research has recently gained considerable attention. In this study, we use deep artificial neural network for the prediction of refractive indices (RIs) of 100k small organic molecules. The dataset includes the static polarizability and number density of molecules that are computed using the density functional theory (DFT) and molecular dynamics (MD) simulation, respectively. The RIs are further calculated based on the Lorentz-Lorenz equation. It is demonstrated that the cutting-edge data mining methods are able to reproduce all properties of the molecules in the dataset with high accuracy. The resulting prediction models estimate the RIs with less than 0.01 mean absolute deviation (2% of the range of RIs in the dataset). These results have been tested through different types of molecular representations (descriptors) and model validation methods. The most reliable model displays approximately 2 times improvement compared to the previous works. This model can further be used in virtual high-throughput screening for accelerating materials discovery and as a guide for rational design.

Back to Science at Extreme Scales: Where Big Data Meets Large-Scale Computing