## Statistical approaches to forcefield calibration and prediction uncertainty in molecular simulation

#### Fabien CailliezUniversité d'Orsay

Classical molecular simulations are now commonly used to predict thermophysical properties of fluids, both in academic studies and for industrial purposes. Their use as a predictive tool requires estimating the uncertainty associated with the predicted value for a given property. Among the sources of the simulation results uncertainty, the one arising from the forcefield has long been ignored. The forcefield contains all the information about the potential energy of a molecular system, coming from interatomic interactions, which are encrypted in parameters that are commonly calibrated in order to reproduce some (uncertain) experimental data. Although the results of a simulation depend on the values of the forcefield parameters, it is only very recently that careful investigations of the effect of their uncertainties have been undertaken [1-6].

In recent years, we have explored various calibration strategies and calibration models within the Bayesian framework [7]. We have studied a simple two-parameters Lennard-Jones potential for Argon, for which the calibration can be done using cheap analytical expressions, allowing for a thorough exploration of the parameter space. We have shown that prediction uncertainty, albeit very small, is larger than characteristic statistical simulation uncertainty [4]. For more complex systems, more parameters have to be calibrated and, in absence of analytical models, the calibration process requires to run long molecular simulations. We have used kriging metamodels and optimal infilling strategies to limit the number of molecular simulations to be performed during the calibration process. We have benchmarked this methodology on the water TIP4P forcefield [8]. Although statistical consistency can be achieved when calibrating forcefield parameters on a unique physical property, our work revealed that some physical properties cannot be obtained at the same time, even when one takes into account the forcefield parameters uncertainties. We thus recently explored various statistical calibration/prediction models to handle data inconsistency and model inadequacy, to evaluate their potential interest in the framework of forcefield parameters calibration [9].
[1] Rizzi F., et al. Multiscale Model. Simul , 10:1428, 2012.
[2] Rizzi F., et al. Multiscale Model. Simul., 10: 1460, 2012.
[3] Angelikopoulos P., et al. J. Chem. Phys., 137: 144103, 2012.
[4] Cailliez F., and Pernot P. J. Chem. Phys., 134: 054124, 2011.
[5] Hadjidoukas P.E., et al. J. Comput. Phys., 284: 1, 2015.
[6] Wu S., et al. Phil. Trans. R. Soc. A, 374: 20150032, 2016.
[7] Kennedy M., and O'Hagan A. J. Roy. Stat. Soc. B, 63:425, 2001.
[8] Cailliez F., Bourasseau A., and Pernot P. J. Comput. Chem., 35: 130, 2014.
[9] Pernot P., and Cailliez F. AIChE J., 63: 4642, 2017.

Back to Workshop IV: Uncertainty Quantification for Stochastic Systems and Applications