Machine Learning Force Fields
- PMID: 33705118
- PMCID: PMC8391964
- DOI: 10.1021/acs.chemrev.0c01111
Machine Learning Force Fields
Abstract
In recent years, the use of machine learning (ML) in computational chemistry has enabled numerous advances previously out of reach due to the computational complexity of traditional electronic-structure methods. One of the most promising applications is the construction of ML-based force fields (FFs), with the aim to narrow the gap between the accuracy of ab initio methods and the efficiency of classical FFs. The key idea is to learn the statistical relation between chemical structure and potential energy without relying on a preconceived notion of fixed chemical bonds or knowledge about the relevant interactions. Such universal ML approximations are in principle only limited by the quality and quantity of the reference data used to train them. This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them. The core concepts underlying ML-FFs are described in detail, and a step-by-step guide for constructing and testing them from scratch is given. The text concludes with a discussion of the challenges that remain to be overcome by the next generation of ML-FFs.
Conflict of interest statement
The authors declare no competing financial interest.
Figures
(x) depicts the mean (eq 8) of the conditional probability p(
|y)
(see eq 7), whereas the
gray area depicts two standard
deviations from its mean (see eq 9). Note that predictions are most confident in regions where
training data is present. (B) Function
(x) can be expressed as
a linear combination of M kernel functions K(x, xi) weighted with regression coefficients αi (see eq 2). In this example, the Gaussian kernel (eq 4) is used (the hyperparameter γ
controls its width). (C) Influence of noise on prediction performance.
Here, the function f(x) (thick gray
line) is learned from M = 25 samples, however, each data point (xi, yi) contains observational noise (see eq 6). When the coefficients αi are determined without
regularization, i.e., no noise is assumed to be present, the model
function reproduces the training samples faithfully, but undulates
wildly between data points (orange line, λ = 0). The regularized
solution (blue line, λ = 0.1, see eq 10) is much smoother and stays closer to the
true function f(x), but individual
data points are not reproduced exactly. When the regularization is
too strong (green line, λ = 1.0), the model function becomes
unable to fit the data. Note how regularization shrinks the magnitude
of the coefficient vectors ∥α∥. (D)
For constructing force fields, it is necessary to encode molecular
structure with a representation x. The choice of this
structural descriptor may strongly influence model performance. Here,
the potential energy E of a diatomic molecule (thick
gray line) is learned from M = 5 data points by two kernel machines
using different structural representations (both models use a Gaussian
kernel). When the interatomic distance r is used
as descriptor (orange line, x = r), the predicted potential energy oscillates between data points,
leading to spurious minima and qualitatively wrong behavior for large r. A model using the descriptor x = e–r (blue line) predicts
a physically meaningful potential energy curve that is qualitatively
correct even when the model extrapolates.
and
atom-wise refinements
(see eq 25). The final
descriptors xiT are used as input for
an additional NN predicting
the atomic energy contributions (typically, a single NN is shared
among all elements).
References
-
- Feynman R. P.; Leighton R. B.; Sands M.. The Feynman Lectures On Physics; Addison-Wesley, 1963; Vol. 1.
-
- Phillips D.Biomolecular Stereodynamics; Adenine Press: Guilderland, NY, 1981.
-
- Shaw D. E.; Deneroff M. M.; Dror R. O.; Kuskin J. S.; Larson R. H.; Salmon J. K.; Young C.; Batson B.; Bowers K. J.; Chao J. C.; et al. Anton, A Special-purpose Machine For Molecular Dynamics Simulation. Commun. ACM 2008, 51, 91–97. 10.1145/1364782.1364802. - DOI
Publication types
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous
