Abstract
One limitation of the accuracy of computational predictions of protein–ligand binding free energies is the fixed functional form of the intramolecular component of the molecular mechanics force fields. Here, we employ the kernel regression machine learning technique to construct an analytical potential, using the Gaussian Approximation Potential software and framework, that reproduces the quantum mechanical potential energy surface of a small, flexible, drug-like molecule, 3-(benzyloxy)pyridin-2-amine. Challenges linked to the high dimensionality of the configurational space of the molecule are overcome by developing an iterative training protocol and employing a representation that separates short and long range interactions. The analytical model is connected to the MCPRO simulation software, which allows us to perform Monte Carlo simulations of the small molecule bound to two proteins, p38 MAP kinase and leukotriene A4 hydrolase, as well as in water. We demonstrate that the accuracy of our machine learning based intramolecular model is retained in the condensed phase, and that corrections to absolute protein–ligand binding free energies of up to 2 kcal/mol are obtained.



![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)