Skip to main content Accessibility help
×
Home

Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations

  • Logan Ward (a1) (a2), Ben Blaiszik (a1) (a3), Ian Foster (a1) (a2) (a3), Rajeev S. Assary (a4) (a5), Badri Narayanan (a5) (a6) and Larry Curtiss (a4) (a5)...

Abstract

Recent studies illustrate how machine learning (ML) can be used to bypass a core challenge of molecular modeling: the trade-off between accuracy and computational cost. Here, we assess multiple ML approaches for predicting the atomization energy of organic molecules. Our resulting models learn the difference between low-fidelity, B3LYP, and high-accuracy, G4MP2, atomization energies and predict the G4MP2 atomization energy to 0.005 eV (mean absolute error) for molecules with less than nine heavy atoms (training set of 117,232 entries, test set 13,026) and 0.012 eV for a small set of 66 molecules with between 10 and 14 heavy atoms. Our two best models, which have different accuracy/speed trade-offs, enable the efficient prediction of G4MP2-level energies for large molecules and are available through a simple web interface.

  • View HTML
    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations
      Available formats
      ×

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations
      Available formats
      ×

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations
      Available formats
      ×

Copyright

Corresponding author

Address all correspondence to Logan Ward at lward@anl.gov

References

Hide All
1.Curtiss, L.A., Redfern, P.C., and Raghavachari, K.: Gn theory. Wiley Interdiscip. Rev. Comput. Mol. Sci. 1, 810825 (2011).
2.Curtiss, L.A., Redfern, P.C., and Raghavachari, K.: Gaussian-4 theory using reduced order perturbation theory. J. Chem. Phys. 127, 124105 (2007).
3.Mardirossian, N. and Head-Gordon, M.: Thirty years of density functional theory in computational chemistry: an overview and extensive assessment of 200 density functionals. Mol. Phys. 115, 23152372 (2017).
4.Becke, A.D.: A new mixing of Hartree–Fock and local density-functional theories. J. Chem. Phys. 98, 1372 (1993).
5.Ward, L. and Wolverton, C.: Atomistic calculations and materials informatics: a review. Curr. Opin. Solid State Mater. Sci. 21, 167176 (2017).
6.Handley, C.M. and Behler, J.: Next generation interatomic potentials for condensed systems. Eur. Phys. J. B 87, 152 (2014).
7.Rupp, M.: Machine learning for quantum mechanics in a nutshell. Int. J. Quantum Chem. 115, 10581073 (2015).
8.Ramakrishnan, R., Dral, P.O., Rupp, M., and von Lilienfeld, O.A.: Big data meets quantum chemistry approximations: the Δ-machine learning approach. J. Chem. Theory Comput. 11, 20872096 (2015).
9.Zaspel, P., Huang, B., Harbrecht, H., and von Lilienfeld, O.A.: Boosting quantum machine learning models with a multilevel combination technique: pople diagrams revisited. J. Chem. Theory Comput. 15, 15461559 (2019).
10.Pilania, G., Gubernatis, J.E., and Lookman, T.: Multi-fidelity machine learning models for accurate bandgap predictions of solids. Comput. Mater. Sci. 129, 156163 (2017).
11.Seko, A., Maekawa, T., Tsuda, K., and Tanaka, I.: Machine learning with systematic density-functional theory calculations: application to melting temperatures of single- and binary-component solids. Phys. Rev. B 89, 054303 (2014).
12.Smith, J.S., Nebgen, B.T., Zubatyuk, R., Lubbers, N., Devereux, C., Barros, K., Tretiak, S., Isayev, O., and Roitbert, A.E.: Outsmarting quantum chemistry through transfer learning universal neural network potentials for organic molecules. ChemArXiv (2018). 10.26434/chemrxiv.6744440.
13.Schütt, K.T., Sauceda, H.E., Kindermans, P.-J., Tkatchenko, A., and Müller, K.-R.: Schnet – a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
14.Faber, F.A., Christensen, A.S., Huang, B., and von Lilienfeld, O.A.: Alchemical and structural distribution based representation for universal quantum machine learning. J. Chem. Phys. 148, 241717 (2018).
15.Chard, R., Li, Z., Chard, K., Ward, L., Babuji, Y., Woodard, A., Tuecke, S., Blaiszik, B., Franklin, M.J., and Foster, I.: DLHub: Model and Data Serving for Science (Cornell University, 2018). https://arxiv.org/abs/1811.11213
16.Narayanan, B., Redfern, P.C., Assary, R.S., and Curtiss, L.A.: Accurate quantum chemical energies for 133 000 organic molecules. Chem. Sci. (2019). doi:10.1039/C9SC02834J
17.Ramakrishnan, R., Dral, P.O., Rupp, M., and von Lilienfeld, O.A.: Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 140022 (2014).
18.Blaiszik, B., Chard, K., Pruyne, J., Ananthakrishnan, R., Tuecke, S., and Foster, I.: The materials data facility: data services to advance materials science research. JOM 68, 20452052 (2016).
19.Ward, L., Blaiszik, B., Foster, I., Assary, R.S., Narayanan, B., and Curtiss, L.A.: Dataset for Machine Learning Prediction of Accurate Atomization Energies of Organic Molecules from Low-Fidelity Quantum Chemical Calculations (Materials Data Facility, 2019). doi:10.18126/M2V65Z
21.Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., and Dahl, G.E.: Neural Message Passing for Quantum Chemistry (2017). http://arxiv.org/abs/1704.01212.
22.Wu, Z., Ramsundar, B., Feinberg, E.N., Gomes, J., Geniesse, C., Pappu, A.S., Leswing, K., and Pande, V.: Moleculenet: a benchmark for molecular machine learning. Chem. Sci. 9, 513530 (2018).
23.Paul, A., Jha, D., Al-Bahrani, R., Liao, W., Choudhary, A., and Agrawal, A.: CheMixNet: Mixed DNN Architectures for Predicting Chemical Properties Using Multiple Molecular Representations (2018). http://arxiv.org/abs/1811.08283.
24.Schütt, K.T., Kessel, P., Gastegger, M., Nicoli, K.A., Tkatchenko, A., and Müller, K.-R.: Schnetpack: a deep learning toolbox for atomistic systems. J. Chem. Theory Comput. 15, 448455 (2019).
25.Huang, B. and von Lilienfeld, O.A.: The “DNA” of Chemistry: Scalable Quantum Machine Learning with “Amons”, 2017 http://arxiv.org/abs/1707.04146.
26.Christensen, A.S., Faber, F.A., Huang, B., Bratholm, L.A., Tkatchenko, A., Müller, K.-R., and von Lilienfeld, O.A.: qmlcode/qml: Release v0.3.1 (2017). doi:10.5281/ZENODO.817332.
27.Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 28252830 (2011).
28.Baxter, J.: A Bayesian Information theoretic model of learning to learn via multiple task sampling. Mach. Learn 28, 739 (1997).
29.Browning, N.J., Ramakrishnan, R., von Lilienfeld, O.A., and Roethlisberger, U.: Genetic optimization of training sets for improved machine learning models of molecular properties. J. Phys. Chem. Lett. 8, 13511359 (2017).
30.Hy, T.S., Trivedi, S., Pan, H., Anderson, B.M., and Kondor, R.: Predicting molecular properties with covariant compositional networks. J. Chem. Phys. 148 (2018).
31.Kearnes, S., McCloskey, K., Berndl, M., Pande, V., and Riley, P.: Molecular graph convolutions: moving beyond fingerprints. J. Comput. Aided Mol. Des. 30, 595608 (2016).
32.Coley, C.W., Jin, W., Rogers, L., Jamison, T.F., Jaakkola, T.S., Green, W.H., Barzilay, R., and Jensen, K.F.: A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci. 10, 370377 (2019).
33.Halgren, T.A.: Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J. Comput. Chem. 17, 490519 (1996).
34.O'Boyle, N.M., Banck, M., James, C.A., Morley, C., Vandermeersch, T., and Hutchison, G.R.: Open babel: an open chemical toolbox. J. Cheminform. 3, 33 (2011).
35.Gebauer, N.W.A., Gastegger, M., and Schütt, K.T.: Generating Equilibrium Molecules with Deep Neural Networks (2018). http://arxiv.org/abs/1810.11347.
36.Yao, K., Herr, J.E., Toth, D.W., Mckintyre, R., and Parkhill, J.: The TensorMol-0.1 model chemistry: a neural network augmented with long-range physics. Chem. Sci. 9, 22612269 (2018).
37.Nakata, M., Shimazaki, T., Hashimoto, M., and Maeda, T.: PubChemQC PM6: A Dataset of 221 Million Molecules with Optimized Molecular Geometries and Electronic Properties (2019) pp. 133. http://arxiv.org/abs/1904.06046.
38.Towns, J., Cockerill, T., Dahan, M., Foster, I., Gaither, K., Grimshaw, A., Hazlewood, V., Lathrop, S., Lifka, D., Peterson, G.D., Roskies, R., Scott, J.R., and Wilkens-Diehr, N.: XSEDE: accelerating scientific discovery. Comput. Sci. Eng. 16, 6274 (2014).
39.Stewart, C.A., Turner, G., Vaughn, M., Gaffney, N.I., Cockerill, T.M., Foster, I., Hancock, D., Merchant, N., Skidmore, E., Stanzione, D., Taylor, J., and Tuecke, S.: Jetstream: a self-provisioned, scalable science and engineering cloud environment. In Proc. 2015 XSEDE Conf. Sci. Adv. Enabled by Enhanc. Cyberinfrastructure - XSEDE ’15; ACM Press, New York, NY, USA, 2015; pp. 1–8.

Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations

  • Logan Ward (a1) (a2), Ben Blaiszik (a1) (a3), Ian Foster (a1) (a2) (a3), Rajeev S. Assary (a4) (a5), Badri Narayanan (a5) (a6) and Larry Curtiss (a4) (a5)...

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed