The surprising ineffectiveness of molecular dynamics coordinates for predicting bioactivity with machine learning

Emanuele Criscuolo; Rıza Özçelik; Derek van Tilborg; Francesca  Grisoni

doi:10.26434/chemrxiv-2024-rp81v

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

The surprising ineffectiveness of molecular dynamics coordinates for predicting bioactivity with machine learning

18 December 2024, Version 1

Working Paper

Show author details

This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

Accurate prediction of protein-ligand binding affinity remains a major challenge in drug discovery, despite the rapid progress of machine learning. Interestingly, machine learning approaches based on two-dimensional molecular information (e.g., binary fingerprints) often outperform those using three-dimensional (3D) information, possibly due to the usage of minimum-energy conformations. This raises questions about how to incorporate more sophisticated three-dimensional information (e.g., ligand flexibility and binding-induced conformational changes) for bioactivity prediction. To this end, we systematically investigate whether coordinates derived from molecular dynamics (MD) can improve prediction performance over minimum-energy conformations. MD-derived coordinates capture dynamic molecular interactions, which are hypothesized to reflect a more realistic representation of ligand-protein binding events. Using over 2600 protein-ligand complexes across three macromolecular targets, we compared multiple machine learning approaches using well-established 3D descriptor sets. Surprisingly, our results show that MD-derived coordinates do not consistently outperform ‘static’ 3D structures, despite their ability to capture dynamic molecular interactions. These findings highlight the persistent challenge of effectively leveraging three-dimensional and dynamic information for bioactivity prediction and underscore the need for improved representations approaches to bridge this gap.

Keywords

Molecular machine learning

Molecular dynamics simulation

Bioactivity prediction

Drug discovery

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Dec 18, 2024 Version 1

Metrics

2,336

981

Views

Downloads

License

The content is available under CC BY NC 4.0

DOI

10.26434/chemrxiv-2024-rp81v

Funding

European Research Council

101077879

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

The surprising ineffectiveness of molecular dynamics coordinates for predicting bioactivity with machine learning

Authors

Abstract

Keywords

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share