Hostname: page-component-89b8bd64d-ktprf Total loading time: 0 Render date: 2026-05-06T10:59:58.917Z Has data issue: false hasContentIssue false

Graph theory approaches for molecular dynamics simulations

Published online by Cambridge University Press:  10 December 2024

Amun C. Patel
Affiliation:
Department of Bioengineering, University of California Riverside, 900 University Avenue, 92521, Riverside, CA, United States
Souvik Sinha
Affiliation:
Department of Bioengineering, University of California Riverside, 900 University Avenue, 92521, Riverside, CA, United States
Giulia Palermo*
Affiliation:
Department of Bioengineering, University of California Riverside, 900 University Avenue, 92521, Riverside, CA, United States Department of Chemistry, University of California Riverside, 900 University Avenue, 92521, Riverside, CA 52512, United States
*
Corresponding author: Giulia Palermo; Email: giulia.palermo@ucr.edu
Rights & Permissions [Opens in a new window]

Abstract

Graph theory, a branch of mathematics that focuses on the study of graphs (networks of nodes and edges), provides a robust framework for analysing the structural and functional properties of biomolecules. By leveraging molecular dynamics (MD) simulations, atoms or groups of atoms can be represented as nodes, while their dynamic interactions are depicted as edges. This network-based approach facilitates the characterization of properties such as connectivity, centrality, and modularity, which are essential for understanding the behaviour of molecular systems. This review details the application and development of graph theory-based models in studying biomolecular systems. We introduce key concepts in graph theory and demonstrate their practical applications, illustrating how innovative graph theory approaches can be employed to design biomolecular systems with enhanced functionality. Specifically, we explore the integration of graph theoretical methods with MD simulations to gain deeper insights into complex biological phenomena, such as allosteric regulation, conformational dynamics, and catalytic functions. Ultimately, graph theory has proven to be a powerful tool in the field of molecular dynamics, offering valuable insights into the structural properties, dynamics, and interactions of molecular systems. This review establishes a foundation for using graph theory in molecular design and engineering, highlighting its potential to transform the field and drive advancements in the understanding and manipulation of biomolecular systems.

Information

Type
Review
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Biomolecular dynamic network. Overview of a biomolecular complex (a) and its representation as a network of nodes and edges (b) through correlation analysis. In panel a, the CRISPR-Cas9 system (PDB 4UN3) represents a typical protein/nucleic acid complex. Adapted with permission from Palermo et al.(Palermo et al., 2017) Copyright 2017 American Chemical Society. In panel b, a network map is shown from Pacific RISA Core Network Map Eigenvector FA2 Region 10 K (https://www.flickr.com/photos/pacificrisa/11345330443 (CC BY-NC-ND 2.0)).

Figure 1

Figure 2. Correlation analysis of Cas12a. a. Overview of the CRISPR-Cas12a complex. The Cas12a protein is shown as molecular surface, highlighting the individual domains using different colours (REC1: light grey, REC2: dark grey, PAM-interacting, PI: red, RuvC: blue, Nuc: green). Nucleic acids are shown as ribbons. b-c. Cross-correlation (CC, upper triangles) and generalized correlations (GC, lower triangles) matrices were computed for Cas12a in both the RNA-bound state (b) and upon DNA binding (c). The strength of the CC and GC coefficients is represented according to the scales on the right. The protein sequence is also displayed. Boxes highlight anticorrelated motions (CC ≤ 0) and highly coupled GC between REC and NUC, which are also illustrated in the cartoon of Cas12a (a). Adapted with permission from Saha et al.(Saha et al. 2020) Copyright 2020 American Chemical Society.

Figure 2

Figure 3. Community Network Analysis (CNA) of CRISPR-Cas9. a. Overview of the CRISPR-Cas9 system, highlighting the individual domains using different colours (⍺-helical lobe: light grey, PAM-interacting C-terminal: red, RuvC: blue, HNH: green). Nucleic acids are shown as ribbons. A close-up view shows the PAM recognition region, highlighting the PAM sequence in red). b. CNA of CRISPR-Cas9 in the absence of PAM (without PAM, w/oPAM, left) and upon PAM binding (with PAM, wPAM, right), shown in a 2D representation of the community network. Bonds connect communities and measure their intercommunication strength. Adapted with permission from Ricci et al.(Ricci et al., 2019) Copyright 2019 American Chemical Society; and from Palermo et al.(Palermo et al., 2017) Copyright 2017 American Chemical Society.

Figure 3

Figure 4. Shortest path calculation. a. The Dijkstra algorithm is used for shortest path calculation by defining a starting point and a destination (i.e., nodes A and C) and iteratively optimizing the path from the former to the latter. In each iteration, the algorithm designates the closest unvisited node as the current node, updating the distances to the remaining unvisited nodes until the destination is reached. For biomolecular allostery, the algorithm employs correlation coefficients as metrics to identify the closest nodes (i.e., $ {w}_{ij}=- logCG $) thereby maximizing the correlation between the starting and destination nodes. b. In the context of the HNH domain of CRISPR-Cas9, the Dijkstra algorithm identifies an allosteric pathway connecting the DNA recognition region (REC2) to the RuvC cleavage site. The signaling route identified through this algorithm (illustrated by the pink line) overlaps with the slow dynamic residues found through solution NMR (represented by purple spheres). Adapted with permission from East et al.(East et al., 2020a) Copyright 2020 American Chemical Society.

Figure 4

Figure 5. Circular Networks. a. Allosteric pathway of information transfer (pink) spanning HNH (green) from the DNA recognition lobe (REC) to the RuvC core. The K810A, K855a, and K848A enhancing specificity mutations are indicted. b. Circular networks of mutation-induced edge betweenness change (ΔEB) noting gain (blue) or loss (red) in allosteric crosstalk between MD-derived communities (shown for the K855A and K810A mutants). HNH communities are plotted on a circle, connected through links the thickness of which is proportional to ΔEB. c. Networks integrating the MD-derived communities (circles), with the experimental dynamic exchange among them (bonds with thickness proportional to CPMG relaxation dispersion NMR upon normalizing the number of flexible residues in each community). Adapted with permission from Nierzwicki et al. (Nierzwicki et al., 2021) 2021 eLIFE.

Figure 5

Figure 6. Signal-to-Noise Ratio of communication efficiency. a. Schematic of the Signal-to-Noise Ratio (SNR) of communication efficiency on the 3D structure of Cas13a. Two black arrows indicate the signal standing up over the noise, shown using grey arrows. High SNR indicates that the signal stands out over the noise. b. Distribution of the signals from the crRNA “seed” (green) and “switch” (blue) regions to the catalytic core residues, plotted on the background of noise (grey) in the crRNA- (top) and target RNA-bound (bottom) Cas13a. c Sites of increased specificity in Cas13a identified through computational analysis and tested through mutagenesis and DNA cleavage experiments. Adapted with permission from Sinha et al.(Sinha et al., 2024) Copyright 2023, Published by Oxford University Press on behalf of Nucleic Acids Research.

Figure 6

Figure 7. Eigenvector centrality analysis of Cas9 variants. a. Close-up view of the PAM binding domain in three variants of Cas9 (viz., VQR, VRER and EQR) with mutations in the PAM binding region(Kleinstiver et al., 2015). The DNA target is shown as ribbons, highlighting the PAM sequence in magenta. The Cas9 residues mutated to alter the protein’s selectivity (green) and the residues preserved (pink) are shown as sticks. b. Eigenvector centrality distribution plotted on the 3D structure of the Cas9 variants, color-coded from red (lowest EC) to blue (highest EC). c. Community structures for the EQR, VQR and VRER Cas9 variants. d. EC distribution shown for the WT Cas9, coloured according to the colour scale on the right. e. Highest eigenvalues of the generalized correlation matrix for the WT Cas9 and its variants.