Hostname: page-component-89b8bd64d-x2lbr Total loading time: 0 Render date: 2026-05-13T06:02:13.458Z Has data issue: false hasContentIssue false

Energy mapping of the genetic code and genomic domains: implications for code evolution and molecular Darwinism

Published online by Cambridge University Press:  04 November 2020

Horst H. Klump
Affiliation:
Department of Molecular and Cell Biology, University of Cape Town, Private Bag, Rondebosch 7800, South Africa
Jens Völker
Affiliation:
Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Rd, Piscataway, NJ 08854, USA
Kenneth J. Breslauer*
Affiliation:
Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Rd, Piscataway, NJ 08854, USA Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
*
Author for correspondence: Kenneth J. Breslauer, E-mail: kjbdna@rutgers.edu
Rights & Permissions [Opens in a new window]

Abstract

When the iconic DNA genetic code is expressed in terms of energy differentials, one observes that information embedded in chemical sequences, including some biological outcomes, correlate with distinctive free energy profiles. Specifically, we find correlations between codon usage and codon free energy, suggestive of a thermodynamic selection for codon usage. We also find correlations between what are considered ancient amino acids and high codon free energy values. Such correlations may be reflective of the sequence-based genetic code fundamentally mapping as an energy code. In such a perspective, one can envision the genetic code as composed of interlocking thermodynamic cycles that allow codons to ‘evolve’ from each other through a series of sequential transitions and transversions, which are influenced by an energy landscape modulated by both thermodynamic and kinetic factors. As such, early evolution of the genetic code may have been driven, in part, by differential energetics, as opposed exclusively by the functionality of any gene product. In such a scenario, evolutionary pressures can, in part, derive from the optimization of biophysical properties (e.g. relative stabilities and relative rates), in addition to the classic perspective of being driven by a phenotypical adaptive advantage (natural selection). Such differential energy mapping of the genetic code, as well as larger genomic domains, may reflect an energetically resolved and evolved genomic landscape, consistent with a type of differential, energy-driven ‘molecular Darwinism’. It should not be surprising that evolution of the code was influenced by differential energetics, as thermodynamics is the most general and universal branch of science that operates over all time and length scales.

Information

Type
Review
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s), 2020. Published by Cambridge University Press
Figure 0

Table 1. Genetic code matrix annotated with trimeric duplex stabilities formed between codons and their corresponding, antiparallel complementary codons

Figure 1

Scheme 1. Free energy distribution spectrum for the 32 trimeric duplexes formed by all 64 complementary codons. The stability distribution is color coded as a ‘heat map’, with the GC-rich most stable family (highest free energy of trimeric duplex formation) highlighted toward the top of the scheme in light green; the next most stable family is highlighted in light purple; and the less stable duplexes relative to the mean are highlighted in light red. The energy spectrum is formatted within four columns that reflect the purine (R)/pyrimidine (Y) sequence patterns designated at the bottom of the scheme.

Figure 2

Fig. 1. Empirical correlations between whole genome codon usage frequencies in S. cerevisiae taken from the work of Futcher and coworkers (Gardin et al., 2014) and the corresponding codon/complementary codon free energies of this study. Each red line represents a best fit to the equation for a straight line of these two independently derived data sets. The result shown here are for two of the three amino acids encoded by six codons, and for four of the five amino acids encoded by four codons. This selection corresponds to that subset of the amino acids judged most ancient, based on a meta-analysis reported by Trifonov (2000, 2004). With the exception of isoleucine, and the insufficient data density for methionine and tryptophan, all of the amino acids encoded by only two codons also show a preference for higher codon usage frequency that correlates with lower codon free energy. For a thermodynamic argument, one strictly should use a log scale plot for the usage frequency. However, over the small data range assessed here, we have confirmed that one cannot distinguish between linear and log linear, with the log plot simply compressing the data.

Figure 3

Fig. 2. The hypercube of all eight cube octet sequence classes (shown in red) located at each apex of the hypercube, illustrating the interconnectedness of the cycles associated with the full cascade of codon interconversions via sequential site changes. Transition mutations occur within a cube, while transversion mutations link one cube to another.