To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Divalent sulfur (S) forms a chalcogen bond (Ch-bond) via its σ-holes and a hydrogen bond (H-bond) via its lone pairs. The relevance of these interactions and their interplay for protein structure and function is unclear. Based on the analyses of the crystal structures of small organic/organometallic molecules and proteins and their molecular electrostatic surface potential, we show that the reciprocity of the substituent-dependent strength of the σ-holes and lone pairs correlates with the formation of either Ch-bond or H-bond. In proteins, cystines preferentially form Ch-bonds, metal-chelated cysteines form H-bonds, while methionines form either of them with comparable frequencies. This has implications for the positioning of these residues and their role in protein structure and function. Computational analyses reveal that the S-mediated interactions stabilise protein secondary structures by mechanisms such as helix capping and protecting free β-sheet edges by negative design. The study highlights the importance of S-mediated Ch-bond and H-bond for understanding protein folding and function, the development of improved strategies for protein/peptide structure prediction and design and structure-based drug discovery.
We previously presented a computational protocol to predict the enzymatic (enantio)selectivity of an ω-transaminase towards a set of ligands (Ramírez-Palacios et al. (2021) Journal of Chemical Information and Modeling 61(11), 5569–5580) by counting the number of binding poses present in molecular dynamics (MD) simulations that met a defined set of geometric criteria. The geometric criteria consisted of a hand-crafted set of distances, angles and dihedrals deemed to be important for the enzymatic reaction to take place. In this work, the MD trajectories are reanalysed using a deep-learning approach to predict the enantiopreference of the enzyme without the need for hand-crafted criteria. We show that a convolutional neural network is capable of classifying the trajectories as belonging to the ‘reactive’ or ‘non-reactive’ enantiomer (binary classification) with a good accuracy (>0.90). The new method reduces the computational cost of the methodology, because it does not necessitate the sampling approach from the previous work. We also show that analysing how neural networks reach specific decisions can aid hand-crafted approaches (e.g. definition of near-attack conformations, or binding poses).
Improvement of super-resolution microscopy in the last decade has led to the development of methods such as stimulated emission depletion (STED) microscopy and structured illumination microscopy (SIM), which modulate the excitation light to break the diffraction limit, or methods such as photo-activated localization microscopy (PALM) or stochastic optical reconstruction microscopy (STORM), which utilize the photoswitching properties of fluorescent molecules to enable precise localization of single molecules (Patterson, 2009). In this chapter, we will focus on PALM/STORM methods, which rely on similar principles and instrumentation. Namely, these acquire a series of images of fields of single molecules followed by subsequent image analysis to localize the molecules to much higher precision than their diffraction limited signals. Importantly, thousands or millions of molecules are identified and used to reconstruct the super-resolution images (Betzig et al., 2006; Hess et al., 2006; Rust et al., 2006; Heilemann et al., 2008; van de Linde et al., 2009; Kamiyama and Huang, 2012).
Classical chemistry and biochemistry experiments in solution measure the properties of many molecules and/or interrogate them simultaneously – these are called ensemble measurements and tend to mask the underlying molecular dynamics. Studies at single-molecule level provide random, stochastic dynamics, and allow access to an incredible wealth of molecular information. Most importantly, previously “unanswerable” questions in the physical, chemical, and biological sciences can now be answered. The field of single-molecule science (SMS) can be roughly divided into two general areas: (1) improvements in single-molecule methodologies (technology development); and (2) use of these methodologies to address important scientific questions in fundamental biological research (applied research). Over the past decades, single-molecule research has fostered excellent collaboration and interdisciplinary research with input from biology, chemistry, and physics.
A single live cell of E. coli can be estimated to contain around 3 million active protein molecules at any given moment. For larger and more complex human cells, that number goes up to between 200 and 300 million (Milo and Phillips, 2015). In E. coli, this represents around 4,000 different types of proteins and in humans around 20,000 (Wang et al., 2015). Each of these proteins performs a different and highly specialized role within the living cell, determined by its three-dimensional structure, composition, mechanics, and dynamics. Very few experimental techniques are able to access information about the structure and dynamics of the individual elements and substructures of protein molecules, which is needed to understand aspects of their function. One such technique is single-molecule force spectroscopy by optical trapping, a method for which Arthur Ashkin won the Nobel Prize in Physics in 2018. Using the principle that highly focused laser beams can be used to trap micron-scale objects, experimental methods have been developed where micron-sized glass beads are functionalized with protein constructs, establishing geometries that enable forces to be applied to the individual protein molecules (Figure 5.1a).
The term “biomarker,” a portmanteau of “biological marker,” has been defined by Hulka and colleagues (Hulka, 1990) as “cellular, biochemical or molecular alterations that are measurable in biological media such as human tissues, cells, or fluids.” In 1998, the definition was broadened as the National Institutes of Health Biomarkers Definitions Working Group defined a biomarker as “a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention” (Biomarkers Definition Working Group, 2001). In practice, the discovery and quantification of biomarkers require tools and technologies that help us predict and diagnose diseases; understand the cause, progression, and regression of diseases; and understand the outcomes of disease treatments. Different types of biomarkers have been used by generations of epidemiologists, physicians, and scientists to study all sorts of diseases. The importance of biomarkers in the diagnosis and management of cardiovascular diseases, infections, immunological and genetic disorders, and cancers is well known (Hulka, 1990; Perera and Weinstein, 2000).
Determining rules for gene expression regulation is an important step toward predicting how cells are decoding the genome sequence to create a wide variety of phenotypes. Recent advances in imaging technologies revealed the stochastic nature of gene expression, in which different numbers of mRNA and protein molecules can be created in cells that have the same genome sequence (Elowitz, 2002); Kaufmann and van Oudenaarden, 2007). An early research revealed that this stochasticity is yielded by two factors: intrinsic and extrinsic noise. While the former is due to instant random chemical reactions in gene expression process, the latter is caused by cell specific molecular states emerging from the integration of gene expression over a longer time (Elowitz, 2002); Kaufmann and van Oudenaarden 2007). This finding inspired studies to investigate how cells deterministically cause robust phenotypes under such stochasticity. In contrast, this also motivated investigations on how cells utilize this stochasticity to generate different kinds of phenotypes for processes such as neural development (Johnson et al., 2015), emergence of bacterial resistance (Sánchez-Romero and Casadesús, 2014), or cancer development (Marusyk et al., 2012; Junttila and de Sauvage, 2013).
More than 150 years after the discovery of DNA (Dahm, 2005) and 50 years after the determination of a DNA duplex structure (Choudhuri, 2003), our understanding of genomics is still lacking. Despite a massive effort in genetic studies, there are still missing pieces in the puzzle: many traits have no corresponding genetic features, the reasons for variations among different individuals within a species are still a mystery, and the lack of single-molecule resolution hinders many types of analysis. These gaps are exemplified by inherent limitations in the currently leading technique for DNA studies, next generation sequencing (NGS). The initial “input” for NGS consists of genomes extracted from a large number of cells, and the final “output” is a single sequence. Thus, the produced data represents an average sequence of the majority of the sequenced cells. This results in oversight of several significant biological aspects, since variations within the population and rare cellular populations are not detected.
Automated fluorescence microscopy–based screening approaches have become a standard tool in systems biology, usually applied in combination with exogenous regulation of gene expression in order to examine and determine gene function. Gain of function can be created by introducing cDNAs encoding the gene of interest that can be either untagged or tagged for the visualization of the recombinant protein and its subcellular localization (e.g., green fluorescent protein [GFP]-tagged; Temple et al., 2009). After the discovery of RNA interference (RNAi) in the late 1990s and the development of mammalian short interfering RNA (siRNA) and short hairpin RNA (shRNA) libraries in the early 2000s, gene knockdown technologies became a mainstream for loss-of-function screens on a large or genome-wide scale (Heintze et al., 2013). To date, genome-wide siRNA libraries are still the main application in genomic high-throughput screening, although key problems of the RNAi technology have become apparent, such as off-target effects, variable levels of knockdown efficiency, resulting in low-level confidence in hits of screening campaigns. In order to overcome these limitations, alternative methods for manipulation of gene expression have been developed and predominantly rely on gene excision.
Biomolecules and biopolymers undergo conformational transitions during many biological processes. For example, some proteins are observed to have multiple intermediate states in the folding/unfolding pathways (Stigler et al., 2011; Yu et al., 2012); intrinsically disordered proteins can form diverse metastable structures (Neupane et al., 2014); functional proteins can often be switched between active and inactive states through conformational transitions (Yang et al., 2003; Hanson et al., 2007; Wijeratne et al., 2013); nucleosomes are able to regulate DNA unwrapping through their conformational transitions (Ngo et al., 2015). These dynamic states of DNA and proteins control their biological functions. Since force plays a fundamental role in many, if not all, biological systems, one way to reveal the dynamics of the molecules is to elucidate its intra- and intermolecular force, which can be used as a marker to capture information about their conformational changes.
Protein secretion studies started in the 1950s with George Palade’s electron microscopy (EM) work (Palade, 1952, 1975). Protein secretion is a very relevant process because more than 30 percent of synthesized proteins work in organelles or outside the cells (Arora and Tamm, 2001). In eukaryotic cells, the proteins secreted to the exterior are synthesized in the cytoplasm and transported inside the endoplasmic reticulum (ER), then pass to the Golgi apparatus and finally to secretory vesicles. Blobel and Sabatini in the 1970s discovered signal sequences at the N-terminus extreme of secretory proteins that allow them to be recognized by receptors thus mediating and facilitating their entrance to ER interior (Blobel and Dobberstein, 1975; Sabatini et al., 1982). Proteins enter the ER lumen by a protein conducting channel formed by a protein complex, known as the translocon, discovered in yeast in Randy Schekman´s laboratory, which is universally conserved (Deshaies et al., 1991). In eukaryotic cells, the translocation of proteins into ER lumen is carried out by the Sec61 complex (Rapoport, 2007; Zimmermann et al., 2011), whereas the bacterial homologue is the heterotrimeric SecY complex, which allows the secretion of proteins to the exterior (Park and Rapoport, 2012).