INTRODUCTION: FROM EDMAN SEQUENCING TO MASS SPECTROMETRY
Prior to mass spectrometry, E dman degradation was the only technique to obtain the sequence information of proteins. Edman sequencing was based on the chemical reaction of the N-terminal amine with phenyl isothiocyanate, leading to a phenylthiocarbamoyl derivative, which was cleaved upon acidification and determined based on chromatography or electrophoresis. This was a slow process identifying one amino acid per reaction cycle. In addition, it required that the N-terminus of the proteins of interest was not blocked. However, most intact proteins, if they are not processed from a secretory or pro-peptide form, are blocked at the N-terminus, most commonly with an acetyl group. Other amino-terminal blocking includes fatty acylation, such as myristoylation or palmitoylation. Cyclisation of glutamine to a pyroglutamyl residue and other post-translational modification to N-termini also occur. In short, all these modifications leave the N-terminal residue without a free proton on the alpha nitrogen, thus Edman chemistry cannot proceed. Nowadays, Edman sequencing plays a minor role in protein analyses and has been surpassed by biological mass spectrometry techniques.
This revolution of biological mass spectrometry was largely enabled by the soft ionisation techniques ESI and MALDI (see Sections 15.2.4 and 15.2.5) that allowed the largescale analysis of biomolecules (proteins, peptides, oligonucleotides, oligosaccharides and lipids) and thereby revolutionised the areas of proteomics and metabolomics (Chapter 22). In contrast to electron impact (EI), these soft ionisation techniques produce molecular ions and only insignificant amounts of fragment ions. Therefore, in order to obtain structural sequence information on biomolecules, tandem MS (or MS/ MS) has been developed. Furthermore, the faster speed and sensitivity of tandem MS soon dwarfed the sequencing turnaround available by Edman degradation.
DIGESTION
The identification of proteins by mass spectrometry usually involves protease cleavage, mostly by trypsin. Owing to the specificity of this protease, tryptic peptides usually have basic groups at the N- and C-termini. Trypsin cleaves after lysine and arginine residues, both of which have basic side chains (an amino and a guanidino group, respectively). This results in a large proportion of high-energy doubly charged positive ions that are easily fragmented. The digestion of the protein into peptides is followed by identification of the peptides by tandem mass spectrometry (Section 21.3). This is commonly referred to as bottom-up or shotgun proteomics.