To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The structures of prokaryotic and eukaryotic genes are different
Both prokaryotic and eukaryotic genes have promoter sequences 5′- to the sites of initiation of transcription with a TATA box. This generally starts about 10 nt 5′- to this site in prokaryotes and about 30 nt from this site in eukaryotes, though in yeast it is usually further away, at 40–100 nt. It is not an invariant feature of eukaryotic promoters, but is always present in prokaryotes, where its actual sequence may influence the rate of initiation of transcription. In general, the more a TATA box deviates from the consensus sequence, the fewer mRNA transcripts will be made so that there will be very low levels of the encoded protein.
There is no sequence in eukaryotes corresponding to the prokaryotic –35 box, but eukaryotic genes typically possess a number of other upstream sequences to which trans-acting protein factors bind to regulate transcription.
The untranslated sequences 5′- to the initiation codon tend to be shorter in prokaryotes than in eukaryotes, where their length is very variable, ranging from just a few nucleotides to a thousand or more. In prokaryotes the conserved Shine–Delgarno sequence immediately 5′- to the initiation codon is used to position the mRNA on the ribosome by base-pairing. In eukaryotes, the methyl-guano sine cap at the extreme 5′-end of their mRNAs is required for binding to the ribosome. A well-conserved sequence immediately round the initiation codon presumably has a different role.
In many investigations of DNA, fragments have been excised from their natural chromosomal sites and incorporated into larger pieces of DNA that can replicate autonomously in various kinds of cells. These will produce copies of the original fragments in large enough quantities and in a pure enough state for further study.
DNA can be obtained for these purposes in three main ways:
By enzymic synthesis on an RNA template using the enzyme reverse transcriptase (Chapter 3.5) to produce complementary DNA.
By the action of restriction endonucleases (Chapter 3.7) for hydrolytic cleavage at specific sites.
By chemical synthesis that can be automated to produce oligonucleotides 50–100 nt long in a reasonable time. This method only makes fragments smaller than those produced by the other methods, some of which can yield pieces of DNA up to at least 60,000 nt long.
DNA made in this way is ligated to the DNA of an independently replicating vector that will grow when introduced into appropriate cells (Chapter 3.2) to yield clones of cells infected with a particular type of DNA. Hence the technique is often referred to as cloning of DNA. The artificially produced DNA is called recombinant DNA since it will have been combined from two sources for its production.
The polymerase chain reaction is a very elegant method for synthesising relatively large quantities of a particular DNA sequence.
Haemoglobin consists of four polypeptide chains of two different though similar types, which are folded round each other in an orderly and compact fashion. The two types of chain, designated α and β, are present in equal amounts. They show a very substantial degree of homology – in humans 43% of the residues are identical. During synthesis of haemoglobin these polypeptides (known as globins) are made first and then each binds a molecule of haem very firmly.
Several different β-like globins are synthesised at different stages of life (Table 10.1). In the early embryo, the ∊-chain is synthesised in the yolk sac. Later, synthesis switches to the foetal liver which makes two forms of the γ-chain, known as Aγ and Gγ, since they contain either an alanine or a glycine residue at one particular position. Finally, just before birth, β-chain synthesis commences in the bone marrow, along with a very small amount of the almost identical δ-chain that differs from the β-chain in only ten residues.
The δ-gene is transcribed at a much lower efficiency than the β-gene so that δ-chain mRNA is produced in much smaller amounts than the β-chain mRNA. The sequences of the two genes differ in the 5′-flanking regions (Fig. 10.1), but it is not possible to pinpoint the reasons for their different rates of transcription.
The classic experiments of Avery in 1944 demonstrated that DNA (Deoxyribonucleic acid) passes genetic information from one bacterium to another. Strain-specific properties of related bacteria could be transferred by DNA that was free of proteins and other substances. DNA is a polymeric molecule built up from only four similar but distinct monomers – nucleotides that are the 5′-phosphates of deoxyguanosine (dGMP), deoxyadenosine (dAMP), deoxycytidine (dCMP), and thymidine (TMP) (Fig. 1.1), joined by phosphodiester linkages between the 3′- and 5′-positions of successive deoxyribose moieties. The initial letters of the bases in the nucleotides are used as abbreviations when writing out their sequence in DNA. The symbols N, R and Y denote any nucleotide, a purine nucleotide and a pyrimidine nucleotide respectively.
DNA is a polar helical molecule
One end of a DNA molecule has a phosphoryl radical on the C-5′ of its terminal nucleotide, while the other end possesses a free -OH on the C-3′ of its nucleotide. Thus a poly nucleotide exhibits polarity in an analogous way to that of proteins with free -NH2 and -COOH groups at their ends. The tetranucleotides TCGA and AGCT are different chemical entities with distinct properties, even though they behave very similarly in many respects (Fig. 1.2). By convention, sequences of DNA are written with the nucleotide containing the free phosphoryl radical at the left. Sequences to the left of a given nucleotide are said to be on the 5′-side (often called upstream), and those to the right are said to be on the 3′-side (often called downstream).
The genomes of retroviruses consist of fairly short (5–9 kbp) strands of RNA containing a very limited number of genes. They are encapsulated in a protein coat that is encoded by two of these genes. The gag gene directs the synthesis of its core protein, and the env gene codes for the glycoprotein spike on the surface of the envelope. They may also carry oncogenes in their genome.
Replication of these viruses is initiated by a reverse transcriptase, encoded by their pol gene. In infected cells the host's RNA polymerase transcribes this gene, producing an mRNA that directs the synthesis of reverse transcriptase on the host's own ribosomes. This enzyme transcribes the viral RNA into a single-stranded DNA molecule to direct the synthesis of a complementary strand of DNA making use of the host's DNA polymerase. This duplex DNA is integrated into the host's genome as a provirus at many sites with duplication of 4–6 bp of the host DNA sequence at each end of the insertion. The provirus can be replicated or transcribed in the usual way, resulting in either the spread of the viral sequence among the host's cells or the production of new virus particles.
The 5′-end of retrovirus RNA contains a cap structure (Chapter 8.2), followed by a short sequence of up to 80 nt (R) that is also found at the 3′-end.
Prokaryotes have a single chromosome consisting of circular double-stranded DNA. The size may vary considerably, e.g. the E. coli chromosome contains about 3.8×106 bp, while that of Bacillus subtilis has 2×106 bp and that of Salmonella typhimurum has 10.5×106 bp. If the E. coli chromosome were in a linear extended form it would be about 1 mm long, but it is a fairly compact structure due to supercoiling of the DNA. This must be opened up to allow access for enzymes involved in replication or transcription in complex processes requiring several proteins, including helicases that relax supercoiled DNA, and single-strand binding protein (SSBP) (Chapter 1.3). Other proteins that are involved in these processes were originally identified by genetic means and are named after the genes that encode them (e.g. dnaB, dnaC, for the genes; DnaB, DnaC for the proteins).
Replication
This has been studied in simplified systems with the DNA from phages like ϕX174 and the plasmid pBR 322 that have specific replication origins. A multisubunit complex of primase (Pri) proteins is built up on these sites, known as primosome assembly sites. PriA binds to a single-stranded hairpin structure and acts as a 3′ → 5′ helicase, translocating along double-stranded DNA with the concomitant hydrolysis of ATP, and PriB is bound to this complex.
Immunoglobulins (Igs), which are antibodies, are proteins consisting of four polypeptide chains – two identical light (L) chains of about 220 amino acid residues, and two identical heavy (H) chains containing between 450 and 600 amino acid residues. Each L chain consists of two domains of approximately equal size. The N-terminal domain is variable in amino acid sequence, and is different in each individual L chain that has so far been sequenced. The C-terminal domain is constant in sequence, though there are two types that exhibit a high degree of homology, known as κ and λ. Either one, but not both, may occur in one Ig molecule.
The H chains contain four or five domains, each with about 110 residues. Again, the N-terminal domain is variable in sequence, and together with the N-terminal variable domain of the L chain forms one of the two antigenbinding sites in each molecule (Fig. 11.1). The hinge region, between the second and third domains, generally contains about 20 amino acid residues, and is devoid of secondary structure and therefore very flexible. There are five types of H chain – α, γ, δ, ∊, μ – whose constant portions are quite distinct, though homologous. ∊ and μ have an extra domain. There are subclasses of α- and γ-chains. Intact Ig molecules are identified by suffixes of Latin letters corresponding to the Greek letter of the H chain (A, G, D, E, M).
Expression of the information in DNA is mediated by RNA
Genetic expression always involves the synthesis of another nucleic acid polymer called ribonucleic acid (RNA) that is composed of four nucleotide monomers – the monophosphates of adenosine (AMP), guanosine (GMP), cytidine (CMP), and uridine (UMP) (Fig. 2.1). These all contain the sugar ribose instead of deoxyribose found in DNA, and the individual nucleotides are linked together through 3′-, 5′-phosphodiester bridges, just as in DNA. Uracil, which is present rather than thymine, lacks the methyl group of the latter base. RNA is generally single-stranded, though its bases can pair by hydrogen bonding to give hairpin-like or stem structures (Fig. 2.2). Guanine occasionally pairs with uracil, though this pairing is less stable than the more usual A–U and G–C pairs. RNA molecules are much smaller than DNA molecules, and only comparatively short stretches of DNA are used to direct the synthesis of individual RNAs.
The stability of stem structures can be calculated in terms of the energy required to open them up. To a first approximation, this is the sum of the energies needed to break the individual hydrogen bonds in the base-pairs in the stem, but they have to be considered in adjacent pairs because there is considerable dependence on neighbouring bases. An allowance also has to be made for loops at the end of a stem and for bulges in the stem if there are bases that are not paired with bases on the opposite side.
Genes for sets of metabolically related enzymes are transcribed as one long message
In bacteria, genes specifying enzymes that are all part of a metabolic pathway are commonly transcribed as single units from adjacent lengths of DNA with only short non-coding stretches between them. Such ‘super-genes’ are known as operons and give rise to polycistronic mRNAs. Since translation immediately follows transcription this results in the rapid production of a set of functionally related enzymes in equivalent amounts – a process known as co-ordinate control. Some of these operons and the enzymes they encode are constitutive, being synthesised more or less constantly; others are subject to precise control, signalled by the presence or absence of metabolites in the cell and are called either inducible or repressible depending on whether they are switched on or off by a particular metabolite.
There are two classes of genes – structural genes encode information for either stable RNAs or mRNAs; regulatory genes regulate the transcription of structural genes but are not themselves transcribed. These regulatory genes are situated immediately upstream from the operon whose activity they control. In addition to the promoter there is another region, known as the operator which is either directly adjacent to the promoter, or even overlaps it.
Negatively controlled inducible operons are not normally transcribed because a specific repressor protein is bound to the operator. Induction occurs when an inducer – a small molecule – binds to the repressor, altering its conformation so that it now dissociates from the operator and allows transcription to proceed.
Eukaryotes produce three different RNA polymerases for transcribing nuclear genes. These are all very large proteins with multiple subunits and molecular masses in the range of 500,000 to 700,000 daltons (Da). Their catalytic subunits with molecular masses of around 200,000 Da are among the largest known single polypeptide chains. Some of the subunits are common to two or even all three of the enzymes. Each of them transcribes a particular set of nuclear genes. Polymerase I (pol I or pol A), with 13 subunits, transcribes the genes for precursors to rRNAs; polymerase II (pol II or pol B) (12 subunits) transcribes genes into mRNAs encoding proteins; polymerase III (pol III or pol C) (15 subunits) transcribes genes for tRNAs and some other small RNAs. An RNA polymerase with a simpler structure that transcribes mitochondrial genes is encoded by a nuclear gene.
Each polymerase requires a number of transcription factors (TFs) in order to bind to the DNA template and initiate and maintain transcription. A protein called TATA binding protein (TBP) binds to the sequence known as the Goldberg–Hogness or TATA box (consensus sequence TATAAA – Table 7.1), usually situated about 30 bp 5′- to the major site of initiation of transcription of mRNAs by pol II. TBP is also a TF for genes transcribed by pol I and pol III even though they rarely contain a TATA box.
Increasingly, medical research is dominated by DNA. The hunt is on for genes that can help to answer the really big questions, such as how does a single cell grow and develop into a complex body, and what really causes cells to grow into tumours sometimes. At the same time new DNA-based drugs and therapies are making their way from the research laboratory into the hospitals and general practitioners' surgeries.
Life, death and the cell
In California, there is a group of people who believe humans are meant to be immortal and that our bodies can last indefinitely if we only have the right psychological outlook! There is absolutely no scientific evidence for this, but the belief that the cells that make up our bodies are immortal was certainly once very widely accepted. Now our view of the life expectancy of cells has changed completely. According to the latest (and still controversial) research, the natural state of our cells is death, and only the constant prodding of genetic signals actually keeps them alive.
The myth of cell immortality originated with Nobel Prize winner Alexis Carrel, a French surgeon who was interested in organ transplantation and tissue culture. In 1912, he started to culture some cells from a chick's heart to see how long they would survive outside the animal's body. When they outgrew their culture vessel, they were divided up and transferred to new vessels – a process known as subculturing.
These words could only come from the mind of the world's most famous seducer: ‘Life is a wench that one loves, to whom we allow any condition in the world, so long as she does not leave us.’
The comment was penned in the last months of the life of Giovanni Casanova, the legendary lover of the 18th century. Fittingly, he met his end after no fewer than 11 bouts with venereal disease, and died because of complications from one of them. But he paid for his dalliances in more than just mortality. Because of these ailments, his amorous career actually stopped 13 years before his life expired. He spent those years eating food in the kitchens of European nobility, causing one biographer to quip: ‘since he could no longer be a god in the gardens, he became a wolf at the table.’
The father of countless illegitimate children was himself born a bastard in the Venice of 1725. Although he never fully established his paternity, he did know his mother, a famous actress of the day named Zanetta Farusi. He grew up in her household, half-brother to a number of other children whose paternity could never be fully established, either.
Casanova's first encounter with the sensuous life occurred at age 11, in the experienced arms of the woman who normally gave him a bath. His teenaged years were full of heterosexual explorations, where he finetuned his history-making skills of seduction.
The legend is that Joseph of Arimathea, the man to whose tomb the crucified Christ was brought, was actually Jesus' uncle. And a tin miner. He often liked to visit the mines and sometimes he would bring along his more famous Nephew. He remains the patron saint of undertakers and tin miners for these two reasons.
There is an alleged history of Joseph after the crucifixion of Christ, one that plays a mighty role in English folklore. He traveled with Mary Magdalene to the north of France. From there, he set sail to Britain and established a church at Glastonbury. He brought with him a certain chalice from the Last Supper, which was the Holy Grail of Arthurian legend. Since the Bards place King Arthur's own castle at Glastonbury, the birthing of the association – complete with The Quest – has a certain romantic logic to it. Joseph planted his own staff at Glastonbury, which became a hawthorn tree that was supposedly 1650 years old. It was said to miraculously blossom every year on Christmas Eve. This was the kind of Catholic nonsense that so incensed the Puritans. But, just in case, the Puritans chopped it down before they left for America. Purity and immortality figure greatly in these legends. Sir Galahad was the only knight of sufficient moral strength to find the Holy Grail. He became spiritually immortal (and physically unearthly) as a result.
By all accounts, the movie star Rudolph Valentino was the premier womanizer of the 1920s. H. L. Mencken once described him as ‘catnip to women.’ Many of the women simply called him ‘jerk.’
Intelligent, stunningly handsome, and acutely aware of both characteristics, the Italian born movie star became one of the most enduring icons from the age of silent movies. He was also one of the silver screen's first male sex symbols. In the space of only five years, he had gone through three wives, countless lovers and a million broken hearts. Even so, the most startling characteristic about his career wasn't its physical intensity, but its amazingly short duration. Valentino starred in his first film at the age of 26. Five years later he was dead. As a result, this sexual supernova left to posterity only images of youth and vigor, producing an eerie timelessness that haunts the minds of many film buffs steeped in the lore of American cinema. This flexibility in our perception of aging, indeed the wobbliness of the very definition, is the focus of this chapter.
How Valentino died
The events that would take Rudolph Valentino's life started in his New York hotel suite. Witnesses relate that Valentino, lounging around his room on a lazy day in August 1926, felt a sudden, incredibly painful stabbing in his side. The pain persisted, but he refused to be hospitalized.
Molecular biology and the DNA revolution are already having an enormous impact on medicine. There is an increasing emphasis on the role of genes in disease, along with powerful new DNA technologies for exploring an individual's genome. At the same time, the molecular approach has given important insights into some of the toughest problems in biology: development, ageing and cancer. We even have the tools to change our genes or control their activity, using gene therapy.
The basics of human genetics and the different DNA testing methods are looked at in this chapter. The new medical research findings that have emerged from molecular biology, along with the prospect of DNA-based therapies, are reviewed in Chapter 8.
Human genetics – a basic guide
The DNA molecule at the heart of each cell in the human body is like a signature, unique to each individual. There is only one exception to this rule. Identical – or monozygotic – twins have identical genomes. The reason is that they develop from a single fertilised egg that splits into two embryos sometime during the first two weeks of a pregnancy. As we shall see, research on twins is important in trying to assess the relative contributions of genes and the environment to someone's physical and mental make-up.
Within the three billion base pair sequence of the human DNA molecule there is obviously plenty of scope for variation in the ordering of the four bases. At the same time, all human DNA molecules have a broad similarity, which lets us all function as members of the same species.