Skip to main content

The Old Bailey Proceedings, 1674–1913: Text Mining for Evidence of Court Behavior

  • Tim Hitchcock and William J. Turkel

The shortest trial report in the Old Bailey Proceedings is precisely eight words in length. In February, 1685:

Elizabeth Draper, Indicted for Felony, was found Guilty.

Corresponding author
Hide All

1. See Old Bailey Proceedings Online (hereafter Proceedings) February 1685, trial of Elizabeth Draper (t16850225-31); and May, 1856, trial of William Palmer (t18560514-490), version 6.0 April 17, 2011.

2. The uniquely accurate character of this transcription is the result of the project's use of “double entry rekeying” to capture the original material up to 1834, and a combination of rekeying and optical character recognition (OCR) for the period 1834 to 1913. The resulting text is 99.99% accurate. This in turn, allowed a complex XML tagging schema to be applied to the transcribed text, reflecting offense, verdict, punishment, and other categories. In contrast, the vast majority of historical resources have used an unchecked OCR methodology, which when applied to historical materials results in a significant level of error, making the resulting digital resources more difficult to use in text mining, and largely impossible to tag accurately for structured information. The character and word accuracy rate across the whole of the British Library's Nineteenth Century Newspaper Project, for example, is 78% for characters, and 68.4% for whole words, implying that almost one in three words is mistranscribed. See Tanner, Simon, Muñoz, Trevor, and Ros, Pich Hemy, “Measuring Mass Text Digitization Quality and Usefulness: Lessons Learned from Assessing the OCR Accuracy of the British Library's 19th Century Online Newspaper Archive,” D-Lib Magazine, 15 (2009). Accessed July 28, 2016.

3. “Text Mining is the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources. A key element is the linking together of the extracted information together to form new facts or new hypotheses to be explored further by more conventional means of experimentation.” Marti Hearst, “What is Text Mining?”, School of Information Management and Systems, University of California, Berkeley, October 17, 2003

4. M. Dorothy George, London Life in the Eighteenth Century, 2nd ed. (Harmondsworth: Penguin Books, 1966). Since the Proceedings were published on microfilm in 1984, with an extended introductory pamphlet by Michael Harris, their role, particularly for the eighteenth century, has become more significant and they have served as the primarily evidential foundation for literatures on plebeian culture, crime and criminal justice, juvenile delinquency, popular and material culture, work patterns and industrialization, homosexuality, and the development of spoken language. See The Old Bailey Proceedings, Parts One and Two, 1714–1834 (Brighton: Harvester Microform 1984); Central Criminal Court Sessions Papers, 1816–1913 (Dobbs Ferry, NY: Trans-World Microforms, 1981); and The Old Bailey Proceedings: A Listing and Guide to the Harvester Microfilm Collection, introduction by Michael Harris (Brighton, Sussex: Harvester Microform, 1984). For the recent use of the Proceedings as the basis for the history of crime see, for example, Anthony Babington, A House in Bow Street: Crime and the Magistracy, London, 1740–1881 (London: Macdonald, 1969); Douglas Hay, Albion's Fatal Tree: Crime and Society in Eighteenth-Century England (London: Allen Lane, 1975); Heather Shore, Artful Dodgers: Youth and Crime in Early Nineteenth-Century London (Woodbridge: Royal Historical Society/Boydell Press, 1999); Frank McLynn, Crime and Punishment in Eighteenth-Century England (Oxford, New York: Oxford University Press, 1991); Hal Gladfelder, Criminality and Narrative in Eighteenth-Century England: Beyond the Law (Baltimore: Johns Hopkins University Press, 2001); and Andrew T. Harris, Policing the City: Crime and Legal Authority in London, 1780–1840 (Columbus: Ohio State University Press, 2004). For industrialization, male homosexuality, and linguistic change, see, for example, Hans-Joachim Voth, Time and Work in England 1750–1830 (Oxford: Clarendon Press; Oxford University Press, 2000). Rictor Norton, Mother Clap's Molly House: The Gay Subculture in England, 1700–1830 (London: GMP, 1992); Randolph Trumbach, Sex and the Gender Revolution, Volume One: Hetrosexuality and the Third Gender in Enlightenment London (Chicago: University of Chicago Press, 1998); and Huber, Magnus, “The Old Bailey Proceedings, 1674–1834. Evaluating and Annotating a Corpus of 18th-and 19th-Century Spoken English,” Annotating Variation and Change (Studies in Variation, Contacts and Change in English 1) 10 (2008),

5. See, for example, John M. Beattie, Crime and the Courts in England 1660–1800 (Princeton: Princeton University Press, 1986); John M. Beattie, Policing and Punishment in London, 1660–1750: Urban Crime and the Limits of Terror (Oxford: Oxford University Press, 2001); David Jeffrey Bentley, English Criminal Justice in the Nineteenth Century (London: The Hambledon Press, 1997); Peter King, Crime, Justice and Discretion in England, 1740–1820 (Oxford: Oxford University Press, 2000); Norma Landau, ed., Law, Crime and English Society, 1660–1830 (Cambridge: Cambridge University Press, 2002); Langbein, John H., “The Criminal Trial Before the Lawyers,” University of Chicago Law Review 45 (1978): 263316 ; Langbein, John H., “Shaping the Eighteenth-Century Criminal Trial: A View from the Ryder SourcesUniversity of Chicago Law Review 50 (1983): 136 ; Thomas A. Green, Verdict According to Conscience: Perspectives on the English Criminal Trial Jury, 1200–1800 (Chicago: University of Chicago Press, 1985); Douglas Hay, “The Class Composition of the Palladium of Liberty: Trial Jurors in the Eighteenth Century,” in Twelve Good Men and True: The Criminal Trial Jury in England, 1200–1800, ed. by James S. Cockburn and Thomas A. Green (Princeton: Princeton University Press, 1988), 305–57; Wiener, Martin J., “Judges v. Jurors: Courtroom Tensions in Murder Trials and the Law of Criminal Responsibility in Nineteenth-Century England,” Law and History Review 17 (1999): 467506 ; Feeley, Malcolm, “Legal Complexity and the Transformation of the Criminal Process: The Origins of Plea Bargaining,” Israeli Law Review.31 (1997): 183222 ; David J. A. Cairns, Advocacy and the Making of the Adversarial Criminal Trial, 1800–1865 (Oxford: Oxford University Press, 1998); Gallanis, Thomas P., “The Rise of Modern Evidence Law,” Iowa Law Review 84 (1999): 499560 ; David Lemmings, Professors of the Law: Barristers and English Legal Culture in the Eighteenth Century (Oxford: Oxford University Press, 2000); and Allyson May, The Bar and the Old Bailey, 1750–1850 (Chapel Hill: University of North Carolina Press, 2003).

6. On the role of counsel in particular, see Beattie, John M., “Scales of Justice: Defense Counsel and the English Criminal Trial in the Eighteenth and Nineteenth Centuries,” Law and History Review 9 (1991): 221–67; Landsman, S., “The Rise of the Contentious Spirit: Adversary Procedure in Eighteenth-Century England,” Cornell Law Review 75 (1990): 498609 ; John H. Langbein, The Origins of Adversary Criminal Trial (Oxford: Oxford University Press, 2005), ch.3; Robert B. Shoemaker, “Representing the Adversary Criminal Trial: Lawyers in the Old Bailey Proceedings, 1770–1800,” in Crime, Courtrooms and the Public Sphere in Britain, 1700–1850, ed. D. Lemmings (Farnham: Ashgate, 2012), 71–91; and. May, The Bar and the Old Bailey. This literature, although acknowledging the changing nature of the Proceedings as a text, has nevertheless largely relied on a straightforward count of textual references in relatively small samples of trials. See also Tim Hitchcock and Robert Shoemaker, London Lives: Poverty, Crime and the Making of a Modern City (Cambridge: Cambridge University Press, 2015), 180–91, 356–62.

7. Langbein, Origins, 190––reiterating, without revision, his judgement on the Proceedings originally made in 1978.

8. Most historians have accepted Langbein's observation that “if the Sessions Paper report ‘says something happened, it did; if the … report does not say it happened, it still may have.’” Langbein, Origins, 185, again quoting himself circa 1983.

9. The Trial of Elizabeth Canning, Spinster, for Wilful and Corrupt Perjury; at Justice Hall in the Old-Bailey … 1754 (London: John Clarke, 1754), 19–20, 104. Quoted in Huber, section

10. See Proceedings, September 1785, t17850914-163.

11. Devereaux, Simon, “City and the Sessions Paper: “Public Justice” in London, 1770–1800,” Journal of British Studies 35 (1996): 500.

12. Proceedings, March 1895, John Sholto Douglas (t18950325_336); April 1895, Oscar Fingal O'Fahartie Wills Wilde and Alfred Taylor, (t18950422-397); May 1785, Oscar Fingal O'Flahartie Wills Wilde and Alfred Waterhouse Somerset Taylor, (t18950520-425).

13. For digital history, see Daniel J. Cohen and Roy Rosenzweig, Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web (Philadelphia: University of Pennsylvania, 2005); and Cohen, Daniel J., Frisch, Michael, Gallagher, Patrick, Mintz, Steven, Sword, Kirsten, Taylor, Amy Murrell, Thomas, William G. III, and Turkel, William J., “Interchange: the Promise of Digital History,” Journal of American History 95 (2008): 442–51.

14. Börner, Katy, “Plug-and-Play Macroscopes.Communications of the ACM 54 (2011): 6069 ; and Franco Moretti, Graphs, Maps, Trees: Abstract Models for a Literary History (London: Verso, 2005).

15. All programming for this project was done by Turkel in Mathematica 8 and 9 on Mac OS X. Mathematica is a proprietary development platform from Wolfram Research that is designed for technical computing. It integrates mathematical and scientific computing, visualization, data manipulation, and access to curated data, with the possibility of deploying documents that mix text and data with dynamic elements. See

16. The “tagging” was done in XML, and the files for each trial incorporating the full XML markup are available on the Old Bailey online site.

17. The “mining” of the Proceedings involved stripping out all XML markup, eliminating non-Latin and numeric characters, converting the text to lower case, and removing all punctuation. The resulting edition standardized the texts, facilitating counting words in a consistent manner. It should be noted, however, that there is a significant disagreement among linguists about what should count as a “word.” Is, for example, the formulation “John's” one word, or two (i.e., name + possessive marker)? Larry Trask, “What is a Word?” (2004), Department of Linguistics and English Language, University of Sussex Working Paper LxWP11/04, Accessed July 28, 2016.

18. The relationship between the number of “defendants” and “trials” for all complete decades (1720–1910) averages 1.32 defendants per trial. This ratio is slightly more variable prior to the early nineteenth century. For 1720–1810 this ratio ranges between a low of 1.18 defendants per trial in the 1800s, and 1.58 in the 1750s. The ratio of defendants to trials from the 1810s onwards was more consistent, and ranged from 1.20 defendants per trial in the 1830s to 1.32 defendants per trial in the 1880s. The relationship between the number of “offenses” and the number of “trials” was more consistent in all periods, averaging 1.07 offenses per trial, and ranging from 1.02 offenses per trial in the 1810s, to 1.19 in the 1900s.

19. These subperiods were initially identified using an unsupervised algorithmic “clustering” technique; however, the final selection of the boundaries was left to the judgement of a human observer, following multiple iterations of different clustering visualizations. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze, Introduction to Information Retrieval (Cambridge: Cambridge University Press, 2008).

20. The Proceedings have not survived or were not published for approximately one third of the sessions between 1674 and 1714. See Clive Emsley, Tim Hitchcock and Robert Shoemaker, “The Proceedings - Publishing History of the Proceedings”, Old Bailey Proceedings Online (, version 7.0, 09 August 2016)

21. Magnus Huber, “Old Bailey Proceedings, 1674–1834,” and Shoemaker, Robert B., “The Old Bailey Proceedings and the Representation of Crime and Criminal Justice in Eighteenth-Century London,” Journal of British Studies 47 (2008): 559–80.

22. Langbein, Origins, 188. A measure of the relatively insecure and changeable nature of the role of shorthand reporter for the Proceedings can be found in Edmund Hodgson's subsequent decline into abject poverty, and his eventual death in the workhouse belonging to St Andrews Holborn. See The Monthly Magazine, or, British Register Vol.XXXIV, Part II. For 1812: 506.

23. Devereaux, “City and the Sessions Paper.”

24. Langbein, Origins, 190, 189.

25. For accounts of the changing volume and character of court business in the nineteenth century, see Clive Emsley, Crime and Society in England, 1750–1900, 4th ed. (Harlow: Longman, 2010), ch.8; David Philips, Crime and Authority in Victorian England: The Black Country 1835–1860 (London: Croom Helm, 1977); Emsley, Clive and Storch, Robert D., “Prosecution and the Police in England since 1700,” Bulletin of the International Association for the History of Crime and Criminal Justice 18 (1993): 4557 . The best account of the changing volume of Old Bailey trials is David Bentley, English Criminal Justice in the Nineteenth Century (London: Hambledon Press, 1998), 55–56.

26. This is not to imply that what was recorded in the Proceedings reflects accurately what was said in court, but merely that the relationship between the two remained the same from 1810 to 1855. The exclusion of sexually explicit evidence from trial reports from 1787 onwards is one measure of the distance between courtroom evidence and trial report (and helps explain historians' relative uninterest in the nineteenth-century Proceedings). Simon Devereaux, “City and the Sessions Paper,” 481.

27. The Criminal Justice Act, 18 & 19 Vic. c.126 (1855) established summary jurisdiction on a clearly defined basis, allowing people charged with minor theft and other offenses to be convicted by two justices. This act was amended only slightly by the Summary Jurisdiction Act, 42 and 43 Vic. c.49 (1879). Emsley, Crime and Society, 216. For a statistical approach to the impact of this legislation, see Williams, Chris, “Counting Crimes or Counting People: Some Implications of Mid-Nineteenth Century British Police ReturnsCrime, Histoire & Sociétés/Crime, History & Societies 4 (2000): 7793 .

28. Shoemaker uses a sample of 271 trials drawn from the January sessions of 1720, 1730, 1740, 1750, 1760, and 1770, and Malcolm Feeley has created a larger sample of 3,500 trials (although Feeley charts a measure of “complexity” rather than trial length per se). Shoemaker, Robert B. (2008) The Old Bailey proceedings and the representation of crime and criminal justice in eighteenth-century London. Journal of British Studies, 47 (3). pp. 559–580. Malcolm M. Feeley, Legal Complexity and the Transformation of the Criminal Process: The Origins of Plea Bargaining, 31 Isr. L. Rev. 183 (1997). Both Simon Devereaux and John Langbein appeal to changing character and length of trial reports, but do so on a more impressionistic basis. Devereaux, “City and the Sessions Papers,” 468; and Langbein, Origins, 188.

29. As the distributions of trial lengths are not normal, mean or average word length is misleading in almost all cases. For more information about the breakdown of the mean under departures from normality, see Rand R. Wilcox, Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy, 2nd ed. (New York: Springer, 2010).

30. Börner, Katy, “Plug-and-Play Macroscopes.Communications of the ACM 54 (2011): 6069 .

31. The use of a logarithmic scale in this chart substantially impacts on how we read the data. It groups, for example, trials between 10 and 100 words in length within the same vertical measure as trials between 1,000 and 10,000 words. This has the effect of understating the differences in trial length at the upper end of the range while overstating the differences at the lower end, so that the apparent difference in trials between 10 and 60 words is equivalent in this figure to the difference between a trial of 10,000 words and 60,000.

32. Proceedings, December 3, 1729, 17291203-1; and Shoemaker, “Representation of Crime”: 566.

33. Proceedings, December 7, 1748, f17481207-1.

34. John Lanbein mentions the public announcement of this policy change, but does not describe its impact. Langbein, Origins, 186. The policy announcement is published on the title page of all issues of the Proceedings from December 7, 1748 to April 25, 1750; however, the 4 d price continues to be advertised until June 25, 1761, through the proprietorship of five different printers and ten different Lord Mayors. By the October 21, 1761 issue, the advertised price had risen to 6 d.

35. Devereaux, Simon, “The Fall of the Sessions Paper: Criminal Trial and the Popular Press in Late Eighteenth-Century London,” Criminal Justice History, 18 (2002): 58, 71; Shoemaker, “Representation of Crime,” passim; and Langbein, Origins, 183–90.

36. Langbein, Origins, 188.

37. Feeley, “Legal Complexity,” 194.

38. Haller, Mark, “Plea Bargaining: The Nineteenth Century Context,” Law & Society Review 13 (1979): 273–79.

39. These categories of crime are taken from the Old Bailey Online XML markup, and include a variety of subcategories. Their application to specific trials was undertaken as part of the original development of the web site, and reflects the project's retrospective historical judgement. See Emsley et al, “About this project,” Proceedings.

40. Shoemaker, “Representation of Crime,” 578–80.

41. Devereaux, “City and the Sessions Papers,” 468.

42. See Klingentstien, Sara, Hitchcock, Tim, and DeDeo, Simon, “The Civilising Process in London's Old Bailey,” Proceedings of the National Academy of Sciences 111 (2014): 9419–24.

43. For the period from January 1801 to the end of the Proceedings, there were 1662 trials for “Rape” and 978 for “Sodomy” out of 145,031 trials in total.

44. 41 George III, c.39. McGowen, Randall, “Managing the Gallows: the Bank of England and the Death Penalty, 1797–1821,” Law and History Review 25 (2007): 241–82; and Deirdre Palk, Gender, Crime and Judicial Discretion, 1780–1830 (Woodbridge: Boydell Press, 2006), 99–101.

45. McGowen provides national returns for prosecutions led by the Bank of England in forgery cases under the new act, which suggest that the bank was the leading agency in the process of developing “guilty pleas” from the mid 1800s. However, it should be noted that at the Old Bailey, the first substantial set of “guilty pleas” for forgery is recorded in 1813, 3 years after the first large batch (27) of “guilty pleas” recorded in theft cases in 1810. McGowen, “Managing the Gallows,” Table 1. For theft cases see, for example, Proceedings, Anne Cotterell, t18100110-7.

46. Thomas Wontner, Old Bailey Experience. Criminal Jurisprudence and the Actual Working of Our Penal Code of Laws. Also, an Essay on Prison Discipline (1833), 56.

47. The history of the criminal justice system has long been dogged by the “dark figure of unrecorded crime,” which ensures that the relationship between levels of prosecution and crime itself is impossible to establish. For a recent survey of the literature on this problem see Peter King, Crime and Law in England, 1750–1840: Remaking Justice from the Margins, Cambridge: Cambridge University Press, 2006.

48. In the period prior to 1734, a total of 731 “guilty pleas” were recorded, of which 431 resulted in the defendant being sentenced to branding, whereas no punishment was recorded in a further 179 cases. The legal difficulties of accepting a guilty plea in this early period were rehearsed in the trial of Mary Aubry for the murder of her husband in 1688, at which the court explicitly advised her not to enter a guilty plea on the grounds that on a charge of murdering her husband, her death by burning would automatically follow. She nevertheless refused, pleaded guilty and was sentenced to be burned for “petty treason”. Proceedings, February 1682, Mary Aubrey, t16880222-24.

49. Hale, Sir Matthew. Historia Placitorum Coronae. The History of the Pleas of the Crown. Edited by Sollom Emlyn. 2 vols. London, 1736. Reprint. Classical English Law Texts. London: Professional Books, Ltd., 1971.; quoted in John H. Langbein. Torture and Plea Bargaining, 46 U. Chi. L. Rev. 4 (1978).

50. Mark Haller, “Plea Bargaining,” 273–79. The role of capital punishment changed more gradually than this suggests; its use for property crime in particular, being substantively reduced in 1826–27, and largely abolished in 1837, before being comprehensively reformed in 1841.

51. Wontner, Old Bailey Experience, 60.

This article derives from a “Digging into Data” research project, “Datamining with Criminal Intent,” jointly funded by the Joint Information Systems Committee (JISC), National Endowment for the Humanities (NEH), and Social Sciences and Humanities Research Council (SSHRC). The authors thank all of the funders for their support, and their collaborators for their direct contribution to this work, including Dan Cohen, Frederick Gibbs, Geoffrey Rockwell, Jörg Sander, Robert Shoemaker, Stéfan Sinclair, Sean Takats, Cyril Briquet, Jamie McLaughlin, Milena Radzikowska, John Simpson, and Kirsten C. Uszkalo. The authors also thank the Old Bailey Online Team at the University of Sheffield and the Old Bailey Corpus Team at Giessen University; and in particular Sharon Howard, Magnus Huber and Magnus Nissel.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Law and History Review
  • ISSN: 0738-2480
  • EISSN: 1939-9022
  • URL: /core/journals/law-and-history-review
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed