Hostname: page-component-848d4c4894-x24gv Total loading time: 0 Render date: 2024-05-14T17:32:52.031Z Has data issue: false hasContentIssue false

THE LAW'S AVERSION TO NAKED STATISTICS AND OTHER MISTAKES

Published online by Cambridge University Press:  26 July 2022

Ronald J. Allen
Affiliation:
Northwestern University Pritzker School of Law, Chicago, Illinois, United States; International Association of Evidence Science, Beijing, China; Forensic Science Institute, China University of Political Science and Law, Beijing, China
Christopher K. Smiciklas
Affiliation:
Northwestern University Pritzker School of Law, Chicago, Illinois, United States

Abstract

A vast literature has developed probing the law's aversion to statistical/probability evidence in general and its rejection of naked statistical evidence in particular. This literature rests on false premises. At least so far as US law is concerned, there is no general aversion to statistical forms of proof and even naked statistics are admissible and sufficient for a verdict when the evidentiary proffer meets the normal standards of admissibility, the most important of which is reliability. The belief to the contrary rests upon a series of mistakes: most importantly, mismodeling of the structure of legal systems and the nature of common law decision making. Contributing to these mistakes is the common methodology in this literature of relying on weird hypotheticals that mismodel the underlying legal relations and contain impossible epistemological demands. Collectively, these phenomena have distracted attention from issues that actually affect real legal systems.

Type
Research Article
Copyright
Copyright © The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

We are indebted to Marcello di Bello, Andrew Jurs, Jonathan J. Koehler, Michael Pardo, Michael Risinger, and Zhuhao Wang for comments on an earlier draft, and for comments received during presentations to the North Sea Group, the first Michele Taruffo Girona Evidence Week, and the Vanderbilt Evidence Summer Workshop. We are also thankful for the insightful comments of two anonymous reviewers. An earlier draft of this article was submitted by Christopher K. Smiciklas as partial fulfillment of the Senior Research Program at Northwestern University Pritzker School of Law.

References

1. See generally Allen, Ronald J. & Leiter, Brian, Naturalized Epistemology and the Law of Evidence, 87 Va. L. Rev. 1491 (2001)CrossRefGoogle Scholar.

2. Enoch, David, How to Theorize About Statistical Evidence (and Really, About Everything Else): A Comment on Allen, 2 Quaestio Facti 285, 288 (2021)Google Scholar.

3. See, e.g., Ronald J. Allen & Michael S. Pardo, Relative Plausibility and Its Critics, 23 Int'l J. Evidence & Proof 5 (2019); Allen, Ronald J., The Nature of Juridical Proof: Probability as a Tool in Plausible Reasoning, 21 Int'l J. Evidence & Proof 133 (2017)Google Scholar.

4. The literature is vast. For a sampling of just the most recent contributions, see, e.g., John Hawthorne, Yoaav Isaacs & Vishnu Sridharan, Statistical Evidence and Incentives in the Law, 31 Phil. Issues 128 (2021); Fratantonio, Giada, Evidence, Risk, and Proof Paradoxes: Pessimism About the Epistemic Project, 25 Int'l J. Evidence & Proof 307 (2021)Google Scholar; Ross, Lewis, Legal Proof and Statistical Conjunctions, 178 Phil. Stud. 2021 (2021)CrossRefGoogle Scholar; Xiaofei Liu & Ye Liang, What It Means to Respect Individuality, 178 Phil. Stud. 2579 (2021); Christian Dahlman & Amit Pundik, The Problem with Naked Statistical Evidence, in Philosophical Foundations of Evidence Law (Christian Dahlman, Alex Stein & Giovanni Tuzet eds., 2021); Bello, Marcello Di, When Statistical Evidence Is Not Specific Enough, 199 Synthese 12251 (2021)CrossRefGoogle Scholar; David Enoch & Levi Spectre, Statistical Resentment, or: What's Wrong with Acting, Blaming, and Believing on the Basis of Statistics Alone, 199 Synthese 5687 (2021); Jackson, Elizabeth, Belief, Credence, and Evidence, 197 Synthese 5073 (2020)CrossRefGoogle Scholar.

5. David Enoch & Talia Fisher, Sense and “Sensitivity Analysis”: Epistemic and Instrumental Approaches to Statistical Evidence, 67 Stan. L. Rev. 557, 559–560 (2015).

6. Ross, supra note 4, at 2021–2022.

7. Martin Smith, When Does Evidence Suffice for Conviction?, 127 Mind 1193, 1195 (2018).

8. Blome-Tillmann, Michael, Sensitivity, Causality, and Statistical Evidence in Courts of Law, 4 Thought 102, 104 (2015)Google Scholar.

9. Smith, supra note 7, at 1200 (emphasis in original). Smith acknowledges that his arguments “may have some revisionary implications.” Id.

10. See Dahlman, Christian, Naked Statistical Evidence and Incentives for Lawful Conduct, 24 Int'l J. Evidence & Proof 162, 177 (2020)Google Scholar; see also Enoch, supra note 2, at 288.

11. Safety, sensitivity, and normalcy are variables that go to the credibility of witnesses and the weight of the evidence. These variables are routinely examined at trial. It is commonplace to examine witnesses as to their perceptual context and abilities at the time in question and the likelihood that they misperceived or misremembered—which involves both sensitivity and safety. When a trial judge instructs a jury to rely on their common sense and general knowledge, the judge is incorporating a normalcy criterion. Those who have never sat through a trial in the United States and are interested will find a complete transcript of a trial as the first chapter of Ronald J. Allen, Eleanor Swift, David S. Schwartz, Michael S. Pardo & Alex Stein, An Analytical Approach to Evidence: Text, Problems, and Cases (6th ed. 2016), at 2–84. One of the participants in the philosophical debates is attuned to the practical significance of the variables being examined. See generally Pardo, Michael S., Safety v. Sensitivity: Possible Worlds and the Law of Evidence, 24 Legal Theory 50 (2018)Google Scholar.

12. That liability rules are designed to affect behavior is obvious. So, too, are various evidence rules. In addition, evidence rules and liability rules interrelate. The challenge for a legal system is to optimize the sum of primary behavior and litigation behavior. For a discussion, see Allen, Ronald J., Burdens of Proof, 13 Law, Probability & Risk 195, 204 (2014)CrossRefGoogle Scholar.

13. Apparently, there is a debate within professional epistemology somewhat analogous to the motivation for this article. See Jennifer Rose Carr, Why Ideal Epistemology?, Mind (Oct. 8, 2021), https://doi.org/10.1093/mind/fzab023.

14. At the risk of going too far into the intricacies of American evidence law, admissibility decisions are based on an extremely low threshold of relevancy. A proffer is relevant if any reasonable person could be influenced by the evidence on a material proposition. Ronald J. Allen, David S. Schwartz, Michael S. Pardo & Alex Stein, An Analytical Approach to Evidence: Text, Problems, and Cases (7th ed. 2021), at 69 (“FRE 401 . . . is a minimal test of logically probable inferences from the offered item to a fact of consequence. The judges will find evidence relevant if a reasonable juror could think that it makes a fact of consequence even slightly more or less likely than it would be were the evidence not known.”). Many probabilistic/statistical proffers meet this standard, and the ultimate argument often is over whether such evidence is sufficient for a verdict. As we demonstrate in this article, the answer to that question is unequivocally yes so long as the evidence is shown to be reliable. Sufficiency means that the evidence allows a reasonable person to conclude by the standard of proof the truth of the necessary elements. See id. at 73. In all cases, the proponent of evidence must establish its conditions of admissibility, in particular its reliability. With “probability” statements, the proponent must demonstrate how the probability judgment was reached, why it is actually probative, and how it maps onto the case at hand, which we demonstrate below in our case discussion. With respect to statistical evidence, again the proponent must show its reliability, which can be done in a variety of ways, such as showing a valid research design, showing that data were reliably collected, and applying appropriate statistical tests to the data (determining significance levels, running power tests, etc.) See, e.g., Michael J. Saks, David L. Faigman, David H. Kaye & Joseph Sanders, Annotated Reference Manual on Scientific Evidence, 83 Reference Guide on Statistics (Westlaw) (“Likewise, since most statistical methods relied on in court are described in textbooks and journal articles and are capable of producing useful results when carefully and appropriately applied, such methods generally satisfy important aspects of the ‘scientific knowledge’ requirement articulated in Daubert . . . Of course, a particular study may use a method that is entirely appropriate, but so poorly executed that it should be inadmissible under Federal Rules of Evidence 403 and 702. Or, the method may be inappropriate for the problem at hand and thus lacks the ‘fit’ spoken of in Daubert. Or, the study may rest on data of the type not reasonably relied on by statisticians or substantive experts, and hence run afoul of Federal Rule of Evidence 703. Often, however, the battle over statistical evidence concerns weight or sufficiency rather than admissibility.”) (internal footnotes and citations omitted).

15. Consider, for example, Enoch and Fisher's sophisticated discussion of various arguments concerning the admissibility and/or sufficiency of naked statistical evidence premised on the belief that “the law” is averse to such evidence. Enoch & Fisher, supra note 5. After canvasing various issues concerning the effect and cost/benefit relationship of such evidence, they conclude that all the considerations are essentially ambiguous, with reasons for and against allowing it. They then articulate a deterrence rationale to justify the aversion to statistical proffers, which as we discuss below is not convincing. A concern for accuracy, however, is; this is the primary reason why there is no general aversion to statistical proffers but instead a commonplace concern for comprehensible and reliable evidence. If one strips out the effort to justify the mistaken view about statistical proffers, they provide a remarkable road map viewed through the normal evidentiary lenses of reliability, cost, and benefit for the various ways in which such evidence can and should be used. See, e.g., Fed. R. Evid. 401, Fed. R. Evid. 403, and various other evidentiary rules.

16. As we discuss infra at text accompanying notes 144–155, this literature is characterized by weird hypotheticals that mismodel the objects of inquiry and often contain impossible epistemological demands. For examples, see articles cited supra note 4. See also Ronald J. Allen, Naturalized Epistemology Revisited, 2 Quaestio Facti 253, 270–275 (2021) (exploring the weird hypothetical methodology). For a hypothetical free discussion, see Christian Piller, Beware of Safety, 60 Analytic Phil. 307 (2019). Michael Pardo's explanation for this phenomenon is that much of the literature starts with the premise that there is a “paradox” that needs to be explained because it assumes (1) the preponderance standard means 0.5 and (2) the statistic represents the probative value of the evidence. As Pardo points out, if the first assumption is false and/or the second assumption is also false, then there is no “paradox” to be explained in the first place. Rather, the weird hypotheticals are just underdescribed in such a way that not enough is known to say anything intelligible about the cases. Michael S. Pardo, Naturalized Epistemology and the Law of Evidence: Methodological Reflections, 2 Quaestio Facti 299, 305–306 (2021). This may very well be correct. By contrast, what best explains the case law and legislation is the concern for reliable—accuracy-enhancing—evidence.

17. See, e.g., I. Maurice Wormser, The Development of the Law, 23 Colum. L. Rev. 701, 713 (1923) (explaining that “our law is a growth, developing slowly from age to age, sloughing off precedents whenever necessary, and making new ones in response to alterations in mature public opinion”).

18. See, e.g., Laurence H. Tribe, Trial by Mathematics: Precision and Ritual in the Legal Process, 84 Harv. L. Rev. 1329, 1349 (1971) (describing the problem as “general statistical evidence” standing alone); L. Jonathan Cohen, Subjective Probability and the Paradox of the Gatecrasher, 1981 Ariz. St. L.J. 627, 627 (1981) (discussing the famous gatecrasher hypothetical, which assumes the only evidence in the case is statistical and makes no further distinctions); David Kaye, Paradoxes, Gedanken Experiments and the Burden of Proof: A Response to Dr. Cohen's Reply, 1981 Ariz. St. L.J. 635 (1981) (defining naked statistical evidence as “background statistics alone”); Craig R. Callen, Adjudication and the Appearance of Statistical Evidence, 65 Tul. L. Rev. 457, 467 (1991) (describing the problem as “only statistical” evidence but lamenting, “‘[n]akedness’ and ‘entirety’ of statistics are so ill defined that it is doubtful one can say precisely whether a statistic is naked, unless perhaps, one is simply saying that the statistic is meaningless”). By 1992, Wells noted that “[n]aked statistical evidence is ill defined in the legal literature but typically refers to probabilities that are not case specific in the sense that the evidence was not created by the event in question but rather existed prior to or independently-of the particular case being tried.” Gary L. Wells, Naked Statistical Evidence of Liability: Is Subjective Probability Enough?, 62 J. Personality & Soc. Psych. 739, 739 (1992).

19. See, e.g., Jonathan J. Koehler & Daniel N. Shaviro, Veridical Verdicts: Increasing Verdict Accuracy Through the Use of Overtly Probabilistic Evidence and Methods, 75 Cornell L. Rev. 246, 264 (1990) (defining “naked statistical evidence” as “a base rate unaccompanied by other evidence”); Richard W. Wright, Causation, Responsibility, Risk, Probability, Naked Statistics, and Proof: Pruning the Bramble Bush by Clarifying the Concepts, 73 Iowa L. Rev. 1001, 1049–1054 (1988) (defining naked statistical evidence as “reports of accidental grouping”).

20. E.g., Enoch & Fisher, supra note 5, at 591, 605, 609. For example, Enoch & Fisher, supra note 5, at 586, explain: “More specifically, we will focus on situations in which the defendant's liability-triggering, or guilty, conduct is inferred based on reference to membership in a particular population or reference class . . . We begin with the two extreme points: DNA evidence, which courts tend to endorse, and propensity-for-crime evidence, which courts tend not to admit at the guilt phase of trial.” (Internal footnote omitted.)

21. The categories may be unclear. For example, to us it seems that Smith's discussion of the TV theft hypothetical involves trace evidence—the TV itself. See Smith, supra note 7, at 1196–1200.

22. Anne Ruth Mackor, Different Ways of Being Naked: A Scenario Approach to the Naked Statistical Evidence Problem, 8 J. Applied Logics 2407, 2416 (2021).

23. David Enoch & Levi Spectre, Sensitivity, Safety, and the Law: A Reply to Pardo, 25 Legal Theory 178, 183–184 (2019) (“When we—following the literature—speak of statistical evidence, we think of examples such as Blue Bus, and the phenomenon it is an example of. This is the phenomenon sometimes called base-rate evidence, sometimes market-share evidence, or naked statistical evidence. . . . How do we, then, define statistical evidence? We don't.”) (emphasis in original).

24. However, the distinction between trace and nontrace evidence is analytically unhelpful. What matters is the reliability of the evidence, which is why courts allow verdicts in either case. See infra at text accompanying notes 127–144 (discussing DNA evidence (trace) and the evolution of the mailing element in the Burks case (nontrace)). At a deeper conceptual level, DNA is said to be trace evidence, but the question in DNA cases is whether the evidence is trace evidence from the defendant. Its evidentiary implications depend entirely upon membership in nontrace base rates. The existence of a “trace” at the scene is entirely trivial from the analytic point of view. More to our point, there is not a hint in American evidence law that this distinction is actually drawn or makes a difference.

25. Edward Hoseah, Corruption in Tanzania: The Case for Circumstantial Evidence (2008), at 133–134.

26. See, e.g., Republic v. 1. Chacha s/o Jeremia Murimi, 2. Mathew s/o Jeremia Daud, 3. Paschal s/o Lugoye Mashiku, 4. Alex Joseph @Bugwema s/o Silola Lyangalo, In The High Court of Tanzania, In The District Registry Of Mwanza, Criminal Sessions Case No.213 of 2014, at 23–25, https://rodra.co.za/images/countries/tanzania/cases/JUDGMENT%20REPUBLIC%20VS%20CHACHA%20JEREMIA%20MURIMI%202014.pdf.

27. See Crim. Proc. L. of the People's Republic of China, ch. V, art. 48 (2018) (“All materials that prove the facts of a case shall be evidence. Evidence shall include: (1) physical evidence; (2) documentary evidence; (3) testimony of witnesses; (4) statements of victims; (5) statements and exculpations of criminal suspects or defendants; (6) expert opinions; (7) records of crime scene investigation, examination, identification and investigative experiments; and (8) audio-visual materials, and electronic data. The authenticity of evidence shall be confirmed before it can be admitted as the basis for making a decision on a verdict.”); see also Civ. Proc. L. of the People's Republic of China, ch. VI, art. 63 (2018) (identical).

28. Professor Allen is a Fellow at the Forensic Science Institute of China University of Political Science and Law, one of the leading forensic science laboratories in China, and has seen this process with his own eyes. See also Li Yuan, Study on Problems in Forensic DNA Identification Standardization in China and Countermeasures for the Same, 4 J. Forensic Sci. & Med. 49 (2018).

29. We are not experts in the evidence law of European countries and would be delighted to learn from those who are more knowledgeable that our quick review is in error. Bearing in mind the difficulty of proving a negative, see, e.g., Martin Oudin, Evidence in Civil Law – France (2015); Núria Mallandrich Miret, Evidence in Civil Law – Spain (2015); Elisabetta Silvestri, Evidence in Civil Law – Italy (2015); Christian Wolf & Nicola Zeibig, Evidence in Civil Law – Germany (2015). Next, consider Michele Taruffo's magisterial review of comparative evidence law, Michele Taruffo, Chapter 7: Evidence, in International Encyclopedia of Comparative Law Vol. XVI: Civil Procedure (2010). In Part I, there is a brief theoretical discussion of probability. Id. §§14–16. In Part III, he has an equally brief reference to “the largely prevailing opinion” that statistics cannot prove individual facts; however, there are no citations to “law” or “courts.” Instead, there are a few cites to scholars who are mostly American. Id. §69.

30. See discussion supra at text accompanying notes 18–24. Others wish to make a distinction between trace and nontrace evidence. Both are allowed as the basis of verdicts in the United States. See infra at text accompanying notes 127–144 (discussing DNA (trace) and the mailing element in the Burks case (nontrace)).

31. Helena Soleto Muñoz & Anna Fiodorova, DNA and Law Enforcement in the European Union: Tools and Human Rights Protection, 10 Utrecht L. Rev. 149, 150 (2014) (internal footnote omitted). Some jurisdictions allow convictions and others do not where DNA is the sole basis of identification—analysis must be jurisdiction-specific. See Lirieka Meintjes-van der Walt & Priviledge Dhliwayo, DNA Evidence as the Basis for Conviction, 24 Potchefstroom Electronic L.J. 1 (2021), https://doi.org/10.17159/1727-3781/2021/v24i0a8537. The Irish Supreme Court recently upheld a conviction where DNA was the only evidence of identity. Wilson v. DPP, 2017 IESC 54.

32. See Ronald J. Allen, Naturalized Epistemology and the Law of Evidence: A Reply to Pardo, Spellman, Muffatto, and Enoch, 3 Quaestio Facti 1, 14 (2021).

33. National Center for State Courts, State Court Caseload Digest: 2018 Data (2020), https://www.courtstatistics.org/__data/assets/pdf_file/0014/40820/2018-Digest.pdf, at 6 (aggregating state caseload data—we cite the caseload figure for the “criminal” category); Federal Judicial Caseload Statistics 2020, U.S. Courts, https://www.uscourts.gov/statistics-reports/federal-judicial-caseload-statistics-2020 (aggregating federal caseload data—again, we cite the figure for criminal cases filed in US district courts).

34. Id. (we cite the state court figures for all categories other than “criminal,” general and limited jurisdiction courts combined, and the figure for the number of federal civil cases filed in US district courts).

35. 58 N.E.2d 754 (Mass. 1945).

36. E.g., Enoch & Fisher, supra note 5, at 559 (“One starting point for the statistical evidence debate is the classic Blue Bus hypothetical, which is a variant of Smith v. Rapid Transit, Inc., a seminal case in modern evidence law.”) (internal footnotes omitted).

37. People v. Bailey, 193 N.W.2d 405 (Mich. Ct. App. 1971) (Levin, J., dissenting); Thomas v. Mallett, 701 N.W.2d 523 (Wis. 2005) (Wilcox, J., dissenting).

38. See Russo v. Material Handling Specialties Co., 1995 WL 1146853, at *6 (1995). On overruling, see the discussion of Joyner and Gomes infra notes 51 and 52.

39. See the discussion of Guenther v. Armstrong Rubber Company, 406 F.2d 1315 (3d Cir. 1969), infra at text accompanying notes 90–96.

40. Carter v. Yardley & Co., 64 N.E.2d 693, 694 (Mass. 1946).

41. Id.

42. Id. at 694–695.

43. Id.

44. Id.

45. Tartas’ Case, 105 N.E.2d 380, 381 (Mass. 1952).

46. Id.

47. Id.

48. Id.

49. Id. at 382.

50. Id.

51. Com. v. Gomes, 526 N.E.2d 1270, 1279–1280 (Mass. 1988).

52. Com. v. Joyner, 4 N.E.3d 282, 291 (Mass. 2014) (explaining that “we have recognized the ‘necessarily probabilistic’ nature of fingerprint identification evidence” and holding that the fingerprint evidence was “sufficient for the jury to have concluded, beyond a reasonable doubt, that the defendant was the perpetrator”).

53. United States v. Jackson, 368 F.3d 59, 61 (2d Cir. 2004).

54. Id.

55. Id. at 64 (emphasis in original).

56. Id. at 64–65.

57. Hart v. Sec'y of Dept. of Health & Hum. Servs., 60 Fed. Cl. 598, 605 (Fed. Cl. 2004).

58. Id. at 607–608.

59. 397 F. Supp. 3d 406 (S.D.N.Y. 2019).

60. Id. at 424.

61. Smith was not the first case to say something disparaging about probability. See Enoch & Fisher, supra note 5, at 561 n.12.

62. “[T]he bus in question could very well have been one operated by someone other than the defendant.” Smith v. Rapid Transit, Inc., 58 N.E.2d 754, 755 (Mass. 1945). We are relying on the facts as reported by the court, as is the convention in the common law. Perhaps there is more to the story than was related, but it was the obligation of the parties to bring that out, not the court.

63. It is important to distinguish between individual discretionary judgments that judges make all the time and rules of law. The naked statistics literature is assuming a general aversion expressed in “the law” for naked statistics rather than critiquing individual applications of normal rules of sufficiency or admissibility of the evidence with which people can disagree.

64. As the Seventh Circuit summarized the situation in United States v. Veysey, 334 F.3d 600, 605 (7th Cir. 2003) (internal citations omitted): “[A] court should not be required to expend any of its scarce resources of time and effort on a case until the plaintiff has conducted a sufficient search to indicate that an expenditure of public resources is reasonably likely to yield a social benefit.” This sentiment was first expressed in Howard v. Wal–Mart Stores, Inc., 160 F.3d 358, 359–360 (7th Cir. 1998).

65. Francis J. Larkin & James W. Smith, Chapter 24: Rule 15: Pretrial Oral Discovery in Massachusetts, in Annual Survey of Massachusetts Law 347, 347–349 (1966).

66. 774 P.2d 60 (Wyo. 1989). For citations to these cases, see, e.g., Enoch & Fisher, supra note 5, at 592 n.127; Jonathan J. Koehler, When Do Courts Think Base Rate Statistics Are Relevant?, 42 Jurimetrics 373, 377 (2002). Enoch and Fisher cite Koehler for the proposition that “[t]o this day, courts continue to exhibit a general preference for individualized evidence and to reject base-rate evidence despite its potential to promote accuracy in legal factfinding.” Enoch & Fisher, supra note 5, at 561. That is not quite right. Koehler's article is the most sophisticated examination of when courts are amenable to statistical proffers. The variables he identifies map onto, even if they are not a perfect fit for, reliability and helpfulness.

67. 350 A.2d 665, 668 (Md. Ct. App. 1976).

68. Stephens, 774 P.2d at 64.

69. Id.

70. Springfield v. State, 860 P.2d 435, 448 (Wyo. 1993). For a similar holding from New Jersey, see State v. Spann, 617 A.2d 247, 260 (N.J. 1993). These cases also demonstrate the evolving understanding of probability reflected in the court opinions. In Spann, the New Jersey Supreme Court considered whether to admit paternity evidence, which is probabilistic, in a criminal sexual assault case. The court concluded that its admissibility should turn on a “Rule 8” and “Rule 4” hearing (at the time, the New Jersey analogs to Federal Rules of Evidence 702 and 403). Spann, 617 A.2d at 260. See also N.J. Stat. Ann. §9:17-52, embedding the holding in a statute.

71. 350 A.2d 665, 668 (Md. 1976).

72. Id.

73. Derr v. Maryland., 73 A.3d 254, 280 (Md. 2013) (holding that from the DNA testimony, “a rational juror could conclude beyond a reasonable doubt, without resorting to speculation or conjecture, that Derr was the victim's attacker”).

74. See Rochkind v. Stevenson, 236 A. 3d 630, 652 (Md. 2020) (adopting the Daubert standard under state law).

75. See, e.g., Smith, supra note 7, at 1193 (claiming that “[w]hat is remarkable about . . . [Smith] . . . is that the prevailing legal doctrine – both then and now, both in the United States and elsewhere – suggests that it should have succeeded”) (emphasis in original). That claim is totally baseless. Even were it not, as we have demonstrated, Smith is about as insignificant of a case as one can imagine, and furthermore the American legal systems have evolved over the last seventy-five years.

76. Enoch & Fisher, supra note 5, at 561.

77. 438 P.2d 33, 40–41 (Cal. 1968) (en banc). Collins has been cited seventy-nine times by other state's courts, but almost exclusively for the propositions that data must have a reliable foundation and care has to be taken in presenting probabilistic evidence to juries. See, e.g., Davis v. State, 476 N.E.2d 127, 134 (Ind. Ct. App. 1985). The actual controversy in the states is not over the admissibility or sufficiency of statistical proffers but how the evidence should be presented to the jury. See, e.g., State v. Schwartz, 447 N.W.2d 422, 428–430 (Minn. 1989). In this respect, the trend is to allow experts to explain in detail the implications of the evidence. See, e.g., infra note 89 (discussing Bloom).

78. Id. at 39 (citing State v. Sneed, 414 P.2d 858, 862 (N.M. 1966)).

79. 154 F.2d 390 (6th Cir. 1946) (internal citation omitted).

80. Id. at 394 (quoting Pennsylvania R. Co. v. Chamberlain, 288 U.S. 333, 339 (1933)). This is another nonentity case with few citations and none of them supporting the proposition that statistical evidence of any sort is disfavored.

81. 13 F.2d 62 (3d Cir. 1926). Very few cases cite to Evans and none for the propositions at hand.

82. Id. at 64.

83. Id. at 64–65.

84. 160 F. 348, 352 (6th Cir. 1908). Hawk is cited a little over thirty times, but always for the proposition that decision should not be based on speculation, surmise, and guess with an occasional reference to “probabilities,” which from the context is obviously being used as a synonym. See, e.g., Goodall Co. v. Sartin, 141 F.2d 427, 434 (6th Cir. 1944).

85. Day v. Boston & M.R.R., 52 A. 771, 774 (Me. 1902).

86. There is an almost perfect modern analogue to this in the Guenther case, which is discussed infra at text accompanying notes 90–96.

87. State v. Carlson, 267 N.W.2d 170, 176 (Minn. 1978). Carlson is mainly cited for its handling of physical evidence. A few cases cite it for the concern about the presentation of evidence to the jury, but the typical reaction is that “[t]he approach taken in Minnesota . . . has been rejected by an impressive myriad of courts and commentators.” See, e.g., Davis v. State, 476 N.E.2d 127, 135 (Ind. Ct. App. 1985).

88. State v. Boyd, 331 N.W.2d 480, 483 (Minn. 1978) (“It does not follow from . . . the Carlson case . . . that the correct approach is to suppress the evidence entirely, as the trial court did.”).

89. State v. Bloom, 516 N.W.2d 159, 167 (Minn. 1994) (Notwithstanding Carlson, the court held that “any properly qualified prosecution or defense expert may, if evidentiary foundation is sufficient, give an opinion as to random match probability using the NRC's approach to computing that statistic.”). Carlson is an outlier even with respect to its concern about the jury, as explained by the Massachusetts Supreme Judicial Court in Com. v. Gomes, 526 N.E.2d 1270, 1279–1280 (Mass. 1988) (internal citations omitted):

Where courts and commentators have been reluctant to admit statistical evidence, that reluctance has stemmed largely from the fact that the probabilities on which the evidence depended were based on mere speculation or were characterized in such a way as to mislead or confuse the jury. See, e.g., Commonwealth v. Drayton . . . (manner of presentation may make statistical evidence misleading); People v. Collins . . . (prosecution presented probabilities without any underlying factual basis); People v. Harbold . . . (statistical evidence must rest on adequate factual basis and, even then, potential for confusion outweighs minimal probative value) . . . On the other hand, where the statistical evidence is shown to be based on accepted scientific principles, courts, including this one, have admitted such evidence.

90. 406 F.2d 1315 (3d Cir. 1969).

91. Id. at 1315.

92. Id. at 1316.

93. Id.

94. Id. at 1318.

95. Id. at 1317.

96. Tracing the impact of Guenther reveals a perfect analogue to the history of Smith. Of the cases that have cited it, the only case suggesting a resistance to probability is an unreported district court decision. Once again, though, the best explanation of the case is lousy evidence. In Chapin v. Great Southern Wood Preserving Inc., 2016 WL 3135545 (2016), the plaintiff alleged that the defendant had sold him defectively treated wood, but he could not identify the manufacturer of the wood and he was not even sure that the wood he bought was treated wood.

97. See Victor v. Nebraska, 511 U.S. 1, 14 (1994) (quoting In re Winship, 397 U.S. 358, 370 (Harlan, J., concurring) (emphasis in original): “[I]n a judicial proceeding in which there is a dispute about the facts of some earlier event, the factfinder cannot acquire unassailably accurate knowledge of what happened. Instead, all the factfinder can acquire is a belief of what probably happened.”).

98. See Victor, 511 U.S. at 14; see also Turner v. United States, 396 U.S. 398, 415–417 (1970) (explaining that even though some heroin is produced in the United States, the fact that a vast majority of heroin is imported into the United States may properly allow a jury “infer that heroin possessed in this country is a smuggled drug”); see also Int'l Bhd. of Teamsters v. United States, 431 U.S. 324, 336 (1977) (“Statistics are . . . competent in proving employment discrimination. We caution only that statistics are not irrefutable; they come in infinite variety and, like any other kind of evidence, they may be rebutted. In short, their usefulness depends on all of the surrounding facts and circumstances.”).

99. Ronald J. Allen, Rationality, Mythology, and the “Acceptability of Verdicts” Thesis, 66 B.U.L. Rev. 541, 550 (1986); see also Ronald J. Allen, On the Significance of Batting Averages and Strikeout Totals: A Clarification of the “Naked Statistical Evidence” Debate, the Meaning of “Evidence,” and the Requirement of Proof Beyond Reasonable Doubt, 65 Tul. L. Rev. 1093 (1991). Courts acknowledge this point. In addition to the Supreme Court cases suggesting the point, discussed supra note 98, see, e.g., People v. Rush, 630 N.Y.S.2d 631, 634 (N.Y. Sup. Ct. 1995), stating that “[t]here can be little doubt . . . that the perils of eyewitness identification testimony far exceed those presented by DNA expert testimony.”

100. Allen, supra note 16, at 276–278; see also Pardo, supra, note 16, at 305.

101. See, e.g., Watson v. Fort Worth Bank & Trust, 487 U.S. 977, 994 (1988) (“Once the employment practice at issue has been identified, causation must be proved; that is, the plaintiff must offer statistical evidence of a kind and degree sufficient to show that the practice in question has caused the exclusion of applicants for jobs or promotions because of their membership in a protected group.”); see also Davis v. District of Columbia, 925 F.3d 1240, 1254 (D.C. Cir. 2019) (affirming the district court's finding that “plaintiffs failed to meet their burden to identify a race-based statistical disparity potentially caused by the challenged [conduct]”).

102. For a discussion, see United States v. Armstrong, 517 U.S. 456, 469–471 (1996), rejecting the plaintiff's statistical proffer as unreliable.

103. See, e.g., supra note 70, discussing N.J. Stat. Ann. §9:17-52, which provides for the admission of statistical evidence of paternity.

104. See Fed. R. Evid. 401–403, 702.

105. It is literally impossible. First, there has to be evidence of the underlying cause of action; in criminal cases this is referred to as the corpus delicti. Second, evidence does not come stamped with its own implications; it only has meaning when appraised. See Allen, supra note 99, at 1103; see also Ronald J. Allen, Factual Ambiguity and a Theory of Evidence, 88 Nw. L. Rev. 604, 618 (1994) (“[E]vidence does not come stamped with its implications in any fashion even remotely analogous to U.S.D.A. Inspected Grade A Beef.”).

106. FRE 902 is not an exception. FRE 902 simplifies the authentication of routine material in the absence of objection by the other side. If an opponent purports to be able to show that material being offered is not genuine, such as with a forgery, the proponent will need to establish the conditions of admissibility. However, the data contained within any vehicle that is self-authenticating has to be authenticated itself.

107. We return to other aspects of this problem infra at text accompanying notes 125–136.

108. Remember that commentators do not agree on what “naked statistics” refers to and some distinguish between trace and nontrace evidence, although the distinction is pointless. See supra note 24. As our following discussion of the mail fraud case and other cases demonstrates, US law allows verdicts on the basis of nontrace evidence as well.

109. State v. Toomes, 191 S.W.3d 122, 128 (Tenn. Crim. App. 2005) (emphasis in original). See, e.g., Roberson v. State, 16 S.W.3d 156, 171 (Tex. App. 2000); Springfield v. State, 860 P.2d 435, 453 (Wyo. 1993); State v. Abdelmalik, 273 S.W.3d 61, 66 (Mo. Ct. App. 2008); People v. Rush, 630 N.Y.S.2d 631, 634 (N.Y. Sup. Ct. 1995).

110. There are cases where DNA evidence was found to be insufficient to support a verdict because of other evidence in the case. These cases are instructive on the evolutionary nature of the common law. An example is State v. Pastuer, 697 S.E.2d 381 (2010). A woman who had broken off her relationship with the defendant, with whom she had lived, was found murdered. Id. at 383. There was no substantial evidence in the case tying the defendant to the murder. A drop of the victim's blood was found on one shoe of the defendant but there was no evidence of its source, and the two had lived together for an extended period. Id. at 387. The court found, all things considered, that there was insufficient evidence to sustain a conviction. Id. at 388. Had this case arisen seventy-five years ago in the Smith era, one could see a court referring to the DNA evidence as providing at most a “probability” that the defendant was the perpetrator, which was not enough. Instead, the court said that raising a suspicion or a conjecture is not enough, forgoing the prior association of “probability” with lousy evidence.

111. Enoch and Fisher refer to DNA as receiving “exceptional treatment.” Enoch & Fisher, supra note 5, at 591. However, along with DNA there needs to be added fingerprints, ballistics, a whole host of other forensic techniques, and a number of other categories of evidence. See infra note 136 for an extensive list of types of expertise offered in criminal cases. See also David H. Kaye, The Ultimate Opinion Rule and Forensic Science Identification, 60 Jurimetrics 175, 177 n.8 (2020) (“Overt source attribution is the norm for fingerprint and firearms identification (when the examiner is confident about the positive association). 1 McCormick on Evidence §207, at 1241–1243 (Kenneth S. Broun ed., 7th ed. 2013) (latent fingerprints); David H. Kaye, Firearm-Mark Evidence: Looking Back and Looking Ahead, 68 Case W. Res. L. Rev. 723, 725–726 (2018) (tool-marks on ammunition).”); Keith A. Findley, The Absence or Misuse of Statistics in Forensic Science as a Contributor to Wrongful Convictions: From Pattern Matching to Medical Opinions About Child Abuse, 125 Dickenson L. Rev. 615, 618 (2021) (“All of the pattern-matching or individualization forensic disciplines are at bottom disciplines that can only make probabilistic claims—that is, statistical claims about the probability that trace evidence left at a crime scene might have been left by a particular suspect.”).

112. National Research Council, Strengthening Forensic Science in the United States: A Path Forward (2009), at 127–183 (“Some of the forensic science disciplines are laboratory based (e.g., nuclear and mitochondrial DNA analysis, toxicology, and drug analysis); others are based on expert interpretation of observed patterns (e.g., fingerprints, writing samples, toolmarks, bite marks, and specimens such as fibers, hair, and fire debris). Some methods result in class evidence and some in the identification of a specific individual—with the associated uncertainties.”).

113. For a discussion, see Edward J. Imwinkelried, The Shifting Battleground over the Admissibility of Experientially Based Expert Testimony: How Far May Experts Go in Elaborating on the Personal Experience Supposedly Validating Their Methodology?, 68 Drake L. Rev. 43, 46 (2020).

114. See, e.g., Commonwealth v. Joyner, 4 N.E.3d 282, 291 (Mass. 2014) (explaining that “we have recognized the ‘necessarily probabilistic’ nature of fingerprint identification evidence”); Rotte v. State, 781 S.W.2d 738, 742 (Tex. Ct. App. 1989) (explaining that “the number of points of comparison from fingerprint analysis is a form of probabilistic evidence”); State v. Brown, 470 N.W.2d 30, 33 (Iowa 1991) (explaining “fingerprint evidence . . . is based on the mathematical theory of probabilities [about the] the chance of two individuals bearing the same fingerprint”); Branion v. Gramly, 855 F.2d 1256, 1264 (7th Cir. 1988) (“Much of the evidence we think of as most reliable is just a compendium of statistical inferences. Take fingerprints. The first serious analysis of fingerprints was conducted by Sir Francis Galton, one of the pioneering statisticians, and his demonstration that fingerprints are unique depends entirely on statistical methods.”).

115. See, e.g., Com. v. Pettyjohn, 64 A.3d 1072, 1077 (Pa. Super. Ct. 2013); Grice v. State, 151 S.W.2d 211, 221 (Tex. Crim. App. 1941); State v. Quintana, 103 P.3d 168, 169–170 (Utah Ct. App. 2004); State v. Bell, 62 S.W.3d 84, 96 (Mo. Ct. App. 2001); Howard v. State, 695 S.W.2d 375, 376 (Ark. 1985).

116. In some cases, ballistics is the only tangible evidence linking the defendant to the crime. See Evans v. Commonwealth, 19 S.W.2d 1091, 1096–1097 (Ky. 1929). See also State v. Benton, 413 A.2d 104, 112 (R.I. 1980) (holding that the expert's ballistics evidence was a valid basis for the jury to conclude that a weapon was in fact the murder weapon used in the commission of a crime); Purnell v. State, 254 A.3d 1053, 1114 (De. 2021) (noting the significance of ballistic evidence).

117. United States v. Green, 405 F. Supp. 2d 104, 122 (D. Mass. 2005). The debate in this area is again not admissibility or sufficiency but how the evidence is to be presented to the jury. See, e.g., Williams v. United States, 130 A.3d 343, 347–349 (D.C. 2016). We agree that testimony in terms of certainty should be disallowed. And we would not be surprised if toolmark testimony eventually is restricted because of the lack of validation. These cases suggest that the naked statistics debate sometimes has it backward—the courts may be too lenient rather than too restrictive.

118. United States v. Foust, 989 F.3d 842, 845–847 (10th Cir. 2021) (upholding conviction based on handwriting expert in forgery case); see also United States v. Johnson, 464 F.2d 556 (5th Cir. 1972).

119. An example of legal evolution is playing out in Arkansas over DNA. When first presented with DNA evidence, the Arkansas Supreme Court was at pains to say that it, along with other evidence, is sufficient to sustain a verdict. See Whitfield v. Arkansas, 56 S.W.3d 357, 359–360 (Ark. 2001). A series of cases is moving the court to the position that DNA alone is “substantial” evidence, and thus sufficient. See, e.g., Ellis v. Arkansas, 222 S.W.3d 192, 196 (Ark. 2006) (explaining that the DNA “alone is sufficient evidence identifying Appellant as the attacker”). For the same phenomenon regarding handwriting analysis, compare United States v. Starzecpyzel, 880 F. Supp. 1027 (S.D.N.Y. 1995), with United States v. Brown, 152 Fed. Appx. 59 (2d Cir. 2005).

120. United States v. Burks, 867 F.2d 795, 797 (3d Cir. 1989).

121. United States v. Hannigan, 27 F.3d 890, 892–893, 892 n.3 (3d Cir. 1994) (“Once evidence concerning office custom of mailing is presented, the prosecution need not affirmatively disprove every conceivable alternative theory as to how the specific correspondence was delivered.”).

122. United States v. Cohen, 171 F.3d 796, 800 (3d Cir. 1999).

123. See, e.g., United States v. Delfino, 510 F.3d 468, 471 (4th Cir. 2007) (upholding §1341 conviction when “the Government called a [foundational witness] who testified that he had no direct knowledge of the receipt of the [the defendant's] loan application and that he did not have direct knowledge as to whether it was received by mail or commercial carrier . . . [h]owever, [the foundational witness] stated that [the Company's] normal business practice is to send the borrower a return United Parcel Service or Federal Express envelope, which the borrower then typically uses to return the application”); United States v. Kelley, 929 F.2d 582, 584 (10th Cir. 1991) (affirming §1341 conviction when the foundational witness testified only to the general business practice and conceded on cross-examination that business practice may not have been followed 100 percent of the time); United States v. Metallo, 908 F.2d 795, 798 (11th Cir. 1990) (affirming §1341 conviction based solely on airline's business practice); see also United States v. Doherty, 867 F.2d 47, 65 (1st Cir. 1989); United States v. Sumnicht, 823 F.2d 13, 14–15 (2d Cir. 1987); United States v. Bowman, 783 F.2d 1192, 1197 (5th Cir. 1986); United States v. Scott, 668 F.2d 384, 388 (8th Cir. 1981); United States v. Shavin, 287 F.2d 647, 652 (7th Cir. 1961).

124. Another example of evolution in the common law is the transformation underway from excluding expert eyewitness testimony, which involves a naked statistic, to permitting it. For discussions, see Ronald J. Allen, Joseph L. Hoffmann, Debra A. Livingston, Andrew D. Leipold & Tracey L. Meares, Comprehensive Criminal Procedure (5th ed. 2020), at 137–141; see also Tanja Rapus Benton, Stephanie A. McDonnell, Neil Thomas, David F. Ross & Nicholas Honerkamp, On the Admissibility of Expert Testimony on Eyewitness Identification: A Legal and Scientific Evaluation Identification, 2 Tenn. J.L. & Pol'y 392, 432–433 (2014).

125. For still further examples, see Pardo, supra note 11, at 64–65.

126. Fed. R. Evid. 404(a).

127. Fed. R. Evid. 413–415.

128. Motive, opportunity, intent, and identity, to name a few.

129. Fed. R. Evid. 608, 609.

130. Allen et al., supra note 11, at 281.

131. FRE 406 generalizes the progression noted above in the wire fraud cases.

132. See General Electric Co. v. Joiner, 522 U.S. 136, 146 (1997) (“[t]rained experts commonly extrapolate from existing data”). See 8 Bus. & Com. Litig. Fed. Cts. §86:56 (West 4th ed.) (“With respect to medical malpractice, reliability of medical causation testimony is often, but not always, established by reference to studies, peer-reviewed articles or other reliable authorities admissible under Fed. R. Evid. 803(18), evidencing that the underpinnings for the expert's conclusions have acceptance in the medical community.”).

133. David Enoch, Levi Spectre & Talia Fisher, Statistical Evidence, Sensitivity, and the Value of Legal Knowledge, 40 Phil. & Pub. Affs. 197, 221 n.38 (2012) draws a distinction between sensitive and insensitive statistical proffers but that distinction does not work either, as Pardo, supra note 11, has demonstrated. The only distinction that explains the cases is the one we have developed in this article—the difference between reliable and unreliable evidence, coupled with a concern about how the evidence is presented to a fact finder.

134. Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993).

135. Daubert v. Merrell Dow Pharmaceuticals, Inc., 43 F.3d 1311, 1321 (9th Cir. 1995) (“For an epidemiological study to show causation under a preponderance standard, ‘the relative risk of limb reduction defects arising from the epidemiological data . . . will, at a minimum, have to exceed ‘2’.’ That is, the study must show that children whose mothers took Bendectin are more than twice as likely to develop limb reduction birth defects as children whose mothers did not. While plaintiffs’ [expert] epidemiologists make vague assertions that there is a statistically significant relationship between Bendectin and birth defects, none states that the relative risk is greater than two. These studies thus would not be helpful, and indeed would only serve to confuse the jury, if offered to prove rather than refute causation. A relative risk of less than two may suggest teratogenicity, but it actually tends to disprove legal causation, as it shows that Bendectin does not double the likelihood of birth defects.”) (emphasis in original) (internal citations and footnotes omitted).

136. Enoch and Fisher assert that in criminal cases “[w]ith the exception of DNA evidence, the use of statistical evidence for conviction purposes is extremely uncommon and very controversial.” Enoch & Fisher, supra note 5, at 596. In fact, the admission of forensic evidence and other expert testimony is quite common; much of it is statistical and can be the basis of a conviction. Moreover, it is unclear why any part of this analysis should be limited to criminal cases. The claim is about naked statistical evidence generally. In civil cases, experts are ubiquitous. See Andrew W. Jurs, Expert Prevalence, Persuasion and Price: What Trial Participants Really Think About Experts, 91 Ind. L.J. 353, 386 (2016) (reporting that experts appeared in 86 percent of the cases in his study—a figure similar to what others have reported). For a study suggesting that the prevalence of experts in criminal cases is similar, see Daniel W. Shuman, Elizabeth Whitaker & Anthony Champagne, An Empirical Examination of the Use of Expert Witnesses in the Courts — Part II: A Three City Study, 34 Jurimetrics 193, 197 (1994). Here is a list from a federal district judge of the subjects of expert testimony in criminal cases, many with a statistical foundation: “fingerprint analysis; ballistics toolmark evidence; DNA testing; footprint and tire-track evidence; hair and fiber analysis; bite-mark evidence; and handwriting evidence; mental health; other medical conditions; coded language used by drug dealers; characteristics of gang activity; terrorist activities; characteristics of sex trafficking; reliability (or unreliability) of eyewitness identification; linguistic analytics; Bitcoin and other digital currencies; computer forensics; characteristics and operation of firearms and explosives; counterfeit currency; controlled substance analysis; the difference between personal use and distribution quantities of drugs; vulnerability of sex-trafficking victims; field sobriety testing in drunk-driving cases; and operation of cell towers and other methods of locating individuals through tracking devices.” Paul W. Grimm, Challenges Facing Judges Regarding Expert Evidence in Criminal Cases, 86 Fordham L. Rev. 1601, 1604 (2018). We provide this list just to show the breadth of data being admitted, but not to suggest anything about the merits of any particular type of evidence. For example, bite-mark evidence is under increasing attack based on concerns about its unreliability. See, e.g., Marvin Zalman & James Winde, The Bite Mark Dentists and the Counterattack on Forensic Science Reform, 83 Alb. L. Rev. 749, 750 (2020).

137. An example often cited as showing the resistance to statistical proffers is the saga of Charles O. Shonubi, convicted of drug smuggling. The district court at sentencing used a sampling mechanism to determine the amount of smuggled drugs, which in turn determined the sentence under the Federal Sentencing Guidelines. United States v. Shonubi, 998 F.2d 84, 85–86 (2d Cir. 1993) (“Shonubi I”). The Court of Appeals reversed the district court twice due to this statistical methodology. Shonubi I, 998 F.2d at 89–90; United States v. Shonubi, 103 F.3d 1085, 1090–1092 (2d Cir. 1997) (“Shonubi II”). However, the Second Circuit was not expressing an aversion to statistical evidence. Rather, it was explicitly interpreting the Federal Sentencing Guidelines, which according to the court required “individualized” data on drug amounts. See Shonubi I, 998 F.2d at 89 (beginning “II Drug Quantity” with a recital of the Sentencing Guidelines and interpretation); see also Shonubi II, 103 F.3d at 1088 (beginning “I. Punishment for ‘Unconvicted’ Conduct” with history of the Sentencing Guidelines and interpretation). The literature citing to Shonubi mistakes a statutory interpretation case for a general distaste for statistical proffers. Overlooked again is the evolutionary nature of the common law. A decade later Shonubi was disavowed in the Second Circuit. United States v. Jones, 531 F.3d 163, 175 (3d Cir. 2008) (referring to the discussion in Shonubi concerning specific evidence as “merely dictum” and stating: “We hold that where, as in this case, seized currency appears by a preponderance of the evidence to be the proceeds of narcotics trafficking, a district court may consider the market price for the drugs in which the defendant trafficked in determining the drug quantity represented by that currency”).

138. We did numerous searches (and riffs) on the phrase “mere probability is insufficient,” and we could not find a single case that suggests something generally amiss with probability or statistics. Quite to the contrary, the dominant message was, like in Smith, that a “probability” had to be sufficient for the burden of persuasion. Rather than distinguishing between “probability” evidence and something else, the cases distinguish between lousy and reliable evidence. We will provide our search terms and results upon request.

139. As the debate over naked statistics rages in the theoretical literature, there is an avalanche of new uses of data in practice. From cell phone tower data, see Carpenter v. United States, 138 S. Ct. 2206 (2018), to data taken from a heart monitor given a probabilistic analysis, see Marie-Helen Maras & Adam Scott Wandt, State of Ohio v. Ross Compton: Internet-Enabled Medical Device Data Introduced as Evidence of Arson and Insurance Fraud, 24 Int'l J. Evid. & Proof 321, 322–326 (explaining “the timeline of events that Compton provided were deemed ‘highly improbable’ by a cardiologist”). Probabilistic forensic tools continue to evolve and be used in court. See the discussion of static secondary ion mass spectrometry and inductively coupled plasma mass spectrometry in National Institute of Justice, Without a Trace? Advances in Detecting Trace Evidence (2003), https://www.hsdl.org/?view&did=444690, at 3–9.

140. Again, we note that the category is unclear. See the discussion supra at text accompanying notes 18–24. See also Enoch & Spectre, supra note 23, at 184 (“How do we, then, define statistical evidence? We don't. We—again, in a way that's consistent with the theoretical literature on statistical evidence—start with the examples. They clearly capture something intuitively important.”) (emphasis in original).

141. For example, Enoch & Fisher, supra note 5, in their remarkably detailed analysis do not cite to cases decided in the twenty-first century that excluded statistical evidence or found it categorically insufficient for a verdict. This is typical. See, e.g., Hawthorne, Isaacs & Sridharan, supra note 4; Enoch & Spectre, supra note 23; Enoch, supra note 2. We found one twenty-first-century case that replicates somewhat the style of the old cases, but it is obviously a case of lousy evidence. See the discussion of Chapin, supra note 96. Obviously, there are cases excluding unreliable statistics.

142. See Smith v. Rapid Transit, Inc., 58 N.E.2d 754, 754–755 (Mass. 1945). The “market” is not commercial buses, but buses and things that look like buses.

143. Out of an abundance of caution, we should point out that statistical evidence is not necessarily sufficient for a verdict. Such evidence has to be adequate, all things considered, to allow a reasonable person to conclude that the requisite burden of persuasion has been satisfied. That, in turn, is not a probability measure. See brief discussion supra note 16.

144. As in the frequent comparison of the relative frequency of an event to a witness who is said to be reliable to a quantifiable extent—like 70 percent of the buses in a town are blue compared to a witness who is assumed to be 70 percent reliable and testifies that she saw a blue bus. The meaning of a 70 percent reliable witness is completely opaque. Does it mean three out of every ten statements the witness utters is false? Has the witness been tested with a series of bus identifications? For a discussion, see Allen, supra note 16, at 269–270.

145. Craig Callen first analyzed incentives and evidence in Notes on a Grand Illusion: Some Limits on the Use of Bayesian Theory in Evidence Law, 57 Ind. L.J. 1, 22–24 (1982). See also Johnston, Jason S., Bayesian Fact-Finding and Efficiency: Toward an Economic Theory of Liability Under Uncertainty, 61 S. Cal. L. Rev. 137 (1987)Google Scholar.

146. Posner, Richard A., An Economic Approach to the Law of Evidence, 51 Stan. L. Rev. 1477, 1525–1526 (1999)CrossRefGoogle Scholar.

147. Allen & Leiter, supra note 1, at 1526. As Allen and Leiter point out, another perfectly plausible although utterly ridiculous microeconomic deduction is that “if company A is really smart, it will take exactly two buses out of service. If both companies have forty-nine buses in service, the probability of liability would be exactly 0.5, meaning plaintiffs injured by buses could never recover.” Id. at 1494 n.6. Enoch and Fisher repeat Posner's analysis, see Enoch & Fisher, supra note 5, at 583 n.83, but their article also contains a subtle and nuanced discussion of the relationship between deterrence and evidence rules put to the use of explaining the resistance to statistical evidence, with DNA being explained as “exceptional” and “unique.” Id. at 591, 605, 609. However, there is no such resistance, and the DNA cases are not unique. Also consider the subtle probing of the deterrence arguments in Hawthorne, Isaacs & Sridharan, supra note 4, at 130–140 (“One lesson to be drawn from our discussion is that philosophers should not be coming up with stories as to why it is generally bad to use statistical evidence as one's primary basis for conviction.”) (emphasis in original).

148. Dahlman, supra note 10, at 174.

149. That one aspect of the law, including the law of evidence, is to affect primary behavior is clearly correct. See Allen, Ronald J., A Note to My Philosophical Friends About Expertise and Legal Systems, 8 Humana Mente 71, 73–79 (2015)Google Scholar (specifying the various functions of evidence law that go beyond accurate fact finding).

150. Dahlman, supra note 10, at 177 (“It is a moral problem about verdicts that fail to contribute in a positive way to the incentive structure for lawful behaviour. There is no logical contradiction between a high degree of probability and a lack of moral justification. Paradox resolved.”); Enoch & Fisher, supra note 5, at 582 (“For if the statistical evidence is strongly against him—say, because ninety-eight per-cent of those at the stadium are gatecrashers—John already knows that he will be convicted, regardless of whether he buys a ticket.”).

151. This is also the problem with Chris William Sanchirico, Character Evidence and the Object of Trial, 101 Colum. L. Rev. 1227 (2001), which made a similar argument about character evidence. But it is false that admitting character evidence could have no positive incentive effects. A person knowing that they are at risk of allegations of wrongdoing has a reason to be even more careful to avoid such situations. Roger C. Park & Michael J. Saks, Evidence Scholarship Reconsidered: Results of the Interdisciplinary Turn, 47 B.C.L. Rev. 949, 992 n.205 (2006).

152. But there is a perverse incentive to allowing rodeo operators to create liability traps, which would be another reason cases like this would not be allowed to proceed. See Allen, supra note 16, at 259.

153. For a thorough discussion of this point, see Enoch & Fisher, supra note 5, at 584; see also Hawthorne, Isaacs & Sridharan, supra note 4, at 130–140.

154. Cf. Spellman, Barbara A., In Defense of Weird Hypotheticals, 2 Quaestio Facti 325, 333–335 (2021)Google Scholar (arguing for the utility of hypotheticals when “put together wisely”). One exception to this overabstraction is Enoch and Fisher's complex discussion of possible deterrence effects. See discussion supra note 15.

155. Fed. R. Evid. 102 (“These rules should be construed so as to administer every proceeding fairly, eliminate unjustifiable expense and delay, and promote the development of evidence law, to the end of ascertaining the truth and securing a just determination.”). Truth is not, however, the only end, as we discuss infra at text accompanying notes 160–164. One of the limits of abstraction and focus on a single variable in this context is their neglect of the fact that constructing legal systems involves a complex optimization problem involving numerous variables. See Allen, supra note 12, at 213.

156. Smith, supra note 7, at 1197; see also Shaviro, Daniel, Statistical-Probability Evidence and the Appearance of Justice, 103 Harv. L. Rev. 530, 536 (1989)Google Scholar; Nunn, G. Alexander, The Incompatibility of Due Process and Naked Statistical Evidence, 68 Vand. L. Rev. 1407, 1424–1430 (2015)Google Scholar. The concern about avoiding wrongful convictions at any cost leads some theorists to attempt to demonstrate that their particular theory is immune to such criticism. See, e.g., Sarah Moss, Probabilistic Knowledge (2018), at 215 (“The thesis that legal proof requires knowledge is consistent with the claim that one should avoid convicting innocent people at any cost.”).

157. Smith builds his argument on the concept of normalcy. One of us has already explained why that is unhelpful in understanding any American legal system of which we are aware. Allen, supra note 16, at 269–276. It amounts to thinking that evidence with a known risk of error, like statistical evidence, is somehow a less justifiable basis for decision than evidence with an even higher but not as well-specified risk of error, like testimonial evidence. The real puzzle here is why anyone would think the more ambiguous evidence should be preferred to the less ambiguous. Not surprisingly, as we have argued at length here, there is vanishingly small support that such a belief animates much law in the United States.

158. Ronald J. Allen & Larry Laudan, Deadly Dilemmas, 41 Tex. Tech. L. Rev. 65, 71 (2008).

159. See, e.g., United States v. Jordan, 945 F.3d 245, 256 (5th Cir. 2019) (“The jury retains the sole authority to weigh any conflicting evidence and to evaluate the credibility of witnesses . . . and [here], [the defendant's] counsel had every opportunity to impeach both [witnesses] for their previous acts of dishonesty and any inconsistencies in their testimony, and the jury independently weighed that testimony and determined that the evidence was sufficient to support a finding of guilt . . . [w]e do not second-guess such findings.”).

160. See Allen, supra note 12, at 213.

161. For an excellent discussion of the complexities of decision optimization in American-style legal systems, see Ribeiro, Gustavo, Evidentiary Policies Through Other Means: The Disparate Impact of “Substantive Law” on the Distribution of Errors Among Racial Groups, 2021 Utah L. Rev. 441 (2021)Google Scholar.

162. See, e.g., Fed. R. Evid. 412–415.

163. Allen & Laudan, supra note 158, at 81–83.

164. Id. at 74.

165. Allen, supra note 32, at 19–20; Allen, supra note 16, at 267.