Hostname: page-component-89b8bd64d-j4x9h Total loading time: 0 Render date: 2026-05-08T21:34:08.688Z Has data issue: false hasContentIssue false

Statistical Properties of Single-Marker Tests for Rare Variants

Published online by Cambridge University Press:  17 April 2014

T. Bernard Bigdeli
Affiliation:
Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, USA Department of Human and Molecular Genetics, Virginia Commonwealth University, Richmond, VA, USA
Benjamin M. Neale*
Affiliation:
Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Michael C. Neale
Affiliation:
Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, USA Department of Human and Molecular Genetics, Virginia Commonwealth University, Richmond, VA, USA Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
*
address for correspondence: Benjamin M. Neale, Broad Institute of Harvard and MIT, 7 Cambridge Center, Cambridge, MA 02142, USA. E-mail: bneale@broadinstitute.org

Abstract

With the dramatic technological developments of genome-wide association single-nucleotide polymorphism (SNP) chips and next generation sequencing, human geneticists now have the ability to assay genetic variation at ever-rarer allele frequencies. To fully understand the impact of these rare variants on common, complex diseases, we must be able to accurately assess their statistical significance. However, it is well established that classical association tests are not appropriate for the analysis of low-frequency variation, giving spurious findings when observed counts are too few. To further our understanding of the asymptotic properties of traditional association tests, we conducted a range of simulations of a typical rare variant (~1%) under the null hypothesis and tested the allelic χ2, Cochran–Armitage trend, Wald, and Fisher's exact tests. We demonstrate that rare variation shows marked deviation from the expected distributional behavior for each test, with fewer minor alleles corresponding to a greater degree of test statistics deflation. The effect becomes more pronounced at progressively smaller α levels. We also show that the Wald test is particularly deflated at α levels consistent with genome-wide association significance, much more so than the other association tests considered. In general, these classical association tests are inappropriate for the analysis of variants for which the minor allele is observed fewer than 80 times, largely irrespective of sample size.

Information

Type
Articles
Copyright
Copyright © The Authors 2014 
Figure 0

TABLE 1 Number of Significant 1-df Allelic χ2, Cochran–Armitage trend, Wald, and Fisher's Exact Test Statistics

Figure 1

FIGURE 1 Quantiles for null distributions of the 1B allelic χ2, Cochran–Armitage trend, Wald, and Fisher's exact tests.

Figure 2

TABLE 2 Number of Significant 1-df Wald Statistics for Logistic Regression Models Featuring a Null Covariate

Figure 3

FIGURE 2 Quantiles for null distributions of 1B Wald statistics for logistic regression models featuring a null covariate.

Figure 4

TABLE 3 Number of Additional Permutations Required to Establish Significance at Given α-Levels

Supplementary material: PDF

Bigdeli Supplementary Material

Supplementary Material

Download Bigdeli Supplementary Material(PDF)
PDF 120.1 KB