Hostname: page-component-89b8bd64d-b5k59 Total loading time: 0 Render date: 2026-05-07T14:23:23.175Z Has data issue: false hasContentIssue false

Rare high-impact disease variants: properties and identifications

Published online by Cambridge University Press:  21 March 2016

LEEYOUNG PARK*
Affiliation:
Natural Science Research Institute, Yonsei University, 134 Shinchon-Dong, Seodaemun-Gu, Seoul, 120-749, Korea
JU HAN KIM*
Affiliation:
Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul 110-799, Korea Systems Biomedical Informatics National Core Research Center (SBI-NCRC), Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 110-799, Korea
*
*Corresponding authors: Leeyoung Park PhD and Ju Han Kim MD, PhD. Tel: (82)2-2123-3530 and (82)2-3668-7674. Fax: (82)2-313-8892 and (82)2-747-8928. E-mail: lypark@yonsei.ac.kr and juhan@snu.ac.kr
*Corresponding authors: Leeyoung Park PhD and Ju Han Kim MD, PhD. Tel: (82)2-2123-3530 and (82)2-3668-7674. Fax: (82)2-313-8892 and (82)2-747-8928. E-mail: lypark@yonsei.ac.kr and juhan@snu.ac.kr
Rights & Permissions [Opens in a new window]

Summary

Although many genome-wide association studies have been performed, the identification of disease polymorphisms remains important. It is now suspected that many rare disease variants induce the association signal of common variants in linkage disequilibrium (LD). Based on recent development of genetic models, the current study provides explanations of the existence of rare variants with high impacts and common variants with low impacts. Disease variants are neither necessary nor sufficient due to gene–gene or gene–environment interactions. A new method was developed based on theoretical aspects to identify both rare and common disease variants by their genotypes. Common disease variants were identified with relatively small odds ratios and relatively small sample sizes, except for specific situations in which the disease variants were in strong LD with a variant with a higher frequency. Rare disease variants with small impacts were difficult to identify without increasing sample sizes; however, the method was reasonably accurate for rare disease variants with high impacts. For rare variants, dominant variants generally showed better Type II error rates than recessive variants; however, the trend was reversed for common variants. Type II error rates increased in gene regions containing more than two disease variants because the more common variant, rather than both disease variants, was usually identified. The proposed method would be useful for identifying common disease variants with small impacts and rare disease variants with large impacts when disease variants have the same effects on disease presentation.

Information

Type
Research Papers
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © Cambridge University Press 2016
Figure 0

Fig. 1. Procedure of identifying disease variants in case–control associations.

Figure 1

Fig. 2. A sufficient causal component model and genotype frequencies of a genetic component of G × G or G × E.

Figure 2

Fig. 3. Changes in odds ratios depending on allele frequencies and proportions in population lifetime incidence. (a) Dominant genes; (b) recessive genes.

Figure 3

Fig. 4. Type II error rates depending on allele frequencies and odds ratios. (a) Dominant variants; (b) recessive variants.

Figure 4

Fig. 5. Type II error rates depending on sample sizes and odds ratios. (a) Dominant variants when both case and control sample sizes increase; (b) dominant variants when case sample size is fixed at 500 and control sample sizes increase; (c) dominant variants when control sample size is fixed at 500 and case sample sizes increase; (d) recessive variants when both case and control sample sizes increase; (e) recessive variants when case sample size is fixed at 500 and control sample sizes increase; (f) recessive variants when control sample size is fixed at 500 and case sample sizes increase.

Figure 5

Fig. 6. Two-disease-variant models for a fixed variant and a variant with various allele frequencies, in which the solid line indicates Type II error rates and the dashed line indicates the probability when only one of two disease variants is identified as a disease variant. (a) Dominant genes; (b) recessive genes.

Figure 6

Fig. 7. Type II error rates for two-disease-variant models depending on various sample sizes. (a) Dominant variants when both case and control sample sizes increase; (b) dominant variants when case sample size is fixed at 500 and control sample sizes increase; (c) dominant variants when control sample size is fixed at 500 and case sample sizes increase; (d) recessive variants when both case and control sample sizes increase; (e) recessive variants when case sample size is fixed at 500 and control sample sizes increase; (f) recessive variants when control sample size is fixed at 500 and case sample sizes increase.

Supplementary material: File

Park and Kim supplementary material S1

Park and Kim supplementary material

Download Park and Kim supplementary material S1(File)
File 15.1 KB
Supplementary material: Image

Park and Kim supplementary material S2

Supplementary Figure

Download Park and Kim supplementary material S2(Image)
Image 30.9 KB
Supplementary material: Image

Park and Kim supplementary material S3

Supplementary Figure

Download Park and Kim supplementary material S3(Image)
Image 305.7 KB
Supplementary material: File

Park and Kim supplementary material S4

Supplementary Table

Download Park and Kim supplementary material S4(File)
File 137.6 KB
Supplementary material: File

Park and Kim supplementary material S5

Supplementary Table

Download Park and Kim supplementary material S5(File)
File 22.9 KB