Hostname: page-component-89b8bd64d-b5k59 Total loading time: 0 Render date: 2026-05-07T03:05:11.050Z Has data issue: false hasContentIssue false

Genetic Similarity Assessment of Twin-Family Populations by Custom-Designed Genotyping Array

Published online by Cambridge University Press:  05 August 2019

Jeffrey J. Beck*
Affiliation:
Avera Institute for Human Genetics, Avera McKennan Hospital & University Health Center, Sioux Falls, SD, USA
Jouke-Jan Hottenga
Affiliation:
Department of Biological Psychology, Behavioral and Movement Sciences, Vrije Universiteit, Amsterdam, the Netherlands
Hamdi Mbarek
Affiliation:
Department of Biological Psychology, Behavioral and Movement Sciences, Vrije Universiteit, Amsterdam, the Netherlands
Casey T. Finnicum
Affiliation:
Avera Institute for Human Genetics, Avera McKennan Hospital & University Health Center, Sioux Falls, SD, USA
Erik A. Ehli
Affiliation:
Avera Institute for Human Genetics, Avera McKennan Hospital & University Health Center, Sioux Falls, SD, USA
Yoon-Mi Hur
Affiliation:
Department of Education, Education Research Institute, Mokpo National University, Jeonnam, South Korea
Nicholas G. Martin
Affiliation:
QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
Eco J.C. de Geus
Affiliation:
Department of Biological Psychology, Behavioral and Movement Sciences, Vrije Universiteit, Amsterdam, the Netherlands
Dorret I. Boomsma
Affiliation:
Avera Institute for Human Genetics, Avera McKennan Hospital & University Health Center, Sioux Falls, SD, USA Department of Biological Psychology, Behavioral and Movement Sciences, Vrije Universiteit, Amsterdam, the Netherlands
Gareth E. Davies
Affiliation:
Avera Institute for Human Genetics, Avera McKennan Hospital & University Health Center, Sioux Falls, SD, USA Department of Biological Psychology, Behavioral and Movement Sciences, Vrije Universiteit, Amsterdam, the Netherlands
*
Author for correspondence: Jeffrey J. Beck, Email: jeffrey.beck@avera.org

Abstract

Twin registries often take part in large collaborative projects and are major contributors to genome-wide association (GWA) meta-analysis studies. In this article, we describe genotyping of twin-family populations from Australia, the Midwestern USA (Avera Twin Register), the Netherlands (Netherlands Twin Register), as well as a sample of mothers of twins from Nigeria to assess the extent, if any, of genetic differences between them. Genotyping in all cohorts was done using a custom-designed Illumina Global Screening Array (GSA), optimized to improve imputation quality for population-specific GWA studies. We investigated the degree of genetic similarity between the populations using several measures of population variation with genotype data generated from the GSA. Visualization of principal component analysis (PCA) revealed that the Australian, Dutch and Midwestern American populations exhibit negligible interpopulation stratification when compared to each other, to a reference European population and to globally distant populations. Estimations of fixation indices (FST values) between the Australian, Midwestern American and Netherlands populations suggest minimal genetic differentiation compared to the estimates between each population and a genetically distinct cohort (i.e., samples from Nigeria genotyped on GSA). Thus, results from this study demonstrate that genotype data from the Australian, Dutch and Midwestern American twin-family populations can be reasonably combined for joint-genetic analysis.

Information

Type
Articles
Copyright
© The Author(s) 2019 
Figure 0

Table 1. Characteristics of samples genotyped on GSA per cohort and tissue

Figure 1

Table 2. Content and marker selection categories of the custom-designed Illumina GSA

Figure 2

Table 3. Imputation quality metrics per minor allele frequency bin for the GSA

Figure 3

Table 4. Genotype concordance metrics for GSA-mimicked, genotyped GoNL SNPs that were reimputed with 1000G reference panel

Figure 4

Fig. 1. Genetic ancestry of Midwestern American, Australian and Dutch subjects. Shown are the results from PCA using autosomal genotyped SNPs after quality control, filtering, pruning and exclusion of long-range LD (109,702 markers). Ancestry outliers were removed prior to performing PCA. PC1 and PC2 represent the first and second PCs and account for 18.864% and 11.919% of the variation, respectively.

Figure 5

Fig. 2. Genetic ancestry of Midwestern American, Australian, Dutch and Nigerian subjects. Shown are the results from the PCA using all autosomal genotyped SNPs after quality control, filtering, pruning and exclusion of long-range LD (109,702 markers). Ancestry outliers were removed prior to performing PCA. PC1 and PC2 represent the first and second PCs and account for 67.765% and 6.568% of the variation, respectively.

Figure 6

Fig. 3. Projection of PCs for Midwestern American, Australian, Dutch and Nigerian subjects onto HGDP populations. Shown are the results from the PCA using autosomal genotyped SNPs that were in common with HGDP after quality control, filtering and exclusion of long-range LD (54,820 markers). Ancestry outliers were removed prior to performing the PCA. PC1 and PC2 represent the first and second PCs and account for 38.048% and 28.811% of the variation, respectively.

Figure 7

Table 5. Statistical significance of differences between populations

Figure 8

Fig. 4. Results of the case-control GWAS between Midwestern American (cases), Australian (controls) and Dutch (controls) populations. (A) Manhattan plot of the case-control GWAS of Midwestern Americans (227 cases) and Dutch (6139 controls) using 228,025 variants after MAF > 0.10 filter. (B) QQ-plot of observed versus expected p values of the association results between Midwestern American and Dutch populations (λ = 1.159). (C) Manhattan plot of the case-control GWAS of Midwestern Americans (227 cases) and Australians (1581 controls) using 228,166 variants after MAF > 0.10 filter. (D) QQ-plot of observed versus expected p values of the association results between Midwestern American and Australian populations (λ = 1.153). Shown in each Manhattan plot is a blue line depicting a suggestive level of statistical significance (p = 1 × 10−5). In panel (C), the red line represents a genome-wide level of statistical significance (p = 5 × 10−8). The rs numbers point to the chromosomal region that reached the genome-wide significance level. Variants with a MAF < 0.10 were excluded. All related individuals and ancestry outliers were removed prior to performing the associations.

Figure 9

Table 6. FST between Midwestern American, Dutch, Australian, and Nigerian populations

Supplementary material: File

Beck et al. supplementary material

Beck et al. supplementary material 1

Download Beck et al. supplementary material(File)
File 2 KB
Supplementary material: File

Beck et al. supplementary material

Beck et al. supplementary material 2

Download Beck et al. supplementary material(File)
File 4 KB