Hostname: page-component-6766d58669-fx4k7 Total loading time: 0 Render date: 2026-05-18T23:38:28.299Z Has data issue: false hasContentIssue false

Determining Zygosity in Infant Twins – Revisiting the Questionnaire Approach

Published online by Cambridge University Press:  12 July 2021

Irzam Hardiansyah*
Affiliation:
Centre of Neurodevelopmental Disorders (KIND), Division of Neuropsychiatry, Department of Women’s and Children’s Health, Karolinska Institutet, Stockholm, Sweden
Linnea Hamrefors
Affiliation:
Centre of Neurodevelopmental Disorders (KIND), Division of Neuropsychiatry, Department of Women’s and Children’s Health, Karolinska Institutet, Stockholm, Sweden
Monica Siqueiros
Affiliation:
Centre of Neurodevelopmental Disorders (KIND), Division of Neuropsychiatry, Department of Women’s and Children’s Health, Karolinska Institutet, Stockholm, Sweden Division of Interdisciplinary Brain Sciences, Department of Psychiatry and Behavioural Sciences, Stanford University, Stanford, California, USA
Terje Falck-Ytter
Affiliation:
Centre of Neurodevelopmental Disorders (KIND), Division of Neuropsychiatry, Department of Women’s and Children’s Health, Karolinska Institutet, Stockholm, Sweden Developmental and Neurodiversity Lab (DIVE), Division of Developmental Psychology, Department of Psychology, Uppsala University, Uppsala, Sweden
Kristiina Tammimies
Affiliation:
Centre of Neurodevelopmental Disorders (KIND), Division of Neuropsychiatry, Department of Women’s and Children’s Health, Karolinska Institutet, Stockholm, Sweden
*
Author for correspondence: Irzam Hardiansyah, Email: irzam.hardiansyah@ki.se

Abstract

Accurate zygosity determination is a fundamental step in twin research. Although DNA-based testing is the gold standard for determining zygosity, collecting biological samples is not feasible in all research settings or all families. Previous work has demonstrated the feasibility of zygosity estimation based on questionnaire (physical similarity) data in older twins, but the extent to which this is also a reliable approach in infancy is less well established. Here, we report the accuracy of different questionnaire-based zygosity determination approaches (traditional and machine learning) in 5.5 month-old twins. The participant cohort comprised 284 infant twin pairs (128 dizygotic and 156 monozygotic) who participated in the Babytwins Study Sweden (BATSS). Manual scoring based on an established technique validated in older twins accurately predicted 90.49% of the zygosities with a sensitivity of 91.65% and specificity of 89.06%. The machine learning approach improved the prediction accuracy to 93.10%, with a sensitivity of 91.30% and specificity of 94.29%. Additionally, we quantified the systematic impact of zygosity misclassification on estimates of genetic and environmental influences using simulation-based sensitivity analysis on a separate data set to show the implication of our machine learning accuracy gain. In conclusion, our study demonstrates the feasibility of determining zygosity in very young infant twins using a questionnaire with four items and builds a scalable machine learning model with better metrics, thus a viable alternative to DNA tests in large-scale infant twin studies.

Information

Type
Articles
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press
Figure 0

Table 1. Performance of the manual algorithm, thresholds across twins and in the DZ/MZ group respectively

Figure 1

Fig. 1. Comparative bootstrapped performance of the three ML-based and manual approaches. Bootstrapped distributions for the manual approach are approximately Gaussian, while those for the three ML approaches were right-skewed. The manual method is the baseline for the comparisons. In all cases, MZ is the positive class; thus, sensitivity is the MZ accuracy, while specificity is the DZ accuracy.Note: ML, machine learning; MZ, monozygotic; DZ, dizygotic.

Figure 2

Table 2. Performance of the three ML classifiers for binary classification of zygosity

Figure 3

Fig. 2. Illustrations of misclassification impact on heritability model estimates. (a, top-left) Biaxis plot of rMZ (blue line, left y-axis), rDZ and Δr (red line and green line, respectively, both right y-axis) along with their respective confidence band as a linear function of zygosity prediction error; (b, top-right) Biaxis plot of A (blue line, left y-axis) and C (red line, right y-axis) along with their respective confidence band as a quadratic function of zygosity prediction error; (c, bottom) Probability of false detection of C in ACE model grows much faster than a linear rate with increasing zygosity prediction error, although the nominal probability remains very small in the shown error range. The first two plots are from the Australian twin BMI data set, the third from all three data sets.Note: A, additive genetic variance; C, common (or shared) environmental factors; E, specific (or nonshared) environmental factors plus measurement error.

Supplementary material: File

Hardiansyah et al. supplementary material

Hardiansyah et al. supplementary material

Download Hardiansyah et al. supplementary material(File)
File 135.2 KB