The Twins Early Development Study (TEDS), one of the largest twin cohorts in the world, investigates how genetic and environmental factors shape individual differences in cognitive abilities, educational achievement, behavior and emotions in the context of typical development. All twins born in England and Wales between 1994 and 1996 were invited to take part in the study, with over 13,000 families participating in the first wave of data collection when twins were around 18 months old. These twins have been followed longitudinally throughout childhood to emerging adulthood, with data collected from the twins themselves, including pioneering web-based cognitive testing, and from their parents and teachers. The data are also increasingly linked to national databases such as the National Pupil Database (NPD). We are currently studying TEDS participants in their early twenties, an age of great change known as emerging adulthood (Arnett, Reference Arnett2006). Here, we update the previous sample descriptions (Haworth, Davis., & Plomin, Reference Haworth, Davis and Plomin2013; Oliver & Plomin, Reference Oliver and Plomin2007; Trouton, Spinath, & Plomin, Reference Trouton, Spinath and Plomin2002) to include information about the latest waves of data collection, additional genotyping and links to national databases.
TEDS Recruitment from Infancy to Emerging Adulthood
All twins born between 1994 and 1996 in England and Wales, as identified through birth records, were invited via their parents to participate. The invitations were sent to families by the UK Office for National Statistics after screening for infant mortality, and 16,810 families expressed interest in taking part. TEDS conducted the first wave of data collection when twins were around 18 months old, obtaining demographic information, data about pregnancy and childbirth, and questions related to zygosity. After this initial wave of assessment, data have been provided at ages 2, 3, 4, 7, 8, 9, 10, 12, 14, 16, 18 and 21. Data collection has been conducted by posting questionnaires and test booklets, by telephone, by web-based platforms and, at age 21, by a smartphone app. Questionnaires and test booklets have been made available across all data collections for those preferring that method of response. Zygosity was assigned using a parent-reported questionnaire of physical similarity, which is found to be over 95% accurate (Price et al., Reference Price, Freeman, Craig, Petrill, Ebersole and Plomin2000). DNA testing was undertaken where zygosity was not clear.
In addition to the main waves of assessment, several smaller studies and spin-offs have also been conducted, as described below. Not all twins have been invited to participate in some waves of data collection, in part for financial reasons. For example, the youngest (1996) birth cohort has not been included in all assessment waves.
Informed consent was obtained from parents during childhood and from twins themselves from age 16 onwards prior to each wave of data collection. Participants were also informed of their right to withdraw from the study prior to each wave. To increase participation rates and decrease attrition, participants receive small gifts or shopping vouchers to reimburse them for their time. Other inducements include prize draws, hand-written birthday cards, annual TEDS newsletters, and the offer of work experience and career advice. In addition, TEDS researchers engage with participants via social media, actively maintaining Twitter and Facebook accounts.
These strategies have been successful in sustaining the study sample over 25 years of data collection. As shown in Table 1, although there has been some attrition, more than 8000 twin-pairs continue to take part. Furthermore, the sample remains fairly representative of the population in England and Wales in terms of ethnicity and family socioeconomic factors. For example, the proportion of families where mothers had an A-level qualification or higher was 37.8% in early childhood, 39.9% in middle childhood, 40.6% in adolescence and 41.3% in emerging adulthood, using the socioeconomic measures collected at first contact, which is comparable to UK averages for this cohort (37.9%).
Notes: Early childhood refers to families who provided any data when the twins were aged 2, 3 or 4 years; middle childhood refers to families who provided any data when the twins were aged 7, 8, 9 or 10 years; adolescence refers to families who provided any data when the twins were aged 12, 14 or 16; emerging adulthood refers to families who provided any data when the twins were aged 18–23 years.
a Only active families were contacted.
b 50% national equivalent refers to working mothers with their youngest child under the age of 2 in the UK; this is slightly higher than at TEDS, which is expected because TEDS families have multiple births.
When we compared the sample at each age to the sample at first contact, although significant differences emerged between these groups, the mean differences were generally small. The largest differences were for socioeconomic factors such as parental education and occupation, but even here the differences are less than half a standard deviation on average (see Supplementary Tables S1 and S2 for details). Logistic regression analyses indicated that family socioeconomic factors explain little variance in missing data, with the greatest variance explained (5%) by parental occupation in the emerging adulthood data (see Supplementary Table S3 for details).
The sample remains fairly representative when considering family socioeconomic measures collected in emerging adulthood; 42.3% of mothers and 41.2% of fathers had tertiary education (National Statistics 46% and 44.6%; https://data.oecd.org/eduatt/adult-education-level.htm). Furthermore, when comparing statistics collected from twins themselves, 48% have an A-level qualification or above, which is similar to national average of 42.1% (https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/502158/SFR03_2016__A_level_and_other_level_3_results_in_England_SFR_revised.pdf).
Measures in TEDS
Cognitive, emotional and behavioral data have been collected from the twins, their parents and their teachers over more than 20 years, with 13 major waves of assessment as illustrated in Figure 1; this was made possible largely by 25 years of continuous funding from the UK Medical Research Council.
The measures collected can be divided into six broad categories: academic achievement, cognitive development (including language, reading and mathematics), psychopathology (emotional and behavioral development), the environment (school, home and life events), physical health, and wellbeing, personality and motivation. The rings in Figure 1 represent the waves of assessment across six broad categories of measures from first contact (inner ring) to age 21 (outer ring). The major measures and timeline of data collections are presented in Supplementary Figure S1.
Measures included UK National Curriculum (NC) teacher-assessed grades, which were obtained from teachers at ages 7, 8, 9 and 10, and from both teachers and parents at ages 12 and 14. In addition, national exam results at ages 16 and 18 were collected from parents or from twins themselves. For 12,533 twins who provided written consent in 2016, we obtained official NC results from age 7 to 18 from the NPD (https://data.gov.uk/dataset/9e0a13ef-64c4-4541-a97a-f87cc4032210/national-pupil-database), which also includes information about school demographics. The TEDS data on reported achievement correlate very highly with NPD data. For example, General Certificate of Secondary Education (GCSE) grades collected at age 16 by TEDS and by NPD correlated .98 for English, .99 for Mathematics and .95 for Science (Rimfeld et al., Reference Rimfeld, Malanchini, Krapohl, Hannigan, Dale and Plomin2018). We have also obtained information about diagnosed learning difficulties.
Cognitive development was assessed at ages 2, 3, 4, 7, 9, 10, 12, 14 and 16. As the TEDS sample is very large and geographically dispersed, in-person cognitive testing was not feasible for the whole sample. However, in-home testing was conducted for a subsample of TEDS twins (∼800 twin-pairs) at age 4.5, focusing on general cognitive ability and language ability (Colledge et al., Reference Colledge, Bishop, Koeppen-Schomerus, Price, Happé, Eley and Plomin2002; Viding et al., Reference Viding, Price, Spinath, Bishop, Dale and Plomin2003). As detailed in Supplementary Figure S1, cognitive data consisted of a mixture of phone, booklet and web-based data collection. At ages 2, 3 and 4, cognitive data were collected by booklets sent to the twins’ parent who administered the tests. Telephone testing was conducted for the first time when twins were age 7 and web-based tests were launched at age 10. At each stage, we validated the phone, web and booklet testing by comparing it to in-person cognitive testing, which showed high agreement with in-person testing in early childhood (Oliver et al., Reference Oliver, Dale, Saudino, Petrill, Pike and Plomin2002; Saudino et al., Reference Saudino, Dale, Oliver, Petrill, Richardson, Rutter and Plomin1998), in middle childhood (Petrill, Rempell, Oliver, & Plomin, Reference Petrill, Rempell, Oliver and Plomin2002) and in adolescence (Haworth et al., Reference Haworth, Harlaar, Kovas, Davis, Oliver, Hayiou-Thomas and Plomin2007). In addition, measures assessing spatial ability and navigation skills were collected from a subsample of TEDS twins when they were between 19 and 20 years of age.
Mental health has been assessed longitudinally from age 2 to 21 using quantitative measures of behaviors widely seen in the population, including emotional symptoms (depression, anxiety and fear), conduct problems (aggressiveness, rule-breaking), hyperactivity and inattention, and prosocial behavior. There are also measures of less common behaviors including autistic symptoms, psychopathy, psychotic experiences and eating disorder symptoms. Much of these data are available from multiple raters (self-rated, parent-rated and teacher-rated).
The environment — school, home and life events
Measures of school environment were obtained at ages 9, 10, 12, 14 and 16, assessing features such as student–teacher relationships, school resources and information about homework. For those participants who gave consent to link our data to NPD, we were able to further link it to ratings of school quality by inspectors from the government agency Ofsted (https://www.gov.uk/government/organisations/ofsted). Ofsted obtains rich data about school environment, including quality of teaching, wellbeing and behavior of students at school, and alternative school provision (such as extracurricular and after-school care). For all assessment waves, home environment was assessed, assessing features such as chaos at home, parent–child interactions, parental feelings and discipline, and parental monitoring and control. Life events were reported by parents at ages 4, 7, 16 and 21, and by the twins themselves at ages 16 and 21.
Physical health and wellbeing
Physical health measures were collected longitudinally from age 2 to 21, including anthropometric data such as height and weight, and various health outcomes including hearing, speech and eyesight problems, and aches, pains and medical conditions. In the near future, we plan to link TEDS data to participants’ medical records. Wellbeing was assessed at ages 16 and 21, including happiness, psychological needs, daily hassles, relationships and financial situation. Relationships including peer problems were assessed at ages 8, 12, 14, 16 and 18.
Personality and motivation
Personality was assessed by the Big 5 inventory at ages 16 and 21, with a broader range of aspects of personality (e.g., curiosity, optimism and gratitude) at age 16. Motivation with regard to school and work engagement was assessed at ages 9, 12 and 16. At age 21, we asked about issues especially relevant to emerging adulthood such as ambition, purpose in life and attitudes about education, marriage, occupation and religion.
In addition to the main TEDS assessments, there have been several studies using subsamples of TEDS twins as well as spin-off projects. The most substantial of these is the Environmental Risk project (E-Risk; http://eriskstudy.com). Other spin-off studies, primarily exploring a specific phenotype in greater depth than was feasible in the entire sample, included food and activity preferences (Wardle, Guthrie, Sanderson, Birch, & Plomin, Reference Wardle, Guthrie, Sanderson, Birch and Plomin2001), anxiety (Eley, Gregory, Clark, & Ehlers, Reference Eley, Gregory, Clark and Ehlers2007; Waszczuk, Zavos, Gregory, & Eley, Reference Waszczuk, Zavos, Gregory and Eley2014), second language learning (Dale, Harlaar, & Plomin, Reference Dale, Harlaar and Plomin2012), attachment (Fearon, Shmueli-Goetz, Viding, Fonagy, & Plomin, Reference Fearon, Shmueli-Goetz, Viding, Fonagy and Plomin2014), importance of nonshared environment in educational achievement (Asbury, Moran, & Plomin, Reference Asbury, Moran and Plomin2016) and daily changes in positive and negative affect (Zheng, Plomin, & von Stumm, Reference Zheng, Plomin and von Stumm2016) among others. More recently, a core spin-off was launched collecting data from children of TEDS twins (CoTEDS; https://www.teds.ac.uk/co-teds). CoTEDS is the first study to collect data from both twin parents and their offspring from birth. To date, more than 450 children of TEDS twins have enrolled in the study (Ahmadzadeh et al., Reference Ahmadzadeh, Eley, Plomin, Dale, Lester, Oliver and McAdams2019).
DNA Studies in TEDS
DNA samples have been obtained from 12,500 individuals and genotyped on one of the two DNA microarrays (Affymetrix GeneChip 6.0 or Illumina HumanOmniExpressExome chips). After stringent quality control, the total sample size available for genomic analyses was 10,346 (including 7026 unrelated individuals and 3320 additional dizygotic [DZ] co-twins). Of these, 7289 individuals were genotyped on Illumina arrays, and 3057 individuals were genotyped on Affymetrix arrays (see Selzam, McAdams et al., Reference Selzam, McAdams, Coleman, Carnell, O’Reilly, Plomin and Llewellyn2018 for a detailed description).
Methodological advances in genomics in recent years have been substantial, most notably the ability to calculate genome-wide polygenic scores (GPSs) that capitalize on the summary statistics of genome-wide association studies. This method offers an individualized prediction of complex traits such as academic achievement, wellbeing and mental health outcomes from DNA alone, without knowing anything about the biological pathways between genes and behavior. To facilitate research and collaboration capitalizing on both powerful GPS and the rich longitudinal data in TEDS, more than 300 GPSs have been created for a wide range of educational, psychiatric, psychological and anthropometric phenotypes throughout development (see Selzam, Coleman, Caspi, Moffitt, & Plomin, Reference Selzam, Coleman, Caspi, Moffitt and Plomin2018, for details about the polygenic score calculation; see section below on Collaborations for detail on how to access these data).
Genotyped DZ twins
The availability of genotyped DZ twin data makes it possible to study within- as well as between-family factors. An advantage of using DZ twins over nontwin siblings is that unlike siblings, DZ twins are the same age. TEDS has a sample of 3320 genotyped DZ twin-pairs; the sample of genotyped DZ twins is also reasonably representative of the population in England and Wales (see Supplementary Tables S1–S4 for details).
The major impact of TEDS has been in the field of education and for the public more generally. Educational achievement was a relatively unexplored area of development before TEDS. TEDS research has shown that educationally relevant traits are among the most heritable behavioral traits from the early school years through compulsory education (Rimfeld et al., Reference Rimfeld, Malanchini, Krapohl, Hannigan, Dale and Plomin2018) to university and beyond (Smith-Woolley, Ayorech, Dale, von Stumm, & Plomin, Reference Smith-Woolley, Ayorech, Dale, von Stumm and Plomin2018) for both test scores and teacher ratings (Rimfeld et al., Reference Rimfeld, Malanchini, Hannigan, Dale, Allen, Hart and Plomin2019). Genetics not only influences achievement but also permeates the choice of academic subjects (Rimfeld, Ayorech, Dale, Kovas, & Plomin, Reference Rimfeld, Ayorech, Dale, Kovas and Plomin2016) and the impact of personality on educational achievement (Smith-Woolley, Selzam, & Plomin, Reference Smith-Woolley, Selzam and Plomin2019). TEDS has combined twin and genomic analyses to discover novel findings about the importance of genetics on differences in school performance for selective versus nonselective schools (Smith-Woolley, Pingault et al., Reference Smith-Woolley, Pingault, Selzam, Rimfeld, Krapohl, von Stumm and Plomin2018) and on intergenerational educational mobility (Ayorech, Plomin, & von Stumm, Reference Ayorech, Plomin and von Stumm2019).
A major reason for the greater acceptance of genetics in education is that TEDS first showed in 2017 that DNA (a polygenic score based on educational attainment) predicts 9% of the variance in tested school performance (GCSE scores at age 16; Selzam et al., Reference Selzam, Krapohl, von Stumm, O’Reilly, Rimfeld, Kovas and Plomin2017). By 2019, the predictive power of the educational attainment polygenic score rose to 15% (Allegrini et al., Reference Allegrini, Selzam, Rimfeld, von Stumm, Pingault and Plomin2019), making it the strongest polygenic score predictor in the behavioral sciences. TEDS has also shown that low socioeconomic status greatly decreases the chances that children with high educational attainment polygenic scores will go to university (Ayorech et al., Reference Ayorech, Plomin and von Stumm2019).
The impact of the DNA revolution in the behavioral sciences depends on the predictive power of polygenic scores, which is limited by single nucleotide polymorphism heritability. TEDS has been at the forefront of attempts to improve prediction using multipolygenic scores (Krapohl et al., Reference Krapohl, Patel, Newhouse, Curtis, Von Stumm, Dale and Plomin2018), extracting longitudinal stability (Cheesman et al., Reference Cheesman, Purves, Pingault, Breen, Plomin and Eley2018) and comparing different methods of polygenic score creation (Allegrini et al., Reference Allegrini, Selzam, Rimfeld, von Stumm, Pingault and Plomin2019). TEDS also pioneered new applications of polygenic scores exploring gene–environment correlation (Selzam, McAdams et al., Reference Selzam, McAdams, Coleman, Carnell, O’Reilly, Plomin and Llewellyn2018) and ‘p’, a genetically driven general psychopathology factor (Selzam, Coleman et al., Reference Selzam, Coleman, Caspi, Moffitt and Plomin2018).
TEDS is also providing a lasting scientific legacy in fostering the next generation of researchers. Over the last two decades, 35 students have completed PhDs primarily using TEDS data, and now many ‘students of TEDS students’ have also completed their PhDs using TEDS data, a process that we hope will continue for many more generations.
Collaborations Past, Present and Future
TEDS has been a valuable resource for researchers across the world for more than two decades, and we have made the data freely and widely accessible (for details, see https://www.teds.ac.uk/researchers/teds-data-access-policy). Thus far, TEDS data have contributed to 426 scientific papers led by 145 researchers in 50 research institutions. Figure 2 illustrates the substantive breadth of TEDS papers. The relative size of the outer circle indicates the proportion of published papers using the broad phenotype domain as the primary phenotype. The largest proportion of papers has reported on cognitive phenotypes, followed closely by mental health phenotypes. The width of the bands (flows) connecting the broad phenotype categories indicates the multivariate nature of many TEDS papers and also points to relationships that need more attention such as the links between mental health and educational achievement.
We are currently collaborating with a wide array of researchers, including some of the largest and most influential international genome-wide meta-analytic studies. These include the Genetics of Language Consortium (http://genlang.org), the Childhood and Adolescence Psychopathology Consortium (CAPICE; http://www.capice-project.eu/index.php), Aggression in Children: Unravelling Gene–Environment Interplay to Inform Treatment and Intervention Strategies (ACTION: http://www.action-euproject.eu), the Early Genetics and Lifecourse Epidemiology consortium (EAGLE; https://www.wikigenes.org/e/art/e/348.html) and the Early Growth Genetics consortium (EGG; https://www.egg-consortium.org). As the twins have now reached adulthood, their data will also increasingly be used in adult genome wide metaanalyses across the world; for example, through the Psychiatric Genomics Consortium (PGC; http://www.med.unc.edu/pgc).
We warmly encourage investigators and students from around the world to consider using the dataset. The TEDS data dictionary provides an overview of measures available by wave, down to the detail of each specific variable, and is publicly available via this link: http://www.teds.ac.uk/datadictionary. In addition, a substantial dataset has been released to the public about educational achievement and cognitive ability at ages 7, 9 and 10 to accompany the monograph summarizing the findings in this developmental stage in middle childhood (Kovas, Haworth, Dale, & Plomin, Reference Kovas, Haworth, Dale and Plomin2007). Data requests can be submitted using our updated data sharing protocols (see https://www.teds.ac.uk/researchers/teds-data-access-policy for details).
We are actively collaborating to increasingly ensure that our assessments dovetail with those of other major British cohorts such as ALSPAC (Fraser et al., Reference Fraser, Macdonald-Wallis, Tilling, Boyd, Golding, Davey Smith and Lawlor2013; see http://www.bristol.ac.uk/alspac/ for details), Twins UK (Moayyeri, Hammond, Valdes, & Spector, Reference Moayyeri, Hammond, Valdes and Spector2012; see http://twinsuk.ac.uk for details) and UK Biobank (Sudlow et al., Reference Sudlow, Gallacher, Allen, Beral, Burton, Danesh and Collins2015; see https://www.ukbiobank.ac.uk for details) to enable cross-study replications.
It is our hope that TEDS continues to be at the forefront of research studying how genetic and environmental factors shape individual differences in cognitive, emotional, behavioral and health outcomes in the population. In addition to plans for new data collections about adult outcomes, we are actively working to link our data to national data registers in the UK, such as Higher Education Student Data and National Health Service records including those from primary and secondary care.
The DNA revolution has led to powerful genomic predictors of cognitive and behavioral outcomes. This relatively new form of data also enriches research on the developmental interplay between genes and environment using longitudinal datasets such as TEDS. TEDS datasets allow a multimethod approach to study how genotypes develop into phenotypes.
To contribute to global scientific efforts toward open science, TEDS is committed to the Open Science Framework (OSF). Since spring 2019, all collaborators have been required to register their analysis plans with OSF prior to obtaining TEDS data (https://www.teds.ac.uk/researchers/teds-data-access-policy) in order to improve transparency of methodology, promote scientific rigor and enhance research quality.
TEDS offers an outstanding resource with rich phenotypes collected longitudinally from infancy to emerging adulthood, enabling genetically sensitive investigations using quantitative genetic and genomic analyses, and actively encourages collaborations. TEDS is committed to advancing our understanding of gene–environment interplay in the development of individual differences across a wide range of phenotypes and is part of global efforts for scientific discovery.
To view supplementary material for this article, please visit https://doi.org/10.1017/thg.2019.56.
We gratefully acknowledge the ongoing contribution of the participants in the Twins Early Development Study (TEDS) and their families.
TEDS is supported by a program grant to RP from the UK Medical Research Council (MR/M021475/1 and previously G0901245), with additional support from the US National Institutes of Health (AG046938). The research leading to these results has also received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007–2013)/grant agreement no. 602768. RP is supported by a Medical Research Council Professorship award (G19/2). TE is part-funded by the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. MM is partly supported by a David Wechsler Early Career Grant for Innovative Work in Cognition. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Conflict of interest
The authors declare no competing interest.
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional guides on the care and use of laboratory animals.