Nearly all first depressive episodes are triggered by a stressful life event (e.g. death of a loved one, divorce, job loss) or a stressful period of life (e.g. starting college, medical internship, military deployment). We have long known the propensity of individuals with high genetic liability, as measured by family history, to become depressed in response to stress, indicating the presence of gene × environment (G × E) interaction in depression (Kendler Reference Kendler, Kuhn and Prescott2004). However, until recently, advancement on our knowledge about the nature of these G × E interactions had stalled.
Specifically, after many smaller genome-wide association studies (GWAS) with no or few significant findings, recent GWAS of depression have utilised very large samples to identify a growing number of associated loci, with a recent meta-analysis finding 178 depression-associated loci in a sample of >1 million participants (Levey Reference Levey, Stein and Wendt2020). This progress opens the door to gaining insights into G × E interactions in depression but requires different study designs and approaches to maximise understanding.
Two major forms of G × E interaction: diathesis–stress and differential susceptibility
How genes and environment interact can be simple or complex (Haldane Reference Haldane1938). Figure 1 illustrates additive effects and the simplest two ways of G × E interaction (diathesis–stress and differential susceptibility).
In the diathesis–stress model (Fig. 1(b)), the vulnerability or risk allele increases the likelihood of an environmental stressor leading to the outcome (such as depression), whereas the resilience allele leaves the individual unaffected. Such risk alleles should be identifiable by traditional GWAS with a large enough sample, even if the environment is not known. These vulnerability alleles are, however, more easily detectable in studies in which all participants are exposed to the risk environment, such as post-traumatic stress disorder (PTSD) studies where all participants were deployed in military action, addiction studies in which all family members are from families who drink or smoke, or special studies of stress-induced depression such as the Intern Health Study (Sen Reference Sen, Kranzler and Krystal2010) or studies of depression after childbirth or major medical stressors such as mastectomy or myocardial infarction.
In the differential susceptibility model (Belsky Reference Belsky and Pluess2009) (Fig. 1(c)), individuals differ in ‘plasticity’, and positive and negative environments can have opposite effects – people without the plasticity allele are less affected by the environment, whereas those with the plasticity allele are at higher risk in the negative environment but also benefit more from the positive environment. GWAS in this scenario are predicted to give no or contradictory findings, as results depend on the environment – a study among people who do not drink or who live under low stress may give opposite findings from a study performed in an environment with high alcohol use or high stress, and large meta-analyses that mix all types of environment are not likely to yield any significant findings. The association of the serotonin transporter promoter polymorphism with depression in the presence of stress may be a case in point, as the risk variant under stress is also more protective in a beneficial environment (van IJzendoorn Reference van IJzendoorn, Belsky and Bakermans-Kranenburg2012), and meta-analyses have come to contradictory conclusions (Karg Reference Karg, Burmeister and Shedden2011; Culverhouse Reference Culverhouse, Saccone and Horton2018).
From a translational perspective, if genetic risk is primarily due to plasticity factors, it suggests that individuals at high genetic risk might also disproportionately benefit more from behavioural or environmental interventions (Belsky Reference Belsky and Pluess2009).
An illustrative example is the study of single-nucleotide polymorphisms (SNPs) in the nicotinic acetylcholine receptor 5 that had long been associated with number of cigarettes smoked (Li Reference Li and Burmeister2009). A recent study (Taylor Reference Taylor, Morris and Fluharty2014) investigated the effect of this locus on weight (body mass index, BMI). As expected from the effect of smoking in reducing body weight, the study found that the risk haplotype (the minor allele) was associated with reduced BMI in current smokers. However, against expectations, the authors found that people who had never smoked showed an increase in BMI for the same allele (Fig. 2).
To see even a nominally significant effect of the locus on BMI without stratifying by smoking status – as is common in GWAS of BMI – would require a sample of >750 000 participants. In fact, GWAS of obesity/BMI/weight have not identified this locus (Locke Reference Locke, Kahali and Berndt2015; Yengo Reference Yengo, Sidorenko and Kemper2018). Further, one would expect contradictory results if a study were performed in populations with low versus high prevalence of smoking. Hence, future genetic studies of BMI should consider smoking status.
High stress paradigms: physician stress, childbirth and PTSD
One way to overcome the challenge of detecting the effect of genetic alleles on the trait is to sample at one extreme of the environment (Kendler Reference Kendler, Gardner and Prescott2006). For major depressive disorder (MDD), it is clear that stress is a major trigger, hence one currently used paradigm to study factors involved in depression is to use a population under high stress. Examples are the first year of medical residency, as in the Intern Health Study (Sen Reference Sen, Kranzler and Krystal2010), military deployment of veterans (Nievergelt Reference Nievergelt, Maihofer and Klengel2019) or the specific stressor of childbirth as a trigger of post-partum depression (Bauer Reference Bauer, Liu and Byrne2019). In the Intern Health Study, we found that polygenic risk scores from large case–control GWAS in MDD could predict risk but were particularly strong in predicting resilience to MDD under this stress (Fang Reference Fang, Scott and Song2020). Such studies do not study the interaction per se, but use of a high-stress paradigm favours identification of genetic factors that act in the presence of stress, regardless of whether they are vulnerability or plasticity factors. Such studies where one stressor is consistently present may be helpful to identify both genetic and environmental resilience factors (Kim-Cohen Reference Kim-Cohen and Turkewitz2012), a topic of great interest in positive psychology and of clinical relevance.
FKBP5 and GABRA2: two examples of working G × E models
Historical candidate genes of stress and depression in general have not fared well (Border Reference Border, Johnson and Evans2019), but the following two examples seem to have survived the test of time.
FKBP5, encoding FK506-binding protein 51
Stress activates the hypothalamic–pituitary–adrenal (HPA) axis, which in turn stimulates glucocorticoids. These bind to glucocorticoid receptors, which activates a number of genes, including FKBP5 (which encodes the FK506-binding protein 51 (FKBP51)), that then dampen the HPA axis (Matosin Reference Matosin, Halldorsdottir and Binder2018; Menke Reference Menke2019). FKBP51 is a co-chaperone of the glucocorticoid receptor. Its gene, FKBP5, is activated by the glucocorticoid receptor. This activation is dampened by methylation (Fig. 3). The rarer T allele of one SNP in the enhancer of FKBP5 increases the binding of the transcriptional machinery and hence increases the glucocorticoid-receptor mediated HPA dysregulation of the stress response and increases risk for MDD in those subjected to childhood trauma (Matosin Reference Matosin, Halldorsdottir and Binder2018), and for PTSD (Fig. 3, Kang et al, Reference Kang, Kim and Choi2019). The mechanism appears to be that the C allele can be methylated, e.g. by a stressful experience such as PTSD, leading to a dampened HPA axis response, whereas the risk T allele cannot be methylated and its carriers are less buffered against the traumatic experience.
GABRA2, encoding the α2 subunit of the GABAA receptor
The GABRA2 gene evolved as a candidate risk variant from linkage studies (Long Reference Long, Knowler and Hanson1998; Edenberg Reference Edenberg, Dick and Xuei2004), and meta-analyses confirmed its association with alcoholism (Zintzaras Reference Zintzaras2012; Li Reference Li, Sulovari and Cheng2014), which we showed was in part mediated through impulsivity (Villafuerte Reference Villafuerte, Heitzeg and Foley2012, Reference Villafuerte, Strumba and Stoltenberg2013). Although GABRA2 appeared as a traditional risk vulnerability factor in the original studies, which were based in families with high levels of alcoholism, our studies show that the GABRA2 ‘risk’ haplotype is best understood as a plasticity allele, as individuals with the plasticity allele are more influenced by parents (Trucco Reference Trucco, Villafuerte and Heitzeg2016) or by peers (Villafuerte Reference Villafuerte, Trucco and Heitzeg2014; Trucco Reference Trucco, Villafuerte and Burmeister2017), in both a negative and a positive manner (Fig. 4). A recent study of 11 000 individuals confirmed GABRA2 as a genetic plasticity factor with regard to behavioural outcomes (Schlomer Reference Schlomer, Cleveland and Feinberg2020), whereas GWAS of alcoholism or alcohol intake (Gelernter Reference Gelernter, Sun and Polimanti2019; Kranzler Reference Kranzler, Zhou and Kember2019) did not identify GABRA2 as a risk variant. The molecular mechanism is still not understood as the plasticity allele of GABRA2 is a haplotype involving >200 SNPs (Enoch Reference Enoch2008), none of them known to be functional.
From large GWAS hits to polygenic effects
GWAS, as successful as they have been in identifying thousands of risk variants (Mills Reference Mills and Rahal2019), have one huge disadvantage: by testing millions of SNPs, the number of statistical tests is large, and therefore a P-value of 5 × 10−8, correcting for 1 million independent tests, is considered the statistical threshold of significance. That means that the number of samples to find a significant hit is extremely large (in the hundreds of thousands) to identify >100 genome-wide significant hits, which usually explain only a few per cent of the risk. There is a linear relationship between significant loci and sample size, suggesting that many more potential hits can be found (Visscher Reference Visscher, Brown and McCarthy2012).
Polygenic risk scores (PRS) have started to overcome this dilemma (Wray Reference Wray, Goddard and Visscher2007). A PRS is calculated from a large GWAS on all or a subset of SNPs (e.g. the top 1% or top 10%). After accounting for the first 5–10 principal components, all selected SNPs are weighted by effect size on the trait of interest, accounting for the direction of the effect. In this manner, a participant in a new study not overlapping the original GWAS can be assigned a single PRS, and where the score falls within the spectrum of the group, that indicates the degree to which they are at risk. For example, in the Intern Health Study, we found that medical interns in the lowest 5% of PRS were highly significantly protected from depression during the internship year, whereas those in the top 5% were at significantly higher risk, but the difference between the top 5%, top 30% and average PRS was small (Fang Reference Fang, Scott and Song2020), indicating that in a high-stress environment, genetic differences of resilience are more prominent.
Moving from candidate genes to GWEIS
Given the tentative evidence from a few candidate genes for plasticity in interaction with the environment, the field is now moving to genome-wide interaction studies, originally called gene–environment-wide interaction studies (GEWIS) (Thomas Reference Thomas2010) but now more commonly known as genome-wide environment interaction studies (GWEIS). A recent GWEIS considering the interaction between depression and life stress in two UK population-based cohorts (Generation Scotland and the UK biobank) has not, however, yielded genome-wide significant SNPs, although gene-specific assessment put some candidate genes into prominence (Arnau-Soler Reference Arnau-Soler, Macdonald-Dunlop and Adams2019). As with G × E studies of candidate genes, whether asking about significant life events in the past 6 months is the best way of measuring stress in such large data-sets remains to be seen. In our opinion, we can have more confidence in stress measured during combat than in stress that happens to be reported at the time of recruitment in such a large population-wide data-set across many age groups and life risks.
Not all environment is purely environmental, and not all genetics is genetic!
The possibility to now perform genetic analysis using polygenic risk scores (PRS) on large samples has opened a new field, studying the influence of genes on continuous traits (e.g. educational achievement; Selzam Reference Selzam, Krapohl and Von Stumm2017) that are known to have many socioeconomic influences. Such genetic studies are highly controversial. Since ‘risk of’ a continuous trait such as educational achievement is meaningless, it is more appropriate to use the broader term polygenic score (PGS) (also called the genome-wide polygenic score, GPS).
We think of genetic factors as clearly distinct from environmental factors. However, when it comes to behaviour, the lines are not distinct. For lifestyle choices and many other environmental factors, genetics may contribute predictors of behaviours and propensities to certain behaviours that may usually be seen as environmental: the propensity for smoking clearly has genetic contributions, ‘accident-proneness’ in children may be linked to personality traits of ‘sensation seeking’ or low ‘harm avoidance’, and two well-recognised ‘environmental’ factors contributing to depression, experience of childhood trauma and life stressors (Kendler Reference Kendler1998), actually have a genetic propensity.
On the other hand, although overall population effects are usually accounted for by removing the effects of the first principal components when calculating PGS, some social factors may remain. So when the PGS for educational attainment explains 8% of a 16-year-old's test score (Selzam Reference Selzam, Krapohl and Von Stumm2017), this does not necessarily mean that 8% of the score is predicted genetically. For example, socioeconomic background and local migrations may reflect long-standing biases in society of how people migrated and which schools they can attend (Abdellaoui Reference Abdellaoui, Hugh-Jones and Yengo2019).
This new field of social genomics (Adam Reference Adam2019) will be go back to old controversies of nature versus nurture with new insights. By placing as much attention on analysing the environment as the millions of genetic alleles, a better recognition of G × E interactions will bring both genetic and sociological advances. Personalised medicine when it comes to depression may not be limited to ‘which drug works best’ for a given patient, but will include recognition that some patients may respond differently to various types of behavioural therapy. Especially when it comes to behaviour, our genetic background does not determine our destiny; but rather, our destiny is plastic, and modifying it may involve optimising environments according to our genetic background.
M.B. wrote the first draft and designed Figs 1, 2 and 4. S.S. extensively edited the manuscript.
S.S. gratefully acknowledges funding from the National Institute of Mental Health (grant MH101459).
Declaration of interest
eLettersNo eLetters have been published for this article.