Over the past two decades a growing body of research has linked low birth weight to a variety of adult health outcomes, first and most compellingly for cardiovascular disease (Barker et al. Reference Barker, Winter, Osmond, Margetts and Simmonds1989; Barker, Reference Barker2004; Huxley et al. Reference Huxley, Owen, Whincup, Cook, Rich-Edwards, Smith and Collins2007), but also for hypertension and type II diabetes (Barker et al. Reference Barker, Bagby and Hanson2006; Whincup et al. Reference Whincup, Kaye, Owen, Huxley, Cook and Anazawa2008) and less clearly some mental disorders (Jones et al. Reference Jones, Rantakallio, Hartikainen, Isohanni and Sipila1998; Breslau & Chilcoat, Reference Breslau and Chilcoat2000). These findings support what is variously termed the ‘Barker hypothesis’, ‘foetal origins hypothesis’ or ‘developmental origins of adult health and disease hypothesis’ (Barker, Reference Barker2007). It describes developmental plasticity in the foetus and young infant as a mechanism for permanently adjusting or ‘programming’ aspects of its physiology in response to the intrauterine and post-natal environment. This may provide adaptive fitness: exposure to poor nutrition in early life may programme glucose–insulin metabolism to maximise fitness by increased insulin resistance. The survival advantage gained in better handling early malnutrition may come at the cost of increased susceptibility to chronic disease in later life (Barker, Reference Barker1995; Heijmans et al. Reference Heijmans, Tobi, Stein, Putter, Blauw, Susser, Slagboom and Lumey2008). However, the evidence is not universally supportive – a meta-analysis of >50 studies assessing the impact of low birth weight on hypertension concluded that previously reported effects were likely to be attributable to random error, reporting biases, inappropriate control of confounders and a weakening of the effect reported by studies over time (Huxley et al. Reference Huxley, Neil and Collins2002). The authors suggested that the evidence for other associations between low birth weight and adult health (e.g. depression) should be submitted to similarly critical scrutiny.
Population-based cohort studies have presented mixed results and there is currently no consensus whether low birth weight is associated with depression in later life (Thompson et al. Reference Thompson, Syddall, Rodin, Osmond and Barker2001; Gale & Martyn, Reference Gale and Martyn2004; Osler et al. Reference Osler, Nordentoft and Andersen2005; Inskip et al. Reference Inskip, Dunn, Godfrey, Cooper and Kendrick2008). High-risk populations, including those exposed to famine, provide particularly strong evidence of an association. For example, Brown et al. (Reference Brown, van Os, Driessens, Hoek and Susser2000) found that men and women exposed to the Dutch Hunger Winter of 1944–45 during their second or third trimesters were at significantly increased risk of affective disorder [odds ratio (OR) 1.54, 95% confidence intervals (CI) 1.12–2.13] for exposure in the second trimester). However, the interpretation of this finding is complex and difficult to generalise to less extreme conditions. Famine no doubt represents a complex cluster of exposures, which in the perinatal period could include extreme material deprivation, maternal ill health and severe stress. There is a need to consolidate the epidemiological data from population studies under more normal circumstances.
We conducted a systematic review of the published literature to determine whether there was evidence to support an association between low birth weight, defined as <2500 g, and depression or psychological distress in later life.
Inclusion of studies
A literature search was carried using the PsycINFO (1967–2011), EMBASE (1980–2011) and Medline (1950–2011) databases and the Cochrane Library (final searches completed on 12 April 2011). The search string used was [Birth weight] AND [depression OR depressive disorder OR affective disorder OR psychological distress]. On completion of the literature search, a citation search was carried out using all the included records.
Inclusion criteria were that: (i) the study reported primary research; (ii) birth weight was the exposure variable; (iii) depression, depressive symptoms or psychological distress were the outcomes and were measured in the participant using self-rating scales or diagnostic interviews. This included anxiety and depression subscales of broader psychopathology questionnaires. We included the broader category of ‘psychological distress’ in recognition that some measures (for example, the Malaise Inventory; Rutter, Reference Rutter1970) are designed to capture symptoms of anxiety and depression together and therefore overlap considerably with more specific depression scales or diagnostic interviews. We conducted a sensitivity analysis to determine whether inclusion of this broader category impacted on our results.
We included both cohort and case–control studies. Studies of clinical cohorts of premature or low weight births compared to normal weight births were included. We excluded other cohorts at ‘high risk’ for the exposure, such as studies on specific risk subgroups (e.g. malnourished mothers, famine). We also excluded studies that used health register data to identify outcomes.
Records were screened for inclusion by title and abstract. The full text shortlisted records were independently assessed for eligibility by two investigators (W.W. and W.L.). Where more than one study reported analyses from the same sample, the study that (a) best fitted the inclusion criteria then (b) included the largest sample size in analyses was selected and the other(s) excluded. We used a data extraction sheet to record key information such as study design, method used to record birth weight and the method used to ascertain depression. We assessed study quality on a number of criteria: whether the participation rate was reported; the participation rate; correction for gestational age; correction for other confounders and sample size. Any discrepancies were discussed and, if necessary, referred to the senior author (M.H.).
Researchers have taken differing approaches to presenting the relationship between birth weight and later outcomes. Some have assessed a binary outcome (below or above a certain level) whereas others have modelled the impact of birth weight expressed as a continuous variable. Whilst the latter approach has the advantage of greater statistical power if the underlying relationship is a linear one, it is unclear whether the effect of birth weight on later outcomes would indeed be linear across the observed range of weights. We therefore extracted binary variables for exposure and outcome. This allowed us to test the hypothesis that an identifiable low birth weight (commonly accepted as ⩽2500 g) could be associated with subsequent adult depression. For studies that used continuous variables in their published analyses, authors were contacted to ask for analyses using binary variables.
The principal analysis was on the binary exposure and binary outcome. A random effects meta-analysis was conducted to combine the estimates from the studies. Random effects was preferred over fixed effects as we expected moderate heterogeneity in effect sizes; however, we report estimates using both methods. Adjusted log odds ratios were used where available and where unavailable raw summary data were used to calculate the unadjusted log odds ratio. Studies were categorised by whether estimates were crude or adjusted for possible confounders and, if they were, whether they adjusted for gestation. We calculated the I 2 statistic to estimate the extent of heterogeneity. All analyses were carried out using Stata 10 (StataCorp, USA).
A funnel plot and a trim-and fill technique were used to assess and adjust for publication bias (Egger et al. Reference Egger, Davey Smith, Schneider and Minder1997; Duval & Tweedie, Reference Duval and Tweedie2000). Hypothesised sources of heterogeneity were explored using meta-regression. These were: (i) the degree of adjustment of study estimates (i.e. unadjusted; adjusted for variables not including gestation; adjusted for variables including gestation); (ii) the outcome measure (depression; psychological distress); (iii) sampling strategy (representative; selective); (iv) the proportion of women in the sample of each study. Studies were categorised by sampling strategy according to whether this was representative of the local population or selective, for example, by gender or low birth weight. Population-based cohorts, case–control studies nested within these and unselected hospital-based cohorts were classed as ‘representative’. Split-cohort designs (comparing very low birth weight babies with healthy weight babies), single gender studies and other approaches were classed as ‘selective’.
Description of eligible studies
The search strategy returned 1739 records. Altogether, 26 met all criteria after two stages of screening (see Fig. 1 and Table 1 for study details). Of the 26 studies, 11 supported the hypothesis that low birth weight is associated with later depression and 15 did not find a statistically significant association. Of the 15 null studies, 13 reported positive findings for subgroups or other measures in the abstract. The mean age of participants within each study ranged from 11 to 85 years. Two studies used a sample restricted by gender (women only) (Inskip et al. Reference Inskip, Dunn, Godfrey, Cooper and Kendrick2008) (Gudmundsson et al. Reference Gudmundsson, Andersson, Gustafson, Waern, Ostling, Hallstrom, Palsson, Skoog and Hulthen2011). Nineteen studies reported on samples in Europe, five in North America and two in Australia.
adj, Adjusted; gest, gestation; OR, odds ratio; HR, hazard ratio; CI, confidence interval; LBW, low birth weight(<2500 g); ELBW, extremely low birth weight(<1000 g); Dep, depression; Dis, psychological distress; BDI, Beck Depression Inventory; CAS, Children Assessment Schedule; CDI, Children's Depression Inventory; CES-D, Center for Epidemiologic Studies – Depression; CIDI, Composite International Diagnostic Interview; DIS, Diagnostic Interview Schedule; GHQ, General Health Questionnaire; GDS, Geriatric Depression Scale; GMS, Geriatric Mental State B; HADS, Hospital Anxiety and Depression Scale; Kiddie-SADS-E, Schedule for Affective Disorders and Schizophrenia for School-Aged Children, Epidemiologic Version; OCHS-R, Ontario Child Health Study-Revised; PSE, Present State Examination; PSF, Psychiatric Symptom Frequency; SADS-LA, Schedule for Affective Disorders and Schizophrenia-Lifetime Version Modified for the Study of Anxiety Disorders; SCID, Structured Clinical Interview for DSM Disorders ; SCL, Symptom Check List; YASR, Achenbach Young Adult Self-Report.
Table 1 lists outcome measures used in the included studies, categorised by depression or psychological distress. Eleven studies reported measures of psychological distress as the outcome. These measures included combined anxiety and depression score on the Hospital Anxiety and Depression Scale (HADS; Berle et al. Reference Berle, Mykletun, Daltveit, Rasmussen and Dahl2006), and other scales such as the Achenbach Young Adult Self-Report (Hack et al. Reference Hack, Youngstrom, Cartar, Schluchter, Gerry Taylor, Flannery, Klein and Borawski2004) or the Hopkins Symptom Checklist (Haavind et al. Reference Haavind, Bergin and Brubakk2007).
Fifteen studies reported measures of depression as an outcome. Depression scales included screening tools such as the HADS (Mallen et al. Reference Mallen, Mottram and Thomas2008) and Center for Epidemiologic Studies – Depression (Alati et al. Reference Alati, Lawlor, Mamun, Williams, Najman, O'Callaghan and Bor2007). One scale, the Malaise Inventory, was used as a measure of psychological distress by one study (Cheung et al. Reference Cheung, Khoo, Karlberg and Machin2002) and as caseness for depression by another (Gale & Martyn, Reference Gale and Martyn2004).
Twenty-two studies had a cohort design and 14 utilised non-selective sampling strategies and contemporaneous birth records to obtain birth weight. These included prospective studies that took all births from an entire population or a hospital and followed the sample to adult life and a retrospective cohort study that used birth records (Mallen et al. Reference Mallen, Mottram and Thomas2008).
Four cohort studies were of a split-cohort design, whereby cohorts of exposed and non-exposed individuals were compared (Elgen et al. Reference Elgen, Sommerfelt and Markestad2002; Saigal et al. Reference Saigal, Pinelli, Hoult, Kim and Boyle2003, Reference Saigal, Stoskopf, Boyle, Paneth, Pinelli, Streiner and Goddeeris2007; Hack et al. Reference Hack, Youngstrom, Cartar, Schluchter, Gerry Taylor, Flannery, Klein and Borawski2004; Raikkonen et al. Reference Raikkonen, Pesonen, Heinonen, Kajantie, Hovi, Jarvenpaa, Eriksson and Andersson2008). Two studies were cross-sectional with self-reported birth weights (Bellingham-Young & Adamson-Macedo, Reference Bellingham-Young and Adamson-Macedo2003; Inskip et al. Reference Inskip, Dunn, Godfrey, Cooper and Kendrick2008). Two case–control studies were also included, of which one was nested in a cohort study (Patton et al. Reference Patton, Coffey, Carlin, Olsson and Morley2004) and one was not (Preti et al. Reference Preti, Cardascia, Zen, Pellizzari, Marchetti, Favarento and Miotto2000).
The overall quality of studies was variable. The sample size for 28% of the studies was <500. Of the 18 studies included in the meta-analysis, 6% failed to report a participation rate. The remainder reported participation rates varying from 24–95% with a median of 72% and four studies reporting participation rates of <50%. This was calculated as the proportion of the population who could participate and did so fully (in some instances, these figures will have included deaths and other losses to follow-up). Non-participation may lead to the under-ascertainment of individuals with depression, as has been demonstrated in a study of non-participants (Knudsen et al. Reference Knudsen, Hotopf, Skogen, Overland and Mykletun2010). Altogether, 67% failed to control for gestational age and only 28% considered other confounders, with a wide range of confounders considered. No study simultaneously controlled for gestational age, maternal socio-economic status, maternal smoking, family history of depression and gender, all of which could plausibly confound the association between low birth weight and depression.
Association between low birth weight and depression
Estimates derived from appropriate binary measures of birth weight and depression were available for 18 of the 26 studies (Fig. 2), including nine for which authors supplied the required information by correspondence (16 were contacted).
The random effects estimate for all 18 studies was OR 1.15 (95% CI 1.00–1.32), with moderate heterogeneity (I 2 = 34.5%, 95% CI <0.1–62.9, χ2 = 25.96, df = 17, p = 0.075). The fixed effects model gave the slightly smaller OR and CI of 1.12 (1.01–1.23).
A funnel plot (Fig. 3) provided evidence of publication bias and the trim-and-fill technique was used to impute small missing null or negative studies. This imputed four missing studies, which resulted in a non-significant association at the 5% level (OR 1.08, 95% CI 0.92–1.27). The fixed effects model using trim and fill gave a similar effect size (OR 1.08, 95% CI 0.98–1.19). The results were similar if the analysis was restricted to studies with the outcome of case-level depression only (not shown).
Investigation of heterogeneity using meta-regression
Studies were compared on two quality measures, the degree of adjustment for confounding and, second, the representativeness of the study samples. The six studies (Hack et al. Reference Hack, Youngstrom, Cartar, Schluchter, Gerry Taylor, Flannery, Klein and Borawski2004; Patton et al. Reference Patton, Coffey, Carlin, Olsson and Morley2004; Batstra et al. Reference Batstra, Neeleman, Elsinga and Hadders-Algra2006; Colman et al. Reference Colman, Ploubidis, Wadsworth, Jones and Croudace2007; Herva et al. Reference Herva, Pouta, Hakko, Laksy, Joukamaa and Veijola2008; Inskip et al. Reference Inskip, Dunn, Godfrey, Cooper and Kendrick2008), which published crude estimates, were compared to the six studies (Fan & Eaton, Reference Fan and Eaton2001; Thompson et al. Reference Thompson, Syddall, Rodin, Osmond and Barker2001; Elgen et al. Reference Elgen, Sommerfelt and Markestad2002; Mallen et al. Reference Mallen, Mottram and Thomas2008; Raikkonen et al. Reference Raikkonen, Pesonen, Heinonen, Kajantie, Hovi, Jarvenpaa, Eriksson and Andersson2008; Vasiliadis et al. Reference Vasiliadis, Gilman and Buka2008), which corrected for variables not including gestation, a quality measure. The ratio of their estimates from meta-regression was 1.05 (95% CI 0.62–1.77), compatible with no difference. Similarly, when the six studies' crude estimates were compared to those from the seven studies (Gale and Martyn, Reference Gale and Martyn2004; Wiles et al. Reference Wiles, Peters, Leon and Lewis2005; Berle et al. Reference Berle, Mykletun, Daltveit, Rasmussen and Dahl2006; Alati et al. Reference Alati, Lawlor, Mamun, Williams, Najman, O'Callaghan and Bor2007; Haavind et al. Reference Haavind, Bergin and Brubakk2007; Raikkonen et al. Reference Raikkonen, Pesonen, Kajantie, Heinonen, Forsen, Philips, Osmond, Barker and Eriksson2007), which corrected for variables including gestation, no significant difference was found (OR 1.10, 95% CI 0.72–1.68). Likewise, studies using more representative samples were similar to those using selective methods (OR 1.29, 95% CI 0.90–1.87).
Additional potential sources of heterogeneity investigated were the outcome used and the gender mix of the samples. Studies where psychological distress was the outcome were found to have similar findings to studies where depression was the outcome (OR 1.18, 95% CI 0.85–1.64). Finally, the proportion of women in the sample of each study (fitted as a continuous variable) was not associated with variation in the association between low birth weight and depression (OR 0.82, 95% CI 0.29–2.33). The ratio describes the predicted relationship of estimates from notional all-female and all-male studies in the meta-analysis.
Studies not included in the meta-analysis
Of the eight papers excluded from the meta-analysis, two reported analyses did not show a significant association, while five did and one reported an association for women but not men.
In our meta-analysis of 18 studies, we have demonstrated a weak association between low birth weight and later depression, including psychological distress, which became non-significant after correction for probable publication bias. The same was true if the analysis was restricted to studies examining depression only. The interpretation of our findings depends on whether correction for publication bias is seen as appropriate. We think that publication and reporting biases are present for two reasons. First, the funnel plot (Fig. 3) suggested a small number of missing small null studies, which would be expected if publication bias is operating. Second, of the 13 studies that reported null findings in relation to our predetermined definitions of low birth weight and depression outcomes, the majority (n = 11) emphasised ‘positive’ findings in their abstracts, such as subgroup analyses or secondary exposures. Only two of these null studies did not prioritise other positive findings (Inskip et al. Reference Inskip, Dunn, Godfrey, Cooper and Kendrick2008; Vasiliadis et al. Reference Vasiliadis, Gilman and Buka2008). This suggests a preference for authors and journals to report positive findings while null findings are often downplayed, a phenomenon widely recognised (Fanelli, Reference Fanelli2010). When we corrected for publication bias by using ‘trim-and-fill’ techniques, the effect became statistically non-significant and the effect size reduced. Steichen (Reference Steichen2000) notes that the trim-and-fill procedure makes numerous assumptions and it therefore requires careful interpretation rather than being seen as an infallible correction, particularly given the fact that there was some heterogeneity between studies. Nonetheless, we think there is evidence of reporting and publication bias and that our meta-analysis should be interpreted as indicating no compelling association between low birth weight and adult depression or psychological distress in the populations studied.
There was, however, heterogeneity in the studies reviewed, with a wide age range of participants, outcomes used and populations. There may therefore be subgroups where the association between low birth weight and depression is stronger than that seen overall. Our exploration of modifier variables using meta-regression did not find that any explained a significant amount of the heterogeneity in the association. These included quality indicators such as the degree to which possible confounders were controlled for, the outcome measures used and how much the design resembled a population-based study, and other factors such as the proportion of women in the sample. However, in an exploration of heterogeneity one is dependent upon the quality of the reporting in the source studies and some variables that might plausibly contribute to heterogeneity (e.g. family history of depression, smoking, maternal physical health) were not routinely reported.
Another potential source of heterogeneity could have been that some studies included extremely low birth weight or very low birth weight infants recruited from specialist hospital environments and compared these to controls of normal birth weight. These study samples would have comprised markedly premature births compared to term births. This contrasts with population studies where the low birth weight infants would have been less severely underweight and premature. It would be expected that such studies would be more likely to show an association of low birth weight on depression, because of confounding by the challenge of being born developmentally immature and therefore more likely to suffer health difficulties (Irving et al. Reference Irving, Belton, Elton and Walker2000; Walther et al. Reference Walther, den Ouden and Verloove-Vanhorick2000). In fact, of three such studies, none supported the hypothesis.
Two studies did not record birth weight from contemporaneous records, relying on self- or parent-report for the majority of recruited subjects. The use of self-report measures of birth weight may lead to recall bias, particularly if the study hypothesis was known to participants. Inskip et al. in their survey of 7020 British women reported satisfactory agreement (r = 0.87) for the 1729 women for whom birth weights by both recall and record were available. Thus, if this approach has led to such bias, it is likely to have had only a minor impact on a minority of studies.
Outcome measures and outcome severity may have contributed to the heterogeneity of results, although we did not find this in our meta-regression of the variables described above. The pooled estimate remained statistically non-significant following adjustment for publication bias for both depression and psychological distress outcomes. However, depression is a relapsing and remitting illness and most studies simply report the association between low birth weight and a single ‘snapshot’ of depression ascertained at one point in adult life. It is of note that Colman et al. by identifying patterns of depression (and other psychiatric disorders) over the life-course found that those with the worst trajectories (of recurrent affective symptoms) had the lowest birth weight (Colman et al. Reference Colman, Ploubidis, Wadsworth, Jones and Croudace2007). Thus, the ascertainment of depression in many of these studies may not have been sufficiently sensitive to identify a subtle underlying effect. Psychological distress, a broad category used in research to capture a range of symptoms of anxiety and depression was included in this review. Sensitivity analyses indicated that including this broader term had no impact on the observed effect.
We excluded health register studies from our review because depression is largely undiagnosed in the general population and only a minority of severe and complex cases present to secondary mental health care (Goldberg & Huxley, Reference Goldberg and Huxley1992). Three health register studies were, however, identified in our search strategy. One, by Osler et al., followed men born in Copenhagen in 1953 using a hospital discharge diagnosis of depression and found no association with low birth weight. Larsen et al. (Reference Larsen, Bendsen, Foldager and Munk-Jorgensen2010) used Danish birth records linked with hospitalisations for affective disorder (including bipolar illness) and found a modest effect (OR 1.15, 95% CI 1.01–1.31). Abel et al. (Reference Abel, Wicks, Susser, Dalman, Pedersen, Mortensen and Webb2010) reported a cohort study that combined Danish and Swedish births, resulting in a sample of 1.49 million, followed to their late 20 s for a diagnosis of affective disorders (including bipolar disorder). The study found modest, statistically significant effects (OR 1.37, 95% CI 1.13–1.66). These studies of case registers provide mixed results, but Abel's paper suggests that low birth weight has a modest effect when the outcome is defined as severe affective disorders leading to contact with secondary care services.
Strengths and limitations
Strengths of this review included the first, to our knowledge, use of the meta-analytic method to evaluate evidence for this hypothesis and broad search strategy to allow for what is a very heterogeneous field. The studies we included had sufficient power to detect a small effect. The heterogeneity of study designs is a limitation, although we were able to investigate differences between studies using meta-regression. Despite variations in study design, these did not seem to impact on the overall effect, although this technique may have limited statistical power.
We opted for a binary approach to the categorisation of birth weight using a recognised cut-point for low birth weight <2500 g. The main reason for so doing was that a continuous approach would have assumed each decrement of birth weight represented an equal increment in the odds of later disorder. Considering the large number of non-pathological constitutional influences on birth weight, especially within the normal ranges, we decided this would have been inappropriate.
As in any meta-analysis, our results are dependent on the quality of the underlying studies, which were variable. Many studies had low statistical power and poor or inadequately reported follow-up rates. Further, many did not control for confounding and it is possible that there may be other confounders that have been neither considered nor controlled for. The estimate was adjusted for gestation in only 39% of studies. As with all meta-analyses, publication and reporting biases are an important problem. We detected probable publication bias with an under-reporting of small null or negative studies. We suggest that authors or journals might prefer not to publish small null studies when it was obvious that the study was underpowered. We also suggest that an absolute publication bias may be operating, whereby the probability of publication is influenced by a study's finding irrespective of sample size. This may be expected in observational work where large studies using secondary datasets may be carried out cheaply.
We therefore conclude that although there is a small statistically significant impact of low birth weight on adult depression this requires careful interpretation. We suggest that for mild to moderate unipolar depression detected in general population samples, publication bias is a sufficient explanation for this effect and that the conservative interpretation (i.e. that low birth weight is not associated with adult depression) is the most reasonable conclusion. However, there is evidence from one very large case register study that affective disorders (including bipolar disorder) of sufficient severity to lead to contact with psychiatric services are associated with low birth weight. A less conservative interpretation would be to ignore the possible impact of publication bias and report a small and probably clinically insignificant effect that may be accounted for by uncorrected (and possibly unknown) confounders. We conclude that there is insufficient evidence to support the hypothesis that low birth weight is associated with adult depression.
We thank the many authors of the original papers who answered our enquiries. W.W. is supported by the NIHR (National Institute of Health Research) Academic Clinical Fellowship Award. W.L. is supported by the UK Medical Research Council. I.C. is supported by a Population Health Investigator Award from the Alberta Heritage Foundation for Medical Research and a New Investigator Award from the Canadian Institutes of Health Research. R.H. is supported by the UK Medical Research Council. M.H. is supported by the NIHR Biomedical Research Centre for Mental Health at the South London and Maudsley NHS Foundation Trust and Institute of Psychiatry, King's College London and is a NIHR Senior Investigator.
Declaration of Interest