Biomarker-based subtyping of depression and anxiety disorders using Latent Class Analysis. A NESDA study

Background Etiological research of depression and anxiety disorders has been hampered by diagnostic heterogeneity. In order to address this, researchers have tried to identify more homogeneous patient subgroups. This work has predominantly focused on explaining interpersonal heterogeneity based on clinical features (i.e. symptom profiles). However, to explain interpersonal variations in underlying pathophysiological mechanisms, it might be more effective to take biological heterogeneity as the point of departure when trying to identify subgroups. Therefore, this study aimed to identify data-driven subgroups of patients based on biomarker profiles. Methods Data of patients with a current depressive and/or anxiety disorder came from the Netherlands Study of Depression and Anxiety, a large, multi-site naturalistic cohort study (n = 1460). Thirty-six biomarkers (e.g. leptin, brain-derived neurotrophic factor, tryptophan) were measured, as well as sociodemographic and clinical characteristics. Latent class analysis of the discretized (lower 10%, middle, upper 10%) biomarkers were used to identify different patient clusters. Results The analyses resulted in three classes, which were primarily characterized by different levels of metabolic health: ‘lean’ (21.6%), ‘average’ (62.2%) and ‘overweight’ (16.2%). Inspection of the classes’ clinical features showed the highest levels of psychopathology, severity and medication use in the overweight class. Conclusions The identified classes were strongly tied to general (metabolic) health, and did not reflect any natural cutoffs along the lines of the traditional diagnostic classifications. Our analyses suggested that especially poor metabolic health could be seen as a distal marker for depression and anxiety, suggesting a relationship between the ‘overweight’ subtype and internalizing psychopathology.

mechanisms (Hasler et al., 2004). Subtypes might be more suitable to investigate etiological heterogeneity when biological profiles are used as the point of departure.
It has been acknowledged that theories assuming that there is a single biological disturbance underlying depression in all patients (e.g. monoamine deficiency) have limited validity (Kendell and Jablensky, 2003;Hasler, 2010;Kapur et al., 2012). Rather, the specific disturbances are likely to differ between patients, even those with similar symptomatology and/or diagnoses (Insel et al., 2010). If we consider this, the low observed effect sizes in treatment trials could be explained by the fact that only in part of the patients treated with a certain compound (e.g. selective serotonin reuptake inhibitors), the treatment actually affects their individual biological disturbances (Tang et al., 2010). Identification of more homogeneous biomarker-based subtypes in the patient population could help to gain more insight into patient-specific etiological mechanisms and to better target treatments to those that are likely to benefit (Meyer and Ginsburg, 2002;Bartova et al., 2010;Leuchter et al., 2010;Simon and Perlis, 2010;Simon, 2011;e Silva, 2013;Miller and O'Callaghan, 2013;Ozomaro et al., 2013;Cuijpers et al., 2016;Korte et al., 2015).
To our knowledge, the existence of depression/anxiety subtypes with different biological disturbance profiles and their manifest clinical characteristics have not been extensively studied before. Therefore, this study aimed to identify biomarker-based subtypes using latent class analysis (LCA) a large and wellphenotyped sample of depressive and/or anxiety patients (n = 1460). In this sample, 36 biomarkers were measured, representing different underlying mechanisms that have previously been found to be relevant to depression and anxiety disorders (e.g. hypothalamus-pituitary-adrenal axis function, inflammation). Next, in order to investigate the clinical relevance and utility of the identified biomarker-based subtypes, clinical characteristics and symptomatology were compared across the identified classes.

Participants and procedures
The Netherlands Study of Depression and Anxiety (NESDA) is a multisite naturalistic cohort study that aims to examine the longterm course of depressive and anxiety disorders. A detailed description of the NESDA study design and sampling procedures can be found elsewhere . In brief, the NESDA cohort consists of 2981 subjects, aged 18-65 years, with (a history of) anxiety and/or depressive disorder and healthy controls. The research protocol was approved by the Medical Ethical Committees of participating institutes, and after complete description of the study, all respondents provided written informed consent. For the present study, all 1460 subjects with a current (last month) diagnosis of a depressive or anxiety disorder according to the Composite International Diagnostic Interview (CIDI; WHO version 2.1) were selected. For additional analyses (see below), 634 healthy control subjects were added to the dataset.

Measurements
During a 4-hour baseline assessment including interviews, a medical examination, a cognitive computer task and collection of blood samples, extensive information was gathered about key (mental) health outcomes and demographic, psychosocial, clinical and biological determinants. Additional measures (written questionnaires and saliva samples) were carried out at home by the participants. Diagnostic and statistical manual of mental disorders-fourth edition (DSM-IV) diagnoses of depressive [minor depression, dysthymia and major depressive disorder (MDD)] and anxiety disorders (generalized anxiety disorder, social phobia, agoraphobia and panic disorder) were established using the Composite International Diagnostic Interview (CIDI) 2.1. Depression symptom severity was measured with the 30-item Inventory of Depressive Symptomatology Self Report (IDS-SR) (Rush et al., 1996). Anxiety symptom severity was measured with the Beck Anxiety Inventory (BAI) (Beck et al., 1988). Sociodemographic information included age, sex, number of chronic diseases, drinking and smoking behavior. The biological data consisted of waist circumference, body mass index (BMI), blood pressure (systolic/diastolic and the ankle-brachial index), heart rate variability and markers from blood and saliva samples. A summary of the biological methods is provided below. For more details, see online Supplementary Table S1.

Blood markers
Venous blood was drawn prior to the interview session (between 8:00 and 9:00 am) after an overnight fast. Venous blood samples were transferred to a local lab for routine assessments; serum and plasma were spun down within an hour and stored at −80°C for later analyses. The routine assays included hematological variables (hemoglobin, hematocrit, and erythrocyte count), liver function (γ-glutamyltransferase, aspartate aminotransferase, alanine aminotransferase) and kidney function (creatinine), as well as markers related to the metabolic syndrome (MetS) and obesity (glucose, cholesterol, triglycerides, HDL-and LDL-cholesterol, thyroid-stimulating hormone, free thyroxine). Additional biomarkers are described below. Most of these markers' associations with mental-health outcomes have previously been established (Chaves et al., 2004;Vancampfort et al., 2014;Hiles et al., 2016).
Inflammation. C-reactive protein (hsCRP) was measured in duplicate by an in-house enzyme-linked immunosorbent assay (ELISA), based on purified protein and polyclonal anti-hsCRP antibodies (Dako, Glostrup, Denmark). Plasma IL-6 levels were measured in duplicate by a high sensitivity ELISA (PeliKine Compact TM ELISA, Sanquin, Amsterdam, The Netherlands). Plasma tumor necrosis factor alpha (TNF-α) levels were assayed in duplicate using a high-sensitivity solid phase ELISA (Quanti-kine® HS Human TNF-α Immunoassay, R&D systems Inc, Minneapolis, MN, USA). Tryptophan and (OH-)kynurenine concentrations were assayed by an automated online solid-phase extraction-liquid chromatographic-tandem mass spectrometric (XLC-MS/MS) method. Both inflammation and degradation of tryptophan along the kynurenine pathway have been found to be associated with depression in this sample Quak et al., 2014).
Neuroplasticity. Brain-derived neurotrophic factor (BDNF) protein levels were measured in serum samples using the Emax Immuno Assay system from Promega (Madison, WI, USA). IGF-I (nmol/l) was assayed centrally by chemiluminescence immunoassay of EDTA plasma on the Liaison autoanalyzer (DiaSorin, S.p.A., Italy). Both measures have previously been found to be associated with depression with melancholic features in this sample (Patas et al., 2014).
Mineral balance. Measurements of parathyroid hormone (PTH) were performed at the Endocrine Laboratory of the VU University Medical Center. PTH levels were measured in EDTA plasma using an intact PTH assay. Intact PTH levels were measured using an immunometric assay (Architect, Abbott Diagnostics, Abbott Park, IL). Vitamin D was measured based on circulating levels of 25(OH)D, extracted and analysed by XLC-MS/MSa (Spark Holland, Emmen, the Netherlands) and coupled to a Quattro Premier XE tandem mass spectrometer (Waters Corp., Milford, MA, USA). Vitamin D has previously been found to be associated with depression in this sample (Milaneschi et al., 2014).
Steroid hormones. Dehydroepiandrosterone and its sulfate conjugate (DHEA/-S) were determined using a delayed one-step immunoassay with a chemiflex assay protocol. Sex Hormone Binding Globulin (SHBG) was determined using a two-step immunoassay with a chemiflex assay protocol. Total estradiol (E2) was determined using a delayed one-step immunoassay with a chemiflex assay protocol. Previously, steroid hormones were found to be associated with anxiety and depression in this sample (Giltay et al., 2012;Morita et al., 2014).
Leptin. Plasma leptin concentrations were measured in EDTA plasma using an ELISA kit (Human Leptin ELISA Kit; Linco Research, Inc, St. Charles, Missouri). Leptin has previously been found to be associated with depression (with atypical features) in this sample (Milaneschi et al., 2017).

Saliva markers
On a single day, prior to the first face-to-face assessment session, participants themselves collected saliva samples at home using Salivettes (Sarstedt AG and Co, Nürmbrecht, Germany  (Carroll, 1982) was carried out by oral administration of a 0.5 mg dexamethasone pill directly after T 6 and a final cortisol measurement the next morning at awakening (T 7 ). The saliva samples were used to assess levels of cortisol, amylase and testosterone. These measures have previously been observed to be associated with depression and anxiety in this sample (van Santen et al., 2011;Giltay et al., 2012;Veen et al., 2013).
Heart rate variability A heart rate recording was performed with the Vrije Universiteit Ambulatory Monitoring System (Cuijpers et al., 2013) (VU-AMS). Subjects were wearing the VU-AMS device during a large part of the NESDA baseline assessment, while participating in different assessment parts (i.e. medical examination, interviewing, and a computer task). The start of the various assessment parts was marked with an event marker to divide the total recording into fixed periods. Movement registration through vertical accelerometry was used to excise periods where subjects were non-stationary. For this project, VU-AMS heart rate variability (HRV) during resting baseline, and HRV change from baseline to two stress conditions (interview and stressful computer task) were used. HRV has previously been found to be associated with depression in this sample (Licht et al., 2008).

Statistical analyses
LCA was used to identify data-driven subgroups with distinct biological profiles. LCA assumes that an unobserved, latent categorical variable explains the association among a set of observed variables. The input variables are listed in online Supplementary  Table S1. Models with increasing numbers of classes were estimated and compared. The optimal model selection was based on the highest entropy and the lowest Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Relative entropy is a measure of classification accuracy (range: 0-1), with values closer to 1 indicating greater accuracy. Lower AIC and BIC values indicate that the model provides a better description of the data. Because these measures do not always agree, parsimony and interpretability were also taken into account. To avoid convergence on a solution at a local maximum (Nylund et al., 2007) LCA was run using up to 1000 random starting values and 250 final stage optimizations. LCA was conducted using Mplus version 5 (Muthen and Muthen, 2007). To investigate the influence of the different variables on the model solution Cramer's V (Cramér, 1946) was calculated for each biomarker variable. After the optimal model was identified, subjects were assigned to their most likely class based on their highest posterior class probability. Differences between classes (in biomarkers, sociodemographics, DSM-IV diagnoses and clinical characteristics) were investigated by using two-tailed χ 2 statistics for categorical variables and one-way analysis of variance statistics for continuous variables, or by using the Kruskall-Wallis test if the outcomes were not normally distributed. The False Discovery Rate controlling procedure was used to counteract the problem of multiple comparisons (Benjamini and Hochberg, 1995). All comparisons were conducted using SPSS for Windows, version 23 (IBM Inc., USA). To evaluate if the identified class structure was specifically informative about the biological heterogeneity in depressed/anxious patients or was more broadly reflective of normal biological variations in the population as a whole, all analyses were rerun in a dataset including both patients and healthy controls (n = 2094).

Data preprocessing
Clinically determined cut-off values to categorize the biomarker variables were available for some but not all biomarkers. Therefore, an alternative approach was taken, categorizing every variable based on percentiles, with the lowest and highest scoring 10 th percentile of all subjects being coded as −1 and 1, respectively, and the middle 80% being coded 0. The 10 th percentile cutoff was chosen to make sure that especially the more extreme, potentially clinically relevant variations in biomarker levels would be represented in the eventual LCA model. Setting the cutoffs at higher percentile values would lead to biomarker variables with more subjects in both the lower and upper category, but with a more within-category variation of biomarker levels. This makes the categorization potentially less useful for differentiation between subjects with distinct biomarker patterns as relevant interpersonal differences are eliminated by pooling patients with different biomarker levels in the same category. Indeed, exploratory analyses using 25 th and 50 th percentiles as cutoffs led models with many classes (i.e. the AIC and BIC kept decreasing with each class addition) that could not be easily distinguished from each other in terms of their specific biomarker patterns. In the final coding scheme, the number of subjects per decile varied between Psychological Medicine 8% and 12%, because in some cases a large number of subjects had the same score, and we chose to use the percentile closest to 10. The above-described coding was done separately for different sex/age (<30, 30-50, >50) strata, because the distributions of some biological variables are known to differ across sex and age (e.g. testosterone). Differences between classes on other potentially relevant covariates were investigated after identification of the optimal latent class model.
To ensure that the model solution would not be driven by the fact that some variables were essentially measuring the same thing, all variables with a correlation above 0.75 before recoding were summed after categorization (i.e. the hematological markers, the heart rate variability reactivity in both test conditions, systolic and diastolic blood pressure, aspartate and alanine aminotransferase). Subjects with a score of ⩽−1 got a value of −1, those with a score of 0 got a value of 0 and subjects with a score of ⩾1 got a value of 1 on a newly constructed compound variable. Recoding was done using SPSS for Windows, version 23 (IBM Inc., USA). The final dataset included 36 biological variables.

Results
The LCA results are shown in Table 1. The lowest BIC combined with adequate entropy indicated that the 3-class model best described the data. Although the AIC decreased for the more complex models up to six classes, these models were less optimal in terms of parsimony and showed only marginally higher   Table S1. Abbreviations: Blood pressure, combination of systolic and diastolic pressure; T4, free thyroxine; TSH, thyroid stimulating hormone; PTH, parathyroid hormone; CAR, cortisol awakening response; AUCg, area under the curve with respect to the ground, AUCi, area under the curve with respect to increase; DHEA(-S), dehydroepiandrosterone(-sulphate); HRV, heart rate variability; HRV reactivity, combination of HRV reactivity in both stress situations; BDNF, brain-derived neurotrophic factor, IGF-1 insulin-like growth factor 1; E2, estradiol; SHBG, sex hormone binding globulin; IL-6, interleukin 6; TNFa, tumor necrosis factor alpha; hsCRP, C-reactive protein; GAMMA, gamma-glutamyltransferase; ASAT/ALAT, combination of aspartate aminotransferase and, alanine aminotransferase; hematology, combination of hemoglobin, hematocrit, and erythrocyte values.

620
Lian Beijers et al. entropy. For each class, Fig. 1 shows the probabilities of biomarker scores in the top or bottom 10 th percentile. Table 2 shows the distribution of biomarkers across the three latent classes. Most biomarkers differed among classes, with the notable exception of the cortisol-related biomarkers. Waist circumference, BMI and leptin had the largest effect (Cramer's V > 0.5) on the model solution and were used to inform class labeling. Other variables (e.g. measures on inflammation and steroid hormones) showed smaller, but still meaningful, variation between classes (Cramer's V = 0.335-0.209, see Table 3). The first class was labeled 'lean' (n = 311) because it was associated with a healthy BMI and a comparatively high probability of being in the bottom 10 th percentile for the obesity/MetS markers. The second class was labeled 'average' (n = 910) because it showed a low probability for extreme scores on any of the biomarkers. The third class was labeled 'overweight' (n = 239) and was characterized by a pattern of probabilities almost perpendicular to the lean class, with comparatively high probabilities to be in the upper 10% on obesity and MetS-related markers. Table 4 shows the distribution of clinical characteristics across the three identified latent classes. As expected because of the prestratified processing of the data, the classes did not differ with respect to gender or age. The lean and average classes were similar in terms of diagnoses, whereas subjects in the overweight class were more likely to suffer from MDD. Persons in the overweight class were also more likely to endorse the atypical depressive subtype and to use psychotropic medication, specifically tricyclic antidepressants. Overweight subjects also had higher scores on both the IDS and the BAI. The lean class had a lower age of onset compared to the other classes (i.e. 18.1 v. 21.6 v. 22.6). There were no differences in course or diagnoses at 2 and 6 years follow-up (see online Supplementary Table S2). Individual symptoms of depression and anxiety did not differ between classes (data not shown).
LCA in the combined sample of patients and healthy controls (see online Supplementary Table S3) again showed a 3-class model to be optimal. The classes were very similar to the original model with regard to their respective biomarker profiles (see online Supplementary Fig. S1). When compared across the classes, the percentage of subjects with current psychopathology was higher in the overweight class, whereas the percentages of patients and controls in the lean class were equal (see online Supplementary Table S4).

Discussion
This study aimed to use LCA in to identify subgroups with different biomarker profiles in a large sample of patients with depressive and anxiety disorders. A lean (21.6%), an average (62.2%) and an overweight (16.2%) class were identified. Overall, the model seemed to reflect somatic health status, which was confirmed by the observation of similar classes when healthy controls were included in the sample. Still, the fact that the subjects with psychopathology were relatively likely to be in the 'overweight' class compared with healthy controls, indicates a possible connection between these disorders and the overweight biomarker profile. Furthermore, we found that the lean group had a significantly lower age of onset compared to the other two groups (see Table 4). Although the mean ages of onset are all in early adulthood and relatively close to each other, it cannot be excluded that there may be subtle differences in etiological mechanisms.
These bottom-up findings align with findings from research that used a more top-down approach and showed an association between depression and anxiety and obesity/MetS in this sample (van Reedt Dortland et al., 2009;Van Reedt Dortland et al., 2010), and with results from large-scale meta-analyses (Blaine, 2008;Luppino et al., 2010;Sardinha and Nardi, 2014;Vancampfort et al., 2014;Ghanei Gheshlagh et al., 2015;Mannan et al., 2016). The exact mechanisms behind these associations are unclear, partially because the (bi)directional nature of the association is still a point of discussion (Blaine, 2008;Ghanei Gheshlagh et al., 2015;Hiles et al., 2016;Mannan et al., 2016). Obesity might cause depression and anxiety through social or biological mechanisms, but it might also be that the excessive weight gain is caused by the unhealthy lifestyle habits of patients. Medication use is another risk factor, as it is becoming apparent that many commonly used psychotropic medications such as antidepressants are associated with cardiometabolic risk factors such as insulin resistance, obesity, and dyslipidemia (Abosi et al. 2018). Evidence from the current sample supports this hypothesis. We found that psychotropic medication use is higher in the overweight subtype, and previous research in this sample showed a directional relationship between medication use and waist circumference and MetS (Hiles et al., 2016). Another interesting hypothesis is that depression and the MetS share a common cause, for instance, genetic risk factors (Bornstein et al., 2006). Indeed, some evidence indicates that MetS and depression are caused by similar alterations of the stress system, including the hypothalamus-pituitary-adrenal axis, the autonomic nervous system, the immune system, and platelet and endothelial function (Van Reedt Dortland et al., 2010;Luppino et al., 2014;Marazziti et al., 2014;Vancampfort et al., 2014;Vogelzangs et al., 2014).
However, we found no association between the overweight subtype and course or diagnoses at 2-and 6-year follow-up. It is possible that no such association exists, but there are alternative explanations. It is possible that other causal factors (not shared between depression and the MetS) obfuscated the results. Furthermore, in this sample, it has been shown that especially MDD subjects (with and without comorbid anxiety) are more likely to change weight compared to controls Gibson-Smith et al., 2016). If subsequent biomarker changes also occurred, these subjects might have switched subtypes, and it might be that this change in subtype would have been more informative with respect to course than baseline subtype membership. Unfortunately, the current data did not allow for investigation of subtype changes because biomarker data needed to do so was not available at follow-up.
In accordance with previous symptom-based subtyping research (Van Loo et al., 2012;Rhebergen et al., 2013;de Vos et al., 2015;Wardenaar et al., 2017) this study does not provide a bottom-up validation for the DSM-diagnosis categories of depression and anxiety disorders as patients with different diagnoses did not cluster into distinct biological classes. Also, in contrast to previous research (Cizza et al., 2012;Lamers et al., 2013;Łojko et al., 2015;Milaneschi et al., 2017), we did not find an association between the latent classes and atypical v. melancholic features of depression. When comparing melancholic and atypical symptomatology on the IDS-SR (Rush et al., 1996) between classes, the only significant difference was that the overweight class showed a higher percentage of atypical specifiers compared to the lean class. However, in absolute terms, both atypical and melancholic IDS-SR counts were highest for the overweight class (although differences on the melancholic subscale were no longer significant after correction for multiple comparisons).
Overall, the results indicate that biomarker heterogeneity among depressive and/or anxiety patients mostly reflects variations in somatic health that extend into the part of the population without mental health problems. However, this does not mean that part of these biological variations is not also related to psychological health. As stated above, variations in somatic health are known to be strongly related to variations in psychopathology. Additionally, there may be smaller but still relevant associations between specific sets of biomarkers and depression/ anxiety that were not detected in the current analyses but could be of strong interest for the development of more personalized etiological models. A future methodological challenge lies in better investigating if generic somatic-health-related biological effects can be separated from more specific psychopathologyrelated biological effects. Possibilities for this may lie in the use of more flexible clustering algorithms, but also in the combination of biomarker and clinical data in the identification of subtypes. With regard to the biomarkers that could be investigated deeper, the current results suggested that there were several biomarkers that had smaller effects in the LCA results than the MetS biomarkers, but could still be potentially interesting as targets for further research, such as inflammation-related markers (e.g. Kynurenine, hsCRP) or sex-related markers (e.g. SHGB, testosterone).

Strengths and limitations
Strengths of the current study included the large sample size, the broad range of available biomarkers, and the presence of thorough clinical assessments. However, the results should also be interpreted in the context of several limitations. First, the results apply to a group of outpatients and results cannot be directly translated to other groups (e.g. inpatients). Second, LCA makes very strong assumptions (e.g. local independence) that enable LCA to estimate interpretable but strongly simplified models. However, we believe that (the violation of) the assumption of local independence is not a driving force in our model, because we made sure not to include strongly correlated pairs of biomarkers and found classes that were not defined by specific clusters of correlated biomarkers but rather by a collection of biomarkers from different domains (see Table 2). Third, the possible negative  influence of underweight could not be evaluated because people with a BMI < 18.5 were rare compared with overweight persons. Fourth, to facilitate the analyses, continuous biomarker measurements were coded to a discrete scale, possibly leading to a loss of information. Latent Profile Analysis (using continuous variables) usually provides a more nuanced representation of the data. Unfortunately, this technique could not be applied to our dataset because it is very sensitive to non-normal distributions (Morgan et al., 2016). Fifth, coding was stratified for sex and age groups, but other unknown/unmeasured factors were not considered. Finally, the biological data needed to estimate the subtypes was not available at follow-up, making it impossible to investigate subtype stability over time and the effects of subtype changes over time.
In conclusion, three biological subgroups were identified with LCA among depressive and/or anxiety patients. These subgroups showed classes that (1) were strongly tied to general (metabolic) health, (2) did not reflect any natural cutoffs along the lines of the traditional diagnostic classifications, and (3) showed that especially poor metabolic health had a strong relationship with depression and anxiety and could, therefore, be seen as a distal marker for these types of psychopathology.
Supplementary Material. The supplementary material for this article can be found at https://doi.org/10.1017/S0033291718001307