Challenging the negative learning bias hypothesis of depression: reversal learning in a naturalistic psychiatric sample

Background Classic theories posit that depression is driven by a negative learning bias. Most studies supporting this proposition used small and selected samples, excluding patients with comorbidities. However, comorbidity between psychiatric disorders occurs in up to 70% of the population. Therefore, the generalizability of the negative bias hypothesis to a naturalistic psychiatric sample as well as the specificity of the bias to depression, remain unclear. In the present study, we tested the negative learning bias hypothesis in a large naturalistic sample of psychiatric patients, including depression, anxiety, addiction, attention-deficit/hyperactivity disorder, and/or autism. First, we assessed whether the negative bias hypothesis of depression generalized to a heterogeneous (and hence more naturalistic) depression sample compared with controls. Second, we assessed whether negative bias extends to other psychiatric disorders. Third, we adopted a dimensional approach, by using symptom severity as a way to assess associations across the sample. Methods We administered a probabilistic reversal learning task to 217 patients and 81 healthy controls. According to the negative bias hypothesis, participants with depression should exhibit enhanced learning and flexibility based on punishment v. reward. We combined analyses of traditional measures with more sensitive computational modeling. Results In contrast to previous findings, this sample of depressed patients with psychiatric comorbidities did not show a negative learning bias. Conclusions These results speak against the generalizability of the negative learning bias hypothesis to depressed patients with comorbidities. This study highlights the importance of investigating unselected samples of psychiatric patients, which represent the vast majority of the psychiatric population.


Introduction
Major depressive disorder (MDD) is a highly debilitating psychiatric condition, with an estimated yearly prevalence of 4.4% worldwide (WHO, 2017). Prior studies that have attempted to clarify the neurobiological and cognitive mechanisms underlying depression, mainly focused on selected patient samples, that either did not have comorbid psychiatric disorders or these disorders were not described (Admon et al., 2017;Elliott, Sahakian, Herrod, Robbins, and Paykel, 1997;Harlé, Guo, Zhang, Paulus, and Yu, 2017;Liu et al., 2017;Robinson, Cools, Carlisi, Sahakian, and Drevets, 2012;Rothkirch et al., 2017;Taylor Tavares et al., 2008). The current paper aims to extend these results by investigating a heterogeneous sample of depressed patients, with a high and well-defined level of comorbidities.
MDD has long been characterized by an imbalance between decreased reward and increased punishment sensitivity (Admon & Pizzagalli, 2015;Eshel & Roiser, 2010). One of the two key symptoms of MDD is anhedonia, which, according to the Diagnostic and Statistical Manual of Mental Disorders 5th edition (DSM-5;American Psychiatric Association, 2013) refers to a diminished interest or pleasure (in almost all activities). Translated towards reward mechanisms, this can be understood as a reduced capacity to anticipate and experience pleasure from reward. A considerable amount of research has focused on reward processing in depression, and in general, these studies find reward learning deficits; a blunted response towards rewarding information and decreased reward sensitivity (Admon & Pizzagalli, 2015;Eshel & Roiser, 2010;Robinson et al., 2012;Safra, Chevallier, & Palminteri, 2019;Timmer, Sescousse, van der Schaaf, Esselink, & Cools, 2017). For example, when asked to respond to certain pictures, never-depressed individuals respond faster to pictures that have been rewarded more often during previous trials. MDD patients do not show this biased learning, implying they do not learn from reward as well as never-depressed individuals (Pizzagalli, Iosifescu, Hallett, Ratner, & Fava, 2008b). This learning deficit has also been shown in individuals who were remitted from depression, indicating that a previous depressive episode may have an enduring effect on reward learning (Pechtel, Dutra, Goetz, & Pizzagalli, 2013;Whitton et al., 2016).
Another key symptom of MDD is increased sensitivity to negative information, a characteristic also termed negative bias (Eshel & Roiser, 2010;Robinson et al., 2012). Several studies have investigated the negative bias hypothesis by using a probabilistic reversal learning (PRL) paradigm, a computer task which measures sensitivity to punishment and reward feedback. These studies observed enhanced sensitivity to punishment in MDD: depressed individuals exhibited greater tendency to reverse responding upon punishment relative to reward (Murphy, Michael, Robbins, & Sahakian, 2003;Taylor Tavares et al., 2008). This negative learning bias is consistent with the larger body of literature on negative information processing biases in MDD in the cognitive domains of attention, interpretation, and memory (Everaert, Podina, & Koster, 2017;Gotlib & Joormann, 2010;LeMoult & Gotlib, 2019;Mathews & MacLeod, 2005;Vrijsen et al., 2017). Negative bias research has generally focused on the processing of emotional words and pictures, e.g. selfdescriptive words, emotional expressions. Thus, individuals show biased learning from positive and negative feedback but also differences in preferential processing of positive and negative emotional information. Furthermore, it is assumed that explicit feedback (such as the word 'correct' or 'incorrect') may still be interpreted with emotional quality, and can influence for example motivation (Roiser & Sahakian, 2013). However, most of these studies used selected samples (with regard to comorbidity, severity, age and/or medication), which limits the generalizability of the findings. Furthermore, they mostly used course, aggregate measures of behavior (i.e. participant learning scores) (Murphy et al., 2003;Robinson et al., 2012;Taylor Tavares et al., 2008).
The use of computational models might be a more sensitive approach to detect latent biases in trial-by-trial behavior (Robinson & Chase, 2017). Accordingly, in the present study, we recruited a large naturalistic sample of psychiatric patients, characterized by well-diagnosed comorbidity of a number of common psychiatric disorders, i.e. MDD, anxiety disorder, addictive disorder, attention-deficit/hyperactivity disorder (ADHD), and autism spectrum disorder (ASD). Investigating mechanisms of depression in a more ecologically valid group of patients (Goldberg & Fawcett, 2012;Kessler et al., 2003;Lamers et al., 2011;Rommelse, Geurts, Franke, Buitelaar, & Hartman, 2011) also allows us to assess whether the deficit is specific to depression, or reflects nonspecific psychiatric vulnerability.
We combined analyses of classic aggregate behavioral measures of punishment and reward sensitivity to enable comparison with prior work (Murphy et al., 2003;Taylor Tavares et al., 2008) with computational reinforcement learning modeling (den Ouden et al., 2013). This modeling allowed us to compute parameters reflecting positive and negative learning rate as well as decision variability. Learning rate indexes the degree to which people update their expectations about reward or punishment based on having received unexpected rewards and punishments in the past, in short, their speed of learning from experience. Decision variability indexes the degree to which choices are in line with their expectations, with high variability corresponding to high choice randomness putatively reflecting poor ability to translate value into action. Critically, recent studies with selected MDD samples using a similar approach have observed enhanced decision variability rather than changes in reward or punishment learning rate (Harlé et al., 2017;Huys, Pizzagalli, Bogdan, & Dayan, 2013;Kunisato et al., 2012) and have associated this with increased ratings of anhedonia.
We compared aggregate behavioral measures and model-based parameters of reward and punishment sensitivity adopting two strategies. First, we used a classic group comparison strategy: we contrasted patients with MDD, patients without MDD and healthy controls (HC). Second, we adopted a dimensional approach in line with the recommendations of Research Domain Criteria (RDoC) guidelines (Insel et al., 2010;National Institute of Mental Health, 2008), which aims to stimulate patient group stratification based on core brain-behavior dimensions rather than discrete diagnostic categories. We assessed the relationship between the outcome variables and symptom severity across the whole group (controls and patients), as measured with questionnaires.
Next, we investigated the specificity of the effects to MDD, by performing an additional set of analyses. First, we examined whether parallel effects were observed when stratifying the group by the other diagnoses present in our sample. Additionally, we assessed whether any of the effects could be accounted for by (i) medication use, or (ii) general psychiatric disease severity, in terms of the total number of other diagnoses. Finally, we examined whether negative learning bias was present in a patient subsample that was matched to previous studies based on age, comorbidity and medication use. This enabled more direct comparison with previous results.

General procedure
The present study is part of a cohort-study run by the Department of Psychiatry of the Radboud university medical center (Radboudumc), Nijmegen, The Netherlands. The MIND-Set study (Measuring Integrated Novel Dimensions in Neurodevelopmental and Stress-related Mental Disorders) is an ongoing observational cross-sectional study that assesses clinical, biological, behavioral, and neuroimaging data (online Supplementary Table S1). Data collection include a set of neuropsychological measures among which a PRL task, which was used to answer the current research questions.
All adult outpatients (age range 18-78, mean age of 40) with a diagnosis of a current depressive disorder (MDD or dysthymia), anxiety disorder, addictive disorder, ADHD and/or ASD were eligible for participation in the MIND-Set study. The present study was conducted during the DSM-IV/DSM-5 transition period. Diagnoses of MDD, dysthymia, anxiety disorder, addictive disorder, and ADHD were therefore established by DSM-IV criteria, and ASD by DSM-5 criteria. Depressive disorders, anxiety disorders and psychotic disorders were assessed with the Structured Clinical Interview for DSM-IV AXIS I Disorder (SCID-I; First, Spitzer, Gibbon, and Williams, 1996); ADHD with the Diagnostic Interview for Adult ADHD, second edition (DIVA 2.0; Kooij and Francken, 2010); ASD with the Dutch Diagnostic Interview for Adult Autism Spectrum Disorders (NIDA; Vuijk,

304
Sophie C. A. Brolsma et al. 2016); and addictive disorders with the Measurements in the Addictions for Triage and Evaluation and criminality (MATE-crimi;Schippers, Broekman, and Buchholz, 2011). Participants were excluded if they had a current psychotic disorder according to the SCID section B, an IQ estimation <70, a sensorimotor disability intervening with participation, were mentally incompetent to sign informed consent or had insufficient knowledge of the Dutch language. The study has been approved by the local ethics committee (Commissie Mensgebonden Onderzoek Arnhem-Nijmegen, NL 55618.091.015). Written informed consent for participating in this study was obtained after the diagnostic procedure (online Supplementary Methods).

Participants
For the current project, we included data from patients who were enrolled between June 2016 and December 2017. In this timeframe, 311 patients were included in the study, of whom 217 participated in the neuropsychological assessment (see online Supplementary Table S2 for an overview of the sample size per diagnostic category). This sample was divided into three groups: (i) the No MDD group (n = 61, patients with disorders other than current or remitted depression), (ii) the Remitted MDD group (n = 55, patients which had at least one previous depressive episode but did not meet the criteria for a depressive episode at the time of inclusion), (iii) the Current MDD group [n = 101, patients with a current depressive disorder (with or without past depressive episodes)]. This division took into account a possible vulnerability from previous depressive episodes in the patients who had remitted from depression. Comorbidity with other disorders (anxiety and/or addiction and/or ADHD and/or ASD) was possible in all three groups (Fig. 1).
In addition, healthy controls were included from October 2016 until June 2019, during which the data of 101 participants were collected. In the current study, we were able to match 81 healthy participants with no current or lifetime psychiatric diagnosis to the patient sample based on age, gender and education level. Healthy controls underwent the same testing procedure as the patients (see online Supplementary Methods).

Probabilistic reversal learning task
Subjects performed a probabilistic reversal learning task, a wellestablished paradigm investigating learning and behavioral adaptation based on reward and punishment (Cools, Barker, Sahakian, & Robbins, 2001;den Ouden et al., 2013;Swainson et al., 2000) ( Fig. 2a). During the task, participants were presented with two squares. They needed to choose one, after which they received feedback; either a reward or a punishment. Subjects were instructed to choose the square that was rewarded most often. They needed to learn this by trial-and-error (see online Supplementary Material for subject instruction). The feedback was probabilistic: the square selected by the participant on the first trial was considered the correct square and rewarded on 80% of the trials. In the remaining 20% of the cases, selecting the same square was punished (despite the response being correct). This punishment feedback was therefore misleading, and should be ignored (participants should not switch responses). This was not explicitly stated. After the first 40 trials of the task, the 'acquisition phase', a 'reversal phase' started (also 40 trials). The probabilities switched in this phase (unbeknownst to the participants) so that the previously rewarded square was now punished on 80% of the trials (and vice-versa, the previously punished square was now rewarded on 80% of the trials). Before starting the experiment, we informed participants that the correct response could change, but they were not aware of how often or when this would occur (Fig. 2b).

Additional measures
Socio-demographic information Information on age, gender and level of education was obtained. Education was divided into four levels; (almost) no education (elementary education or education not finished), low (lower vocational and general secondary education), middle (intermediate vocational and higher secondary education) and high (higher vocational education and university) (Ikram et al., 2015).

Background neuropsychology
Verbal IQ was determined with the National Adult Reading Test score (NART score, Dutch version, Schmand, Lindeboom, and Van Harskamp, 1992). To assess the functional specificity of any effects, we also measured the total number of errors from the spatial working memory task (SWM errors) from the Cambridge Neuropsychological Test Automated Battery (CANTAB®; Cambridge Cognition 2019) to measure working memory capacity as an additional measure of cognitive ability.

Medication
Medication use was assessed during the clinical diagnostic procedure, and again during the neuropsychological assessment to monitor changes in medication use (online Supplementary  Table S1). Of the 218 patients, 55 used a selective serotonin reuptake inhibitor (SSRI), 36 used benzodiazepines, 22 used antipsychotics, 18 used dopaminergic medication (e.g. methylphenidate), 12 used opioids, 9 used tricyclic antidepressants, and 4 used lithium at the time of the neuropsychological assessment. There were 65 participants who did not use any medication.

Aggregate behavioral outcome measures
Based on previous work, we computed the following aggregate behavioral measures of reward and punishment learning and reversal: (i) Error rate: the total number of errors during the acquisition and reversal phase (z-transformed); . Participants had to choose one of the squares, by pressing the corresponding arrow key on the keyboard (left, right, up, down). Squares were shown until a response was given. Subsequently, the feedback was given, which could be a reward (a green smiley accompanied by a high sound) or a punishment (a red sad smiley accompanied by low sound), which was shown for 1500 ms. The next trial started after 1000 ms. (b) During the acquisition phase, the square that was selected first (here yellow) would be rewarded 80% of the trials. During the reversal phase, the previous punished square would now be rewarded 80% of the trials. (ii) Probabilistic switch rate, defined as the number of errors after misleading feedback divided by the total number of misleading feedback trials (Murphy et al., 2003;Taylor Tavares et al., 2008); (iii) Win-stay and lose-shift rates, computed as the proportion of trials after a reward (v. punishment) on which the same square was chosen again (v. not chosen again), irrespective of whether this was the correct square or not (den Ouden et al., 2013). (iv) Perseveration errors, computed as any sequence of two or more errors in the reversal phase (z-transformed) (den Ouden et al., 2013). This outcome measure was taken as an index of behavior after the reversal. (v) Number of participants that reached a learning criterion of eight consecutive correct responses during acquisition.
Although it is an arbitrary measure, it has been used before (den Ouden et al., 2013;Swainson et al., 2000). It requires the participant to ignore at least two instances of misleading feedback.

Computational modeling
In addition, we employed an established reinforcement learning model with dual learning rates, from here referred to as the reward-punishment (RP) model (Frank, Moustafa, Haughey, Curran, & Hutchison, 2007;Rescorla & Wagner, 1972). This approach estimates three key parameters: punishment learning rate, reward learning rate, and decision variability. Punishment (v. reward) learning rate reflects the degree to which participants update the value of an action depending on previous experience with unexpected punishment (v. reward). A high learning rate indexes greater weight on an unexpected outcome, thus faster updating of action value. Decision variability reflects the stochasticity of choices given this expected value and indexed by the softmax beta parameter: A high beta means lower decision variability (choosing the best option more consistently); a lower beta means that decisions are more random (see online Supplementary Materials for a detailed description).

Statistical analyses
Demographic information Gender, age, verbal IQ, education level, spatial working memory task (SWM) errors, and depressive symptom (IDS-SR total score) were submitted to an analysis of variance (ANOVA) to compare the four groups (No MDD, Remitted MDD, Current MDD, HC). A chi-square test was used to compare the total number of diagnoses between groups, as an index of general psychiatric severity (a maximum of six concurrent disorders: current MDD, remitted MDD, anxiety disorder, addictive disorder, ADHD, and ASD).

Behavioral outcome measures
Acquisition and reversal errors were submitted to repeatedmeasures ANOVA with the task phase as a within-subjects factor and group as a between-subjects factor. The proportion of subjects passing the learning criterion within each group was analyzed with a chi-square test. Win-stay and lose shift rate were submitted to repeated-measures ANOVA with error type as a within-subjects factor and group as a between-subjects factor. Model-based reward and punishment learning rate were also submitted to repeated-measures ANOVA with the learning rate as a within-subjects factor and group as a between-subjects factor.
Finally, probabilistic switch rate, perseverative errors, and modelbased decision variability were all submitted to separate ANOVAs with the group as a between-subjects factor.

Dimensional analyses
Spearman's partial correlations were computed across the whole sample, exploring the relationship between the different psychiatric symptom ratings (IDS-SR, ASI, CAARS, AQ-50) and the behavioral (probabilistic switch rate, win-stay, lose-shift) and computational (reward learning rate, punishment learning rate, decision variability) outcome measures. We included age, gender and working memory capacity (SWM total errors) as covariates of no interest in all analyses. Significant effects ( p value <0.05) were further investigated with follow-up t tests. Whenever there was unequal variance (measured with Levene's Test) between the groups, we present the results from the t tests that used the Welch-Satterthwaite correction, as implemented in SPSS. We used Bayesian ANOVAs to quantify the evidence in support of the null (no difference between groups) or alternative (a difference between patients and HC) hypotheses for the main behavioral and computational outcome measures (JASP Team, 2019).

Specificity analyses
We analyzed the outcome measures as a function of the other diagnoses present in our sample and compared them with the HC group. Behavioral and computational dependent variables were assessed with a multivariate ANOVA using either anxiety disorder (present/absent/HC), addictive disorder (present/absent/HC), ADHD (present/absent/HC) or ASD (present/absent/HC) as between-subjects factor. To correct for multiple comparisons we divided the p value by the number of tests, i.e. the number of outcome measures [4; error type (2 levels: Win-stay and lose-shift), probabilistic switch rate, learning rate (2 levels: Reward and punishment), decision variability] that were tested times the number of diagnoses (4; anxiety disorder, addictive disorder, ADHD, ASD), which was 0.05/16 equals a p value of 0.003.
Next, we explored whether any of the effects of diagnosis we observe can be accounted for by general psychiatric severity (indexed by the total number of diagnoses), or by type of medication used. We were specifically interested in the effects of the commonly used SSRIs, given our previous results from a genetic study on PRL (den Ouden et al., 2013). This study revealed an effect of common single nucleotide polymorphism in the gene encoding the serotonin transporter (SERT: 5HTTLPR plus rs25531) in the healthy subject population on the lose-shift rate. Therefore, we compared our outcome measures in patients who used an SSRI (n = 54), with patients who used other medication (n = 98), and with patients who did not use any medication (n = 65).
Furthermore, we specifically utilized the probabilistic switch rate (the number of errors after misleading feedback divided by the total number of misleading feedback trials) to enable comparison with two previous studies that examined PRL in depression. First, Murphy et al. (2003) examined 27 medicated MDD patients (age 26-59, antidepressant or mood-stabilizing medication) without comorbidities. Second, Taylor Tavares et al. (2008) examined 13 unmedicated MDD patients (18-55) also without comorbidity. These samples differ in sample size, medication use and comorbidity compared with the current sample. We therefore also analyzed probabilistic switch rate within a subsample of the HC and a subsample of the patients with comparable Psychological Medicine characteristics to the previous studies. Specifically, we analyzed the data of participants between 18 and 59 years old, and patients who only had a current MDD (recurrent or first episode) without any comorbidity. This yielded a sample of 24 patients, who were then divided based on whether they used an SSRI (n = 10) or not (n = 14). These patients were then compared with a subset of the HC in the same age range (n = 63). Although we could only compare probabilistic switch rate with the previous studies, we analyzed all outcome measures with Group (HC, Clean MDD no SSRI, Clean MDD SSRI) as between-subjects factor.
For completeness, we also report statistics for all direct comparisons, not corrected for multiple tests. Specifically, we compared all negative learning bias measures (i.e. probabilistic switch rate, lose-shift rate and punishment learning rate) between the following groups: Current MDD v. HC, Current MDD v. No MDD, and Current MDD v. Remitted MDD. See the online Supplementary Material for a detailed report.
Finally, to assess whether the findings were driven by individuals who may have used a different strategy to complete the task, or who did not completely understand the task, we restricted our analyses to participants who passed a learning criterion of eight consecutive correct responses in the acquisition phase. We examined the effect of Group and depressive symptoms on the different outcome measures, but only in participants who passed this criterion (online Supplemental Material). Win-stay rate was higher than lose-shift rate [main effect of error type: F(1,285) = 46.76, p < 0.001, η p 2 = 0.141], but did not differ between the groups [main effect of Group: F(3,285) = 1.04, p = 0.378, η p 2 = 0.011], nor was there a significant interaction between Group and error type [F(3,285) = 0.64, p = 0.589, η p 2 = 0.007]. Critically, the groups also did not differ in terms of probabilistic switch rate [F(3,285) = 1.73, p = 0.161, η p 2 = 0.018] (Fig. 3a-c). These results were confirmed by a Bayesian ANOVA, where the Bayes Factor was 12.66 for probabilistic switch rate and 58.82 for error type. This meant that there was strong evidence in favor of the null hypothesis (no difference between the groups) compared with the alternative hypotheses (a difference between the groups).

Computational modeling parameters
There was neither difference between reward and punishment learning rate [F(1,285) = 0.09, p = 0.764, η p 2 < 0.001], nor a difference between the four groups [F(3,285) = 1.69, p = 0.170, η p 2 = 0.017] or an interaction [F(3,285) = 0.333, p = 0.802, η p 2 = 0.003] (Figs 3d and 3e). Furthermore, decision variability did not differ significantly between the four groups, F(3,285) = 2.16, p = 0.093, η p 2 = 0.022 (Fig. 3f). These results were confirmed by a Bayesian ANOVA, which showed that the null-hypothesis was 17.24 and 4.5 more likely than the alternative hypothesis for learning rate and decision variability, respectively. This was considered moderate to strong evidence that there was no difference between the groups.

Dimensional analyses
In addition to the group-wise comparisons, we investigated the associations of depressive, anxiety, ADHD and autism symptom

Specificity analyses
Probabilistic reversal learning deficits as a function of the other diagnoses Next, we repeated the analyses using the other diagnoses (anxiety disorder, addictive disorder, ADHD, ASD) as between-subjects grouping factor (online Supplementary Table S4). Figure 1 shows the overlap between the different disorders. The primary rationale for these analyses was to investigate whether any observed effects of MDD were specific to MDD or extended to other psychiatric disorders. There was no effect of any grouping on the computational modeling parameters. However, there was a significant difference when the group was stratified based on ASD [F(2,280) = 3.96, p = 0.020, η p 2 = 0.028]. Patients with ASD exhibited a higher probabilistic switch rate (i.e. more errors after misleading negative feedback) compared with patients without ASD, t(209) = 2.04, p = 0.042 [the difference between HC and patients with ASD, and HC and patients without ASD was not significant, t(136) = 1.26, p = 0.209 and t(233) = 0.62, p = 0.534 respectively]. However, these effects did not survive correction for multiple comparisons. No other significant effects were found (Figs 4a-d, online Supplementary Table S4).

Probabilistic switch rate increases with general psychiatric severity (number of diagnoses)
We did not find effects of medication on our outcome measures of interest (online Supplementary Results). However, there was a significant effect of number of diagnoses on the probabilistic switch rate [F(1,215) = 5.67, p = 0.018, η p 2 = 0.026]; patients with more diagnoses exhibited a higher probabilistic switch rate (Fig. 4e). We found no other effect of general psychiatric severity (online Supplementary Results).
Comparison probabilistic switch rate with previous studies Probabilistic switch rate was not significantly different between HC, MDD patients without SSRI use and MDD patients with SSRI use [main effect of Group: F(2,84) = 0.03, p = 0.970, η p 2 = 0.001]. In Fig. 4f we present the mean probabilistic switch rates of the current study together with those of Murphy et al. (2003) and Taylor Tavares et al. (2008). Additionally, we compared win-stay and lose-shift rate, and reward and punishment learning rate with the results from non-depressed individuals from three other studies (online Supplementary Table S5). On average, our participants perform like those in other studies. Moreover, the degree of individual variability in these measures, as indexed by the standard deviations, also resemble those reported previously.

Demographic information
There were no significant differences between the four groups in terms of age, gender, IQ, education level, errors on the spatial working memory task, number of perseverative errors, or number of people that reached the learning criterion (Table 1). As expected there was a significant group difference in depressive symptom severity (IDS-SR total score), with lower ratings in HCs Furthermore, there was a significant difference between the groups in terms of the total number of diagnoses, with fewer diagnoses in the No MDD group than the Remitted MDD group, χ 2 (4) = 38.9, p < 0.001, and than the Current MDD group, χ 2 (5) = 63.8, p < 0.001. There was no significant difference in the number of diagnoses between the Remitted MDD group and the Current MDD group, χ 2 (4) = 6.3, p = 0.176.

Discussion
In the present study, we assessed the generalizability of the negative learning bias hypothesis of depression from selected depressed patient samples to a large, heterogeneous sample of depressed patients with high levels of specified comorbidities, by measuring learning from punishment v. learning from reward. In contrast to previous studies focusing on selected and smaller samples, patients with MDD did not exhibit a negative bias compared with HC in terms of any of the behavioral and computational measures indexing increased learning from punishment (punishment learning rate, lose-shift behavior, probabilistic switch rate). The severity of depressive symptoms was not associated with any of the behavioral or computational model-derived measures.
The negative bias hypothesis is a dominant and enduring account of MDD, which is grounded in evidence from studies using a variety of cognitive tasks, including learning paradigms (Beck, 1986(Beck, , 2008Eshel & Roiser, 2010;Gotlib & Joormann, 2010). We find no evidence in support of this negative learning bias account in this naturalistic sample of psychiatric patients. The critical difference with previous studies is the presence of comorbid psychiatric disorders. It is possible that learning deficits that have been associated with the other disorders have an influence during this task as well. For example, increased reward sensitivity in addiction (Nusslock & Alloy, 2017) and atypical reward processing has been found in ADHD (Luman, Oosterlaan, & Sergeant, 2005;Thoma, Edel, Suchan, & Bellebaum, 2015), which has also been associated with motivation deficits in ADHD (Volkow et al., 2011). Previous studies have generally found decreased reward sensitivity and learning in MDD (Admon & Pizzagalli, 2015), which has been associated with one of the main symptoms of depression, anhedonia (Huys et al., 2013;Pizzagalli, Goetz, Ostacher, Iosifescu, & Perlis, 2008a;Robinson & Chase, 2017). In contrast, we did not find evidence for reduced learning from or insensitivity to reward in patients with depression, nor for an association between the outcome measures and the level of anhedonia symptoms (see online Supplementary Material). However, we did not measure anhedonia with a specific questionnaire and can therefore not draw any definitive conclusions about a possible specific effect of anhedonia (as opposed to a more general effect of depression) on reward learning. It would be interesting to examine this with a questionnaire that measures anhedonia symptoms, such as the Snaith-Hamilton Pleasure Scale (SHAPS, Snaith et al., 1995).
To investigate whether comorbidity and medication status affected our findings, we performed supplementary analyses, selecting a subset of healthy controls and patients that matched the samples of Murphy et al. (2003) and Taylor Tavares et al. (2008) based on age, comorbidity, and medication use (Fig. 4f). Surprisingly, these analyses also revealed no effects on any of the outcome measures indexing negative bias. While comorbidity with anxiety disorders is often not excluded, ADHD or ASD are not mentioned, and possibly not measured. Given the substantial overlap in our sample, it is possible that they are underdiagnosed in other studies. Furthermore, we performed rigorous screenings on these disorders for our healthy control group, which was quite large compared to other studies. Noteworthy, and in line with our results, two recent studies using computational modeling also did not find enhanced punishment learning rate in MDD (Huys et al., 2013;Kunisato et al., 2012).
Additionally, we examined whether the negative learning bias hypothesis is specific to depression, or whether it extends to other psychiatric diagnoses. When the sample was divided based on the other major psychiatric disorders, we found marginally increased switching after misleading negative feedback in patients with ASD, an effect generally consistent with a previous finding of increased attention to negative social-emotional images in ASD (Unruh, Bodfish, & Gotham, 2018). However, we note that caution is warranted when interpreting these findings for two reasons. First, the marginal effects did not survive our significance threshold when correcting for multiple comparisons, which is appropriate particularly given the exploratory nature of these supplementary analyses. Second, the probabilistic switch rate in the ASD group did not differ from those of HC. These findings might suggest the presence of underestimated and unmeasured comorbidity in previous samples.
Interestingly, we also found evidence indicating that patients with more disorders responded more towards misleading negative feedback. However, the number of diagnoses did not have an effect on any of the other measures indexing negative bias.
We had a limited number of exclusion criteria (e.g. current nonaffective psychotic disorder or mental retardation). Naturalistic sampling as done in the current study is an advantage, but can also have its drawbacks. There are several factors which are more easily monitored in a smaller sample. For instance, while we did record medication use, we did not specifically examine the effects of daily dosage and the number of years used. Additionally, some patients had experience with one or more behavioral therapies, which might also influence behavior. Future studies should attempt to control these factors, or potentially investigate their effect in heterogeneous clinical samples. Another limitation of the current study is that negative learning bias is measured with only one task. We acknowledge that the present lack of a negative bias might not generalize to other tasks such as those measuring emotional processing (Everaert et al., 2017;Gaddy & Ingram, 2014;LeMoult & Gotlib, 2019;Mathews & MacLeod, 2005). The use of this particular task was motivated by previous literature, enabling us to compare these results with previous studies that were also done in smaller groups of patients with MDD (Murphy et al., 2003;Taylor Tavares et al., 2008). Of note, in a follow-up study in which we employed a slightly different (deterministic) reversal-learning task, we again were unable to provide evidence for changes in punishment (or reward) learning (Brolsma et al., Submitted). Finally, one should note that we used one prior to fit the data of both the patients and the control participants. This is considered a stricter test of the differences, as using the use of two (or more) priors increases the chance of false positives. Yet, this approach may increase the chance of false negatives, and such different procedures should be addressed in future work. Noteworthy, in line with our modeling results, we did not observe any difference in the behavioral outcome measures in our data.
Our results have implications for the negative learning bias hypothesis of MDD. We do not find any evidence for a negative learning bias in MDD. If anything, our results suggest that negative learning bias is associated with ASD. We cannot exclude the possibility that previously reported negative learning bias in patients with MDD reflects comorbid (undiagnosed) ASD, because screening for ASD diagnosis in those previous studies (Admon et al., 2017;Elliott et al., 1997;Liu et al., 2017;Robinson et al., 2012;Taylor Tavares et al., 2008) remained unreported. However, and importantly, the evidence for negative bias in depression in other cognitive domains such as attention, interpretation and memory is rather robust (Everaert et al., 2017;Gaddy & Ingram, 2014;Peckham, McHugh, & Otto, 2010), and has more recently been extended to other psychiatric disorders (e.g. anxiety disorders, substance abuse disorders, ADHD symptoms, ASD) ( Bar-Haim, Lamy, Pergamin, Bakermans-Kranenburg, & van IJzendoorn, 2007;Field, Munafò, & Franken, 2009;Gotham, Unruh, & Lord, 2015;Unruh et al., 2018;Vrijsen et al., 2017Vrijsen et al., , 2018. The expression of negative bias may depend on the cognitive domain studied, and remain limited to affective domains, without impacting learning. Understanding the complexity of cognitive mechanisms underlying depression with the purpose of predicting psychiatric vulnerability therefore requires investigating symptomatology in large and naturalistic samples.
Supplementary material. The supplementary material for this article can be found at https://doi.org/10.1017/S0033291720001956