Hostname: page-component-75d7c8f48-44qqc Total loading time: 0 Render date: 2026-03-26T13:41:21.381Z Has data issue: false hasContentIssue false

The selective serotonin reuptake inhibitor sertraline alters learning from aversive reinforcements in patients with depression: evidence from a randomized controlled trial

Published online by Cambridge University Press:  17 April 2024

Jolanda Malamud*
Affiliation:
Applied Computational Psychiatry Lab, Mental Health Neuroscience Department, Division of Psychiatry and Max Planck Centre for Computational Psychiatry and Ageing Research, Queen Square Institute of Neurology, University College London, London, UK
Gemma Lewis
Affiliation:
Division of Psychiatry, University College London, London, UK
Michael Moutoussis
Affiliation:
Max Planck UCL Centre for Computational Psychiatry & Ageing Research, University College London, London, UK Wellcome Centre for Human Neuroimaging, Queen Square Institute of Neurology, University College London, London, UK
Larisa Duffy
Affiliation:
Division of Psychiatry, University College London, London, UK
Jessica Bone
Affiliation:
Division of Psychiatry, University College London, London, UK Research Department of Behavioural Science and Health, Institute of Epidemiology, University College London, London, UK
Ramya Srinivasan
Affiliation:
Division of Psychiatry, University College London, London, UK
Glyn Lewis
Affiliation:
Division of Psychiatry, University College London, London, UK
Quentin J. M. Huys
Affiliation:
Applied Computational Psychiatry Lab, Mental Health Neuroscience Department, Division of Psychiatry and Max Planck Centre for Computational Psychiatry and Ageing Research, Queen Square Institute of Neurology, University College London, London, UK
*
Corresponding author: Jolanda Malamud; Email: j.malamud@ucl.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

Background

Selective serotonin reuptake inhibitors (SSRIs) are first-line pharmacological treatments for depression and anxiety. However, little is known about how pharmacological action is related to cognitive and affective processes. Here, we examine whether specific reinforcement learning processes mediate the treatment effects of SSRIs.

Methods

The PANDA trial was a multicentre, double-blind, randomized clinical trial in UK primary care comparing the SSRI sertraline with placebo for depression and anxiety. Participants (N = 655) performed an affective Go/NoGo task three times during the trial and computational models were used to infer reinforcement learning processes.

Results

There was poor task performance: only 54% of the task runs were informative, with more informative task runs in the placebo than in the active group. There was no evidence for the preregistered hypothesis that Pavlovian inhibition was affected by sertraline. Exploratory analyses revealed that in the sertraline group, early increases in Pavlovian inhibition were associated with improvements in depression after 12 weeks. Furthermore, sertraline increased how fast participants learned from losses and faster learning from losses was associated with more severe generalized anxiety symptoms.

Conclusions

The study findings indicate a relationship between aversive reinforcement learning mechanisms and aspects of depression, anxiety, and SSRI treatment, but these relationships did not align with the initial hypotheses. Poor task performance limits the interpretability and likely generalizability of the findings, and highlights the critical importance of developing acceptable and reliable tasks for use in clinical studies.

Funding

This article presents research supported by NIHR Program Grants for Applied Research (RP-PG-0610-10048), the NIHR BRC, and UCL, with additional support from IMPRS COMP2PSYCH (JM, QH) and a Wellcome Trust grant (QH).

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Task and performance. (a) The Go/NoGo task consisted of four different conditions. On each trial one of four possible fractal images was shown. Actions were required in response to a circle that followed the fractal image after a variable delay. After a brief delay, the outcome was presented: a green upward arrow for a win, a red downward arrow for a loss, or a horizontal bar for a neutral outcome. In the go-to-win condition, pressing the key (‘go’) led to a reward with 80% and a neutral outcome with 20% probability, vice versa if they did not press the key (‘nogo’). In the go-to-avoid condition, pressing the key (‘go’) led to a neutral outcome with 80% and a loss with 20% probability. In the nogo-to-win, not pressing the key (‘nogo’) led to a reward with 80% and a neutral outcome with 20% probability. In the nogo-to-avoid condition, not pressing the key (‘nogo’ response) led to a neutral outcome with 80% and a loss with 20% probability. Each task administration consisted of 96 trials, with 24 trials per condition. (b) Mean percentage of correct responses in each of the four conditions. Black dots depict participants and black error bars depict standard deviation of the mean (s.d.). Dashed lines depict chance level. Post hoc comparisons were implemented by means of repeated measures t tests showing a significant difference in accuracy between Pavlovian congruent (got to win and nogo to avoid) and incongruent conditions (go to avoid and nogo to win). Significance ∗ ≤ 0.05, ∗∗ ≤ 0.01, ∗∗∗ ≤ 0.001, ∗∗∗∗ ≤ 0.0001.

Figure 1

Table 1. Baseline characteristics for participants providing informative Go/NoGo task data (N = 435)

Figure 2

Figure 2. Computational modeling of the Go/NoGo task. (a) shows the differences in integrated Bayesian Information Criterion (iBIC) scores for all models tested compared to the most parsimonious model (red star), where a smaller iBIC score indicates a more parsimonious model. All models are modified Q-learning models (Rescorla Wagner – RW) with two pairs of action-values (‘go’ and ‘nogo’) for each stimulus. The y-axis shows the number of free parameters for each model. The most parsimonious model includes separate learning rates for rewards and punishments, win and loss sensitivities, appetitive and aversive Pavlovian biases, irreducible noise, and a constant bias factor added to the action-value for ‘go’. (b) shows the histogram of the difference between the integrated loglikelihood (iLL) of the most parsimonious model and the iLL of the random baseline model. Datasets were declared as informative if the data was more than three times more likely to have occurred under the most parsimonious model (vertical red dashed line). (c) The four subplots show the average learning curves in blue (averaged over participants; solid line) for each condition separately. Each row of the raster images shows the choices of each participant. ‘Go’ responses are depicted in white, and ‘nogo’ responses are depicted in grey. Additionally, the average ‘go’ probability was separated into included datasets (orange) and excluded datasets (green). The solid line refers to empirical data and the dashed line to simulated data from the most parsimonious model. Informative datasets (orange) show that participants, on average, seem to learn over trials, which can be captured qualitatively well by the most parsimonious model. In contrast, the average ‘go’ probability of non-informative/excluded datasets (green) appears to have no temporal relation, hence showing no learning over trials. Further, it is well captured by the random baseline model.

Figure 3

Table 2. Mixed-effect linear models testing our pre-registered hypotheses (only informative data Ntask runs = 886; Npatients = 435, 66% of those randomized)

Figure 4

Figure 3. Effects of sertraline on RL parameters. (a) Shows the aversive Pavlovian bias at baseline and at the follow-ups separated into drug groups (blue, left = placebo; red, right = sertraline). (b) Shows the change in aversive Pavlovian bias between sessions separately for the drug groups. (c) Early changes in the aversive Pavlovian bias predict treatment outcome. This figure shows the relation between the change from baseline to week two in the aversive Pavlovian bias and log-transformed BDI total score (only of participants who had an informative task run at baseline and week 2). In blue the placebo group and in red the sertraline group. An interaction effect was observed between group and early change in the aversive Pavlovian predicting depression at 12 weeks driven by a significant association between the early change and log-transformed BDI total score at 12 weeks. (blue, left = placebo; red, right = sertraline). (d) Shows the loss learning rate at baseline and at the follow-ups separated into drug groups (blue, left = placebo; red, right = sertraline). (e) Shows the change in loss learning rate between sessions separately for the drug groups. Significance *  ≤  0.05, **  ≤  0.01, ***  ≤  0.001, ****  ≤  0.0001.

Supplementary material: File

Malamud et al. supplementary material

Malamud et al. supplementary material
Download Malamud et al. supplementary material(File)
File 1.7 MB