Hostname: page-component-89b8bd64d-sd5qd Total loading time: 0 Render date: 2026-05-08T17:10:23.330Z Has data issue: false hasContentIssue false

Diminished prospective mental representations of reward mediate reward learning strategies among youth with internalizing symptoms

Published online by Cambridge University Press:  07 March 2023

Josh M. Cisler*
Affiliation:
Department of Psychiatry and Behavioral Sciences, Dell Medical School, University of Texas at Austin, USA Institute for Early Life Adversity Research, Dell Medical School, University of Texas at Austin, USA
Amanda J. F. Tamman
Affiliation:
Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
Greg A. Fonzo
Affiliation:
Department of Psychiatry and Behavioral Sciences, Dell Medical School, University of Texas at Austin, USA Institute for Early Life Adversity Research, Dell Medical School, University of Texas at Austin, USA Center for Psychedelic Research and Therapy, Dell Medical School, University of Texas at Austin, USA
*
Author for correspondence: Josh M. Cisler, E-mail: josh.cisler@austin.utexas.edu
Rights & Permissions [Opens in a new window]

Abstract

Background

Adolescent internalizing symptoms and trauma exposure have been linked with altered reward learning processes and decreased ventral striatal responses to rewarding cues. Recent computational work on decision-making highlights an important role for prospective representations of the imagined outcomes of different choices. This study tested whether internalizing symptoms and trauma exposure among youth impact the generation of prospective reward representations during decision-making and potentially mediate altered behavioral strategies during reward learning.

Methods

Sixty-one adolescent females with varying exposure to interpersonal violence exposure (n = 31 with histories of physical or sexual assault) and severity of internalizing symptoms completed a social reward learning task during fMRI. Multivariate pattern analyses (MVPA) were used to decode neural reward representations at the time of choice.

Results

MVPA demonstrated that rewarding outcomes could accurately be decoded within several large-scale distributed networks (e.g. frontoparietal and striatum networks), that these reward representations were reactivated prospectively at the time of choice in proportion to the expected probability of receiving reward, and that youth with behavioral strategies that favored exploiting high reward options demonstrated greater prospective generation of reward representations. Youth internalizing symptoms, but not trauma exposure characteristics, were negatively associated with both the behavioral strategy of exploiting high reward options as well as the prospective generation of reward representations in the striatum.

Conclusions

These data suggest diminished prospective mental simulation of reward as a mechanism of altered reward learning strategies among youth with internalizing symptoms.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press
Figure 0

Table 1. Clinical and demographic characteristics of the participants

Figure 1

Fig. 1. (a) Depiction of the social reward three-arm bandit task. Participants completed 90 trials. Trials began with presentation of three faces and participants chose one face in which to invest $10. The choice phase lasted until participants made a selection, which was then indicated with a blue box around it for 1s. An anticipation phase followed while they waited for the outcome of the choice, which consisted of a jittered fixation cross for 1.5–3s. The outcome phase was subsequently displayed and consisted of binary return of either $20 (net increase of $10) or no return (net loss of $10). The outcome phase presented the outcome of the trial (win or loss) for 2s, updated the points total for 1s, followed by a jittered fixation cross of 1.5–3s prior to starting the next trial. (b) Depiction of the MVPA pipeline. For each ICA network separately, trial × voxel matrices of beta coefficients are created for all participants except one left out participant separately for reward outcomes during the task. Support vector machine classifiers are then trained on these data, resulting in a decoder for reward outcomes. Next, this reward decoder is applied to the trial × voxel matrix of beta coefficients at the time choice for the participant that was left out of the training. This results in a prediction about the degree to which the reward representations are active at the time of choice, which can be compared to the magnitude of reward the participant was expecting for that given choice. This process is repeated until each participant has served as the left-out test participant.

Figure 2

Fig. 2. (a) Akaike Information Criterion values of model fit for the compared models. We tested a factorial manipulation of anticorrelated or not anticorrelated models (denoted with A+ or A−) and risk sensitive or not risk sensitivity models (denoted with RS+ or RS−). Consistent with our past studies using Matlab's fmincon for model fitting (Cisler et al., 2019; Ross et al., 2018), our updated approach using hierarchical Bayesian inference (Piray et al., 2019) similarly demonstrated the anticorrelated and risk sensitive model fit the data best. (b) There were no relationships between Childhood Trauma Questionnaire total severity scores and softmax βs, representing individual differences in exploitation / exploration strategies on the task. (c) There was a significant inverse relationship between CBCL internalizing symptoms and softmax βs, suggesting decreased exploitative behavior among those with greater internalizing symptoms.

Figure 3

Fig. 3. (a) Depiction of spatial maps from the Independent Component Analysis. (b) Reward decoding performance for each ICA network. Decoding performance was defined as the mean of sensitivity and specificity in correctly classifying reward outcomes from the left-out participant using the model trained on the remaining participants' data. (c) β coefficients reflecting the degree to which value expectation, derived from the computational model, of the chosen arm on the task predicted the magnitude of MVPA predicted reward presentations (i.e. SVM hyperplane predictions) at the time of choice. All networks demonstrated significant coupling between reward expectation and magnitude of reward representations when controlling for multiple comparisons. (d). ICA networks demonstrating significant interactions between softmax βs and coupling between reward expectation and magnitude of reward representations (i.e. SVM hyperplane predictions), suggesting that those who generated greater reward representations in proportion to expected reward also tended to use behavioral strategies to exploit high reward arms.

Figure 4

Fig. 4. (a) Scatter plot depicting relationship between CBCL internalizing symptoms and coupling between MVPA reward representations during choice and reward expectations. (b) Even though we controlled for CTQ total severity in our primary analyses, we conducted an additional analysis differentiating effects of assault exposure and internalizing symptoms. We used a median split to identify control adolescents with low v. high internalizing symptoms, and separately used a median split to identify assaulted adolescents with low v. high internalizing symptoms. Separating the sample in this manner allows differentiation of impacts due to assault exposure and internalizing symptoms. If coupling of prospective reward representations in the striatum were more strongly associated with assault exposure, we would expect that both assault groups would demonstrate impairment relative to both control groups, with relative homogeneity within groups. By contrast, if coupling of prospective reward representations in the striatum were more strongly associated with internalizing symptoms, we would instead expect coupling of prospective reward representations to follow the pattern of internalizing symptoms across the groups in accordance with panel B. (c) As can be seen in Fig. 4c, individual differences in coupling with prospective reward representations clearly tracked individual differences in internalizing symptoms and not assault exposure, t(51) = −3.14, p = 0.003 (regression model with group coded as follows in accordance with differences in CBCL internalizing symptoms [see panel B]: control low symptoms = 0, control high symptoms = 1; assault low symptoms = 1, assault high symptoms = 2).

Figure 5

Fig. 5. (a) Graphical depiction of mediation model, where internalizing symptoms predict decreased coupling between MVPA reward representations and expectations of reward in the striatum (i.e. path a), and decreased coupling of reward representations in the striatum predict decreased choices to exploit high reward arms on the task (i.e. path b). Path c refers to the total effect of internalizing symptoms on behavioral strategies on the task, and path c’ refers to the direct effect after accounting for the indirect effect (i.e. path ab) through MVPA reward representations. (b). The significance of the indirect effect was tested through 50 000 bootstrap iterations and demonstrating that the 95% confidence interval does not include zero.

Supplementary material: File

Cisler et al. supplementary material

Cisler et al. supplementary material

Download Cisler et al. supplementary material(File)
File 565.8 KB