Introduction
Bipolar disorder (BD) is a chronic, highly relapsing disorder in which episodes of depression and (hypo)mania alternate with euthymic phases. Growing evidence suggests that BD follows a progressive course [Reference Grewal, McKinlay, Kapczinski, Pfaffenseller and Wollenhaupt-Aguiar1, Reference Passos, Mwangi, Vieta, Berk and Kapczinski2], forming the basis for clinical staging models [Reference Berk, Conus, Lucas, Hallam, Malhi and Dodd3, Reference Kapczinski, Dias, Kauer-Sant’Anna, Frey, Grassi-Oliveira and Colom4] which divide the course of disease into different phases in order to better predict prognosis and treatment response and to counteract further disease progression by promoting early interventions [Reference Berk, Post, Ratheesh, Gliddon, Singh and Vieta5, Reference Cosci and Fava6]. These models agree in describing a prodromal phase, initial full episodes of mania or depression, and later stages marked by more frequent, longer, and severe relapses, potentially leading to a persistent stage characterized by limited symptomatic and functional recovery. The two most discussed models focus on either the number of recurrent episodes and quality of remissions [Reference Berk, Conus, Lucas, Hallam, Malhi and Dodd3] or on inter-episode symptoms and functional impairment [Reference Kapczinski, Dias, Kauer-Sant’Anna, Frey, Grassi-Oliveira and Colom4]. The biological basis for these models lies in the concept of neuroprogression, which assumes that pathological brain changes progress alongside worsening clinical features, including cognitive and functional decline [Reference Berk7]. While several studies have attempted to empirically validate staging models based on clinical features [e.g. Reference van der Markt, Klumpers, Dols, Draisma, Boks and van Bergen8–Reference Magalhães, Dodd, Nierenberg and Berk15], their biological basis in terms of neuroprogression, for example, by linking stages to brain structural changes, remains insufficiently understood, limiting clinical utility [Reference Passos, Mwangi, Vieta, Berk and Kapczinski2, Reference Alda and Kapczinski16, Reference Malhi, Rosenberg and Gershon17].
Instead, a current debate concerns the progression of cognitive functions. Although longitudinal studies [Reference Strejilevich, Samamé and Martino18–Reference Macoveanu, Damgaard, Ysbæk-Nielsen, Frangou, Yatham and Chakrabarty24], including meta-analyses [Reference Samamé, Martino and Strejilevich25–Reference Bora and Özerdem27], largely suggest cognitive stability in BD, not proving the assumption of neuroprogression [Reference Samamé28–Reference Strejilevich, Samamé and Quiroz30], cross-sectional studies have been interpreted as evidence of a progressive deterioration of cognitive functions [e.g. Reference Passos, Mwangi, Vieta, Berk and Kapczinski2, Reference Czepielewski, Massuda, Goi, Sulzbach-Vianna, Reckziegel and Costanzi31–Reference Torres, DeFreitas, DeFreitas, Kauer-Sant’Anna, Bond and Honer35] and the lack of longitudinal evidence has been attributed to methodological limitations [Reference Yatham, Schaffer, Kessing, Miskowiak, Kapczinski and Vieta36, Reference Vieta37]. However, although cognitive deficits often correlate with structural and functional brain changes, they only provide an indirect indication of neuroprogression. A comprehensive understanding requires direct examination of underlying neurobiological processes. In this context, white matter (WM) microstructural alterations appear particularly promising, as recent evidence increasingly points to their role in both the pathophysiology of BD [e.g. Reference Thiel, Meinert, Winter, Lemke, Waltemate and Breuer38–Reference Thiel, Lemke, Winter, Flinkenflügel, Waltemate and Bonnekoh40] and cognitive performance [Reference Holleran, Kelly, Alloza, Agartz, Andreassen and Arango41, Reference Meinert, Nowack, Grotegerd, Repple, Winter and Abheiden42]. To date, however, only one study has compared WM integrity across different stages of BD, identifying lower WM integrity in the sagittal striatum and corpus callosum in later stages of BD compared to earlier stages [Reference Tanrıkulu, İnanlı, Arslan, Çalışkan, Çiçek and Eren43]. Building upon these findings, our study aims to investigate in more depth whether the concept of disease progression can be biologically supported by alterations in WM microstructure, assessed through diffusion tensor imaging (DTI).
Most previous studies on neuroprogressive effects on cognition or function have used a simple classification, comparing patients with their first episode to those with multiple episodes [e.g. Reference Tanrıkulu, İnanlı, Arslan, Çalışkan, Çiçek and Eren43–Reference Huang, Chen, Hsu, Tsai and Bai47]. However, even if the measure of the number of previous manic episodes is convincing due to its simplicity, intuition, and clinical relevance, this classification reflects only part of the proposed staging models [Reference Magalhães, Dodd, Nierenberg and Berk15, Reference Tremain, Fletcher and Murray48, Reference Tremain, Fletcher, Scott, McEnery, Berk and Murray49] and does not capture crucial aspects of disease progression. It fails to consider the variability in the degree of remission between episodes – ranging from clearly separated episodes to persistent forms – as well as differences in inter-episode functioning [Reference Kapczinski, Dias, Kauer-Sant’Anna, Frey, Grassi-Oliveira and Colom4].
To address these gaps, our study uses different criteria to approximate disease progression based on the staging models postulated by Berk et al. [Reference Berk, Conus, Lucas, Hallam, Malhi and Dodd3] and Kapczinski et al. [Reference Kapczinski, Dias, Kauer-Sant’Anna, Frey, Grassi-Oliveira and Colom4]. Given the challenges of operationalizing and validating detailed staging models, we adopted the International Society for Bipolar Disorders’ (ISBD) recommendation [Reference Kapczinski, Magalhães, Balanzá-Martinez, Dias, Frangou and Gama50] to broadly categorize BD into earlier and later stages. This simplified approach follows the call for caution in the application of complex but still incomplete models [Reference Malhi, Bell, Morris and Hamilton51] and allows the exploratory investigation of WM microstructural changes associated with disease progression. As in previous studies, our first approach was to compare patients after their first manic episode with those who had already experienced multiple manic episodes in their lives. Second, the quality of remission was used, which in the staging models is assumed to decrease as BD progresses [Reference Alda and Kapczinski16, Reference Bauer, Andreassen, Geddes, Vedel Kessing, Lewitzka and Schulze52]. Finally, impairment of the patient’s interepisodic psychosocial functioning was used as an indicator of disease progression, including only euthymic patients. Testing the hypothesis of neuroprogression, we expected patients at later stages of the disease to show WM microstructural alterations that differ from patients at earlier stages. Specifically, we expected BD patients with multiple manic episodes to show lower WM microstructural integrity compared to BD patients who experienced only their first manic episode. Additionally, we hypothesized that BD patients not achieving complete remission between episodes show lower WM microstructural integrity compared to BD patients achieving full remission. Furthermore, greater functional impairment in euthymic patients was expected to relate negatively to WM integrity.
Methods
Participants
One hundred fifty-three BD patients (n = 79 female, Mage = 41.0 years, SDage = 12.0 years) and 153 HCs (n = 78 female, Mage = 41.2 years, SDage = 13.2 years) matched by age, sex, and site were drawn from the baseline assessment of the Marburg-Münster Affective Disorders Cohort Study (MACS) [Reference Kircher, Wöhr, Nenadic, Schwarting, Schratt and Alferink53] (Table 1). Participants aged 18–65 were recruited in Münster and Marburg through local psychiatric hospitals and newspaper advertisements. The study was approved by the ethics committees of University of Marburg (AZ: 07/14) and Münster (2014-422-b-S), in accordance with the Declaration of Helsinki. All participants gave written informed consent and received financial compensation. Exclusion criteria included usual magnetic resonance imaging (MRI) contraindications, head trauma, and past and current neurological, cardiovascular, or other serious illnesses, and current substance dependence. HC had no lifetime mental disorder and no current intake of psychotropic medication. Diagnoses or lack thereof were assessed using the Structured Clinical Interview for DSM-IV-TR for Axis I disorder (SCID-I) [Reference Wittchen, Wunderlich, Gruschwitz and Zaudig54], conducted by trained personnel.
Table 1. Demographic and clinical characteristics of BD patients and HC

Note: Data are mean ± SD or frequencies. BD, bipolar disorder; HC, healthy controls; HDRS, 21-item Hamilton Depression Rating Scale; GAF, General Assessment of Functioning Scale; n/a, not applicable; YMRS, Young Mania Rating Scale. +Calculated using the paired two-tailed Student’s t test. $Calculated using the χ 2 test. *Derived from Item 90 of the Operational Criteria (OPCRIT) Checklist for Affective and Psychotic Illness.
Clinical characteristics
During the interview, patients provided retrospective self-reports on their previous course of illness, including the number of manic episodes and quality of remission between previous episodes, assessed with the Operational Criteria (OPCRIT) Checklist for Affective and Psychotic Illness [Reference McGuffin, Farmer and Harvey55]. Patients were categorized as “single episode with good remission,” “multiple episodes with good remission between episodes,” “multiple episodes with partial remission between episodes” or “persistent chronic illness.” For analysis, the first two categories were combined into “good remission” (BDrem), while the latter two were grouped under “chronic course” (BDchron) to ensure balanced group sizes and simplify interpretability by using conceptually meaningful groups. The presence and severity of acute depressive and manic symptoms were assessed using the 21-item Hamilton Depression Rating Scale (HDRS) [Reference Hamilton56] and the Young Mania Rating Scale (YMRS) [Reference Young, Biggs, Ziegler and Meyer57], respectively. The level of functioning was assessed via the Global Assessment of Functioning (GAF) [Reference Saß, Wittchen, Zaudig and Houben58]. The type and amount of current medication were assessed and summarized in one medication load index, as described earlier [Reference Hassel, Almeida, Kerr, Nau, Ladouceur and Fissell59] (Supplement 1).
DTI data acquisition and pre-processing
DTI data were acquired with two 3 T whole body MR scanners (Marburg: Tim Trio, Siemens, Erlangen, Germany; Münster: Prisma fit, Siemens, Erlangen, Germany). All images were thoroughly quality controlled according to the published protocol of the MACS study [Reference Vogelbacher, Möbius, Sommer, Schuster, Dannlowski and Kircher60]. Due to changes of the body coil (BC) and gradient coil (GC) in the MRI scanner in Marburg, we controlled for four different scanner settings, including three dummy-coded variables (BC and GC pre change, BC post and GC pre change, BC and GC post change) with Münster as the reference category as covariates in all analyses, as previously recommended [Reference Vogelbacher, Möbius, Sommer, Schuster, Dannlowski and Kircher60].
Preprocessing, quality assurance, and analyses were performed in FSL6.0.1 (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/) [Reference Jenkinson, Beckmann, Behrens, Woolrich and Smith61–Reference Woolrich, Jbabdi, Patenaude, Chappell, Makni and Behrens63] and followed published protocols described elsewhere [Reference Thiel, Meinert, Winter, Lemke, Waltemate and Breuer38, Reference Vogelbacher, Möbius, Sommer, Schuster, Dannlowski and Kircher60]. For details on DTI acquisition, quality assurance, preprocessing, and analysis, see Supplement 2. DTI metrics (fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD), and axial diffusivity (AD)) were calculated based on the computed diffusion tensor. MD, RD, and AD were analyzed in the same way as FA (Supplement 3), but we focus on FA as the most widely employed DTI measure. It captures water diffusion directionality on a scale from 0 (isotropic diffusion) to 1 (completely anisotropic diffusion) and is hypothesized to reflect fiber density and degree of myelination [Reference Feldman, Yeatman, Lee, Barde and Gaman-Bean64, Reference Alexander, Lee, Lazar and Field65].
Statistical analyses
Demographic, clinical, and cognitive data were analyzed in R Studio (version 4.2.2; R Core Team, 2022).
Analyses of DTI data
The DTI data were analyzed using the tract-based spatial statistics (TBSS) technique implemented in FSL, which reduces partial volume effects and misalignment during registration [Reference Smith, Jenkinson, Johansen-Berg, Rueckert, Nichols and Mackay66]. The analyses were adjusted for alpha inflation using the non-parametric permutation tests implemented in FSL randomize [Reference Winkler, Ridgway, Webster, Smith and Nichols67]. Threshold-Free Cluster Enhancement (TFCE) was applied with 5000 permutations per test [Reference Smith and Nichols68] and corrected for familywise error (FWE; p < 0.05). Additionally, FDR correction [Reference Benjamini and Hochberg69] was applied to all post-hoc tests. In all analyses, age, sex, total intracranial volume (TIV), and the scanner variables were included as covariates. All analyses that yielded significant results in the BD patient groups were further checked for robustness by including additional covariates (current symptom severity, age of onset, lifetime comorbidity, BD subtype, and medication; Supplement 5). To investigate WM integrity in association with disease progression of BD, we used three different approaches:
Analysis 1: Number of manic episodes. First, we categorized patients based on the lifetime number of manic episodes: N = 40 patients had experienced only one (hypo-)manic episode in their lives (BDfirst), n = 113 patients had experienced two or more manic episodes (BDmultiple) (Table 2). A one-factorial analysis of covariance (ANCOVA) was performed with group as the independent variable (HC vs. BDfirst vs. BDmultiple) and FA as dependent variable (F-test), followed by post-hoc paired t-tests.
Table 2. Demographic and clinical characteristics of BD patients depending on the number of manic episodes or the quality of remission between previous episodes

Note: Data are mean ± SD or frequencies. BD = bipolar disorder, BDfirst = BD patients who experienced only one hypomanic or manic episode in their lives, BDmultiple = BD patients who already experienced two or more manic episodes, BDrem = BD patients who experienced a good remission between previous episodes in their lives, BDchron = BD patients who achieved only partial remission between episodes or had already developed a chronic course (derived from Item 90 of the Operational Criteria (OPCRIT) Checklist for Affective and Psychotic Illness, which assesses the previous course of the illness), HDRS = 21-item Hamilton Depression Rating Scale, GAF = General Assessment of Functioning Scale, YMRS=Young Mania Rating Scale. +Calculated using the paired two-tailed Student’s t test. $Calculated using the χ2 test. ~Calculated using the Mann–Whitney-U-Test.
Analysis 2: Quality of remission. Second, the patients were categorized into two groups based on their previous quality of remission as assessed using the OPCRIT checklist for affective and psychotic disorders: N = 85 patients (BDrem) experienced a good remission between previous episodes in their lives. In contrast, n = 68 patients achieved only partial remission between episodes or had already developed a chronic course (BDchron) (Table 2). Again, a one-factorial ANCOVA was performed with group as the independent variable (HC vs. BDrem vs. BDchron) and FA as the dependent variable (F-test), followed by post-hoc paired t-tests.
Analysis 3a: Level of functioning. Finally, to investigate whether the level of functioning was related to WM microstructure, a linear regression model was used to calculate an association between the GAF score and FA. As outlined, we focus on interepisodic functional levels. Therefore, only BD patients who were (partially) remitted at the time of measurement were included in this analysis (n = 75, Supplementary Table S1). For completeness, the analysis was also conducted on the full sample (Supplement 4).
Results
Analysis 1. WM microstructural differences among HC, BDfirst, and BDmultiple
There was a significant main effect of group in FA (F-contrast: p tfce-FWE = 0.001, total k = 6843 voxels in seven clusters, Figure 1A and B, Supplementary Table S2), which was further examined with pairwise comparisons: These revealed significantly lower FA values in BDmultiple compared with HC in one large bilateral cluster (d = 0.21, p tfce-FWE < 0.001, k = 45,480 voxels) as well as compared with BDfirst (d = 0.30, p tfce-FWE = 0.003, k = 23,478 voxels in seven clusters, Figure 1C), with both differences remaining significant after FDR correction (p = 0.003 and p = 0.005). In contrast, BDfirst patients did not show significantly different FA values compared with HC (p tfce-FWE = 0.688). Both effects were mainly localized in the corpus callousum, the corona radiata, and the superior longitudinal fasciculus (Supplementary Table S3). The difference between the two BD groups remained significant even when adjusting for additional clinical characteristics (Supplementary Table S2). There also emerged a significant main effect of group for RD and MD, reflected by significantly higher scores for BDmultiple compared with HC and BDfirst. No effects were found for AD (Supplement 3).

Figure 1. Differences in FA between HC and BD categorized into stages based on the number of manic episodes. Note. (A) Mean fractional anisotropy (FA) across healthy controls (HC), patients with bipolar disorder (BD) who have only experienced a first manic episode (BD-first), and patients with BD who have already experienced multiple manic episodes (BD-multiple). The mean FA value was obtained from FA values of all the voxels that showed a significant main effect of diagnosis (ptfce-FWE < 0.05). Error bars represent 95% confidence intervals. p-values were obtained from pairwise post hoc t-contrasts. (B) Density estimation plots of FA values for the three groups: HC, BD-first, and BD-multiple. (C) Higher FA in BD-first compared with BD-multiple. Statistically significant clusters from the post-hoc t-contrast are displayed on the MNI152 template using MRIcroGL (version 1.2). Highlighted areas represent voxels (using FSL’s ‘fill’ command for better visualization), where significant differences between groups (p tfce-FWE < 0.05) were detected. MNI = Montreal Neurological Institute.
Analysis 2. WM microstructural differences among HC, BDrem, and BDchron
When categorizing BD patients based on their previous remission quality, the F-contrast again revealed a significant main effect of group in FA (p tfce-FWE = 0.005, total k = 1764 voxels in five clusters, Figure 2A and B, Supplementary Table S2). Pairwise post-hoc t-contrasts revealed significantly higher FA values for HC compared with BDchron (d = 0.26, p tfce-FWE < 0.001, total k = 39,297 voxels in one cluster, Figure 2D) , as well as BDrem, albeit less pronounced (d = 0.33, p tfce-FWE = 0.031, total k = 1426 voxels in four clusters, Figure 2C). Both effects remained significant after FDR correction (p = 0.003 and p = 0.047) and were mainly localized in the corpus callousum and the corona radiata, whereas the comparison between HC and BDchron also included the internal and external capsule, posterior thalamic radiation and superior longitudinal fasciculus (Supplementary Table S3). BDchron showed lower FA values compared to BDrem, although this difference did not reach statistical significance (p tfce-FWE = 0.075). There also emerged a significant main effect of group for RD, reflected by significantly higher scores for BDchron compared with HC and BDrem. No effects were found for MD and AD (Supplement 3).

Figure 2. Differences in FA between HC and BD categorized into stages based on the quality of remission between episodes. Note: (A) Mean fractional anisotropy (FA) across healthy controls (HC), patients with bipolar disorder (BD) achieving stable remission between episodes (BD-rem), and patients with BD achieving partial or no remission between episodes (BD-chron). The mean FA value was obtained from FA values of all the voxels that showed a significant main effect of diagnosis (ptfce-FWE < 0.05). Error bars represent 95% confidence intervals. p-values were obtained from pairwise post hoc t-contrasts. (B) Density estimation plots of FA values for the three groups HC, BD-rem, and BD-chron. (C-D) Higher FA in HC compared with BD-rem (c) or BD-chron (d). Statistically significant clusters from the post-hoc t-contrasts are displayed on the MNI152 template using MRIcroGL (version 1.2). Highlighted areas represent voxels (using FSL’s “fill” command for better visualization), where significant differences between groups (p tfce-FWE < 0.05) were detected. MNI, Montreal Neurological Institute.
Analysis 3. Association between GAF scores and WM microstructure in remitted BD patients
As explained, we focus on interepisodic functional level, reporting results from euthymic subjects. The linear regression analysis investigating an association between the GAF score and FA in euthymic BD patients yielded a significant positive association (ptfce-FWE < 0.001, one cluster with k = 43,114 voxels, Figure 3), which remained significant when additionally controlling for clinical characteristics (Supplementary Table S2). A negative association was found for MD and RD, while no effect emerged for AD (Supplement 3). The effects were mainly localized in the corpus callosum, corona radiata, and superior longitudinal fasciculus (Supplementary Table S3).

Figure 3. Positive association between GAF scores and FA in euthymic BD patients. Note: (A) Scatterplot depicting the cross-sectional association between GAF scores and fractional anisotropy (FA) in euthymic patients with bipolar disorder (BD). Each datapoint represents one participant. Lines and shaded areas indicate the mean association between FA and GAF scores as well as the confidence intervals. The FA value was obtained from the FA values of all the voxels that showed a significant positive association (ptfce-FWE < 0.05). (B) Statistically significant clusters from the positive association effect are displayed on the MNI152 template using MRIcroGL (version 1.2). Highlighted areas represent voxels (using FSL’s “fill” command for better visualization), where a significant association between variables (p tfce-FWE < 0.05) was detected. MNI = Montreal Neurological Institute.
Discussion
This study was the first to comprehensively investigate whether disease progression in BD is reflected in WM integrity, using three approaches derived from staging models. Overall, our results support our hypothesis that early and late stages of BD differ in WM microstructure, as we found higher WM integrity in patients with fewer manic episodes and in those with higher levels of functioning outside of acute episodes. Although no clear group differences emerged when remission between episodes was used as a progression criterion, these results provide initial evidence that WM alterations may relate to illness progression.
Our finding of lower WM integrity in patients with a first manic episode compared to patients with multiple manic episodes aligns with other studies [Reference Tanrıkulu, İnanlı, Arslan, Çalışkan, Çiçek and Eren43, Reference Lavagnino, Cao, Mwangi, Wu, Sanches and Zunta-Soares70], which reported localized effects in the corpus callosum. Although this fiber tract was also found to be centrally affected in all our analyses, we observed more global and widespread WM microstructure impairments throughout the brain, involving various projection, association, and commissural pathways. As both the significance of this widespread effect and the specific role of the corpus callosum have been addressed in our previous research on BD [Reference Thiel, Meinert, Winter, Lemke, Waltemate and Breuer38, Reference Thiel, Lemke, Winter, Flinkenflügel, Waltemate and Bonnekoh40], they will not be further discussed here. Although our remission-based approach showed no significant differences between early and late stages in direct comparisons, patients with partial or no remission showed widespread WM alterations compared to HC, whereas patients with stable remission differed from HC only in small local clusters. This locality versus globality also indicates greater WM impairments in later stages. The positive correlation between WM microstructure and global functioning, as measured by the GAF score, in remitted BD patients highlights the clinical significance of these WM impairments. Although we are the first to investigate this specific relationship, our finding aligns with studies linking GAF and WM volume [Reference Ferro, Bonivento, Delvecchio, Bellani, Perlini and Dusi71, Reference Forcada, Papachristou, Mur, Christodoulou, Jogia and Reichenberg72]. Cognitive impairments or residual depressive symptoms – key predictors of euthymic functional impairment [Reference Bonnín, Jiménez, Solé, Torrent, Radua and Reinares73–Reference Léda-Rêgo, Bezerra-Filho and Miranda-Scippa78] – may mediate this correlation. Cognitive deficits in particular have been associated with WM microstructural impairment in BD patients [Reference Caruana, Carruthers, Berk, Rossell and Van Rheenen79]. Overall, our findings on functioning should be interpreted with caution, as interepisodic functioning served more as an indirect measure of disease progression. Furthermore, impairments in this domain can occur independently of the disease – even in HC – and we lack data on the patients‘premorbid functioning.
Regarding the differences between our two categorical approaches, it can first be assumed that the episode-based categorization rather measures the effects of repeated acute stress, whereas the second approach is more likely to capture the effects of chronic stress, caused by incomplete remission with persistent residual symptoms, which seems to be less clearly associated with WM alterations. Moreover, the first approach differentiates more in earlier stages, while the second approach focuses more on later stages. Patients were more likely to be classified in the latter group in the first approach than in the second, which is also reflected in the differences in the respective group sizes. The fact that significant differences were found within BD patients using the first approach, but not the second, may lend support to the hypothesis discussed in the literature that WM abnormalities tend to emerge earlier in the course of the disease, while later stages no longer lead to significant changes [Reference Duarte, Massuda, Goi, Vianna-Sulzbach, Colombo and Kapczinski80–Reference Moorhead, McKirdy, Sussmann, Hall, Lawrie and Johnstone83].
All our analyses underwent comprehensive robustness checks, accounting for clinical features previously shown to influence WM integrity, including pharmacological treatment [Reference Favre, Pauling, Stout, Hozer, Sarrazin and Abé39, Reference Hafeman, Chang, Garrett, Sanders and Phillips84], BD subtype [Reference Thiel, Lemke, Winter, Flinkenflügel, Waltemate and Bonnekoh40], or age of onset [Reference Favre, Pauling, Stout, Hozer, Sarrazin and Abé39, Reference Canales-Rodríguez, Verdolini, Alonso-Lana, Torres, Panicalli and Argila-Plaza85]. We therefore conclude that the alterations identified are independently associated with the variables used to assess disease progression. However, since this is a cross-sectional study, our findings can be interpreted in two ways: either as an indication of neuroprogression, that is, cumulative changes due to repeated experiences of episodes or symptoms [e.g. Reference Tanrıkulu, İnanlı, Arslan, Çalışkan, Çiçek and Eren43, Reference Huang, Chen, Hsu, Tsai and Bai47, Reference Lavagnino, Cao, Mwangi, Wu, Sanches and Zunta-Soares70], or as trait-like characteristics of clinical subtypes that already differ a priori in the studied characteristics (i.e. patients with pronounced WM alterations are also more likely to reach later stages) [Reference Passos, Mwangi, Vieta, Berk and Kapczinski2, Reference Martino, Samamé, Marengo, Igoa and Strejilevich32]. Both interpretations are supported by the broader research on cognitive deficits in BD and seem plausible, especially given the lack of longitudinal DTI studies. Importantly, the interpretation of clinical subtypes does not rule out the presence of neuroprogression in certain subgroups, but rather reflects the heterogeneity of BD [Reference Alda and Kapczinski16]. This heterogeneity results not only from different forms of progression but also from factors such as BD subtypes, predominant polarity, age of onset, response to treatment, psychotic features, suicide attempts, or rapid cycling [Reference Alda and Kapczinski16, Reference Hozer and Houenou86, Reference Chen, Tu, Huang, Bai, Su and Chen87]. Neurobiological differences between such subtypes support this approach of clinical subtypes [Reference Hozer and Houenou86–Reference Sweet, Gao, Chen, Tatsuoka, Calabrese and Sajatovic91], such as our prior finding of more severe WM impairments in BD subtype I versus II [Reference Thiel, Lemke, Winter, Flinkenflügel, Waltemate and Bonnekoh40]. Moreover, our categorization by manic episodes did not consider the possible contribution of depressive episodes, which raises the factor of predominant manic polarity as an explanation for the observed differences as well as the discrepancies between approaches. This is illustrated by the fact that some patients with only one manic episode were classified as BDchron (Table 1), likely due to a disease course dominated by depressive episodes. There are some studies suggesting that a predominant manic polarity may be associated with more severe progressive impairment compared to a predominant depressive polarity [Reference Belizario, Gigante, de Almeida Rocca and Lafer92, Reference Abé, Ekman, Sellgren, Petrovic, Ingvar and Landén93], possibly due to the frequent occurrence of psychotic symptoms during mania [Reference Aminoff, Onyeka, Ødegaard, Simonsen, Lagerberg and Andreassen94] and the use of certain medications such as antipsychotics or anticonvulsants [Reference Favre, Pauling, Stout, Hozer, Sarrazin and Abé39, Reference Canales-Rodríguez, Verdolini, Alonso-Lana, Torres, Panicalli and Argila-Plaza85, Reference Sehmbi, Rowley, Minuzzi, Kapczinski, Kwiecien and Bock95].
The findings of this study should be interpreted with certain limitations in mind. One crucial limitation of our study is its cross-sectional design, which does not allow causal inferences. The question of neuroprogression is inherently a longitudinal one, which cannot be fully answered by cross-sectional studies. Future longitudinal studies in BD are therefore urgently needed. In addition, one limitation lies in the use of simplified staging approaches. Instead of employing a detailed staging model, we relied on broad dichotomous classifications, which miss subtle differences in disease progression as each category encompasses a wide range of clinical severity. While this approach aligns with the broader recommendations of the ISBD [Reference Kapczinski, Magalhães, Balanzá-Martinez, Dias, Frangou and Gama50], a more nuanced method combining multiple variables would have been more sound. In the future, detailed clinically validated staging models may provide better insight into the associated neurobiological changes [Reference Berk, Post, Ratheesh, Gliddon, Singh and Vieta5]. Furthermore, even though we made several robustness checks, not all possible influences on the results could be excluded, as our sample was very heterogeneous. For example, previous disease history, such as depressive episodes prior to diagnosis, specific comorbidities, or previous psychopharmacological treatments, may have undetected influences. Finally, the use of more advanced tractography techniques could provide a more detailed and hypothesis-driven investigation across specific WM pathways, thus usefully complementing the whole-brain voxel-wise analysis that we performed via TBSS [Reference Preti, Baglio, Laganà, Griffanti, Nemni and Clerici96].
In conclusion, our study provides important insights into the relationship between WM microstructure and the clinical course of BD. Patients in advanced stages showed lower WM integrity compared to HC and, partially, to patients in earlier stages, and lower WM integrity was associated with poorer functioning. Due to the cross-sectional design, the results leave open whether they are truly indicative of a progressive course. Nevertheless, our findings highlight the clinical relevance of WM alterations. They not only advance our understanding of the biological mechanisms underlying disease progression but may also inform future clinical practice. Incorporating patterns of WM alterations associated with early versus late stages into clinical assessments could enable more accurate evaluation of disease progression, earlier identification of patients at high risk for rapid progression and functional impairment, and support the implementation of personalized, stage-specific treatment strategies. Although further research is needed before these findings can be directly applied in clinical practice, integrating WM alterations into refined and validated staging models could increase their diagnostic and prognostic utility while keeping in mind the complexity and heterogeneity of BD.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1192/j.eurpsy.2025.10105.
Data availability statement
The data that support the findings of this study are available from the corresponding author, upon reasonable
Acknowledgements
We are deeply indebted to all the study participants, the recruitment sites and their staff. Detailed acknowledgments of the FOR2107 can be found at www.for2107.de/acknowledgements.
Financial support
This work is part of the German multicenter consortium “Neurobiology of Affective Disorders. A translational perspective on brain structure and function,” funded by the German Research Foundation (Deutsche Forschungsgemeinschaft DFG; Forschungsgruppe/Research Unit FOR2107): Tilo Kircher (TK, speaker FOR2107, grant numbers KI588/14-1, KI588/14-2, KI588/15-1, KI588/17-1), Udo Dannlowski (co-speaker FOR2107, grant numbers DA1151/5-1, DA1151/5-2, DA1151/6-1, DA1151/9-1, DA1151/10-1, DA1151/11-1), Igor Nenadić (grant numbers NE2254/1-2, NE2254/3-1, NE2254/4-1), Tim Hahn (grant number HA7070/2-2). This work was further supported by the DFG grant SFB/TRR 393, project grant no 521379614, as well as ME62262-1 (awarded to SM), and the Interdisciplinary Center for Clinical Research (IZKF) of the medical faculty of Münster (grant Dan3/022/22 to UD), as well as the DYNAMIC center, funded by the LOEWE program of the Hessian Ministry of Science and Arts (grant number: LOEWE1/16/519/03/09.001(0009)/98). Further, this work was in part funded by the Else Kröner-Fresenius-Stiftung (grant no 2023_EKEA.153 awarded to SM) and the Innovative Medical Research (IMF) of the medical faculty of the University of Münster (grant no ME122205, ME122405 awarded to SM). PK was supported by the European Union – NextGenerationEU and the Romanian Government (contract no. 760246/28.12.2023/28.12.2023, code PNRR-III-C9-2023-I8-CF103/31.07.2023).
Competing interests
Tilo Kircher received unrestricted educational grants from Servier, Janssen, Recordati, Aristo, Otsuka, Neuraxpharm. This funding is not associated with the current work. The corresponding author confirms that no other authors have any potential conflicts of interest.
 
 




 
              
Comments
No Comments have been published for this article.