White matter microstructure predicts foreign language learning in army interpreters

Adult foreign language acquisition is challenging, and the degree of success varies among individuals. Anatomical differences in brain structure prior to training can partly explain why some learn more than others. We followed a sample of conscript interpreters undergoing intense language training to study learning-related changes in white-matter microstructure (FA, MD, RD and AD) and associations between differences in brain structure prior to training with acquired language proficiency. No evidence for changes in white matter microstructure relative to a control group was found. Starting values of RD, AD and MD were positively related to final test scores of language proficiency, corroborating earlier findings in the field and highlighting the need for further study of how initial brain structure influences and interacts with learning outcomes.

Functional and structural neuroimaging can help explain why some learn better than others. Yang, Gates, Molenaar and Li (2015) used functional Magnetic Resonance Imaging (fMRI) to study native English speakers as they were taught a new tonal vocabulary over the course of six weeks. Their results show that successful and less successful learners differed in brain activation prior to training, with greater neural activity during tone discrimination for successful learners, and more activity during pitch discrimination for less successful learners. Over time successful learners also showed a more coherent and integrated functional network. Shepard, Wang and Wong (2012) used graph theory to look at functional network characteristics (using fMRI) in relation to learning outcomes following an auditory pitch discrimination task. They found that successful learners showed reduced local efficiency but increased global efficiency in a core network of auditory language areas.
Other studies have combined measures of functional (using fMRI) and structural (using Diffusion Tensor Imaging: DTI) brain connectivity. López- Barrosso and colleagues (2011) showed that efficient word learners rely on fast and efficient communication between temporal and frontal areas in the left hemisphere whilst more recent findings (Ripollés et al., 2017) have linked brain connectivity in temporal pathways to correct identification of words and meanings. Qi et al. (2015) compared DTI measurements of known language tracts to the outcome of four weeks of Mandarin training and saw that language learning was correlated with DTI measures in the right hemisphere, but not in the left. These findings, taken together with work by Xiang et al. (2012), show that individual differences in brain connectivity can help explain why some people learn a language more successfully than others. Structural brain change (in the form of alterations in grey and white matter microstructure) can also occur as an effect of learning a new language (see Li, Legault and Litcofsky (2014) for review). We identified local increases in grey matter volume as an effect of learning a new language (Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg & Lövdén, 2012). Intense language learning in a group of conscript interpreters at the Swedish Armed Forces Language School led to regional changes in cortical thickness as well as changes in right hippocampal volume. Importantly, these changes were related to learning outcomes, with larger increases in right hippocampal volume and the superior temporal gyrus in interpreters who became more proficient in their assigned language. White matter microstructure has also been known to change in response to language training. Schlegel, Rudelson and Tse (2012) collected monthly DTI scans of English speakers who underwent a 9-month intensive course in Modern Standard Chinese and saw that white matter networks reorganized progressively during that time period. Hosoda and colleagues (2013) investigated a larger cohort of participants (n 137) using both grey matter and white matter measures. They found that grey matter structure was initially related to vocabulary competence but did not predict the outcome of a 16-week 2 nd language training program. They did, however, see changes in both white matter microstructure and regional grey matter volume, primarily in the right hemisphere, as an effect of training. Language studies can also lead to brain changes on the functional level (Paulesu, Vallar, Berlingeri, Signorini, Vitali, Burani, Perani & Fazio, 2009;e.g., Shtyrov, 2011;Shtyrov, Nikulin & Pulvermüller, 2010).
The highlighted studies point towards the importance of the language network as a determinant for later language performance. And even more importantly, they illustrate the ability for said network to change in response to demands. The present study follows a select group of multilinguals as their language networks are pushed at a high pace, allowing us to observe how an experienced brain responds to training. Studying this group can lend valuable insight into how learning affects other advanced groups that learn multiple languages over the course of their careers, such as interpreters or diplomats. Our earlier findings (Mårtensson et al., 2012) were limited to grey matter structure. In light of the functional and structural findings presented above it is relevant to investigate whether a) white matter and grey matter microstructure will act as a predictor of learning outcomes and b) whether white matter microstructure will change in response to language training in a group of multilingual individuals who were pre-selected for their language ability prior to training.
Grey matter and white matter measurements from the same population as in Mårtensson et al. (2012) were used for this study. Individuals with strong language ability and knowledge in at least 2 foreign languages enrolled at the Swedish Armed Forces Language School and studied at a pace of 350-500 new words per week. Increases in right hippocampal volume and changes in cortical thickness of the inferior frontal gyrus (IFG), superior temporal gyrus (STG) and the middle frontal gyrus (MFG) were observed following 3 months of language learning. These areas hold key roles in sensorimotor aspects of language (Demonet, 2005;Hickok & Poeppel, 2007;Price, 2010). More specificically, the IFG is belived to be involved in the articulatory network (predominantly left hemisphere) whilst the STG is concerned with spectrotemporal analysis (bilaterally). Both belong to the same frontal network of areas regulating speech processing (Hickok & Poeppel, 2007). The hippocampus is also believed to be involved in rapid learning of new words (Davis & Gaskell, 2009).
There are known connections (see Friederici (2009), for review) between the IFG and the STG, such as the Arcuate Fasciculus and the Superior Longitudinal Fasciculus (SLF). The arcuate fasciculus forms parts of the SLF and has traditionally been claimed to connect Broca's and Wernicke's areas (Catani & Thiebaut de Schotten, 2008), a statement that has been challenged as advancements in neuroimaging have taken place (Bernal & Altman, 2010). Another pathway that is believed to be related to language is the uncinate fasciculus. It connects the anterior temporal lobe with the medial and lateral orbitofrontal cortex (Catani & Thiebaut de Schotten, 2008).
Grey matter consists primarily of neuronal cell bodies, dendrites, and axon collaterals along with glia (mainly astrocytes), synapses, and capillaries whilst white matter is dominated by myelinated axons (Kassem, Lagopoulos, Stait-Gardner, Price, Chohan, Arnold, Hatton & Bennett, 2012;Zatorre, Fields & Johansen-Berg, 2012). White matter microstructure is often measured using Diffusion-tensor imaging (DTI,). DTI is sensitive to hindrance of water diffusion that results from tissue boundaries and is often quantified using mean diffusivity (MD), fractional anisotropy (FA), radial diffusivity (RD) and axial diffusivity (AD). MD quantifies free diffusion of water within a voxel (Beaulieu, 2002) whilst FA measures the directionality of diffusion. FA and MD are calculated from AD and RD, which measure diffusion parallel to and perpendicular to axonal fibers, respectively (Sen & Basser, 2005).
Considering earlier studies from the field, and the findings by Mårtensson et al. (2012) specifically, we hypothesize that grey and white matter will be predictive of later learning outcomes as measured by language proficiency and that white matter microstructure will change in response to training. Left and right hippocampal volumes are expected to be predictive of later language proficiency, as are areas in the language network (Friederici, 2011;Friederici & Gierhan, 2013). Brain structure will be measured globally using Tract-Based Spatial Statistics (TBSS: white matter) and using FreeSurfer's cortical stream (grey matter). Changes are expected to be larger in, but not restricted to, the left hemisphere and with projections towards the right temporal lobe in light of the findings of Mårtensson et al. (2012).

Participants
Fourteen (6 female) right-handed and MRI-eligible volunteers were recruited from the Swedish Armed Forces Intelligence and Security Centre. Conscripts at the center were selected from all Swedish 18-year-old men and women who had willfully decided to undergo military training. Screening before entry was based on school achievements, study skills, emotional stability, and intelligence. Students are required to have top grades in at least 2 foreign languages (making all students at least tri-lingual) and are given a week to learn 350 non-words of Finnish origin. The purpose of this vocabulary test is to select recruits that can manage the very high pace of the academy. Out of roughly a hundred select applicants the academy chooses between 20-30 students in an average year (see Dahlquist, 2004 for reference). Eight of the interpreters studied Dari, four studied Arabic, and two studied Russian. No interpreter had prior knowledge in his or her assigned language.
Controls (n = 17, 10 female) were students of cognitive science or medicine at Umeå University. The control group was recruited

764
Johan Mårtensson et al. to be comparable to interpreters in age, intelligence, years of education, and emotional stability. Controls were measured before and their semester, which matched closely in time to the measurements of interpreters. The time between measurements was 3 months. The groups were equivalent in age, years of education, intelligence and emotional stability. See Table 1.

Behavioral measures
Raven's advanced progressive matrices The 18 odd-numbered items from set II of this test (Raven, 2000) were administered to both groups at pretest. Participants had 10 minutes to complete the task, and the dependent variable was the number of correctly selected patterns.

Anxiety ratings
A Swedish translation of the STAI Y-2 (Spielberger, Gorsuch & Lushene, 1970) was modified and presented to participants who rated anxiety (Filaire, Sagnol, Ferrand, Maso & Lac, 2001) they had experienced in the past month. The questionnaire was administered at pretest.

Proficiency
Our proficiency measure consisted of grades from the mid-year exam at the interpreter academy. This test was performed a few weeks after posttest. This exam is especially important to the interpreters because those who fail are forced to leave the academy (none of our participants had to leave, indicating that they studied hard). The exam itself consists of one written and one oral test. The written language test includes translating full sentences and texts and the oral test includes non-simultaneous interpreting. Both tests have been developed to measure the ability of actual language use in the demanding circumstances a military interpreter might find herself in, and as such they measure a broad spectrum of language abilities. For our proficiency measure we used the means of the oral and written exam, with a scale between 1 (least proficient) and 10 (most proficient). During the stay at the academy the interpreters underwent similar tests, with oral and written exams interleaved (one per week). The teachers had no insight into the findings from the study when grading the students. The tests were developed at the academy (which has been active since 1957) by language teachers, most of whom work part time as lecturers at Swedish universities.

Struggle
To investigate whether training-induced white matter changes were connected to increased knowledge or/and a large amount of effort, we asked the head teacher at the academy to subjectively rate the amount of effort needed to stay at the academy at post-test. The question (translated from Swedish) was: "Judge how large effort was needed for each participant to achieve the goals of the interpreter academy and to be allowed to stay in the program" and was rated on a Likert scale of 1 (little effort) to 9 (large effort). No participant scored below 6 on this scale which led to a restriction in range.

Known languages
All participants knew at least two languages (Swedish and English) with additional languages known by both interpreters and controls. In effect, all participants were at least trilingual, which is of relevance since learning and using a second language can affect white matter microstructure (Pliatsikas, Moschopoulou & Saddy, 2015). Statistically the groups did not differ in terms of number of languages known (see Table 1.). However, it is difficult to rule out the possibility that there actually might have been differences between the groups, since the participants were simply asked to note down any additional languages they know (aside from Swedish and English) and not the language ability in each additional language.

MR Acquisition
Images were acquired at Umeå center for Functional Brain Imaging (UFBI) on a GE Discovery MR 750, 3 Tesla scanner with a 32-channel phased-array head coil. For Diffusion Weighted Images at pre-and post-test a Spin Echo refocused EPI sequence was used. Images had a slice thickness of 2 mm. Sequence parameters were: TR = 8000 ms, 64 slices with no gap, acquisition matrix 128 × 128 interpolated to 256 × 256 matrix with a FOV of 250 mm, TE= 84.4 ms, 4 repetitions of 24 independent directions, b= 1000 s/mm2 and 4 b = 0 images, Dual Spin Echo switched on and ASSET acceleration factor 2.

DTI Preprocessing
Diffusion-weighted images were analyzed using the FSL software package (http://www.fmrib.ox.ac.uk/fsl). The four subject-specific images were averaged to improve signal-to-noise ratio and the resulting output image was corrected for possible head movement. This was done using FLIRT from FSL (Jenkinson & Smith, 2001;Jenkinson, Bannister, Brady & Smith, 2002) with the mean of the B = 0 images from each run being used as the reference using 6 degrees of freedom followed by interpolation using nearest neighbor. Images from all participants were inspected for motion and no participant was excluded. A mean image was calculated from the B = 0 image of each run and used as a brain mask. The resulting data was then processed via dtifit and voxelwise statistical analysis was carried out using TBSS v1.2 (Smith, Jenkinson, Johansen-Berg, Rueckert, Nichols, Mackay, Watkins, Ciccarelli, Cader, Matthews & Behrens, 2006;Smith, Jenkinson, Woolrich, Beckmann, Behrens, Johansen-Berg, Bannister, De Luca, Drobnjak, Flitney & Niazy, 2004): all FA images were aligned into 1 × 1x1mm standard space with the FMRIB58_FA image as target; the most typical subject (the participant that corresponds the most to the rest of the sample) was then used for group-wide alignment into standard space; the resulting image was fed into a tract skeleton generation program and a threshold set at 0.2 to exclude gray matter voxels or cerebrospinal fluid. The MD, RD and AD images made use of the non-linear transformation and projection vectors extracted during preprocessing of the FA images but were otherwise treated in the same way as above.

Difference images
For each participant, a difference image was calculated from the skeletonized images (by subtracting the pretest image from the Bilingualism: Language and Cognition 765 posttest image). This was done for FA, MD, RD and AD values respectively. Interpreters were then compared against controls on these difference images by means of voxelwise permutationbased inference (5000 permutations). Significant differences in this test thus reflect differences between the groups in the amount of changes between pretest and posttest (i.e., a group by time interaction). Threshold-Free Cluster Enhancement (TFCE) was used and only TFCE p-value images, fully corrected for multiple comparisons across space and at p < 0.05 were considered.

Pretest differences
Due to the select nature of the interpreters as compared to the control population we compared FA, MD, RD and AD between groups at pretest using the criteria above.

Behavioral correlations to change
To measure whether difference images correlated with either Struggle or Proficiency in the interpreters a third analysis was carried out, using the difference images but with Struggle and Proficiency added as covariates (in two separate runs).
Predictive value of white matter microstructure A fourth analysis was performed to evaluate whether FA, MD, RD or AD values at pretest were related to language proficiency at posttest. Within the group of interpreters, demeaned proficiency values were added as a covariate to the same type of analysis and then tested against a null distribution generated from 5000 iterations. The mean values from the resulting skeletonized MD, RD and AD values were then exported (using fslstats with the -M option) into SPSS version 26 (IBM Corp, 2019) for outlier analysis (using the SPSS boxplot feature, which highlights outliers based on Interquartile range of 1.5), and comparison to prior grey-matter findings from Mårtensson et al. (2012). A second run was performed to account for measures of intelligence.
Pre-test analysis of FreeSurfer data from Mårtensson et al. (2012).
To provide completion in light of earlier findings that found grey matter to be predictive in a similar population (Hosoda et al., 2013) the published cortical and hippocampal data from the presented cohort was revisited to investigate whether cortical thickness or hippocampal volume was predictive of later language proficiency. All scans were manually inspected for artifacts and no scan was rejected. The volumes were then analyzed using the Vertex-wise general linear model analysis was performed with Proficiency as the dependent variable and cortical thickness as the independent variable. For subcortical grey matter structure, a linear regression was performed using Proficiency as the dependent variable and right and left hippocampal volume as independent variables. See Mårtensson et al. (2012) for further details, and the results below.

Pretest differences in white matter microstructure (RD) between interpreters and controls but no change over time
No statistically significant pretest differences were found between controls and interpreters for FA, MD or AD. However, interpreters had relatively lower RD values bilaterally in several areas involved in the language network (Friederici, 2011;Friederici & Gierhan, 2013): forceps minor, inferior fronto-occipital fasciculus, superior longitudinal fasciculus, uncinate fasciculus, anterior thalamic radiation, inferior longitudinal fasciculus with a larger area visible in the left hemisphere, the corticospinal tract as well as the cingulum (see Figure 1A). Controls did not have lower RD values compared to interpreters in any area. Calculated difference images (postpre) showed no selective changes over time for FA, MD, RD or AD for interpreters relative to controls. Hence, no evidence for changes in white matter microstructure over time were found.
An additional analysis compared RD at posttest between both groups to see whether the initial difference between the populations was still present. There was no difference between the groups for RD at posttest (at p < 0.05) but a tendency ( p > 0.1) for interpreters to have lower RD in the same networks at posttest as at pretest, again indicating that no real change between the groups occurred as an effect of training.
Within the group of interpreters neither language proficiency (Proficiency) nor a subjective measure of the effort needed to stay at the academy (Struggle) correlated with changes in white matter microstructure.

Grey matter did not predict later language proficiency
No predictive value of cortical thickness for later proficiency was found at p < .001, using a clusters-extent threshold of 100 vertices. In addition, no predictive value was found for left and right hippocampal volume in relation to final proficiency.

White matter microstructure predicts later language proficiency
Within the group of interpreters, higher language proficiency was related to higher values of MD, RD and AD ( p < .05, fully corrected for multiple comparisons) at pretest. Notably, baseline MD was higher for more proficient interpreters in areas that partly overlap with the pretest differences between interpreters and controls shown in Figure 1: forceps minor, anterior thalamic radiation, inferior fronto-occipital fasciculus, corticospinal tract, hippocampal part of the cingulum and uncinate fasciculus bilaterally. Additionally, higher MD values were found in tracts stretching along the superior longitudinal fasciculus bilaterally as well as the inferior longitudinal fasciculus in the left hemisphere (see figure 1B-D).
RD and AD showed similar trends where higher values were positively correlated with higher Proficiency. Effects for RD and AD were limited to the forceps minor bilaterally (RD) as well as parts of the inferior fronto-occipital fasciculus (RD, AD), anterior thalamic radiation (AD) and uncinate fasciculus in the left hemisphere (RD) or bilaterally (AD). No effect was found for FA. This is indicative of crossing fibers, which have been known to cause RD and AD to increase or decrease simultaneously in the same areas (Vos, Jones, Jeurissen, Viergever & Leemans, 2012). The effects in RD and AD were strongly correlated (r(12) = .787, p = .001).
The areas where MD, RD or AD correlated with language proficiency were compared to changes in grey matter from Mårtensson et al. (2012). No effects were found when controlling for multiple comparisons using Bonferroni correction.
To rule out that the correlations between DTI parameters and proficiency were driven by age differences within the group of interpreters, a bivariate correlation was performed between chronological age and the mean MD, RD and AD values from the respective skeletonized images. The significant areas from each analysis (MD, RD and AD) were then used as masks on the individual skeletonized images and the mean of all the voxels within the mask exported to SPSS for analysis. Neither MD (r = .125, p = .671), RD (r = .237, p = .414) nor AD (r = .185, p = .526) correlated with age.

The effects were not related to general intelligence
Proficiency was used once more as a covariate whilst regressing out individual scores from Raven's progressive matrices. The resulting skeletonized image was visually equal to the original analysis.

Discussion
We found that interpreters and controls differed in white matter microstructure at baseline, and that roughly the same networks were predictive of later learning proficiency in the interpreters. We did not find support for our hypothesis that white matter microstructure would change over time as an effect of training, possibly because of the prior language expertise in the interpreter population as compared to other study populations (Schlegel et al., 2012).
Of the areas differing between controls and interpreters at baseline, only the superior longitudinal fasciculus (Friederici, 2009) is strongly related to language, but it overlaps with the Arcuate fasciculus that has been tied to language performance (López-Barrosso et al., 2011, but not in Ripollés et al., 2017) and has been seen to change in response to language learning (Hosoda et al., 2013). However, the uncinate fasciculus is believed to be related to language (Catani & Mesulam, 2008), was recently implicated as relevant for language learning (Ripollés et al., 2017) and belongs to the limbic system, which includes the hippocampus (Catani & Thiebaut de Schotten, 2008). The right hippocampus increased in volume in the same population as described here (Mårtensson et al., 2012). The inferior longitudinal fasciculus connects the limbic system to visual areas (Fox, Iaria & Barton, 2008) and has been observed to change in response to (Hosoda et al., 2013) and be relevant for (Ripollés et al., 2017;Qi et al., 2015), language learning in the right hemisphere. The inferior fronto-occipital fasciculus has been implied in semantic processing (Duffau, Gatignol, Mandonnet, Peruzzi, Tzourio-Mazoyer & Capelle, 2005;Duffau, Gatignol, Mortiz-Gasser & Mandonnet, 2009;Mandonnet, Nouet, Gatignol, Capelle & Duffau 2007;Ripollés et al., 2017) which is likely of relevance for interpreters with high demands on vocabulary acquisition.
Lower RD values have been connected to higher levels of myelin (Song, Sun, Ju, Lin, Cross & Neufeld, 2003;Song, Sun, Ramsbottom, Chang, Russell & Cross, 2002;Song, Yoshino, Le, Lin, Sun, Cross & Armstrong, 2005). Thus, the present findings, where controls had relatively higher values as compared to interpreters, could be taken as a cautious indication of higher degree of myelination in the group of interpreters in white-matter pathways connecting known language areas as well as areas that have been related to language learning outcomes (Schlegel et al., 2012;Mårtensson et al., 2012;Hosoda et al., 2013;López-Barrosso Fig. 2. (A, top;B, left) Mean diffusivity in range of areas related to language and cognitive function were positively correlated with achieved language proficiency within the interpreter group. Similar effects, but to a much lesser extent, were found for axial (C, D) and radial (E, F) diffusivity.

768
Johan Mårtensson et al. et al., 2011;Qi et al., 2015 andRipollés et al., 2017). It should be noted, however, that these areas are far from language specific, and are used in a wide array of higher cognitive functions. White matter microstructure (MD, RD and AD) before training was related to later language performance, which is in line with findings pointing towards the importance of brain structure at baseline in relation to learning outcomes in general (Zatorre, 2013) and for language learning specifically (Schlegel et al., 2012;Hosoda et al., 2013;López-Barrosso et al., 2011;Qi et al., 2015 andRipollés et al., 2017). Intuitively, we would expect higher FA values to be correlated positively with performance, such as observed when Qi and colleagues (2015) measured local white matter microstructure in twenty-one English native speakers who were taught Mandarin for four weeks. Instead, we saw a relation between MD and language proficiency in the interpreters, with relatively higher MD values foreshadowing higher performance. The areas overlap with regions where interpreters differed from controls in RD, but are more widespread.
Increased MD has been observed in reduced membrane density (Sen & Basser, 2005), which should be compared to FA that is known to increase with maturation (Zatorre et al., 2012). As such our findings appear counterintuitive when compared to the supposed greater myelination of interpreters compared to controls in the same areas. Earlier training studies have shown both decrease (Taubert, Draganski, Anwander, Müller, Horstmann, Villringer & Ragert, 2010) and increase (Scholz, Klein, Behrens & Johansen-Berg, 2009) in FA values as an effect of training (Schlegel et al., 2012;Hosoda et al., 2013). Halwani, Loui, Rüber and Schlaug (2011) compared singers to instrumental musicians. Singers showed lower FA values in the arcuate fasciculus. Within the group of singers, however, the inverse was true; singers with more experience showed lower FA values. The authors of the study conclude that with more experience it is likely that microstructural complexity increases and FA values decrease.
Relatively larger values of MD, however, point towards less microstructural complexity and in the direction of large dominating fiber volumes. Since all our MD values are contained within skeletonized tracts where smaller fibers have been averaged out across participants, large MD values in themselves are not unexpected. Relatively smaller values on the other hand could point towards larger amounts of crossing fibers in less proficient interpreters. Crossing fibers have also been known to cause RD and AD to increase or decrease simultaneously in the same areas (Vos et al., 2012), which we observe in this study. This might also help explain the counterintuitive finding that RD was lower in interpreters as compared to controls, but higher in interpreters who became more proficient in the end. Perhaps there was more room for change in those interpreters, perhaps there was higher microstructural complexity. Diffusion imaging measures the effects of myelin, cell membranes, and other small structures on diffusion within relatively large voxels (2 mm) resulting in the issue of measuring microscopic anatomical factors at a macroscopic level of detail (Mori & Zhang, 2006). With the level of detail that we present in this study we cannot delve further into the neurophysiological underpinnings of our DTI findings.
It should be noted that Mårtensson et al. (2012) found selective increases in grey matter volume in the left STG, IFG, MFG and the right hippocampus for interpreters. All of these areas are adjacent to the effects observed in mean diffusivity. This makes sense, since the arcuate fasciculus is believed to connect the STG (and MTG) with the IFG and MFG (along with the premotor cortex (Catani, Jones & Ffytche, 2005; Catani, Allin, Husain, Pugliese, Mesulam, Murray & Jones, 2007). The uncinate fasciculus in turn, is believed to connect limbic areas such as the hippocampus, to frontals areas of the cortex (Catani & Thiebaut de Schotten, 2008) that correspond reasonably well with the areas observed in the current findings along with those of Schlegel and colleagues (2012). A notable difference to the grey matter findings is that the initial white matter differences between interpreters and controls are mostly bilateral whilst the gray matter changes were mainly left hemispheric (Mårtensson et al., 2012).
In contrast to some previous reports (Schlegel et al., 2012;Hosoda et al., 2013), no white matter changes were found following intensive language training. This may be due to the extensive prior experience with languages within the group of interpreters, when compared to the control group. Compared to the participants in Schlegel et al. (2012), the interpreters presumably had more extensive experience with foreign language learning. Inconsistencies could also be due to small sample size, with lower sample sizes in the range reported here and earlier (Mårtensson et al., 2012) demonstrating excessive variability when compared to larger samples (Munson & Hernandez, 2019). Hosoda et al. (2013) used a considerably larger number of participants (n = 137) whilst Schlegel et al. (2012) measured each participant (n = 27) nine times over the course of nine months.
The differences between methods used is also worth mentioning. TBSS measures changes to large tracts common to all participants, whilst Schlegel et al. (2012) measured tracts in predefined regions of interest by means of fiber-tracking as well as brain-wide analysis of white matter voxels. Training characteristics may have further contributed to the differences in findings, as the participants from Hosoda et al. (2013) knew at least 2 languages prior to training. Participants exhibited training-related changes mainly in the right hemisphere as opposed to the predominantly left hemispheric changes from Mårtensson et al. (2012), which can perhaps be taken as indication that different demands have been placed on the two populations.
We found that white matter microstructure in language-related areas predicts later language proficiency, in consistency (to some extent) with earlier findings (Schlegel et al., 2012;Hosoda et al., 2013;López-Barrosso et al., 2011;Qi et al., 2015 andRipollés et al., 2017). Our results also fall in line with earlier cross-sectional findings showing that faster language learners have greater density of white matter in areas related to auditory processing (Golestani, Molko, Dehaene, LeBihan & Pallier, 2007) and that there are structural differences between early and late literates (Carreiras, Seghier, Baquero, Estévez, Lozano, Devlin & Price, 2009) as well as between simultaneous interpreters and controls (Elmer, Hänggi, Meyer & Jäncke, 2011).
These results are limited to a select and small group of individuals with knowledge of several foreign languages prior to training. However, they contribute to the increasing number of studies that have shown that recording white matter microstructure in the language network can be a valuable tool in understanding future learning capacity. The findings also highlight the need for further study of how initial brain structure influences and interacts with learning outcomes. Starting values need to be taken into consideration, when looking at why some individuals have an easier time learning a language than others. This work was supported by the Sofja Kovalevskaja Award (to Martin Lövdén) from the Alexander von Humboldt foundation donated by the German Federal Ministry for Education and Research, the Swedish Research Council (421-2005-2018, 421-2010-1250), the Linnaeus environment Thinking in Time: Cognition, Communication and Learning, financed by the Swedish Research Council (349-2007-869), and grants from the Swedish Research Council and the Umeå School of Education to LN.