The attrition of school-learned foreign languages: A multilingual perspective

Abstract In the vast body of research on language learning, there is still surprisingly little work on the attrition or retention of second/foreign languages, particularly in multilinguals, once learning and/or use of these languages ceases. The present study focuses on foreign language attrition and examines lexical diversity and (dis)fluency in the oral productions of 114 multilingual young adults, first language German speakers who learned English as their first (FL1) and French or Italian as their second foreign language (FL2), shortly before and approximately 16 months after graduation from upper secondary school. The level of foreign language use after graduation was found to have a noticeable impact on the measured change in output quality in the FL2, but only little in the FL1, where participants’ initial proficiency was considerably higher. The amount of use in the FL1 had no visible connection with attrition/maintenance in a rarely used FL2. Those participants who felt their speaking skills in one of their foreign languages had improved were correct in their self-assessment, but the degree to which the remaining subjects felt their speaking skills had deteriorated was not reflected in their productions.

For many people around the world, learning foreign languages has become a normal and important part of their lives. In Austria, where a large majority of the population speaks German either exclusively (approximately 89%; 95% among citizens) or in combination with another language in their daily lives (Statistik Austria, 2007aAustria, , 2007b, 1 children begin learning their first foreign language (FL1) in primary school and may add a second (FL2) and even third foreign language (FL3) by graduation from upper secondary school. It remains an open question, however, how much of this school-learned foreign language knowledge, proficiency and skills is retained later in life. In general, the phenomenon of foreign language attrition is still underresearched (Herdina & Jessner, 2013;Mehotcheva, 2010), and research on the attrition of more than one (foreign) language within the individual is virtually nonexistent to date.
To contribute to filling this research gap, the present paper takes a multilingual approach based on the dynamic model of multilingualism (DMM; Herdina & Jessner, 2002) and uses data from the Linguistic Awareness in Language Attriters (LAILA) project to study the development of oral production skills in the first and second school-learned foreign languages (FL1 English and FL2 French or Italian) of multilingual young adults after formal learning had ended. 2 Defining language attrition Until the 1980s, research into language attrition included societal language shift, loss and death, as well as pathological language loss (Lambert & Freed, 1982), but in more recent decades work in this field has focused exclusively on "nonpathological decrease in proficiency in a language that had previously been acquired by an individual" (Köpke & Schmid, 2004, p. 5; see also de Bot & Weltens, 1995) or, more precisely, on "the decline of any language (L1 or L2), skill, or portion thereof in a healthy individual speaker" (Ecke, 2004, p. 322). The latter definition rightly emphasizes that the term attrition does not necessarily denote a decrease in global language proficiency; instead, attrition may only affect certain language skills, and even those only partly. In any case, these processes all lead to a "reduction or simplification of language systems and/or the impairment of access to them, [which] is assumed to be a normal, often inevitable aspect of language development in the lifespan of a bi-or multilingual speaker" (Ecke & Hall, 2013, p. 735).
As is evident in this last definition, more recent research, in particular from the psycholinguistic branch of attrition research, also stresses that language knowledge is not necessarily fully lost from memory, nor is it irretrievably covered up and obliterated by more newly acquired knowledge. Instead, knowledge that is rarely or not used becomes less interconnected and therefore more difficult to access for the language user (for detailed information on the savings paradigm, see, e.g., de Bot, Martens, & Stoessel, 2004;Nelson, 1978; for the activation threshold hypothesis, see Paradis, 1993Paradis, , 2004Paradis, , 2007, and below; for more on theories of forgetting and language attrition, see Ecke, 2004).
Like the theories and paradigms referred to above, many of the theories, hypotheses, and approaches within attrition research are borrowed from related fields and disciplines, and a range of biological and cognitive as well as linguistic and extralinguistic factors have been studied and found to play a role in language attrition (for detailed overviews, see, e.g., Bardovi-Harlig & Stringer, 2010;Köpke, 2007;Schmid & Köpke, 2019). It should be noted, however, that to date most researchers study language attrition in bilinguals and focus upon a maximum of two languages (one attrition, one acquisition) or two language systems in interaction. From the present research perspective, these researchers are taking a bilingual approach and not a multilingual approach (see Jessner & Megens, 2019;Megens, 2020).
A multilingual approach to language growth and decline Traditional models of language acquisition tend to focus on increase, but often do not account for (or simply ignore) decline. Moreover, language systems within the individual are treated as separate from each other. In the DMM (Herdina & Jessner, 2002) a person's multilingual language system consists of nested subsystems (e.g., the different languages spoken by an individual), which themselves consist of further subsystems (e.g., syntax, morphology, etc.). All of these are in constant interaction with each other and with their environment in an ongoing process of adjustment and reorganization. Language acquisition involves "non-linear and reversible processes: that is, development refers to both acquisition and attrition" (Jessner, Megens, & Graus, 2016, p. 196), and attrition (or negative growth) is thus an integral, normal and expected part of (multilingual) language development (see Jessner, 2003Jessner, , 2008. Because of these constant processes of adaptation and change, multilingual systems show a sensitive dependence on initial conditions (see Aronin & Jessner, 2015), such as the proficiency in a given language before the onset of non-use.
The main driving factor behind these continuous changes is the adjustment of the individual's language systems to his or her perceived communicative needs, which in turn are influenced by both internal (e.g., psychological) and external (e.g., social, cultural, environmental, and circumstantial) factors. System stability (or instability) depends strongly on language maintenance: a fundamental assumption in the DMM is that a particular (sub)system will erode if insufficient energy and time are invested in it.
Use, or more specifically, effort-which can be conscious and intentional or less deliberate-is necessary not just in building and improving language systems but also in maintaining what has already been achieved (see also de Bot, 2004). This language maintenance effort (LME) can be seen as composed of or dependent on • The language use factor: (re)activation and renewal of various parts of the linguistic system/subsystem(s), e.g., through actual use of the language in communication or other activities; and • The linguistic hypothesis verification or corroboration factor: the renewal of parts of the individual's (explicit knowledge of a) linguistic subsystem by means of a verification of hypotheses concerning the language system (Herdina & Jessner, 2002, p. 99); also called the language awareness factor.
If LME is responsible for maintaining the stability of a language system, it logically follows that its absence can be considered as the core of attrition. This is particularly relevant in multilingual speakers, where multiple subsystems compete for cognitive capacities in terms of online processing or recall as well as for limited time and resources in terms of LME. A more holistic and multilingual approach considers the user's repertoire as a whole wherever possible, rather than studying an individual language system in isolation. Such an approach will better allow us to study: • language attrition in multilinguals: multilinguals forgetting any of their three or more languages; and • multilingual attrition: multilinguals forgetting two or more of their (foreign) languages (see Megens, 2020;.
Types of attrition: L1 attrition, non-L1 attrition, foreign language attrition Since the emergence of language attrition as an independent field of research, most studies have examined the attrition of the L1 in an L2 environment, usually the attrition of (bilingual) immigrants' first or native language after migration (for an annotated bibliography up to 2018, see Schmid, 2019a). This research into L1 attrition is distinguished from research on "L2 attrition," a term generally used to apply to all languages that are learned or acquired after (early) childhood, usually in addition to the L1. While we find this differentiation between L1 and non-L1 a useful jumpingoff point, the broad application of the term "L2" or "second language" to mean any language that does not fit the category of "L1" is problematic because it may be taken to mean that, by and large, all L2s are created equal and can be treated as such. To begin with, attrition studies often give no indication whether this "L2" is one of only two languages in the individual's repertoire, or if there are three, four, or more languages at play. This means there is no systematic differentiation between purely bilingual settings and tri-/multilingual ones. More recent work tends to rectify this omission, and authors usually number individuals' languages as L2, L3, L4 : : : Ln to indicate the order of acquisition, with the term "L3" increasingly serving as shorthand for any language beyond the second. Beyond chronological acquisition order, however, we often still have little or no information on or systematic classification of various types of non-L1, even though these can differ strongly in terms of age of onset, length and manner of acquisition, amount of exposure and use, dominance, and so forth, particularly in multilingual contexts. It is not our intention to draw a strict line between language acquisition and language learning (for a discussion of this, see, e.g., Herdina & Jessner, 2002, p. 34 ff.;Paradis, 2009), as we prefer to see these concepts as two ends of a continuum rather than as discrete categories. Nonetheless, we agree with Schmid and Mehotcheva (2012) in pointing out that the amount and quality of input, exposure, and use in the case of a language learned in an explicit, formal, instructed-learning setting such as a school or university classroom will generally differ substantially from languages that are (also) learned and used in a more implicit, naturalistic way, as is the case in immersion/submersion or migration contexts. Within the broader category of what we will term non-L1 attrition, which may pertain to any non-L1, and thus includes most of what is traditionally referred to as "L2 attrition," we therefore distinguish the subcategory of foreign language (FL) attrition, which focuses specifically on those languages that have been acquired/learned, usually with intentional effort, in a formal classroom/school learning setting, but which do not form a substantial part of the learner's everyday life outside this setting (for a further suggestion on differentiating between different types of non-L1, see Mehotcheva & Köpke, 2019).
Foreign language attrition: previous research This section will outline important findings from non-L1 and FL attrition studies with a focus on aspects relevant to the present study, its design, and research questions. 3

Rate of attrition
In terms of a forgetting curve, the psychologist Ebbinghaus (1885) predicted that attrition would set in immediately after learning stops, be rapid at first, but then slow down. This rapid onset followed by a slowdown or even stability is confirmed in some attrition studies: Bahrick (1984a) found that attrition of Spanish foreign language skills was heaviest in the first 5 to 6 years after learning ceased, and that the remainder, which he called "permastore-content," was "immune to further losses for at least a quarter century" (p. 111); after this period, however, attrition became stronger again. Similarly, Weltens (1989) found that proficiency in school-learned FL French in Dutch multilingual students dropped off within the first 2 years, but then leveled off in the 2 years that followed.
In contrast, other scholars found an initial plateau (of 6 months to a few years) within which there was relatively little attrition and after which attrition set in (Kaufman & Aronoff, 1991;Kuhberg, 1992;Tomiyama, 1999;van Ginkel & van der Linden, 1996, as cited in Wang, 2010Weltens & Cohen, 1989; for a discussion on these "two seemingly opposing views" on "rate of attrition," see also Mehotcheva & Köpke, 2019, pp. 336-337). Wang (2010) points out, however, that subjects in these studies had high or very high proficiency levels, and that this initial plateau may be seen as an extension of acquisition after which normal forgetting sets in (see also Weltens & Cohen, 1989).

Areas of linguistic knowledge
Looking at specific areas of language knowledge (i.e., lexicon, syntax, phonology, or morphology), the lexicon is thought to be the area in which attrition is evident soonest (e.g., de Bot & Weltens, 1995;Hagège, 2000;Paradis, 2007;Schmid, 2007). Other studies, however, found morphology to be the first linguistic category to be affected by attrition (e.g., Kuhberg, 1992;. Moorcroft and Gardner (1987) suggest that this order is likely dependent on the participants' proficiency levels: low-proficiency learners have grammars that are still unstable and therefore more vulnerable, while high-proficiency learners have relatively stable grammars and a larger vocabulary at their disposal, which means their lexical knowledge is more vulnerable (p. 337). 4

Productive versus receptive skills
Bahrick's investigation of over 500 individuals in the United States whose instruction in Spanish had occurred from 1 to 50 years prior to being tested found that attrition affected "smaller portions of recognition vocabulary than of recall vocabulary" (1984a, p. 116). He concluded that the absolute amount of attrition was the same for both types of vocabulary and attributed the difference to subjects' recognition (i.e., receptive) vocabulary being larger than their recall (i.e., productive) vocabulary to start with. An alternative interpretation of the results is that while reception (e.g., recognition of a lexical item) is based on stimulation from the outside, production, which involves the recall of lexical items, requires an impulse from within (see Paradis, 2004). FL production, particularly online and in real time, as is the case with spontaneous speech, depends more strongly on rapid and effective access to language stored in memory (particularly lexis) than does comprehension (reading or listening). If attrition consists (at least in part) of impaired access to language in memory, it logically follows that productive skills are more vulnerable to attrition than receptive ones. These findings are supported by a number of studies (e.g., Grendel, 1993;Hakuta & D'Andrea, 1992;Hansen, 1999Hansen, , 2011Murtagh, 2003;Weltens, 1989;Weltens & Grendel, 1993), which found little or no attrition in receptive (lexical) skills. Weltens (1989), for instance, used a combination of longitudinal and cross-sectional design and a variety of receptive tests to investigate the attrition of FL French (chronologically the subject's L3, L4 or L5) 5 receptive skills in 150 Dutch (L1) multilingual secondary school graduates 2 and 4 years after the end of formal instruction. He found very little attrition in the lexical and in the grammatical area and partly ascribed this to the absence of time pressure during testing. When Grendel (1993) used Weltens's (1989) study design in a similar population, she, therefore, decided to use a lexical decision paradigm, which included a time limit. She, too, however, found almost no signs of attrition. For this reason, Weltens and Grendel (1993, p. 154) concluded that "future studies of language attrition should focus on language production."

Subjective perception of attrition
Several studies (Murtagh, 2003;Weltens, 1989;Weltens & Grendel, 1993; see also Waas, 1996, for L1 attrition) have found that participants tended to overestimate or at least overstate how much their language had attrited, and that the amount of selfreported loss was not reflected in the test measures used. Some scholars therefore consider self-evaluation of attrition as unreliable ("not always a valid measure for assessing attrition"; Schmid & Mehotcheva, 2012, p. 114). However, particularly in situations involving productive rather than receptive skills and/or time pressure (as is the case in spontaneous speech), language users' self-assessment may actually (and accurately) reflect the increased mental effort necessary to produce the language, due in chief to the reduced ease of access to language knowledge in the mind. The subjective feeling of loss in itself may therefore be considered an early sign of attrition, even if the output quality is not measurably different from before the onset of attrition or if the measured degree of deterioration is far smaller than what subjects report.

Language use and contact
Attrition is believed to be influenced by both linguistic and extralinguistic factors, but as set out above, the factor that would appear most obvious as exerting the strongest influence in attrition is (lack of) language use and contact: It stands to reason that the less a language is used, the more likely it is to deteriorate and vice versa. This prediction is formulated in theoretical models such as the DMM (see above) and the activation threshold hypothesis (ATH; Paradis, 1993Paradis, , 2004. According to the ATH, items in memory require a certain level of neural impulses to activate them; the activation threshold level necessary to do this goes down each time an item is activated (i.e., access to it becomes faster and easier) but rises over time between activations. Moreover, the activation of a specific item raises the threshold of other items that are in competition with it (e.g., equivalents in another language). This means recency and frequency of use are vital in keeping knowledge accessible and the activation threshold low. A little-used item can still be stored in memory, but accessing it becomes more effortful or at least slower when the activation threshold has gotten higher due, for instance, to lack of use.
Some studies on L1 attrition (e.g., Hulsen, 2000;Köpke, 1999) found that less use of a language was associated with stronger attrition, but Schmid (2007) found that the amount of L1 use in daily life did not appear to have any predictive power. However, Schmid (along with Köpke, 2007) argues that a more in-depth examination of the type and quality of language contact might reveal more than focusing purely on amount/frequency. Using only receptive skills for consuming television, for example, is likely to have a very different effect than writing or interacting orally in a language (for detailed information on the factor of "language use and contact" in L1 attrition, see Schmid, 2019b).
While such findings should logically also apply to non-L1 attrition, some important differences must be pointed out. The level of proficiency achieved by most FL learners before the onset of attrition is usually far lower than that of an L1 user or of a non-L1 user for whom the language is present and important in daily life; the attrition process is therefore bound to be dissimilar due to this difference in initial proficiency (see below). Moreover, it is also more likely for FL learners to lose all contact with a language acquired in a school/instructional setting once this learning ceases than it is for an L1 speaker/daily non-L1 user to lose all contact with that language. That being said, however, research on FL attrition has "so far failed to validate [the] assumption of the inevitability of language attrition" (Schmid & Mehotcheva, 2012, p. 102) once an FL is no longer used or studied. Inversely, simply having any language contact (as opposed to none at all) does not prevent attrition (Bahrick, 1984a(Bahrick, , 1984bMehotcheva, 2010;Xu, 2010).
Whether less contact/use means more attrition is still not entirely clear, and it appears, counterintuitively, that this may not be the case. Xu (2010) investigated the attrition and retention of school-learned English in Chinese and Dutch university students 2 years after instructed learning had ended. The study found attrition in both participant groups (in all four skills among Chinese participants but only in writing among the Dutch), but language contact did not predict performance in either group. In Mehotcheva (2010), among Dutch and German multilinguals who learned Spanish at university as an Ln (L5 for most participants), length of attrition (the time between the end of active Spanish use and testing) had no discernible influence on attrition, even when initial proficiency was controlled for, and more rehearsal (amount of language use/contact) was not, on the whole, associated with higher retention. Similar results were found by Bahrick (1984aBahrick ( , 1984b and Murtagh (2003).
Particularly in multilinguals, however, it is also important to consider the amount/frequency/quality of language contact not only with the specific (single) language that is under investigation, but also with other (foreign) languages, as contact with one language may not simply curtail time spent engaged with another but also displace/inhibit access to the latter (see DMM and ATH above).

Initial proficiency
Except in the case of prepubescent children, research into the attrition of an L1 normally assumes that a language has been "fully" acquired before the period of reduced/non-use sets in. This, of course, is not the case for most non-L1s and even less so for FLs. Initial proficiency, the level at which a language has been mastered before the onset of disuse, has been found to be an influential factor in language attrition: across various FL (and similar non-L1) attrition studies (e.g., Bahrick 1984a; Gardner, Lalonde, Moorcroft, & Evers, 1985;Godsall-Myers, 1981;Grendel, 1993;Murtagh, 2003;Nagasawa, 1999;Weltens, 1989), higher initial proficiency has been associated with better FL retention, particularly in the long term. More recently, Mehotcheva (2010) also found this to be true in multilingual learners who had up to seven FLs in their repertoire: In their retention of university-learned Spanish 6 after a stay-abroad program, initial proficiency was "the most salient predictor of language retention with high proficiency at onset leading to better retention of the language" (p. 154), whereas no firm conclusion could be drawn for length of attrition or amount of language contact. Similarly, in Xu's (2010) study, initial proficiency was found to be a strong influencing factor for both Chinese and Dutch learners of English, but language contact was not.
Even so, determining the influence of initial proficiency is not without its difficulties. Hansen (1999) examined the Japanese of LDS missionaries (who had learned this language as young adults while working in the target culture) several decades after their return and argues that it is the length of exposure to a language rather than proficiency per se that contributes to higher retention. However, as Schmid (2006, p. 77) points out, these two factors can be expected to correlate. Overall, it must be noted that it is difficult if not impossible, even in settings where a group of learners of a similar age have had roughly the same amount and type of exposure to a particular FL, to disentangle initial proficiency from these and additional factors such as general intelligence, language aptitude, or attitude and motivation, all of which may also contribute to the retention of a learned language.

Language attrition in multilinguals and multilingual attrition
Most of the published studies on foreign language attrition whose participants were multilingual-and they are fairly few in number-focus on the development of one particular language; that is, they focus on language attrition in multilinguals, but not on multilingual attrition (see Megens, 2020;. Bahrick (1984aBahrick ( , 1984b, Weltens (1989), and Mehotcheva (2010) are described above. Another small-scale study that deserves mention is Nakuma (1997), who investigated the attrition of communicative competences in the L3/Ln Spanish of 13 professionals in Ghana. 7 A decade after university graduation, Nakuma found a quite significant loss in those who had not had contact with Spanish and some gains in those who still used Spanish for business purposes, which led him to conclude that "a far greater amount of effort and time is required to maintain, and eventually surpass, one's own level in a language than is required to lose it, after a given level of capability has been achieved" (1997, pp. 219-220), a notion which underlines the importance of LME in multilingual systems (Megens, 2020).
Attrition studies that look at two or more languages within the same multilingual individual(s) are even fewer in number (see Jessner & Megens, 2019); the only three we are aware of are case studies that include only one or two children who acquire (and lose) their languages in a naturalistic setting (Faingold, 1999;Cohen, 1989) or that focus on tip-of-the-tongue states in five languages over 10 years in a single multilingual adult (Ecke & Hall, 2013). To date, we are not aware of any studies that examine multilingual foreign language attrition, that is, the attrition of two or more FLs longitudinally within a group of multilinguals.

Study and methods
Background: Foreign language learning in Austrian schools Children in Austria come into (playful, nonintensive) contact with their first foreign language in primary school and begin learning it more intensively from Grade 5 onwards; this FL1 is virtually always English (98%; Statistik Austria, 2016). Those who continue on past the mandatory 9 years of schooling will therefore have had at least 8 years of FL1 English instruction by the time they graduate from upper secondary school at the end of Grade 12 or 13. The FL2 (most commonly Italian, French, or Spanish, more rarely Latin or Russian; Statistik Austria, 2016) is usually added in Grade 7 or 9, so students will typically have learned it for at least 4 years by graduation, though the total learning time for the FL2 (both in terms of years and hours per week) varies far more than for the FL1. A third, and far more rarely a fourth and even fifth foreign language may be added at any point (often as an elective), depending on school type. Austrian pupils are generally required to pass written and oral exams in one to three foreign languages (again, depending on school type) as part of their Matura, the school-leaving exam that also counts as university entrance qualification. The expected level of proficiency at graduation is set at level B2 of the Common European Framework of Reference (CEFR) for the FL1 and at level B1 for the FL2 (see BMB Bundesministerium für Bildung, 2017, for details).

Research questions and predictions
The study was guided by the following research questions: RQ1 Will participants show signs of attrition in their spontaneous oral production approximately 1.5 years after FL learning has ceased?
RQ2 Does the amount of use after formal learning ceases have any influence on the further development of a FL? If so, does this low/high amount of use have a similar impact on the development of the FL1 English and the FL2 French/Italian, or are there differences? RQ3 What role does initial proficiency play in the attrition process? RQ4 Will high continued use of one FL have an impact on the development of the other, rarely used FL? If so, does high continued use of one FL expedite or mitigate the attrition of another, rarely used FL? RQ5 Will participants' own assessment of the attrition/growth of their FL skills reflect the changes measured in their productions?
Based on the literature review and the assumptions of the DMM, we make the following predictions: unless they continue to use the respective FL, participants will show signs of attrition in their oral productions in both FL1 and FL2. Lower use of either language will result in stronger attrition in that language, and high use of one FL coupled with low use in the other FL will further expedite the attrition/displacement of the latter language. However, we also assume that attrition will be dependent on initial proficiency, and that loss will be stronger in the FL2, where proficiency at graduation is considerably lower than in the FL1. We furthermore assume that participants' self-assessment of their speaking skills and of how strongly these have attrited will not necessarily be reflected in the measured level of language attrition, as participants will tend to overestimate the amount of attrition that has occurred.

Participants
The sample chosen for the present paper consisted of 114 participants, 80 female and 34 male, drawn from the much larger LAILA data pool on the basis of their linguistic profile: all had German as their sole L1, and all were learning English as their FL1 and either French (n = 58) or Italian (n = 56) as their FL2. For the purposes of this study, no differentiation was made in the analysis between French and Italian, and all statistical values pertaining thereto are subsumed under "FL2." At the first test time (see below), participants were in their last year at one of 17 upper secondary schools in Tyrol and their median age was 18 years (M = 17.86, SD = 0.7). They had been learning English in school as their FL1 for between 7 and 13 years (Mdn = 9, M = 9.33, SD = 1.38) and their FL2 for between 3 and 8 years (Mdn = 4, M = 4.8, SD = 1.1). Of the 114 participants, 37 had learned (or were still learning) a third foreign language (FL3; either French, Italian, or Spanish) at school for between 2 and 3 years (M = 2.6, SD = 0.5). None mentioned an active FL4.

Testing procedure
The first test time (T1; baseline) took place in the last months before the participants' Matura, which is usually held in May through June. Approximately one year after this exam, participants were contacted and invited back for a second test time (T2) and offered a financial recompense high enough to mitigate the self-selection of those graduates particularly good at or interested in foreign languages. The time between July 1 in the year of T1 (the definite start date of the post-high school life for all participants) and T2 ranged from 14 to 19 months (M = 16.4; Mdn = 16.5; SD = 1.3). T2 thus took place well over a year after all formal FL learning in school had fully ceased. Tasks and testing procedures were virtually identical at both test times, the main difference being that at T1, testing took place at school during school hours, while at T2, participants came to the University of Innsbruck for testing.
At T1, written tasks, including the participant questionnaire, were completed online in school computer labs in groups with a tester on hand to help in case of problems or questions. Oral language testing sessions were conducted individually in a separate room in one-on-one, face-to-face sessions with a trained tester and were recorded on audio and video. At T2, written online tasks were completed at home in advance before oral tasks were done at the university. While the full LAILA test battery included a large number of tasks, only those test instruments and questionnaire items relevant to the present paper are described here.

Questionnaire
Before testing, participants completed a detailed online questionnaire (in their L1 German) on their language background, learning experience, habits (including the effort put into learning and using the FLs voluntarily outside school), and language skills, as well as their self-evaluation of their habits, skills, and proficiency. This questionnaire contained 35 items (some of which had to be answered separately for each language) and was self-designed by the project team to fit the specific profile of the participants and the foci of the study while keeping it to an acceptable length. 8 The questionnaire at T2 was similar to that at T1, but contained additional detailed questions on what participants had been doing and to what extent they had been using their language(s) since graduation, as well as a self-evaluation of how their language skills had developed and changed in that time period.
Initial proficiency: Self-assessed foreign ability and effort to T1 (baseline) At T1, participants were asked to indicate how they rated their own proficiency in each of the four language skills (listening, speaking, reading, and writing) in each of their FLs on a 5-point Likert scale that ranged from 1 (very low) to 5 (very good). This means that T1 FL ability speaking had a maximum of 5 points, and T1 FL ability total (the sum of the scores for all four skills) had a maximum of 20 points for each language. Participants were also asked how often (on a 5-point Likert scale ranging from 1 = not at all to 5 = once or more per week) they did the four following activities in each FL outside school hours and without having to do them for school: (a) read magazines/newspapers/online articles or blogs; (b) read novels or other literature; (c) watch movies; TV series, or other programs; or (d) interact with native or nonnative speakers in that language. Cronbach's α for these four items at T1 was 0.72 in the FL1 and 0.784 in the FL2. T1 FL effort, the sum score of these four 1-5 Likert scales (maximum score of 20 points) indicated to what degree the respective FL was a part of a participant's life outside school and how much use and effort the individuals put in voluntarily.

FL use and effort after graduation
At T2, participants were asked to indicate how much they had used their FL1 English and FL2 French/Italian since graduation (FL use) on a 6-point Likert scale ranging from 1 (practically never) to 6 (daily or nearly daily) and to provide more details in an open-ended comment box. Based on their score (1)(2)(3)(4)(5)(6) for FL use, participants were assigned to one of three groups in the FL1 and FL2, respectively: low (1-2), middle (3)(4), or high (5)(6). The number of participants in each FL use group is presented in Table 1. In their comments, 25 participants mentioned some form of formal learning of their FL1 and/or FL2 after graduation (11 FL1 only, 7 FL2 only, and 7 both). Most of these rated their use of this FL as "high"; among those who rated themselves 4 or lower (3 in FL1, 6 in the FL2), this formal learning had been brief or nonintensive, or had only begun very recently. Those self-rated "high" users of the FL1 English who had no formal language learning after graduation mentioned English uses such as: communication on extended travel; work abroad or au pair stays; use for work or communication with relatives, partners, friends, or flatmates; English course literature or lectures in their (nonlanguage) field of studies; or preferred consumption of entertainment media (specifically series) in English. For the FL2, the number of participants who rated their FL use as "high" (n = 17) is small: 8 of these mention some form of ongoing formal learning, and four others mention use for work. Participants who rated their FL use since graduation as "middle" or "low," particularly in the FL2, often explicitly point out in their comments that they use this FL only occasionally or (virtually) never.
At T2, participants were also asked how much FL effort (for wording and scoring, see above) they had generally made in the time since graduation. Cronbach's α for the four component factors of FL effort total was 0.792 in the FL1 and 0.878 in the FL2 at T2; Cronbach's α for these four factors plus FL use was 0.841 in the FL1 and 0.893 in the FL2. A correlation analysis between these two self-reported measures (FL use and FL effort since graduation) found a highly significant and strong positive correlation in both the FL1 English (ρ = .69, n = 110, p < .001) and FL2 French/Italian (ρ = .67, n = 114, p < .001). Therefore, FL use alone was used for further analysis. Four candidates did not indicate their use of FL1 (English) since graduation; two of these were in the FL2-low and two in the FL2-middle use groups.

Self-assessed change in foreign language ability since graduation
At T2, participants were asked to indicate how they felt their proficiency in each of the four language skills (listening, speaking, reading, and writing) in each of their FLs had changed since graduation on the following 7-point Likert scale: -3 I can barely do this anymore/cannot do this at all anymore -2 has got noticeably weaker -1 has got a bit weaker 0 has stayed about the same 1 has got a bit stronger 2 has got noticeably stronger 3 has got a whole lot stronger The scores in each language (FL1 English and FL2 French/Italian) on change in FL ability speaking (-3 to 3) and the sum of responses to all four skills, change in FL ability total (-12 to 12), were used for further analysis.

Spontaneous oral production tasks Oral tasks
In the FL1 (English) "Facebook task," an image prompt task, participants were given a maximum of 3 min to describe, interpret, and comment on a one-picture cartoon from the webcomic Joy of Tech (Geekculture, 2007) that showed three well-dressed people begging for money on a sidewalk with cardboard signs indicating that they had lost their jobs due to foolish missteps on Facebook or YouTube. This task was geared toward FL users at the expected proficiency level of B2 on the CEFR. In the FL2 (French or Italian) "Bike task," a picture story task, participants were presented with an ordered sequence of four hand-drawn images (made by Megens, 2011) showing a couple watching the TV weather forecast, preparing for an outing and then cycling in the rain. Participants had a maximum of 3 min to narrate a story along the pictured storyline in any way they wished (e.g., including dialogue between the couple in the picture), though preferably in the past tense. This task was less sophisticated in its scope than the FL1 task; it was geared toward language users at the expected level of B1 (CEFR) but could also be completed by users with a lower level of proficiency and was therefore well suited to the present study's participants.

Measuring FL attrition in spontaneous oral production
The present study targets complexity ("the extent to which the language produced in performing a task is elaborate and varied"; Ellis, 2003, p. 340), and fluency ("the extent to which the language produced in performing a task manifests pausing, hesitation, or reformulation"; Ellis, 2003, p. 342; see also Schmid, 2011). The measures listed in Table 2 are based on previous research that analyzed proficiency development in both spoken and written data (e.g., Housen & Kuiken, 2009;Kormos & Dénes, 2004;Larsen-Freeman & Cameron, 2008;Megens, 2011Megens, , 2020 Lexical diversity, that is, the variety of words used in a given text, is often measured using the traditional type-token ratio (TTR), or the number of different words (types) 9 divided by the total number of words written or uttered (tokens); the highest possible value (indicating no repetitions at all) is therefore 1. The TTR is purely quantitative and does not give any information on aspects such as the register, sophistication, frequency, or uniqueness of the lexis used (see Meara & Bell, 2001;Read, 2000). Its major weakness, however, is that type repetition inevitably occurs in texts longer than a few sentences and increases with the length of the text, which means that TTR decreases in longer text samples and is thus not stable when applied to data samples of varying length (for criticism of the TTR and suggestions for alternatives, see, e.g., Daller, van Hout, & Treffers-Daller, 2003;Vermeer 2000). In our analysis, we therefore used the "sophisticated type-token ratio [STTR]-word types per square root of two times the words types Following the notion that "the first sign of language attrition then is not the loss of certain items but rather an increase in the length of time needed for their retrieval," previous studies on attrition (including Grendel, 1993;Kormos & Dénes, 2004;Megens, 2011Megens, , 2020Nagasawa, 1999;Schmid, 2007;Schmid & Beers Fägersten, 2010;Tomiyama, 1999;Waas, 1996) have focused on the "quantification of hesitation variables in spontaneous speech" (Hansen, 2001, p. 63). In the present study, the following measures of (dis)fluency and hesitation were targeted and coded: • filled pauses/filler words: thinking sounds and nonword vocalizations, irrespective of their pronunciations, such as "ahm" or "hmm," and words or phrases in the target language of the task obviously being used as a filler (and not used in a meaningful utterance), such as "well," "alors," "allora," "yeah," "oui," or "I don't know" • false starts: where speakers begin an utterance, but break off within a word • repetitions: retracing without correction (the speaker begins to say something, stops, then repeats the earlier material without change) • corrections: partial repetition of the preceding material with a correction • full and complete reformulations of the preceding message, without specific corrections.

Transcription and coding of oral tasks
The recordings of the spontaneous oral production tasks (four "performances" of 2-3 min per participant) were transcribed and simultaneously coded using guidelines based on the conventions of the CHAT transcription format set forth in the manual for the CHILDES project (MacWhinney, 2000). Utterances that did not pertain to the task or that were in a nontarget language (usually clarification questions or comments directed at the tester) were excluded from analysis. For each performance, the overall number of tokens and types (excluding filler words, see above) produced within the time limit of three minutes were counted and used to measure length of production (tokens) and to calculate the STTR. To account for the variance in speech sample length, disfluency markers are given as occurrences per 30 tokens (for detailed information on transcription and coding conventions, see Megens, 2020). These analyses and calculations were carried out in the CLAN software program. Statistical analysis was performed in R (R Core Team, 2015) and Microsoft SPSS. Nonparametric tests were used where assumptions (e.g., normal distribution) for parametric tests were not met on all variables or where analysis used subgroups that were smaller than 30 cases. The α level for statistical significance was set at p < .05 unless otherwise indicated.

Delta values: Measuring change over a period of reduced/non-use
To assess the changes (i.e., improvement or decline) in the quality of the spontaneous oral performance between T1 and T2, a delta value (δ) was computed for each of the indicators described above by subtracting a participant's score at T1 from their score at T2. Comparing each person at T2 to him/herself more than a year earlier allowed us to look less at the absolute quality produced at a given test time, and more at the relative, intraindividual changes, thereby mitigating individual differences in language proficiency at T1 (baseline) or in general speaking style (e.g., individual speaking speed in any language, the frequent or rare use of filler sounds, etc.). A (significant) decrease in a participant's STTR between T1 and T2 would indicate lexical attrition; a (significant) increase in disfluency markers would indicate loss of fluency and thus attrition.

Initial proficiency
Overall, participants' proficiency at T1 (baseline) was higher in their FL1 English than in their FL2 French/Italian. This difference was reflected in students' selfassessment of their own language proficiency (see Table 3) and also showed clearly in the vast majority of participants' spontaneous oral production at T1 (see Table 4). The tasks for the FL1 English and FL2 French/Italian were not the same, but even so, within the three-minute limit participants (n = 114) produced considerably more output (tokens) and a greater number of different words (types) in the FL1 than in the FL2. Disfluency markers were also more prevalent in the FL2 than in the FL1.
Change in quality of spontaneous oral production by foreign language use Table 5 shows scores on measures of lexical diversity and (dis)fluency at T1 and T2 as well as the changes on these measures between T1 and T2 (the δ values), with participants split up into groups by FL use after graduation (low, middle, or high) within the respective FL (FL1 English or FL2 French/Italian). The dispersion is fairly large on nearly all measures within all groups, and it should be pointed out that a considerable number of candidates showed very little change between the two test times (i.e., had small δ values), particularly in the FL1. Overall, mean values here indicate that, the less the language was used, the greater the decline in quality tended    to be, though this effect is far stronger and more clear-cut in the FL2 than in the FL1. A series of Wilcoxon signed rank tests comparing participants within the low/middle/high-use subgroups at T1 with themselves at T2 showed a strong difference between the FL1 and FL2, as is visible in the significances for δ values in Table 5. In the FL1, participants' performance at T2 was not significantly different from T1 on any measure in any group. By contrast, in the FL2, the decrease in types, tokens, and STTR was highly significant (p < .001) among participants with low FL2 use, while those in the middle category showed a significant decrease in the occurrence of self-corrections and reformulations (i.e., an improvement in fluency), and those with high use improved significantly on tokens, filled pauses, and selfcorrections.
In order to explore whether the FL use groups (low, middle, or high) differed from each other in terms of their δ values (individual change on each measure between T1 and T2), a series of Kruskal-Wallis tests was conducted. They found no significant differences between groups on any measure in the FL1 English, but in the FL2 French/Italian, groups were significantly different from one another on types, χ 2 (2, n = 114) = 13.22, p = .001, tokens, χ 2 (2, n = 114) = 14.24, p = .001, STTR, χ 2 (2, n = 114) = 8.26, p = .016, and reformulations, χ 2 (2, n = 113) = 9.27, p = .01. A series of post hoc Mann-Whitney U tests with Bonferroni adjustment (p < .017) found significant differences for types and tokens (both p = .001, medium effect, r = .34 and .36) only between the low and high FL2 use group. The differences in reformulations were significant between the low and middle use group (p = .005 with small-medium effect, r = .29) and between the middle and high use group (p = .009 with medium effect r = .41). This dissimilarity between the FL1 and FL2 in terms of intraindividual change in spoken performance between T1 and T2 can also be clearly seen in Figure 1, the boxplots for the delta values of STTR by FL use group.
Language use and self-assessed change in language skills As described above, participants were asked to assess how much their language skills in their FL1 and FL2 had deteriorated or improved since graduation (see bottom half of Table 3). Interestingly, a correlation analysis (Spearman's ρ) between FL use since graduation and self-assessed change in FL ability found correlations that were moderate but statistically highly significant between these two factors in both the FL1 and FL2. This was the case for self-assessed change in overall language skills (FL1 ρ = .598, n = 109, p < .001; FL2 ρ = .536, n = 113, p < .001) and for speaking skills in particular (FL1 ρ = .506, n = 109, p < .001; FL2 ρ = .512, n = 113, p < .001). In other words, how much a participant reported using a foreign language since graduation correlated moderately with how much that person felt they had improved or deteriorated in that same time period, with no visible difference between the FL1 English and FL2 French/Italian.

Measured and self-assessed change in language skills
The next step in our analysis was to explore how participants' own assessment of the improvement/deterioration in their speaking skills since graduation (change in FL ability speaking) related to the actual measured change in the quality of their oral production between T1 and T2. Correlation analysis (see Table 6) found no or only a small correlation between self-assessed change in FL speaking ability and measured changes (δ values): in the FL2 (Italian or French), Spearman's ρ was .194 or smaller for every correlation and only significant in the case of tokens. In the FL1 English, correlations were also small: Spearman's ρ was never greater than .275; however, these correlations were significant for types (p = .005), tokens (p = .011), STTR (p = .03), and filled pauses (p = .003). Overall, how much participants felt they had improved or declined was not or only very weakly related to the magnitude of the changes found in the analysis of their oral production; this correlation was even weaker in the FL2 than the FL1. It is worth noting, however, that in the FL1 (n = 40), a far smaller proportion of candidates than in the FL2 (n = 94) felt their speaking skills had declined.
To examine this more closely, participants were subdivided into groups according to self-scoring on change in FL ability speaking: strong decline (-3), moderate decline (-2, -1), neutral (0), moderate improvement (1,2) and strong improvement (3). In general, the better a group's self-assessed change in speaking skills, the higher that group's mean score was on virtually every measure at T2 (see columns for "T2" in Table 7). Intraindividual development between T1 and T2, however, was less clear-cut (see columns for "δ" in Table 7). Interestingly, in those students who felt their speaking skills had improved since graduation, this development was actually reflected in their output: mean group delta values for types, tokens, and STTR were noticeably greater here than in the decline or neutral groups, and the improvers' mean drop in disfluency markers between T1 and T2 was greater on nearly every measure than in the neutral or decline groups, though a high dispersion must be taken into consideration. As the significances for the delta values in Table 7 show, among those who believed their speaking skills had improved between T1 and T2, Wilcoxon signed rank found these gains to be significant for types, filled pauses, and self-corrections in both languages and in addition for tokens in the FL2.
In contrast, in "nonimproving" groups, the self-reported degree of decline (neutral, moderate, or strong) was not necessarily mirrored in the degree of decline indicated by mean δ scores. In the FL1, neither the neutral nor the moderate decline group performed significantly differently at T2 than at T1, with the exception of one measure (reformulations) in one group (neutral). In the FL2, those who felt their speaking skills had declined (moderately or strongly) were correct in that they did perform significantly worse on types, tokens, and STTR at T2 than they had at T1. However, this (mean) drop in scores between T1 and T2 was not greater among the self-perceived "strong decliners" than among the "moderate decliners," or even, on some measures, among the "neutrals." A Kruskal-Wallis test (see Table 8) in the FL1 (excluding the group strong decline, which only had one case) found a significant difference between the three remaining groups (moderate decline, neutral, and moderate improvement) for types and filled pauses. A series of post hoc Mann-Whitney U tests with Bonferroni adjustment (p < .017) found that for both types and filled pauses, the difference was only significant between the moderate decline and moderate improvement groups. In the FL2, a Kruskal-Wallis test found significant differences between the four groups (strong decline, moderate decline, neutral, modeate improvement) for types, tokens, and STTR. A series of post hoc Mann-Whitney U tests with Bonferroni adjustment (p < .008) found no significant differences between strong Table 6. Correlation between measured changes in output quality (δ) and self-assessed quality change in foreign language speaking ability between Time 1 (T1) and Time 2 (T2; n = 113)   decline, moderate decline, and neutral groups; significant differences always involved the moderate improvement group (see Table 8). This means the self-rated strong attriters, moderate attriters, and neutrals were not significantly different from each other in terms of measured performance losses between T1 and T2.

Relationship between first and second foreign language use
To explore a possible influence of the level of use in one FL on the attrition of another rarely used FL, the subgroup made up of those participants who reported low use of their FL2 since graduation (n = 71; see Table 1) was examined to see whether their use of their FL1 (low, middle, or high) over this same time period had any relevance for the development of their FL2. A series of analyses of variance found no significant differences between these groups on any of the δ values for lexical diversity or disfluency in the FL2. An analysis of the potential influence of use of FL2 on the development of a rarely used FL1 was not possible because out of 12 participants who reported low use of their FL1 English, 11 also reported low use of their FL2. Correlation analysis between use of FL1 and changes in oral data measures in the FL2 (and vice versa) within each of the low-use groups similarly yielded no or only small, nonsignificant correlations.

Discussion and conclusion
The aim of the present study was to examine the attrition, or maintenance, of school-learned foreign languages once formal learning had ended. More precisely, we aimed to explore if and how factors such as initial proficiency and (amount of) language use influenced the development of oral production skills in multilingual learners of more than one foreign language. To this end, we examined lexical diversity and (dis)fluency in the spontaneous foreign-language oral productions of 114 young adult multilingual L1 German speakers shortly before and then over a year after their graduation from upper secondary school, which allowed us to compare the same participants at peak proficiency to themselves post-attrition rather than using similar learners as a baseline. Taking a holistic approach to changes in the multilingual system based on the DMM, we assessed not just one language but both the FL1, English, and the FL2, French or Italian. This approach and the size of the population form a valuable contribution to the field as, to our knowledge, no previous attrition study has examined more than one foreign language in a group of individuals. Overall, some 16 months after formal foreign language learning had ceased, those participants who reported high FL use showed gains in their spontaneous oral output in terms of length of production, lexical diversity, and fluency, while those with low or moderate use of their foreign languages tended to show signs of attrition. This lends support to our general prediction (RQ1) based on the ATH (e.g., Paradis, 2004) and the DMM (Herdina & Jessner, 2002), both of which see sufficient use/activation or LME as the key factor in maintaining or increasing language proficiency.
We further expected that less use of a particular language would produce stronger attrition in that FL (RQ2), an assumption mirrored in the results of some L1 attrition studies, but contradicted by others. Our prediction was only partially confirmed: in English, the differences between individuals' performance at T1 and T2 was generally not significant; neither was the degree of improvement/attrition significantly different between participants who reported low, middle, or high use of this FL1. By contrast, in the FL2, lexical diversity deteriorated significantly among participants with low FL use, while those with middle or high use showed some significant gains in fluency. In a nutshell, in FL1 English, less use after the end of formal FL learning did not correspond strongly with greater attrition, but in the FL2 French/Italian, it did.
One possible explanation is that the nature and perception of foreign language use differed depending on the FL. Even before graduation, students used their FL1 English more outside school than they did their FL2 French or Italian, and reported FL use after graduation was generally far higher for the FL1 English (with 60 participants reporting high, 38 middle, and 12 low use) than for the FL2 French or Italian (17 high, 24 middle, and 73 low). All of this indicates a different and perhaps more important role for English as a (first foreign) language than for French or Italian in this population. To some extent, it is logical that learners would tend to use those languages they know better more (and know those they use more better), as was the case here. There may also be some disparity in how learners self-rate their FL use, for example, an objectively identical amount of use may be rated lower in the FL1 English than in the FL2.
Even so, this would not be sufficient to explain why the amount of use had little to no impact on the development of the FL1 but quite a strong one on the FL2. This disparity is most likely mainly attributable to the difference in initial proficiency (self-rated and measured) between the two FLs (RQ3), which supports our prediction that lower initial proficiency would be associated with stronger attrition, an assumption in line with previous research findings supporting the notion that initial proficiency largely predicts language attrition processes (e.g., Ecke & Hall, 2013;Mehotcheva, 2010;Murtagh, 2003;Weltens, 1989;Xu, 2010). In our case, English as an FL1, where the initial proficiency was certainly higher than in the FL2, thus appears to be less vulnerable to attrition through reduced or non-use.
It is important to note that at 14-19 months, the period of reduced/non-use was relatively short, so these results cannot tell us what might happen after a longer period of time. The present findings, however, may support the notion of there being an initial plateau of 6 months to 2 years after which attrition sets in, at least in learners with high proficiency (see, e.g., Weltens & Cohen, 1989). The results may further imply an interaction between achieved proficiency and the importance of LME in maintenance (e.g., more LME being necessary to achieve the same ends in a less developed system) that may merit closer attention in future research.
A novel approach in this study was to examine whether high use of one foreign language in a multilingual's repertoire (FL A ) might have an impact on the development of another, non-used or little-used, FL B (RQ4). No evidence could be found in our data to support our prediction that high(er) use of the FL A (here the FL1 English) would be associated with a stronger decline in FL B (FL2 French/Italian); nor was there any evidence of that high FL1 use had mitigated the effects of non-use or slowed the attrition process in the FL2. Effects of the degree of FL2 use on a rarely used FL1 could not be measured as the groups were too small. It is unclear, however, whether such an impact might be better visible after a longer period of attrition, in a larger sample, if other (more qualitative) methods of analysis had been used and/or if other properties and qualities of the multilingual system had been explored. This may be especially relevant in this case, where the L1 German and the FL in question (English, French, or Italian) all form part of the Indo-European language family and can therefore not only act as supporter languages in the acquisition process (e.g., Jessner, 2006) but can also counteract multilingual attrition/attrition in multilinguals due to the common core between the languages (e.g., the large number of cognates that form a considerable part of the multilingual lexicon and thus contribute to the high level of multilingual awareness, particularly crosslinguistic awareness, developed in experienced language learners; see Megens, 2020).
The current study also aimed to explore the relationship between students' selfperception of how their language skills had changed since graduation and the decline/improvements actually measured in their oral productions (RQ5). Our prediction that self-assessed development would not tally with measured development was partially correct. In both FLs, those students who felt their FL speaking skills had improved were fairly accurate in their self-evaluation: they showed gains in lexical diversity and fluency, and improvements in both these areas were greater among these self-rated "improvers" than among those who believed their speaking skills had stayed the same or attrited. Those who felt their language skills had remained the same or had gotten worse were generally right in the sense that they did not show growth in lexical diversity or reduction of disfluency (i.e., they were correct in believing that they had not improved). However, as predicted, they were less accurate in assessing the degree of attrition: particularly in the FL2, those students who felt they had declined strongly did not, on average, show a greater degree of loss between T1 and T2 than those who felt their speaking skills had gotten only slightly worse or even than those who believed theirs had stayed the same. These findings tally with earlier research such as Weltens (1989), Weltens andGrendel (1993), or Murtagh (2003), where self-rating data indicated that subjects at various training levels overestimated the amount of attrition that had occurred. These results may of course indicate nothing more than that these foreign language learners have poor self-assessment skills. However, it is also conceivable that the reason for this overestimation of attrition lies elsewhere: rather than reflecting the actual (deterioration of) quality of their oral production, the self-analysis of these participants reflected the increased effort and mental strain it took to fulfill the same task some 16 months after graduation, which in itself can be considered an early sign of language attrition (see, e.g., de Bot & Weltens, 1995;Cohen, 1989;Ecke, 2004;Megens, 2011Megens, , 2020, on the treatment of compensatory strategies).

Outlook
The present study's results point toward a number of considerations regarding future research activities. The DMM places particular emphasis on the factor of effort, which is necessary not only in building and improving language proficiency and skills but also in maintaining what has already been achieved. So far, very few studies have been dedicated to the topic of individual language maintenance, so there is room for future work, especially from a multilingual perspective, on the role of LME in attrition processes in general, and on the more subjective examination of multilingual users' perception of the effort put into maintenance work in particular. Taking a closer look at self-assessment as a crucial indicator of multilingual development should also be part of future discussions. Self-reported data is generally considered problematic, as participants' responses may reflect what they think the testers want to hear or the image of themselves they would like to project rather than reality. In some cases (e.g., self-assessment of the degree of deterioration/improvement between T1 and T2), however, these self-reported data are interesting precisely because they are subjective and, particularly when contrasted with the harder data (e.g., of speech production quality), allow us to gain insight into phenomena that would otherwise go undetected.
Following the arguments made by de Bot (2011) and Ecke and Hall (2013) this current large-scale quantitative study could be complemented by in-depth qualitative longitudinal studies to obtain more information on the dynamics, complexity, and variability of learners' developing language systems including the temporary or permanent attrition of (foreign) languages. At the same time, we need to work from a research perspective moving from a simplistic picture of language development to a complex understanding of the multilingual mind (Aronin & Jessner, 2015). This implies that we learn to understand a multilingual person as a specific language learner/user who develops differently from mono-and often bilinguals on both the linguistic and the cognitive level.
Accordingly, future studies should explore the emergent nature of multilingual awareness and its role in multilingual development, in particular in foreign language attrition and/or maintenance processes (Megens, 2020) as well as metalinguistic and metacognitive processes that have been evidenced as influential on both the individual and group level. Metacognition, its role in multilingual development and its importance for success in lifelong learning, as pointed out in Jessner (2018), should also be part of future research and discussions.
Finally, from an applied research perspective, it is our hope that (future) language teachers increase their own knowledge of attrition and maintenance processes in order to support their students beyond the language classroom by equipping them with the skills and strategies that can help them better manage and maintain their languages once formal learning has ended, thus better preparing them for life and jobs in an increasingly multilingual society.