The importance of psychological and social factors in adult SLA: The case of productive collocation knowledge in L2 Swedish of L1 French long-term residents

Abstract The study investigates how psychological and social factors relate to productive collocation knowledge in late L2 learners of Swedish (French L1) (N = 59). The individual factors are language aptitude (measured through the LLAMA aptitude test), reported language use, social networks, acculturation, and personality. Multiple linear regression analysis showed that positive effects were found for LLAMA D (phonetic memory), LLAMA E (sound-symbol correspondence), reported language use, and length of residence (LOR). Furthermore, a negative effect was found for the personality variable Open-mindedness. These variables explained 63% (adjusted R²) of the variance, which represents large effects compared to other studies on individual factors. In sum, the findings confirm earlier results on the importance of language aptitude and language use for productive collocation knowledge. They also add evidence of the importance of personality and LOR. In sum, cognitive and social factors combine to explain different outcomes in adult L2 acquisition.


Introduction
Research has suggested that after some time in the teens, age effects diminish and individual variation in adult L2 learning is more dependent on social and psychological factors (cf. Hyltenstam, 2018). This study aims to contribute to this research by examining what factors best predict language proficiency among French long-term residents in Sweden. Many studies related to the critical period hypothesis have focused on grammatical intuition and different measures of phonology (e.g., Birdsong, 2005;DeKeyser, 2000). However, these studies have rarely looked into a central phenomenon for the advanced second language learner: collocations (e.g., make a decision, perfectly possible). Collocations are conventionalized word combinations that refer to a meaning unit and have in common that they cannot be generated by lexical or grammatical rules and contribute to fluent and idiomatic language use.
Research has consistently shown that mastery of collocations correlates with measures of second language proficiency Gyllstad, 2007;Nizonkiza, 2011). In using a conventionalized word combination, the speaker signals familiarity with, and a sense of belonging to, a specific linguistic community (Wray, 2002). As such, the use of collocations is a means of conforming to social norms and expectations. It is accordingly not unreasonable to assume that cognitive, affective, and social factors could have an effect on the successful acquisition of collocations. This area remains largely unexplored, however, and constitutes the research gap for the present study.
The study includes L2 learners of Swedish who started learning Swedish as adults. They are L1 French voluntary migrants who have spent at least 5 years in the host community, Sweden, but often longer. The participant sample is quite original with respect to mainstream second language acquisition (SLA) because it targets long-term L2 speakers having an L1 with many speakers around the world (French), in a second language setting with a very limited number of speakers in comparison (Swedish). The main research question to be answered is: What psychological and social factors predict productive collocation knowledge in long-term L2 residents?

Collocation knowledge and second language acquisition
Despite the plethora of definitions, most researchers agree that collocations consist of words that occur frequently together in a given language. The present study takes a statistical approach to defining what constitutes a collocation and considers a collocation to be any word combination in which the included words appear together more often than by chance (for details, see "Methods and Procedures") (Paquot & Granger, 2012). Research suggests that productive collocation knowledge (PCK) is the most challenging aspect of L2 vocabulary knowledge (e.g., Laufer & Waldman 2011;Schmitt, 2014). The difficulty in acquiring collocations in an L2 is assumed to relate to the L2 learner's relative lack of exposure to the target language and to the phenomenon of L1 entrenchment (L1 influence in preferred patterns) (Ellis, 2002(Ellis, , 2006. Massive exposure is therefore crucial to develop collocation knowledge, a theme often discussed within usage-based approaches to SLA (e.g., Ellis & Wulff, 2015).

Collocation knowledge and psychological and social factors
To date, little research has been conducted on what factors best predict collocation knowledge in an L2. Granena and Long (2013) (Spanish L2, Chinese L1) showed that in the late starter group (AO 16-29 years), language aptitude, measured by the LLAMA aptitude test, was a predictor of lexis and collocations and the subtests LLAMA D (sound recognition) and LLAMA E (sound-symbol correspondence) had the strongest effects (LLAMA D, r = .46, LLAMA E, r = .36). A similar result was found by Forsberg Lundell and Sandgren (2013), who investigated the relationship between PCK and aptitude and personality, in a small sample of Swedish L1 users of L2 French (N = 13). Just like Granena and Long (2013), they found an association with LLAMA D (r = .58). In addition, they found that PCK was correlated to two dimensions of the Multicultural Personality Questionnaire (MPQ), namely Open-mindedness and Cultural Empathy. This latter result indicates that not only aptitude would be relevant for collocation knowledge but perhaps also other individual factors such as personality. González-Fernandez and Schmitt (2015) also focus on PCK. Their study included 108 Spanish L1 English L2 participants of different proficiency levels, who had learned English for 13.67 years on average. The study showed that PCK was correlated with amount of everyday language exposure (r = .56). The importance of language exposure for collocational knowledge is also investigated by Dąbrowska (2019). She compares knowledge of grammar, vocabulary, and collocations in a group of English L1 speakers (N = 90) and a group of English L2 speakers (N = 67). Besides investigating their performance within the mentioned linguistic domains, she also measured the effect of individual differences in both groups. Print exposure was an important predictor for collocation knowledge, both in L1 and L2 speakers. However, when conducting a regression analysis, it turned out that "everyday language use" was by far the strongest predictor for collocation knowledge and explains 36% of the variance.
In the present study, language aptitude and language use will be included as primary factors, given their importance in earlier research. However, in view of the scarcity of quantitative research on individual factors and collocations, it is worthwhile including a few other factors that have yielded effects on other L2 proficiency domains.
As stated above, social integration may be important for successful acquisition of formulaic language (Dörnyei et al., 2004). Social integration is a complex phenomenon, but it is reasonable to assume that it could relate to variables such as social networks (cf. Dollmann et al., 2020) and acculturation, that is, cultural affiliation (Ryder et al., 2000). It could also be related to personality because as Kormos (2013) notes, personality can be a decisive factor for creating opportunities for language use. In a study by Ożańska-Ponikwia and Dewaele (2012), the personality trait Openness to Experience was the strongest predictor of self-perceived proficiency of L2 English in a migratory setting. The importance of Openness and Open-mindedness are generally confirmed by Moyer (2021) in her overview of gifted language learners. Collocations, because they are typically nativelike, could be a means for and result of social integration and thus linked to the aforementioned factors.
To summarize, research on collocations and individual factors suggests an effect of language aptitude and language use, but findings from naturalistic settings point to the importance of exploring other variables.

Research Questions and Hypotheses
The research question of this study is: To what extent do the following factors predict productive collocation knowledge?
This question relates to the five psychological and social factors investigated in this study (see Table 1 for the correspondence between the investigated factors and their operationalization as independent variables in the statistical analysis). The study also includes two extraneous variables (i.e., variables that are not the focus of the investigation, but that can potentially affect the dependent variable), namely length of residence (LOR) and length of Swedish studies.
Based on the previous research, we propose the following hypothesis: PCK will be related to language aptitude (LLAMA) given its importance for collocation knowledge (Forsberg Lundell & Sandgren, 2013;Granena & Long, 2013). It will also be related to target language use (language engagement), based on the results from González-Fernández and Schmitt (2015) and Dąbrowska (2019).

Participants
The present sample included 64 French L1 Swedish L2 speakers but 5 participants had to be excluded (see the "Data Analysis" section for more information). The remaining 59 participants consisted of 35 women and 24 men. Their mean age was M = 41.59 (SD = 9.13), ranging from 27 to 71 years. Their mean LOR was 13.20, ranging from 5 to 50 years. All of them had finished upper secondary education in France, before coming to Sweden. The participants were carefully selected based on the following sociobiographic criteria: 1. They had French as their main L1 (bilinguals from birth were not excluded unless Swedish was the other L1). 2. They had finished upper secondary education. 3. They had started learning the Swedish language no earlier than 12 years of age, to target postcritical period learners. 4. They had spent at least 5 years in Sweden.
Recruitment of participants relied on convenience sampling. In a first phase, participants were recruited through the Facebook groups Les Français à Stockholm (French people in Stockholm) and French connection. A nonnegligible portion of the participants were also recruited through snowball sampling.
The initial aim was to collect data from more than 64 participants, but this turned out to be impossible due to financial constraints. While a larger sample would have been desirable, we would like to insist on the value of the present dataset, given the relative scarcity of data on this category of participants in SLA research (long-term residents, L2 Swedish). No power analysis was conducted before recruiting participants. Instead, the aim was to recruit as many participants as possible during the project phase.

Productive collocation knowledge
The L2 Swedish PCK test used in the present study has been validated in a prior study (Prentice & Forsberg Lundell, 2021). The test targets verb þ noun collocations, such as ställa en fråga (Eng. pose a question). The test was developed based on Gyllstad (2007) for item selection. The items were extracted from newspaper corpora in the Swedish language bank (https://spraakbanken.gu.se) and items were selected based on MI scores and frequency (for details, see Prentice & Forsberg Lundell, 2021). The test had a fill-in-the-gap format. Participants were asked to supply the verb; the first letter of the verb was provided, to not open up the possibility for too many alternatives. For example: GP blir det första av de utländska medierna som får chansen att s__________ en fråga på presskonferensen. "GP [Göteborgs Posten] is the first of the foreign media getting a chance to p__________ a question at the press conference." Items were scored dichotomously (1 or 0). Besides only accepting alternatives that constitute a clear collocation (according to MI threshold and frequencies explained in Prentice & Forsberg Lundell, 2021), spelling mistakes were allowed (such as *stella instead of ställa) because they do not interfere with collocation knowledge.

Sociological questionnaires (independent variables)
The Language Engagement Questionnaire (LEQ) (McManus et al., 2014) measures language use and was developed by the LANGSNAP-project (https://langsnap.soton. ac.uk/). Participants were asked to indicate how often they carry out 23 activities in the target language, including both passive and active language use. The six response options ranged from "never" to "every day," which were then coded with numerical values ranging from 0 (never) to 5 (every day). In this study, "language engagement" was operationalized as the average of the 23 responses.
The Social Network Questionnaire (SNQ) (ibid.) provides detailed information about the number of people included in the participant's social networks in the target community, how they interact with these people, and in what languages. The social network variable used in this study is a numerical value that represents the number of people with whom the participant regularly interacts (only) in L2 Swedish.

Psychological tests and questionnaires (independent variables)
The LLAMA aptitude test (Meara, 2005), is one of the most recently developed language aptitude tests and has been widely used (e.g., Abrahamsson & Hyltenstam, 2008;Granena & Long, 2013). The test measures language aptitude with respect to vocabulary learning (LLAMA B), sound recognition (LLAMA D), sound-symbol correspondence (LLAMA E), and grammatical inferencing (LLAMA F).
The VIA Acculturation Questionnaire (Ryder et al., 2000). The questionnaire consists of 10 items assessing migrants' heritage culture attachment (VIA France) and 10 items assessing their host culture attachment (VIA Sweden). Participants were asked to express their liking for typical values, traditions, and practices for each culture on a 9-point Likert scale ranging from 1 (disagree) to 9 (fully agree).
The Multicultural Personality Questionnaire (MPQ)-Short Form (Van der Zee et al., 2013) measures an individual's potential to function in a new cultural environment. It is based on the five-factor model but has been adapted for the purpose of testing multicultural effectiveness. It measures personality along five dimensions: • Cultural Empathy: the ability to empathize with cultural diversity and to understand feelings, beliefs, and attitudes different from one's own heritage. • Open-mindedness: an open, unprejudiced attitude toward diversity.
• Social Initiative: the tendency to approach social situations actively, to take the initiative and engage in social situations. • Flexibility: the ability to learn from new experiences, including adjusting behavior according to contingency and enjoying novelty and change. • Emotional Stability: the tendency to remain calm in stressful situations and to control emotional reactions.
In addition, a sociobiographic questionnaire was also filled in, based on Moyer (2004). For the purpose of the present study, the information regarding LOR and length of Swedish studies was used. Table 1 contains an overview of the factors and the corresponding variables in the statistical analysis, as well as the instrument used. Cronbach's alpha was used to calculate a measure of reliability in the cases in which the tests were compatible with this type of analysis (see Table 2). The MPQ is divided into five dimensions and the VIA questionnaire into two dimensions, and reliability coefficients were calculated for all these. It should be noticed that the internal consistency for the productive collocation test is very high (0.96), whereas some of the MPQ dimensions (Cultural Empathy, Open-mindedness, and Emotional Stability) do not have very good internal consistency and results related to these dimensions should be interpreted with extra caution.

Procedures
The data collection process was undertaken by the first and second authors, who met in person with each participant in Stockholm during 2019 and 2020. The researchers and The whole session took 1.5-2 hours. The PCK test and the aptitude test were administered first because they were deemed to be more cognitively demanding than the others, and we wanted to make sure that fatigue was not an issue when performing these tests.

Data analysis
Recent recommendations from the American Statistical Association highlight the problems with significance testing, for example, the problem with deciding on whether a variable has an effect or not based on whether a p-value is above or below .05 (Wasserstein et al., 2019). These types of recommendations have also been discussed within the SLA domain by Larson-Hall and Plonsky (2015). In line with these recommendations, we will focus on estimating effect sizes and discussing the uncertainty in our measurements, rather than deciding that an effect is "significant" or not based on a p-value. Furthermore, we aim to make the effect sizes meaningful by describing their effects in terms of how much each variable needs to change to increase the PCK score by 1 SD.
To answer the research questions, two multiple linear regressions were conducted with PCK as the dependent variable. Multicollinearity was likely not a problem in either of the models, as indicated by the variance inflation factors (VIF) that were below 4 and the tolerance levels that were above .3, for all variables. Specifically, the mean VIF in Model 1 was 1.85, and in Model 2 it was 1.30. Also, inspection of Q-Q plots indicates that residuals in both models were approximately normally distributed. Finally, individual scatterplots and diagnostic plots (e.g., plots of residuals vs. fitted values) were inspected to satisfy that a linear model was applicable to the data. All analyses were conducted in the statistical software R version 4.1.1 (R Core Team, 2020).
In total, five participants were excluded from the analyses. Four participants were excluded for missing values in one or more variable(s). The final participant was excluded after inspection of a plot with residuals versus fitted values from the regression analyses, revealed that the participant was a multivariate outlier.

Results
Because the present research is largely exploratory, we present two models. First, in Model 1, all independent variables were included to explore their respective effect on PCK. Thereafter, we present Model 2 which only includes the variables that Model 1 indicated had a meaningful effect on PCK. That is, to include a variable in Model 2, we considered both the size of the effect and the width of the confidence intervals. Specifically, the confidence interval had to be narrow enough to indicate with some certainty, that the effect exists in the population, and the size of the effect had to be large enough to be meaningful for the understanding of PCK. See Table 3 for the results of both models.
Due to the relatively low number of participants in the present study, the confidence intervals are fairly broad, meaning that there is uncertainty about the size of the effects. See Table 4, for means, standard deviations and range of all of the variables, and Table A1 in the appendix for a full correlation matrix between all variables.  Model 1 and 2 When we included all 15 variables in Model 1, it explained 63% of the variance in PCK (adjusted R 2 = .63). Building on Model 1, we propose a more compact model in which we only included 5 variables that we judged had a meaningful effect on PCK. Although Model 2 included 10 less variables, it still explained 63% of the variance (adjusted R 2 = .63). Thus, both Model 1 and Model 2 explained a large amount of the variance according to Plonsky and Ghanbar's (2018, p. 724) categorization. However, given the use of fewer variables, Model 2 is a better model of what factors are important to develop PCK.
In the following text we present the effect of each variable together with the respective research question. To make the effect sizes more meaningful, we present them in terms of how much each variable has to change for the PCK score to increase by 1 SD (9.78 in the current sample).
To what extent does language aptitude predict PCK? Model 1. Language aptitude had an effect on PCK, but not all of its components. Specifically, Model 1 indicates that, to raise PCK by 1 SD, LLAMA D (b = .17) has to increase by 57.53 (i.e., 9:78 :17 ). Similarly, LLAMA E (b = .18) has to increase by 54.33. That is, although the model indicates that these two variables impact PCK, the effects are small because an individual would have to move almost the whole range of LLAMA D to increase PCK by 1 SD, and more than half the measure for LLAMA E. Nevertheless, the variables are still important for understanding PCK.
Meanwhile, LLAMA B (b = .06) and LLAMA F (b = .04) had no meaningful effects on PCK. To increase PCK by 1 SD they would have to increase by 163.00 and 244.50. That is, the effects are so small that a change of more than the scale ranges are required for PCK to increase by 1 SD.
Model 2. The effects LLAMA D (b = .17) and E (b = .20) remained almost the same in Model 2. That is, PCK increases by 1 SD for every 57.53 points on LLAMA D and for every 48.90 points on LLAMA E. Thus, Model 2 indicates that an individual's ability both to recognize sounds and to make sound-symbol connections are important for PCK.
To what extent does reported language use predict PCK? Model 1. An individual's language engagement (b = 3.67) may have a positive effect on PCK. The Language Engagement Questionnaire (LEQ) ranges from 0 (lowest) to 5 (highest), and an increase of 2.66 is required to increase PCK by 1 SD. Although the size of the effect was not large, it is not unimportant.
Model 2. The size of the effect remained largely the same in Model 2 (b = 3.90), for every 2.51 points it increases PCK by 1 SD. In other words, the model indicates that it is beneficial to engage with the L2 for PCK.
To what extent do social networks predict PCK? Model 1. The number of L1 speakers in the L2 user's social network does not have a meaningful effect on PCK according to Model 1 (b = -.30). In fact, Model 1 indicates that for PCK to increase by 1, their number of L2 relations has to decrease by 32.60. This effect is both small and unlikely. Thus, it is more likely that there is no effect on PCK and that the small negative effect is due to the imprecision of the measurements.
To what extent does acculturation predict PCK? Model 1. Both the VIA Sweden and VIA France scales range from 1 (lower) to 9 (higher). Model 1 indicates acculturation has no meaningful effect on PCK. Specifically, VIA Sweden (b = .51) has to decrease by 19.18 to increase it by 1 SD. Similarly, VIA France (b = .79) has to increase by 12.38. That is, to increase PCK by 1 SD, the VIA measures need to increase more than their scale ranges.
To what extent does multicultural effectiveness predict PCK? Model 1. Each of the five measures in the Multicultural Personality Questionnaire (MPQ) ranges from 1 (lowest) to 5 (highest). Model 1 indicates that out of the five personality measurements, only Open-mindedness had a meaningful effect on PCK, and the effect was negative. Namely, Open-mindedness (b = -5.06) has to decrease by 1.93 to raise PCK by 1 SD.
For the remaining four measures, a change of more than the range of the MPQ scale would be needed to increase PCK by 1 SD. To raise PCK by 1 SD, Cultural empathy has to increase by 44.45, Flexibility has to decrease by 13.58, Social initiative has to decrease by 5.59, and finally, Emotional stability would have to decrease by 5.43.
Model 2. The effect was similar in size in Model 2 (b = -5.75), that is, for every 1.70 points Open-mindedness decreases, PCK will increase by 1 SD. Thus, the effect was not large, but it is still important for PCK. However, please note that the participants mainly used the higher part of the scale (M = 3.75). Only four participants scored below the midpoint of the scale (3) and these four all scored 2.88. That is, the negative effect of being Open-minded on PCK may only hold when comparing highly Open-minded individuals to those moderately open-minded.
Length of residence and length of Swedish studies Model 1. While LOR had an effect on PCK (b = .72), length of L2 Swedish studies did not (b = .41). Specifically, to increase PCK by 1 SD, an individual would have to reside in the host country an additional 13.58 years. Meanwhile, an individual would have to study the L2 for another 23.85 years. Given that the average participant in our sample had studied Swedish for 1.18 years, we judge that the effect of studying an L2 only has a very small, if any, effect on PCK.
Model 2. The size of the effect of LOR on PCK was smaller in Model 2 (b = .56). Specifically, Model 2 indicates that PCK will increase by 1 SD for every 17.47 years. Nevertheless, Model 2 still indicates that LOR is important for an individual's PCK.

Discussion and conclusion
The present study set out to investigate which factors best explain individual variation in long-term L2 users of Swedish (L1 French) with respect to productive collocation knowledge. The sample included 59 participants (F = 35, M = 24, mean age of testing 41.7 years, mean LOR 13.20 years).
It was hypothesized that both language aptitude and language engagement would be important predictors of PCK. The remaining variables were exploratory. Two multiple regression analyses were conducted to investigate which of these factors best predicts PCK. Model 1 included all the variables and explained 63% of the variance. Model 2 included only the variables that, given Model 1, seemed to have a noticeable effect. These were LLAMA D (sound-recognition), LLAMA E (sound-symbol correspondence), language engagement, MPQ Open-mindedness, and LOR. Model 2 also explained 63% of the variance, in spite of including a much smaller number of variables. In relation to results from regression analyses in the field of SLA in general, 63% is to be considered a large effect and thus quite a robust model. Due to the limited sample size and the resulting imprecision of the measurements, it is difficult to say exactly which of the factors has the largest impact-however, the beta-values suggest that language engagement, LLAMA E, and LOR have the strongest effects among all the variables (see Table 3). Interestingly, this confirms the initial hypothesis and resonates with the results from Granena and Long (2013) regarding language aptitude (LLAMA D and LLAMA E) and with those of González-Fernández and Schmitt (2015) and Dąbrowska (2019) for language use and experience. Furthermore, because both LOR and language engagement were important predictors, the data lends strong support to usage-based theories' assumptions of the importance of frequency effects in language acquisition (e.g., Ellis, 2002).
However, the study also shows, in accordance with a multifactorial approach as proposed by the Douglas Fir Group (2016) and by Alene Moyer (2004Moyer ( , 2021, that frequency of input and language engagement alone cannot explain learning outcome. The present study indeed shows that a psychological factor such as language aptitude is important. In addition, another psychological factor was also part of Model 2: Open-mindedness. Interestingly enough, however, the relationship was negative in this case. It was mentioned already in the results section that, when interpreting this result, we need to consider the fact that the large majority of our participants report values from 3-5 (max 5) and the lower values on the scale are not represented. A tentative interpretation would thus be that being extremely openminded is negative for mastery of collocations, but not necessarily that being clearly close-minded is a facilitating factor. The results are thought-provoking in comparison to earlier findings on the role of personality in SLA where Openness to Experience and Open-mindedness are consistently reported as positively associated with language learning (Moyer, 2021;Ożańska-Ponikwia & Dewaele, 2012). Having conducted fieldwork with the included participants and based on the sociobiographic questionnaire, we know that some of our learners use English as a lingua franca on a daily basis. Some of these participants display cosmopolitan language ideologies and classify themselves as highly "open-minded." However, these participants typically attain only basic levels in Swedish. They reflect an international posture and one could presume that an unexpected "side-effect" of reporting to be very open-minded could be a lesser propensity to learn the local language. It is thus possible that the negative effect of open-mindedness is not specifically related to mastery of collocations, but to language proficiency in general, in a situation in which the target language competes with the global lingua franca English. This finding requires further research into personality traits and their connections to SLA. It also suggests relationships between personality and ideological positions, which could be further explored.
All in all, the study lends support to earlier findings on the role of both language engagement and aptitude as important explanatory factors for high-level L2 proficiency and collocation knowledge in particular. More generally, it suggests that a multifactorial approach is necessary when accounting for second language proficiency in a context of mobility and migration.
A limitation of the study is the sample homogeneity and size. Nevertheless, the present study is the first to investigate the impact of multiple factors on PCK in longterm L2 users. In addition, it is, for instance, rare in that it targets an L2 that competes with a global lingua franca. It is our hope that it will motivate similar studies, in a multitude of L2 user contexts.