Introduction
Words are the building blocks of language, and the brain is sensitive to diverse information encoded in individual words (Ellis & Ogden, Reference Ellis and Ogden2017), including statistical (Brysbaert et al., Reference Brysbaert, Mandera and Keuleers2018), phonological (Yates, Reference Yates2005), semantic (Kissler et al., Reference Kissler, Assadollahi and Herbert2006), and orthographic cues (Jalbert et al., Reference Jalbert, Neath and Surprenant2011) in first language (L1) processing. A key aspect of semantics is emotional content. Recent studies have examined the processing advantage of emotion words over neutral words, focusing on two affective dimensions: valence (positivity or negativity) and arousal (emotional intensity) (Barrett & Russell, Reference Barrett and Russell1998; Citron et al., Reference Citron, Gray, Critchley, Weekes and Ferstl2014). In psycholinguistic research, debate remains over whether valence exerts similar (Balota et al., Reference Balota, Yap, Cortese, Hutchison, Kessler, Loftis and Treiman2007) or distinct effects (Estes & Adelman, Reference Estes and Adelman2008; Kousta et al., Reference Kousta, Vinson and Vigliocco2009) on the word processing of L1 speakers. Findings are also mixed on whether processing is modulated by arousal (Kazanas & Altarriba, Reference Kazanas and Altarriba2016), linguistic features (e.g., word length) (Larsen et al., Reference Larsen, Mercer and Balota2006), or task type (Liu et al., Reference Liu, Fan, Tian, Li and Feng2023). For second language (L2) speakers, it is unclear whether their processing advantage is reduced (Iacozza et al., Reference Iacozza, Costa and Duñabeitia2017; Toivo & Scheepers, Reference Toivo and Scheepers2019) or comparable to L1 speakers (Ayçiçegi-Dinn & Caldwell-Harris, Reference Ayçiçegi-Dinn and Caldwell-Harris2009). Moreover, how psycholinguistic factors affect emotion word processing in L2 speakers at different proficiency levels, and how these patterns diverge from L1 processing, remains underexplored.
To bridge these gaps, this meta-analysis compared the processing advantages of emotion words over neutral words in L1, intermediate L2, and advanced L2 speakers. We further examine whether language processing involves valence-sensitive mechanisms, as proposed by the Automatic Vigilance Theory (Pratto & John, Reference Pratto and John1991), or instead reflects general motivational engagement with both positive and negative stimuli, as suggested by the Model of Motivated Attention and Affective States (Lang et al., Reference Lang, Bradley and Cuthbert1997). The moderating effects of arousal, word type, linguistic variables (frequency, word length, phonological and orthographic neighborhood sizes), and task type were also investigated. The findings will shed light on how embodied experience shapes word representation across varying levels of language proficiency.
Literature review
Emotion word processing for L1 and L2 speakers
Emotion appears to be automatically activated during L1 processing (Kissler et al., Reference Kissler, Herbert, Peyk and Junghofer2007; Kousta et al., Reference Kousta, Vinson and Vigliocco2009), but the mechanisms and extent of such activation in L2 remain debated. Behavioral studies using emotional priming (Kazanas & Altarriba, Reference Kazanas and Altarriba2016), word recall (Altarriba & Basnight-Brown, Reference Altarriba and Basnight-Brown2011), emotional Stroop (Eilola et al., Reference Eilola, Havelka and Sharma2007), and lexical decision tasks (Conrad et al., Reference Conrad, Recio and Jacobs2011; Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018) suggest that L2 speakers also automatically activate emotional information. For instance, Ayçiçegi-Dinn and Caldwell-Harris (Reference Ayçiçegi-Dinn and Caldwell-Harris2009) found comparable emotional activation in L1 and L2 speakers in a recall task. However, other studies have yielded contrasting findings. In a Stroop paradigm, Winskel (Reference Winskel2013) observed significant emotion effects on word processing in L1 but not L2 speakers. Using eye-tracking, Tang and Ding (Reference Tang and Ding2024) found that, unlike L1 readers (e.g., Knickerbocker et al., Reference Knickerbocker, Johnson and Altarriba2015), L2 readers showed no processing advantage for emotion words during sentence reading. Physiological data further revealed stronger pupil dilation in L1 than in L2 speakers during the processing of emotional information, consistently observed across both word recognition (Toivo & Scheepers, Reference Toivo and Scheepers2019) and sentence reading tasks (Iacozza et al., Reference Iacozza, Costa and Duñabeitia2017).
These divergent findings may be explained by the Emotional Contexts of Learning Theory (Caldwell-Harris et al., Reference Caldwell-Harris, Gleason and Aycicegi2006), which posits that L2 emotion word processing is shaped by learning context. Limited emotional engagement in L2 learning leads to weaker embodiment and reduced emotional activation compared to L1 (Pavlenko, Reference Pavlenko2012), with activation further moderated by variation in language use and acquisition environment (Ahn & Jiang, Reference Ahn and Jiang2023; Tang & Ding, Reference Tang and Ding2024). Importantly, L1 and L2 speakers also differ in the cognitive resources available for emotional language processing. Evidence from the Ghent Eye-Tracking Corpus (Cop et al., Reference Cop, Dirix, Drieghe and Duyck2017), where participants read entire novels in both L1 and L2, showed that processing emotionally rich narratives in L2 imposes greater cognitive demands than in L1. This aligns with multiple theoretical frameworks. According to the Shallow Structure Hypothesis (Clahsen & Felser, Reference Clahsen and Felser2006), L2 speakers process linguistic information in a shallower and less detailed way due to limited working memory resources. Likewise, the Lexical Quality Hypothesis (Perfetti, Reference Perfetti2007) suggests that L2 speakers’ less robust and integrated lexical representations increase cognitive load and hinder activation of emotional, semantic, and syntactic information. Therefore, compared to L1 speakers, fewer cognitive resources are left for emotional information for L2 speakers and smaller emotion effects are expected in processing (Imbault et al., Reference Imbault, Titone, Warriner and Kuperman2021). Finally, L1 and L2 speakers may differ in emotional representation. The Revised Hierarchical Model (Kroll & Stewart, Reference Kroll and Stewart1994) posits that L2 speakers access concepts via L1 translation rather than direct lexical-conceptual links. This limits their ability to effectively activate emotional content during processing.
Emotional factors that may moderate the processing advantages of emotion words
Valence
Emotion words such as joy, sadness, and surprise are categorized by valence as positive or negative. Valence plays a central role in emotion word processing (Blackett & Harnish, Reference Blackett and Harnish2022), as explained by two existing models.
The Automatic Vigilance Theory (Pratto & John, Reference Pratto and John1991) proposes that negative stimuli receive sustained attentional processing because of their survival relevance, resulting in humans’ slower responses to negative relative to positive or neutral stimuli. Valence thus exerts an independent moderating effect beyond other psycholinguistic factors. Supporting this theory, studies with L1 speakers have shown processing disadvantages for negative words across Stroop (Winskel, Reference Winskel2013), lexical decision (Estes & Verges, Reference Estes and Verges2008), and naming tasks (Algom et al., Reference Algom, Chajut and Lev2004). A large-scale study by Kuperman et al. (Reference Kuperman, Estes, Brysbaert and Warriner2014) found a monotonic effect of valence on lexical decision times, with more negative words eliciting slower responses. Similar patterns have been observed in L2 speakers: negative words often show processing disadvantages in behavioral (Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018), eye-tracking (Sheikh & Titone, Reference Sheikh and Titone2016), and neuroimaging studies (Jończyk, Reference Jończyk2016). However, vigilance effects may be weaker in L2 due to reduced emotional grounding or embodied experience (Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018; Pavlenko, Reference Pavlenko2012). For instance, Jończyk (Reference Jończyk2016) reported delayed and attenuated neural responses to negative words in L2 compared to L1 speakers.
In contrast, the Model of Motivated Attention and Affective States (Lang et al., Reference Lang, Bradley and Cuthbert1997) argues that attention is drawn to motivationally relevant stimuli regardless of valence, granting both positive and negative words processing advantages over neutral ones. Knickerbocker et al. (Reference Knickerbocker, Johnson and Altarriba2015) found that both types facilitated L1 reading. The model attributes prior differences between positive and negative word processing to uncontrolled psycholinguistic variables. For example, Vinson (2014), using the British Lexicon Project, showed that after controlling for variables like concreteness, frequency, and neighborhood size, emotion words—positive or negative—were processed faster than neutral ones. Similarly, Larsen et al. (Reference Larsen, Mercer and Balota2006) and Kousta et al. (Reference Kousta, Vinson and Vigliocco2009) reported that the processing disadvantages of negative words disappeared when controlling for word length, frequency, and orthographic neighborhoods.
Arousal
Arousal refers to the intensity of emotional activation. The dimensional model posits that arousal is an independent emotional dimension, distinct from valence, and may engage separate cognitive mechanisms in language processing (Barrett & Russell, Reference Barrett and Russell1999). Norming studies have identified a U-shaped relationship between valence and arousal across modalities, including affective pictures (Kurdi et al., Reference Kurdi, Lozano and Banaji2017), single words (Stadthagen-Gonzalez et al., Reference Stadthagen-Gonzalez, Imbault, Pérez Sánchez and Brysbaert2017), and extended verbal stimuli like idioms and sentences (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016; Morid & Sabourin, Reference Morid and Sabourin2024; Zhong et al., Reference Zhong, Shao and Yi2025). This pattern suggests that stimuli with stronger emotional valence—whether positive or negative—are typically associated with higher arousal levels than neutral stimuli. Previous studies have also reported mixed findings regarding the arousal effects on behavioral performance and neural representation in language processing among L1 speakers (Estes & Adelman, Reference Estes and Adelman2008; Kousta et al., Reference Kousta, Vinson and Vigliocco2009; Kuperman et al., Reference Kuperman, Estes, Brysbaert and Warriner2014; Larsen et al., Reference Larsen, Mercer, Balota and Strube2008). Estes and Adelman (Reference Estes and Adelman2008), for instance, found that arousal positively predicted English word recognition speed, with high-arousal words recognized faster than calming ones. In contrast, Kuperman et al. (Reference Kuperman, Estes, Brysbaert and Warriner2014) found that high-arousal words were recognized more slowly than calming ones, though the effect was minimal (0.1% variance) in a dataset of 12,658 English words. Similarly, Kousta et al. (Reference Kousta, Vinson and Vigliocco2009), analyzing 1,446 words, reported no significant arousal effect on reaction times when valence was controlled. Emotional arousal also affects L1 and L2 processing differently. Altarriba and Canary (Reference Altarriba and Canary2004) examined the arousal effects of emotion words in both English L1 speakers and Spanish L2 speakers of English using a lexical decision task. The results showed positive priming effects for both groups under arousal conditions compared to a baseline condition, but L2 speakers exhibited longer latencies compared to L1 speakers.
Emotion word type
Emotion words fall into two types: emotion-label words (e.g., happy, fearful), which directly denote emotional states, and emotion-laden words (e.g., coffin, murder), which imply emotions without explicit labeling (Pavlenko, Reference Pavlenko2008). The Emotion Duality Model (Imbir et al., Reference Imbir, Jurkiewicz, Duda-Goławska and Żygierewicz2019; Tang & Ding, Reference Tang and Ding2024) posits that emotion-label words trigger automatic, biologically rooted responses, while emotion-laden words require mediated conceptual access and greater cognitive effort (Altarriba & Basnight-Brown, Reference Altarriba and Basnight-Brown2011). Previous studies have reported varying findings on the processing of emotion-label and emotion-laden words in L1 speakers. Zhang et al. (Reference Zhang, Wu, Yuan and Meng2020) revealed that emotion-label and emotion-laden words activated different cortical responses and demonstrated neural dissociation, while Martin and Altarriba (Reference Martin and Altarriba2017) and Vinson et al. (Reference Vinson, Ponari and Vigliocco2014) found no significant word type effects. In L2 processing, Kazanas and Altarriba (Reference Kazanas and Altarriba2016) reported a larger priming effect for emotion-label than emotion-laden words in their L1, but not in L2, among Spanish-English bilinguals in a lexical decision task. However, in emotion categorization tasks (Tang et al., Reference Tang, Fu, Wang, Liu, Zang and Kärkkäinen2023), L2 speakers exhibited higher accuracy and shorter reaction times for emotion-label words compared to emotion-laden words. Additionally, Altarriba and Basnight-Brown (Reference Altarriba and Basnight-Brown2011) found that L1 and L2 speakers differed in word type-valence interactions in an Affective Simon Task. L1 speakers showed valence-color congruency effects only for negative emotional-label words, while L2 speakers exhibited expected effects for both word types, regardless of valence.
Linguistic factors that may moderate the processing advantages of emotion words
Linguistic factors such as concreteness, frequency, length, and orthographic or phonological neighborhood size are known to influence general word processing (Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018; Kuperman et al., Reference Kuperman, Estes, Brysbaert and Warriner2014). These linguistic factors also moderate emotion word processing (e.g., Kousta et al., Reference Kousta, Vinson and Vigliocco2009; Larsen et al., Reference Larsen, Mercer and Balota2006; Vinson et al., Reference Vinson, Ponari and Vigliocco2014), as reviewed below.
Concreteness
Concrete and abstract words differ in emotional connotation. According to Vigliocco et al.’s (Reference Vigliocco, Meteyard, Andrews and Kousta2009) Embodied Theory, concrete words arise from external sensory-motor experiences, while abstract words relate to internal states. Norming studies have shown that more abstract words are generally rated as more positive (Hinojosa et al., Reference Hinojosa, Martínez-García, Villalba-García, Fernández-Folgueiras, Sánchez-Carmona, Pozo and Montoro2016; Kousta et al., Reference Kousta, Vigliocco, Vinson, Andrews and Del Campo2011) and more arousing (Guasch et al., Reference Guasch, Ferré and Fraga2016; Vigliocco et al., Reference Vigliocco, Kousta, Della Rosa, Vinson, Tettamanti, Devlin and Cappa2014). For L1 word processing, some studies reported that concrete emotion words were processed more quickly and accurately in recall and lexical decision tasks (Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018) and elicited earlier ERP components (Palazova et al., Reference Palazova, Sommer and Schacht2013) compared to abstract ones. However, when attention focuses on emotional content, abstract words may show a processing advantage over concrete words. Yao and Wang (Reference Yao and Wang2014) found that abstract positive words elicited faster responses in emotional priming tasks. Similarly, Jin et al. (Reference Jin, Ma, Li and Zheng2023) reported that abstract words more effectively captured attention and enhanced emotional evaluation than concrete words in an emotion recognition task, as reflected in ERP data. Studies have also reported that concreteness differentially modulates emotion-related effects in L1 and L2 speakers. For instance, valence effects were found to be stronger for abstract words than for concrete words in L1 speakers (Palazova et al., Reference Palazova, Sommer and Schacht2013), but stronger for concrete words in L2 speakers (Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018). This contrast may reflect disembodied cognition (Pavlenko, Reference Pavlenko2012) and acquisition order (Jones & Mewhort, Reference Jones and Mewhort2007). L2 speakers tend to have less emotional grounding than L1 speakers. As a result, concrete words, acquired earlier and tied to sensory experience, may evoke stronger emotional responses than later-learned, abstract words in L2 speakers.
Frequency
Language acquisition is shaped by repeated exposure to language, and language speakers are sensitive to the frequency of linguistic units (Ellis & Ogden, Reference Ellis and Ogden2017). High-frequency words are processed more efficiently (Balota et al., Reference Balota, Cortese, Sergent-Marshall, Spieler and Yap2004; Ellis, Reference Ellis2002), which was also observed with emotion words (Larsen et al., Reference Larsen, Mercer and Balota2006). Frequency also moderates the effects of other psycholinguistic factors such as valence and arousal (Kuperman et al., Reference Kuperman, Estes, Brysbaert and Warriner2014). From a cognitive perspective, low-frequency words may allow deeper processing, thereby enabling psycholinguistic variables to exert their effects over the course of processing. In contrast, some studies reported stronger valence effects for high-frequency words, particularly in lexical decision and eye-tracking tasks (e.g., Scott et al., Reference Scott, O’Donnell, Leuthold and Sereno2009).
Length
Few studies have directly examined word length effects on emotion word processing. Nevertheless, some norming studies have explored the link between word length and emotionality. Larsen et al. (Reference Larsen, Mercer and Balota2006), for instance, found that emotion words tend to be longer. Beyond emotion words, length effects are well-documented, with longer words generally processed more slowly than shorter ones (e.g., Baddeley et al., Reference Baddeley, Thomson and Buchanan1975). Yet, this relationship may not be strictly linear. Using English Lexicon Project data (Balota et al., Reference Balota, Burgess, Cortese and Adams2002), New et al. (Reference New, Ferrand, Pallier and Brysbaert2006) identified a U-shaped effect: 5–8-letter words yielded faster responses compared to 3–4 letter or 9–13 letter words. Similar findings were reported in Balota et al. (Reference Balota, Yap, Cortese, Hutchison, Kessler, Loftis and Treiman2007). Yap and Balota (Reference Yap and Balota2009) later attributed this U-shaped trend to the optimal perceptual span for medium-length words (6–9 letters).
Orthographic and phonological neighborhood sizes
Language processing is influenced by the activation of similar lexical representations, measured by orthographic and phonological neighborhood sizes. Orthographic neighborhood size refers to the number of words formed by changing one letter while maintaining letter positions (Coltheart et al., Reference Coltheart, Davelaar, Jonasson, Besner and Dornick1977). Larger orthographic neighborhoods facilitate visual word recognition (Forster & Shen, Reference Forster and Shen1996) and recall (Jalbert et al., Reference Jalbert, Neath and Surprenant2011). A meta-analysis of 32 studies by Larsen et al. (Reference Larsen, Mercer and Balota2006) found that emotion words had significantly smaller orthographic neighborhoods than control words. In word processing, larger orthographic neighborhoods are generally associated with faster response latencies in behavioral tasks (Forster & Shen, Reference Forster and Shen1996), possibly due to stronger semantic connectivity in memory (Larsen et al., Reference Larsen, Mercer and Balota2006). Phonological neighborhood size—the number of words differing by one phoneme (Luce & Pisoni, Reference Luce and Pisoni1998)—also affects language processing. Yates (Reference Yates2005) found a processing advantage for words with larger phonological neighborhoods across naming, lexical decision, and categorization tasks, whereas other studies reported inhibitory effects of larger phonological neighborhoods in lexical decision and naming tasks (Luce & Pisoni, Reference Luce and Pisoni1998).
The role of task type in the processing advantages of emotion words
Over the past decades, various tasks—including Stroop color-naming, lexical decision (Estes & Verges, Reference Estes and Verges2008), word naming (Algom et al., Reference Algom, Chajut and Lev2004), and eye-tracking during reading—have been used to investigate emotion word processing. These tasks fall into three categories: (1) Emotional tasks (e.g., emotion judgment task) direct attention to emotional content, requiring participants to categorize words by valence (Liu et al., Reference Liu, Fan, Tian, Li and Feng2023); (2) Irrelevant tasks (e.g., Stroop color-naming task) focus on non-emotional, non-linguistic features like color, ignoring emotional or linguistic content (Algom et al., Reference Algom, Chajut and Lev2004; Larsen et al., Reference Larsen, Mercer, Balota and Strube2008); (3) Linguistic tasks (e.g., lexical decision and semantic judgment tasks) focus on semantic, grammatical, or orthographic properties (Yao et al., Reference Yao, Yu, Wang, Zhu, Guo and Wang2016). The rationale for using irrelevant and linguistic tasks is that if emotional information is automatically activated, differences in processing between emotion and neutral words would be observed.
Some studies have examined how task type affects emotion word processing in L1 and L2 speakers. Winskel (Reference Winskel2013) compared Stroop (irrelevant) and emotion judgment (emotional) tasks in Thai-English bilinguals. Emotional effects appeared in both tasks for L1 (Thai), but only in the emotion judgment task for L2 (English). Winskel explained that the Stroop task taps early, automatic processing, whereas the emotion judgment task involves later, conscious processing. Unlike L1 speakers, L2 speakers may not automatically activate emotional content but can do so at later processing stages. Task type also showed significant interactions with specific emotional variables. Estes and Verges (Reference Estes and Verges2008) reported an interaction between valence and task type in L1 speakers: while approach effects for positive words were consistent across tasks, freezing responses to negative words appeared only when emotion was not task-relevant, such as in irrelevant or linguistic tasks. Ferré et al. (Reference Ferré, Anglada-Tort and Guasch2018) further examined this interaction in both the L1 and L2 of language speakers. In L1, negative words showed a processing disadvantage relative to neutral words in emotion judgment and lexical decision tasks, but not in free recall tasks. By contrast, positive words showed a processing advantage over neutral words in lexical decision and free recall tasks, but not in emotion judgment tasks. In L2, negative word effects appeared in both lexical decision and free recall tasks, while positive word effects emerged only in lexical decision tasks.
The current study
To synthesize divergent findings, we conducted a meta-analysis on the processing advantages of emotion words. We examined how emotional, linguistic factors and task type influence emotion word processingFootnote 1 across studies involving L1 and L2 speakers. Given differences in language acquisition, we further assessed whether L1, advanced, and intermediate L2 speakersFootnote 2 differ in sensitivity to these variables. Three research questions (RQs) guided this investigation:
-
RQ1: To what extent do emotion words enjoy a processing advantage over neutral words?
-
RQ2: To what extent do emotional factors (i.e., valence, arousal, emotion word type), linguistic factors (i.e., concreteness, word frequency, word length, orthographic and phonological neighborhood sizes), and task type moderate the processing advantage of emotion words?
-
RQ3: To what extent are the main effects of the above-mentioned psycholinguistic factors on the processing advantage of emotion words moderated by language proficiency?
Methodology
Inclusion and exclusion criteria
We searched Google Scholar, Web of Science, PsycINFO, ERIC, Linguistics and Language Behavior Abstracts, and ProQuest Global Dissertations to identify studies on emotion word processing. Keywords including “emotions”, “emotion-label”, “emotion-laden”, “negative words”, “positive words”, “neutral words,” and “word processing” were used in the searches for studies published before March 1, 2025. This yielded 3,436 reports. A backward citation search (Yanagisawa & Webb, Reference Yanagisawa and Webb2021) identified 241 additional reports, resulting in 3,677 studies screened using the six inclusion criteria below.
-
1. We only included studies written in English and excluded those written in other languages.
-
2. We included studies examining the emotion word processing advantage across all languages represented in the dataset (e.g., English, Chinese, German).
-
3. This meta-analysis focused on the processing advantage of emotion words over neutral words. Therefore, we only included studies that involved neutral words as control groups.
-
4. The included studies fell into four speaker group types: (a) L1-only, (b) L2-only, (c) mixed L1 and L2 speakers of the same language, and (d) bilingual individuals with both an L1 and an L2.Footnote 3
-
5. We used response times or reaction times (both referred to as RTs hereafter) as the measures of effect sizes. Studies without reported RTs were excluded.
-
6. We included studies that investigated the processing of emotion words in adults and excluded those exclusively focused on children.
-
7. We only included data from healthy, neurotypical participants.
-
8. We included studies reporting sufficient data (e.g., RTs, standard deviations, or t-values) to calculate effect sizes. Studies lacking such data were excluded.
We examined the abstracts and introductions of potentially relevant studies, identifying 307 potential studies. Figure 1 presents the screening process (see Appendix S1 for more details about the PRISMA flow diagram). Through a full-text review, we then narrowed these down to 88 studies with 3,280 participants that met all our criteria, reporting 391 effect sizes. These studies comprised 85 journal papers, 2 doctoral theses, and 1 master’s thesis (see Appendix S2 for included studies for this meta-analysis).
PRISMA flow diagram for meta-analysis.

Coding
For all included studies, we coded outcome variables (e.g., descriptive statistics or metrics for effect size calculation), predictor variables (e.g., valence, arousal, frequency, emotion word type, length, concreteness, orthographic and phonological neighborhood sizes, task type, language proficiency, language type, language background), and study identifiers (author, year). All predictors except neighborhood size and frequency were treated as categorical, using scale midpoints when needed. If predictor values were unreported but full stimuli were available (N = 43), we retrieved values from reference databases; otherwise, values were coded as missing (see Appendix S3 for details of the coding scheme in Table 1 and Appendix S4 for the coding sheet).
Coding scheme for predictor variables

Note:
a Each unique participant group and language spoken was treated as a separate entry. Language proficiency was coded based on self-rated proficiency, length of language learning, and age of acquisition. Participants who acquired the language from birth and reported near-ceiling proficiency or were identified as native speakers were coded as L1. Those who rated themselves in the top third and began learning before age 10 [4] (Hartshorne et al., 2018) were classified as advanced L2 speakers, and the rest were labeled as intermediate speakers. This yielded 23 advanced and 29 intermediate L2 speaker groups—a roughly balanced distribution that facilitated subsequent analyses. Of the 88 included studies, 71 involved only L1 speakers, 2 only advanced L2, and 6 only intermediate L2. Additionally, 5 studies included both L1 and advanced L2 speakers, and 4 included both L1 and intermediate L2 speakers. Language proficiency was coded categorically rather than continuously due to the absence of consistent, standardized, and psychometrically valid test scores across studies. Self-report scales also varied (e.g., 10-point in Chen et al., 2015 vs. 7-point in Naranowicz et al., 2023), limiting comparability. Categorical coding further allowed detection of non-linear effects that linear models might obscure. In addition, we used multilevel meta-regression with robust variance estimation to assess minimum detectable effects (MDE) under the unbalanced L2–L1 data structure. Results showed sufficient sensitivity to detect small effects for advanced L2–L1 comparisons (MDE = 0.13) and small-to-moderate effects for intermediate L2–L1 comparisons (MDE = 0.23), indicating that group imbalance did not preclude meaningful analyses. To further mitigate bias from sample imbalance, we adopted several strategies: (1) cluster-robust variance estimation with small-sample corrections, which was used to ensure valid inference under unbalanced conditions;(2) sensitivity analyses excluding outliers, unpublished studies, or collapsing L2 groups; (3) disaggregating studies with multiple contrasts to increase usable data for L2 groups; and (4) applying a five-level meta-regression model to account for effect size variance, within- and between-study dependence, and language-level heterogeneity. This approach ensured accurate modeling and reduced the influence of group size imbalance on statistical inference.
b Valence, arousal, and concreteness were treated as categorical variables for several reasons. First, cross-linguistic variation in semantic interpretation complicates fine-grained numerical comparisons. Second, rating scales varied across studies—for valence and arousal (e.g., 9-point in Sheikh & Titone, 2013; 7-point in Chen et al., 2015; 5-point in Pauligk et al., 2019) and concreteness (e.g., binary in Spadacenta et al., 2014; 5-point in Kaltwasser et al., 2013; 7-point in Ferré et al., 2017; 9-point in Chen et al., 2015). Third, the Model of Motivated Attention and Affective States (Lang et al., Reference Lang, Bradley and Cuthbert1997) predicts non-linear valence effects that are better captured through categorical rather than continuous modeling. Finally, numerical concreteness ratings were often missing (in 76.1% of studies), and categorical coding enabled broader inclusion and theory-driven contrasts.
c Length was measured in letters. To ensure consistency, we excluded languages with non-Latin orthographies (e.g., Chinese and Korean), which rely on logographic characters or syllabic block units rather than linear sequences of Latin alphabetic characters. Given the U-shaped patterns of length effects, where words of medium length have the shortest RTs as found in previous studies (Balota et al., Reference Balota, Yap, Cortese, Hutchison, Kessler, Loftis and Treiman2007; New et al., Reference New, Ferrand, Pallier and Brysbaert2006), we categorized word length into short, medium, and long bins based on tertiles.
d See Appendix S5 for the descriptions and coding of each type of task.
e Language background refers to the language speakers’ first language.
Effect size calculation
When studies reported complete RTs and standard deviations, Cohen’s d was computed as the difference between emotion word (positive or negative) and neutral word RTs, divided by the pooled standard deviation. For 12 studies reporting t-values or correlations, we converted them directly into Cohen’s d. To correct small-sample bias, we applied a correction factor J to obtain Hedges’ g (see Appendix S6 for details).
Coding procedure
Two researchers with expertise in psycholinguistics and statistics collaboratively refined the coding scheme. One conducted the initial coding, and the other independently verified it based on the finalized scheme. Discrepancies were resolved through discussion. Inter-rater reliability was high (Cohen’s κ = 0.998). See Appendix S7 for details.
Analysis procedure
Analyses were conducted in R (v4.3.1; R Core Team, 2023) using the metafor (v3.0.2; Viechtbauer, Reference Viechtbauer2010) and clubSandwich (v0.5.10; Pustejovsky, Reference Pustejovsky2023) packages. To address effect size dependencies and account for language type and learner variables, we employed a five-level multilevel meta-regression model (e.g., Yanagisawa & Webb, Reference Yanagisawa and Webb2021), with random effects for sampling variance (Level 1), within-study variance (Level 2), between-study variance (Level 3), language background (Level 4), and language type (Level 5). Cluster-robust variance estimation (Hedges et al., Reference Hedges, Tipton and Johnson2010), with small-sample corrections (Tipton & Pustejovsky, Reference Tipton and Pustejovsky2015), was used to account for dependency among effect sizes and ensure valid inference under unbalanced conditions. The alpha level was set at .05.Footnote 4
To examine the processing advantage of emotion words, as indexed by effect sizes (RQ1), we computed weighted average effect sizes by multiplying each estimate by the inverse of its variance, summing the weighted values, and dividing by the total weight (See Appendix S6 for variance calculations and Appendix S8 for the forest plot of effect sizes and variances). To explore the moderating effects of emotional, linguistic, and task-related factors on effect sizes (RQ2), each factor was entered individually into separate models as a predictor.Footnote 5 To assess whether these effects vary by language proficiency (RQ3), we added language proficiency and its interactions with each of the above-mentioned psycholinguistic variables as model predictors.Footnote 6
To assess potential publication bias, we generated a funnel plot, conducted Egger’s sandwich test, and applied fail-safe N, Orwin’s fail-safe N, and the trim-and-fill method (Borenstein et al., 2009). All results indicated no publication bias (see Appendix S9). We also ran multiple sensitivity analyses, confirming the robustness of the identified effects (see Appendix S10).
Results
No significant overall processing advantage was found for emotion words (combined negative and positive) compared to neutral words when analyzing data pooled across all language groups (Hedge’s g = 0.030, 95% CI = [−0.097, 0.157], P = .612). Only L1 speakers showed a significant processing advantage of positive words (b = 0.198, P = .008), while negative words did not differ significantly from neutral words in any language speaker group. L1 speakers also showed a significant processing advantage for positive over negative words (b = −0.179, P = .005). A significant interaction with proficiency (P = .021) further confirmed that this valence effect was stronger in L1 than in intermediate L2 speakers, in which no significant effects were observed (see Table 2).
Results of the moderator analyses including emotional factors as predictors

Note: Valence (reference group): positive. Arousal (reference): low arousal. Word_type (reference group): emotion-laden word. Language proficiency (reference group): L1 speaker
a Redundant predictors were dropped from the model due to the absence of high-arousal cases among advanced L2 speakers.
A significant moderating effect of language proficiency (intermediate L2 vs. L1) on arousal was observed (P = .036), such that high arousal reduced the processing speed advantage of emotion words measured by RTs to a greater extent in intermediate L2 speakers than in L1 speakers. A significant effect of word type was observed only in advanced L2 speakers (b = 0.450, P = .036), indicating that emotion-label words were associated with a larger processing advantage compared to emotion-laden words. Interaction analysis with proficiency (P = .041) revealed that this word type effect was significantly stronger in advanced L2 speakers than in L1 speakers, who showed no significant effect.
When considering linguistic factors (see Table 3), a significantly greater processing advantage for longer words compared to shorter words (medium vs. short: b = 0.148, P = .015; long vs. short: b = 0.325, P = .011) was found only among advanced L2 speakers. We also observed a significant interaction between language proficiency and word length (long vs. short: P = .048), indicating that the length effects were more pronounced in advanced L2 speakers than in L1 speakers, who showed no significant word length effect. In addition, a significant processing advantage for concrete emotion words compared to abstract emotion words was found only in intermediate L2 speakers (b = 0.370, P = .030). This concreteness effect was stronger than that observed in advanced L2 speakers (P = .049) and L1 speakers (P = .042), neither of whom showed significant concreteness effects.
Results of meta-regressions including valence, language proficiency, and linguistic factors or task type as predictors

Note: Concreteness (reference: low concreteness); length (reference: short); task (reference: emotional task); valence (reference: positive words); language proficiency (reference: L1 speakers).
Significant task effects were observed only in intermediate L2 speakers, with smaller processing speed advantages measured by RTs for emotion words in both the irrelevant task (b = −0.725, P = .007) and the linguistic task (b = −0.588, P = .007), compared to the emotional task. Interaction analyses revealed that task effects (linguistic task vs. emotional task: P = .024) were greater in intermediate L2 speakers than in L1 speakers, in which no significant task effects were found (see Table 3).
Discussion
No overall processing advantage was found for emotion words (combined positive and negative) over neutral words across language groups, consistent with mixed findings in L1 and L2 research. Some studies report advantages for emotion words regardless of valence (e.g., Kousta et al., Reference Kousta, Vinson and Vigliocco2009; Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018), while others suggest that emotional content—especially negative—may impair processing (e.g., Palazova et al., Reference Palazova, Sommer and Schacht2013). These inconsistencies may reflect the influence of various psycholinguistic moderators, as we discuss below.
Valence effects emerged only in L1 speakers, who showed a significant processing advantage for positive over negative words, suggesting distinct cognitive mechanisms for different valences. In contrast, among L2 speakers, emotion word processing was significantly moderated by arousal, emotion word type, word length, frequency, and task type, indicating a reliance on integrated encoding of emotional, linguistic, and cognitive factors, while such effects were largely absent in L1 speakers. These differences may stem from emotional grounding and lexical representation quality (Ellis, Reference Ellis2002). First, according to the Emotional Contexts of Learning Theory (Caldwell-Harris et al., Reference Caldwell-Harris, Gleason and Aycicegi2006), L1 speakers acquire linguistic forms, sensory-motor experiences, and emotional content concurrently in emotionally rich childhood environments (Pavlenko, Reference Pavlenko2012), enabling automatic emotional activation. In contrast, L2 is often learned through explicit instruction in emotionally neutral contexts, offering fewer opportunities to link word meanings with autobiographical experiences (Tenderini et al., Reference Tenderini, de Leeuw, Eilola and Pearce2022; Jończyk, Reference Jończyk2016), resulting in weaker emotional grounding—particularly for intermediate learners. Second, emotional representations in the L2 mental lexicon may be less robust (the Lexical Quality Hypothesis, Perfetti, Reference Perfetti2007), and emotional language processing may be shallower and more structurally constrained (the Shallow Structure Hypothesis, Clahsen & Felser, Reference Clahsen and Felser2006). Due to weaker emotional grounding and representation, emotional activation in L2 is less automatic and demands greater cognitive effort, reducing sensitivity to emotional content compared to L1 speakers (Caldwell-Harris et al., Reference Caldwell-Harris, Gleason and Aycicegi2006; Pavlenko, Reference Pavlenko2012).
Valence has significant effects that vary between L1 and L2 speakers
The processing advantage of positive over negative words was significant among L1 speakers. This valence effect aligns with prior findings (e.g., Algom et al., Reference Algom, Chajut and Lev2004; Estes & Verges, Reference Estes and Verges2008; Kazanas & Altarriba, Reference Kazanas and Altarriba2016; Kuperman et al., Reference Kuperman, Estes, Brysbaert and Warriner2014) and supports the Automatic Vigilance Theory (Pratto & John, Reference Pratto and John1991), which posits that negative emotions trigger a “freeze” response in language processing (Kousta et al., Reference Kousta, Vinson and Vigliocco2009). However, the lack of a significant disadvantage for negative versus neutral words suggests the findings do not fully contradict the Stimulated Attention and the Emotion State Model (Lang et al., Reference Lang, Bradley and Cuthbert1997). Rather, both models may coexist: while negative words can also enhance processing relative to neutral words, similar to positive words, avoidance mechanisms may offset this advantage. In either case, the results indicate that, at least for L1 speakers, language processing is deeply intertwined with emotional content (Sheikh & Titone, Reference Sheikh and Titone2016).
However, no significant valence effects were found in either advanced or intermediate L2 speakers. Interaction analysis confirmed that valence effects were significantly stronger in L1 than in intermediate L2 speakers, who showed no effect. Our findings align with prior evidence of reduced emotional responses in L2 speakers compared to L1 speakers, as seen in pupil dilation (Toivo & Scheepers, Reference Toivo and Scheepers2019), eye-tracking (Tang & Ding, Reference Tang and Ding2024), and skin conductance (Eilola & Havelka, Reference Eilola and Havelka2011). This highlights potential heterogeneity in emotion word processing between L1 and L2 speakers. The absence of valence effects in L2 speakers may stem from reduced emotional grounding (Caldwell-Harris et al., Reference Caldwell-Harris, Gleason and Aycicegi2006), lower-quality emotional representations (Perfetti, Reference Perfetti2007), and shallower syntactic processing (Clahsen & Felser, Reference Clahsen and Felser2006), all of which may limit cognitive resources for emotional processing. Notably, although advanced L2 speakers showed no significant valence effects, the interaction analysis revealed no significant group differences between advanced L2 and L1 speakers. This suggests that with increased proficiency, cognitive resources, and language experience, L2 speakers’ sensitivity to emotional valence gradually aligns with that of L1 speakers.
L2 speakers showed stronger arousal effects and word type sensitivity
The reduction in processing speed for highly arousing emotion words, as measured by RTs, was found only in intermediate L2 speakers. A significant interaction between arousal and language proficiency confirmed that the inhibitory effect of high arousal was stronger in intermediate L2 speakers than in L1 speakers, who showed no significant effect. While studies on L1 speakers have reported mixed arousal effects—some showing facilitation (e.g., Kousta et al., Reference Kousta, Vinson and Vigliocco2009), others reporting null or slight inhibitory effects (e.g., Kuperman et al., Reference Kuperman, Estes, Brysbaert and Warriner2014)—our results suggest that intermediate L2 speakers consistently exhibit a qualitative difference in their processing of high-arousal stimuli compared to L1 speakers, aligning with prior findings (Altarriba & Canary, Reference Altarriba and Canary2004). The stronger inhibitory effect of high arousal in intermediate L2 speakers may result from less emotionally grounded access to emotion word meanings and shallower emotional representations. First, L1 speakers typically acquire language in emotionally rich contexts, forming strong links between words and emotional experiences (Caldwell-Harris et al., Reference Caldwell-Harris, Gleason and Aycicegi2006). In contrast, L2 speakers—particularly at the intermediate level—often learn in emotionally neutral settings, limiting such associations. Therefore, they exhibit disembodied emotion processing, with weaker emotional engagement and reduced sensorimotor grounding compared to L1 speakers (Dylman & Bjärtå, Reference Dylman and Bjärtå2019; Kühne & Gianelli, Reference Kühne and Gianelli2019; Pavlenko, Reference Pavlenko2008). Second, intermediate L2 speakers may have difficulty forming precise, automatized lexical representations (Perfetti, Reference Perfetti2007) and tend to develop shallower emotional representations (Clahsen & Felser, Reference Clahsen and Felser2006). Reduced automatic access to emotion word meanings and less deeply emotional grounding increase the cognitive demands of processing high-arousal information, thereby amplifying its inhibitory effects on word processing in less proficient L2 speakers.
Regarding word type, a processing advantage for emotion-label over emotion-laden words was found only in advanced L2 speakers. They also showed greater sensitivity to word type than L1 speakers, who showed no significant effect. This pattern, consistent with prior research (Altarriba & Basnight-Brown, Reference Altarriba and Basnight-Brown2011; Kazanas & Altarriba, Reference Kazanas and Altarriba2015), suggests that representational differences between the two word types are most pronounced in advanced L2 speakers. L1 speakers process both types efficiently due to fully developed emotion-semantic mappings, while intermediate L2 speakers show generally reduced effects due to weaker access to emotion meanings. In contrast, advanced L2 speakers—whose emotional language processing is partially developed—are sufficient for effective engagement with emotion-label words but still limited for the more context-dependent emotion-laden words. To be more specific, L1 speakers’ reduced sensitivity to word type, relative to L2 speakers, may reflect differences in emotional representation (the Emotion Duality Model; Imbir et al., Reference Imbir, Jurkiewicz, Duda-Goławska and Żygierewicz2019; Tang & Ding, Reference Tang and Ding2024) and distinct lexical structures in L1 and L2 systems (the Revised Hierarchical Model; Kroll & Stewart, Reference Kroll and Stewart1994). Emotion-label words denote discrete emotional states, while emotion-laden words convey affect through contextual associations. For L1 speakers, emotionally grounded experience supports efficient extraction of affective content from both types. In contrast, L2 speakers, lacking embodied experience, find emotion-laden words—whose meanings depend on real-world associations—less accessible. They process these with reduced automaticity and greater reliance on context, leading to stronger differentiation between word types. Notably, the word type effect was absent in intermediate L2 speakers, possibly due to limited emotional language experience. This may hinder even the processing of emotion-label words, diminishing their advantage over emotion-laden words and weakening the overall word type effect.
Taken together, our findings suggest that distinct emotional dimensions are independently represented in word processing. Their interaction with language proficiency indicates that L2 speakers, especially intermediate ones, process emotional language differently from L1 speakers. In general, L2 speakers’ emotional language processing is constrained by cognitive resources and language experience.
Multifaceted moderating effects of linguistic factors
A processing advantage for concrete over abstract emotion words was found only in intermediate L2 speakers. This concreteness effect was also stronger in intermediate L2 speakers than in both advanced L2 and L1 speakers, where no significant concreteness effect was found. These results suggest a hierarchical pattern in the acquisition of emotional lexical information, with emotional content in an L2 developing from concrete to abstract concepts. Concrete words are more easily learned, as they connect to L1-based experiences and tangible referents. In contrast, abstract words require higher-order conceptual representations and are typically acquired later, as language proficiency moves from intermediate to advanced and eventually native-like levels. Thus, emotion word processing in intermediate L2 speakers depends more heavily on concreteness than in the other language speaker groups.
Furthermore, word length mattered, but only among advanced L2 speakers, who showed greater advantages for longer emotion words. These words may be more salient in their language experience, leading to facilitation. The facilitatory effect of longer word length was more pronounced in advanced L2 speakers than in L1 speakers, who did not exhibit a significant word length effect. This may reflect L2 speakers’ greater reliance on surface-level linguistic features like word length (the Shallow Structure Hypothesis; Clahsen & Felser, Reference Clahsen and Felser2006), rather than deeper emotional or pragmatic content, compared to L1 speakers (the Lexical Quality Hypothesis; Perfetti, Reference Perfetti2007). Interestingly, positive length effects were absent in intermediate L2 speakers. This may be because intermediate L2 speakers are additionally constrained by limited cognitive resources compared to advanced L2 speakers. With richer language experience, advanced L2 speakers have more stable lexical representations of longer words, allowing them to process surface-level orthographic, phonological, and grammatical features with less effort. In contrast, intermediate L2 speakers, with less stable representations, may face greater cognitive load and resource depletion as word length increases (Liu et al., Reference Liu, Margoni, He and Liu2021).
In contrast, no frequency effects were found for the processing advantage of emotion words. This is surprising, given well-established frequency effects in word processing (Brysbaert et al., Reference Brysbaert, Mandera and Keuleers2018), including for emotion words (Larsen et al., Reference Larsen, Mercer and Balota2006). However, unlike prior studies, our focus was not on baseline word processing but on the processing advantage of emotion words over neutral words. Hence, our results suggest that frequency did not interact with the general effects of emotional information. One explanation is that over 90% of the emotion words in the meta-analysis were low-frequency (i.e., below 100 per million) and narrowly distributed, limiting the detectability of frequency effects. Another possibility is that emotional activation tied to word frequency is absent or unstable. As emotional processing is embodied, emotional activation requires real-world experiential grounding. High linguistic frequency alone—especially from decontextualized contexts like classrooms—may not strengthen emotional associations, and thus may not enhance processing advantages for emotion words (Pavlenko, Reference Pavlenko2012; Jończyk, Reference Jończyk2016).
Neither orthographic nor phonological neighborhood size significantly influenced word processing in any group. Although prior studies reported neighborhood effects on word processing (e.g., Forster & Shen, Reference Forster and Shen1996; Jalbert et al., Reference Jalbert, Neath and Surprenant2011; Luce & Pisoni, Reference Luce and Pisoni1998; Yates, Reference Yates2005), our meta-analysis did not replicate these findings. As discussed above, this discrepancy likely reflects our distinct focus on emotional processing advantages (i.e., emotion vs. neutral words) rather than baseline word processing. Thus, our results suggest that the richness of orthographic or phonological representations is not closely intertwined with emotional activation.
Task type showed moderating effects that differed between L1 and L2 speakers
Greater processing speed advantages measured by RTs for emotion words in emotional tasks, compared to both linguistic and irrelevant tasks, were observed only in intermediate L2 speakers. Moreover, the interaction revealed that these task effects were more pronounced in intermediate L2 speakers than in L1 speakers, where no significant effects were observed. With limited language experience, intermediate L2 speakers may struggle to activate emotional content when it is not the focus. Similarly, Winskel (Reference Winskel2013) found that L2 speakers do not automatically evoke emotions during early processing stages but can consciously activate them later. Specifically, emotional tasks tend to elicit stronger emotional responses by directing attention to emotional content and facilitating its retrieval. By contrast, linguistic and non-emotional tasks offer fewer cues for emotional processing, resulting in more implicit activation. Both the nature of L2 learning environments and the structure of emotional representation in the bilingual lexicon may explain intermediate L2 speakers’ differential performance across task types. First, intermediate L2 learners often acquire language through explicit instruction lacking emotionally rich, real-world experiences (the Emotional Contexts of Learning Theory; Caldwell-Harris et al., Reference Caldwell-Harris, Gleason and Aycicegi2006), limiting their ability to automatically activate emotional content in tasks without an emotional focus. Second, learning and representation are likely interconnected: for intermediate L2 speakers, linguistic symbols and embodied emotional experiences may be processed through separate channels (the Revised Hierarchical Model; Kroll & Stewart, Reference Kroll and Stewart1994), such that emotional activation is less efficient when it is not explicitly required by the task in intermediate L2 speakers. This may further hinder automatic access to emotional meaning in non-emotional tasks, unlike more proficient L2 or L1 speakers.
Furthermore, the absence of task effects in advanced L2 speakers, compared to their presence in intermediate L2 speakers, highlights the role of language experience in the implicit activation of emotional content. With increased language proficiency, L2 speakers become better at implicitly processing emotional information and suppressing interference. Consequently, task-related differences in emotion word processing diminish as language proficiency increases.
Limitations and future directions
As in most language processing research (Collart, Reference Collart2024), studies on emotion word processing have predominantly focused on English and German. Future work should include a broader range of languages to explore cross-linguistic and cross-cultural variation. Second, the structural imbalance in the current data—marked by fewer L2 studies—indicates the need for more research on L2 speakers to enhance group balance and generalizability. Third, the L2 speakers in the included studies came from varied learning environments—some with extensive immersion in the target language (e.g., Conrad et al., Reference Conrad, Recio and Jacobs2011), others living in bilingual contexts (Cieślicka & Guerrero, Reference Cieślicka and Guerrero2023). According to the Emotional Contexts of Learning Theory (Caldwell-Harris et al., Reference Caldwell-Harris, Gleason and Aycicegi2006), cultural and social differences critically influence emotion word processing. Future research should report and control for L2 learning environments to better understand emotion word processing in L2 speakers. Furthermore, the absence of consistent, standardized, and fine-grained proficiency measures across studies limited the modeling of L2 proficiency as a continuous variable. Future research should adopt uniform, validated assessments (e.g., IELTS or TOEFL) to improve comparability and allow for more flexible modeling. Fifth, inconsistent norming scales and cross-linguistic variation in emotional semantics hindered continuous modeling of valence, arousal, and concreteness. Future studies should use consistent rating scales and examine cross-linguistic differences to support fine-grained modeling.
Finally and most importantly, most emotion word processing research has largely overlooked psycholinguistic features such as imageability, due to (1) limited data availability (none of the L2 studies included in this meta-analysis reported imageability ratings, and large-scale norms remain scarce), and (2) high inter-individual (Su et al., Reference Su, Yum and Lau2023) and cross-linguistic (Rofes et al., Reference Rofes, Zakariás, Ceder, Lind, Johansson, De Aguiar and Howard2018) variability in ratings, which undermines effect robustness and interpretability. Likewise, AoA was excluded due to its absence in L1 studies, limited reporting in L2 research and because this meta-analysis focuses on lexical properties rather than proxies of cumulative learning history. Future research should improve standardization and reporting of key psycholinguistic variables across languages, reduce variability and include features reflecting developmental trajectories.
Conclusion
This meta-analysis investigated the processing advantage of emotion words. The results suggest that, like most cognitive activities, language processing is dynamically influenced by valence and largely aligns with the principles of the Automatic Vigilance Theory (Pratto & John, Reference Pratto and John1991), which posits preferential attention to affectively positive stimuli. Critically, our findings reveal that emotion word processing is modulated by multiple dimensions—including arousal, lexical category (e.g., emotion-label vs. emotion-laden words), linguistic features (e.g., word length, frequency), and task type—all of which interact with the cognitive mechanisms underlying affective language comprehension. Last, the meta-analysis revealed significant heterogeneity across L1, advanced L2, and intermediate L2 speakers regarding the moderating effects of emotional polarity (valence), intensity (arousal), the way emotions are represented in words (word type), length, concreteness, and language speakers’ attention (task type) on emotion word processing. This suggests that the processing of emotional information in language is embodied and shaped by language experiences and cognitive resources. The findings are consistent with several theoretical frameworks, including the Emotional Contexts of Learning Theory (Caldwell-Harris et al., Reference Caldwell-Harris, Gleason and Aycicegi2006), the Shallow Structure Hypothesis (Clahsen & Felser, Reference Clahsen and Felser2006), the Revised Hierarchical Model (Kroll & Stewart, Reference Kroll and Stewart1994), and the Lexical Quality Hypothesis (Perfetti, Reference Perfetti2007). Our study supports the practice of considering L1 speakers and L2 speakers as heterogeneous groups in language processing in empirical research. The comparison between intermediate and advanced L2 speakers further suggests that emotional information acquisition in L2 is a dynamic process that progressively approximates native-like levels. Therefore, emotional information should be regarded as unique and additional content in language learning. Future language teaching and research for L2 speakers should place greater emphasis on the acquisition of emotional meaning.
Supplementary material
The supplementary material for this article can be found https://doi.org/10.1017/S0142716426100630.
Replication package
The data that support the findings of this study are openly available in Open Science Framework at https://osf.io/hteg4/.
Competing interests
The authors declare none.


