1. Introduction
The relationship between language and emotion is complex and bidirectional. Human languages are rich with emotion words that directly label affective states and emotion-laden words that carry emotional connotations rooted in contextual and social cues. According to theories of embodied cognition, processing emotional language involves reenacting the sensorimotor and affective states associated with those words (Cieślicka & Guerrero, Reference Cieślicka and Guerrero2023; Niedenthal, Reference Niedenthal2007). In psycholinguistic research, this is often studied via the “emotion effect.” Evidence suggests that positive words are reliably associated with processing facilitation while findings for negative words remain mixed (Ferré et al., Reference Ferré, Fraga and Hinojosa2025). Neurocognitive studies indicate that emotional word processing engages both language areas and emotion-related circuits, supporting the idea that language meaning is partly grounded in affective experience (Barrett, Reference Barrett2017; Citron, Reference Citron2012). In sum, the emotional meanings of words are grounded in affective experience and can influence processing, yet the valence-specific profile of the influence varies.
If the nexus of language and emotion is intriguing in one’s native tongue, it is even more so in a second language. Bilinguals often report that swearing, endearments or emotionally charged phrases feel different in L2 compared to L1, and empirical work documented systematic differences in affective responses across languages (Dewaele, Reference Dewaele2010; Pavlenko, Reference Pavlenko2005). A growing body of behavioral, psychophysiological and neuroimaging studies indicates that emotional words in L2 often elicit attenuated autonomic and embodied responses relative to L1, although effects vary with proficiency, context and stimulus type (Aguilar et al., Reference Aguilar, Ferré and Hinojosa2024; Caldwell-Harris, Reference Caldwell-Harris2014). Recent reviews and meta-analyses emphasize this complexity as they report consistent evidence for reduced peripheral emotional reactivity in L2 (e.g., skin conductance), more mixed behavioral effects (reaction times and accuracy depend on task and valence) and neural findings that point to both shared and language-specific circuits depending on task demands and the degree of learning or immersion (Aguilar et al., Reference Aguilar, Ferré and Hinojosa2024; Ferré et al., Reference Ferré, Fraga and Hinojosa2025). Theoretical accounts attribute these differences to multiple factors. The Shallow Structure Processing Hypothesis (Clahsen & Felser, Reference Clahsen and Felser2006) posits that adult L2 learners rely on less detailed syntactic parsing, focusing more on lexical-semantic cues; by analogy, L2 emotional processing might be “shallower” or less automatic, relying more on conscious appraisal than on immediate affective intuition. Additionally, bilingual emotion lexicon theories (e.g., Pavlenko, Reference Pavlenko2008) argue that emotional meanings are often learned through direct translation or classroom contexts in L2, rather than through lived emotional experiences, leading to weaker affective links. Bilingual language control models further suggest that managing two languages recruits higher-order cognitive control networks (Abutalebi & Green, Reference Abutalebi and Green2008; Green, Reference Green1998). As a result, L2 speakers may engage frontal–cingulate circuits (implicated in attention and control) to a greater extent to process emotional content, perhaps dampening the spontaneous emotional reaction. In sum, relative to L1, emotional word processing in L2 tends to be less intuitive and emotionally potent, often requiring more cognitive resources, which is evidenced by behavioral, psychophysiological and neural findings (Caldwell-Harris, Reference Caldwell-Harris2014; Jończyk et al., Reference Jończyk, Naranowicz, Bel-Bahar, Jankowiak, Korpal, Bromberek-Dyzman and Thierry2025). These differences raise important questions about how adult L2 learners come to understand and internalize emotional meaning in a new language.
A key aspect of the language–emotion interface that has received little attention is the acquisition of emotional vocabulary in a second language. Studies that do focus on acquisition point to strong effects of learning context and methodology. For example, Brase and Mani (Reference Brase and Mani2017) demonstrated that the conditions under which emotional words are taught (e.g., multimodal affective contexts vs. decontextualized definitions) influence both immediate acquisition and later processing of L2 emotional words, suggesting that embodied or context-rich training supports stronger affective integration. Reviews integrating experimental, developmental and corpus evidence further concluded that emotional content facilitates retention and semantic integration for certain word types such as abstract and socially laden terms, yet it depends on valence, arousal, cross-modal cues and the richness of associative experience (Aguilar et al., Reference Aguilar, Ferré and Hinojosa2024; Ferré et al., Reference Ferré, Fraga and Hinojosa2025). Methodologically, prior studies employed diverse paradigms including explicit associative learning using faces or scenes, naturalistic immersion, translation-equivalence tasks and implicit exposure (Ferré et al., Reference Ferré, Fraga and Hinojosa2025; Martínez-Tomás et al., Reference Martínez-Tomás, Baciero, Lázaro and Hinojosa2025), and they yield different patterns of behavioral and physiological outcomes. The present study addresses this gap by using an associative learning paradigm combined with ERP measures to track how emotional meanings emerge in the brain and whether L2 learners recruit compensatory resources relative to native speakers. By situating our design alongside earlier acquisition work (e.g., Brase & Mani, Reference Brase and Mani2017) and recent syntheses (Aguilar et al., Reference Aguilar, Ferré and Hinojosa2024; Ferré et al., Reference Ferré, Fraga and Hinojosa2025), we aim to clarify which aspects of emotional meaning, whether early perceptual salience or later evaluative integration, are sensitive to language background and learning context.
Research on emotion typically follows two complementary frameworks. Dimensional accounts characterize affective states along continuous axes such as valence and arousal and was widely used to explain broad emotion effects on lexical access (Bradley & Lang, Reference Bradley, Lang, Lane and Nadel2000). By contrast, discrete emotion theories propose that a limited set of qualitatively distinct emotions are psychologically and sometimes biologically dissociable, each with characteristic elicitors, expressions and action tendencies (Ekman, Reference Ekman1992; Izard, Reference Izard2007). Importantly for language research, empirical evidence suggests that discrete emotions can differentially affect behavioral and ERP indices of word processing over valence and arousal effects (Briesemeister et al., Reference Briesemeister, Kuchinke and Jacobs2014; Silva et al., Reference Silva, Montant, Ponz and Ziegler2012). We therefore adopt a targeted discrete emotion approach by focusing on disgust and sadness. The two emotions differ in prototypical cues, action tendencies and physiological/neural correlates despite both being negatively valenced: disgust is typically an externally triggered, visceral emotion as we feel disgust in response to noxious stimuli and it often involves a strong revulsion response (Rozin & Haidt, Reference Rozin and Haidt2013). In contrast, sadness is usually an internally generated emotion arising from loss, social pain or disappointment, and it is associated with withdrawal and a subdued demeanor (Kreibig et al., Reference Kreibig, Wilhelm, Roth and Gross2007). The two emotions thus occupy distinct positions in the affective space: disgust is more linked to immediate survival-oriented reactions (e.g., nausea, physiological aversion), whereas sadness relates to prolonged affective states of sorrow or grief (Mikulincer & Florian, Reference Mikulincer and Florian1997; Rozin & Haidt, Reference Rozin and Haidt2013). Neuroimaging research has identified the anterior insula as a key hub for disgust, activated both when experiencing disgust oneself and when observing disgust in others (Wicker et al., Reference Wicker, Keysers, Plailly, Royet, Gallese and Rizzolatti2003). Meanwhile, brain studies indicate that the anterior cingulate cortex (ACC) frequently implicated in the generation or regulation of sad affect (Phan et al., Reference Phan, Wager, Taylor and Liberzon2002). Studying these two negative emotions allows us to explore whether the neural and cognitive mechanisms for learning an emotion word are general to valence and arousal or if they also capture emotion-specific nuances. By examining discrete emotions rather than treating emotion in binary valence terms, we follow a growing body of research emphasizing that different emotions can modulate language processing in different ways (Briesemeister et al., Reference Briesemeister, Kuchinke and Jacobs2014; Ferré et al., Reference Ferré, Guasch, Mart’in and Hinojosa2017).
A critical design feature of our study is the use of emotional facial expressions as stimuli for word learning. Faces are among the most salient and interpretable carriers of emotional information in human communication (Ekman, Reference Ekman1993). In essence, the face provides an implicit definition through affective demonstration. Prior studies found that associating novel stimuli with emotional faces can effectively imbue those stimuli with emotional value. For example, Hammerschmidt et al. (Reference Hammerschmidt, Sennhenn-Reulen and Schacht2017) showed that neutral faces paired with motivationally salient outcomes led to amplified early brain responses to those faces on subsequent presentations. This indicates that faces can serve as powerful anchors in associative learning, especially for affective material. In the context of word learning, using faces as the mediating stimuli exploits our natural abilities for emotional perception. Our previous studies employed a similar approach of pairing pseudowords with disgust and sadness faces, and found that participants not only learned the emotional meanings, but those meanings were strong enough to produce congruency effects in sentence processing (Gu et al., Reference Gu, Liu, Wang, de Vega and Beltrán2022; Gu et al., Reference Gu, Liu, Beltrán and de Vega2023). Moreover, learning through faces ensures that the new words gain an experiential association (seeing an emotional expression) rather than just a verbal definition. This might help overcome the “disembodiment” issue by grounding the L2 word’s meaning in a visible emotional event. In practical terms, facial photographs from standardized sets (such as the Karolinska Directed Emotional Faces [KDEF]) offer controlled, validated stimuli for disgusted, sad and neutral expressions.
To probe the neural correlates of affective word learning, our study employs event-related potential (ERP) measures during word processing. Several ERP components are particularly relevant to language and emotion, and they provide insight into different cognitive stages. One key component is the early posterior negativity (EPN), occurring roughly 200–300 ms after stimulus onset. The EPN is a relative negativity over occipito-temporal scalp sites and is consistently enhanced for emotionally arousing stimuli (including emotional words) compared to neutral stimuli (Farkas et al., Reference Farkas, Oliver and Sabatinelli2020; Kissler et al., Reference Kissler, Assadollahi and Herbert2009). Researchers interpret the EPN as a marker of early attention capture by emotional content, essentially meaning that emotional words receive extra perceptual processing in the visual cortex due to their salience (Kissler et al., Reference Kissler, Herbert, Winkler and Junghöfer2007; Schacht & Sommer, Reference Schacht and Sommer2009). In addition, the emotion effects of EPN were reported to occur approximately at the same onset as lexicality effects (for review, see Citron, Reference Citron2012). A similar occipito-temporal negativity called recognition potential that is sensitive to the meaningfulness of visually presented words can also be observed during the early stage of word processing (Martín-Loeches, Reference Martín-Loeches2007; Martín-Loeches et al., Reference Martín-Loeches, Hinojosa, Gómez-Jarabo and Rubia2001). This component is useful for examining the learning effects between learned and unlearned pseudowords. The novelty P3 (P3a) is also frequently elicited by explicit oddball or orienting paradigms where infrequent and task-salient events trigger frontal orienting and evaluation process (Polich, Reference Polich2007). Since the stimuli in our design were presented in an incidental lexical context without a classic oddball structure and our primary interest was investigating the learning effects, an early negativity locus is more suitable to capture how the brain distinguishes between learned and unlearned linguistic forms. Another key component, the late positive complex (LPC), is a broad sustained positivity typically observed from 400 ms onward, often maximal over parietal scalp. The LPC has been linked to elaborate or evaluative processing of stimuli and is sensitive to the emotional significance of words (Hajcak et al., Reference Hajcak, Weinberg, MacNamara and Foti2012; Hajcak & Foti, Reference Hajcak and Foti2020). Generally, emotionally significant stimuli (including words) elicit a larger LPC than neutral stimuli, reflecting sustained attention or memory encoding processes (Citron, Reference Citron2012; Olofsson et al., Reference Olofsson, Nordin, Sequeira and Polich2008) although such emotion effects might potentially be attenuated or reversed during early semantic-affective grounding (Gu et al., Reference Gu, Liu, Wang, de Vega and Beltrán2022). In bilingual studies, the LPC can also index the amount of effort or resource allocation, for example, an L2 word might produce a larger LPC if it requires more effortful processing (Opitz & Degner, Reference Opitz and Degner2012). By measuring EPN and LPC responses to learned and unlearned pseudowords (and comparing across the L1 and L2 groups), we can pinpoint when and newly learned how emotional meaning manifests in the neural processing. A successful encoding of a new word’s emotional meaning might be evidenced by the emergence of emotion-specific EPN and LPC effects (indicating the word is now processed akin to other emotional vocabulary; Hinojosa et al., Reference Hinojosa, Moreno and Ferré2020; Zhang et al., Reference Zhang, He, Wang, Luo, Zhu, Gu, Li and Luo2014). These ERP components thus serve as temporal signatures of different cognitive processes: early affective tagging (EPN) and more reflective, possibly language-modulated emotional analysis (LPC).
In addition to scalp ERPs, we consider the broader neural regions and networks that could be implicated in emotional word learning and processing, particularly differences between L1 and L2 speakers. Converging evidence from affective neuroscience highlights certain regions as central to emotion processing. The insula is one such region: especially the anterior insula, which has been repeatedly implicated in the experience and recognition of disgust (Gan et al., Reference Gan, Zhou, Li, Jiao, Jiang, Biswal, Yao, Klugah-Brown and Becker2022; Jayashankar & Aziz-Zadeh, Reference Jayashankar and Aziz-Zadeh2023). Notably, Wicker et al. (Reference Wicker, Keysers, Plailly, Royet, Gallese and Rizzolatti2003) found that seeing someone else’s disgust expression and feeling disgust oneself activated a common area of the anterior insula. This region is thought to integrate visceral sensory information and contribute to the subjective feeling of disgust (and other internal-state emotions). We expect that learning a word that means “disgust” might engage the insula when that word is later processed, as the word’s meaning triggers an embodied simulation of disgust. Another key region is the ACC. The ACC is often described as a hub for cognitive control and conflict monitoring, but it also plays a role in emotion—especially in monitoring pain or distress (which are relevant for sadness) and in exerting control when processing emotional stimuli (Etkin et al., Reference Etkin, Gyurak and O’Hara2011; Xiao et al., Reference Xiao, Ding and Zhang2021; Zhou et al., Reference Zhou, Gao, Bao, Liang, Cao, Tang, Li, Hu, Zhang, Sun, Roberts, Gong and Huang2024). In prior studies, processing of sad stimuli and experiences has been linked to ACC engagement (Phan et al., Reference Phan, Wager, Taylor and Liberzon2002), possibly reflecting the monitoring of emotional conflict or the regulation of affective state. In bilingual language contexts, the ACC is frequently active due to its role in language control (e.g., inhibiting one language while using another) (Liu et al., Reference Liu, Jiao, Li, Timmer and Wang2021) and may become even more engaged for emotional material if the L2 speaker needs to downregulate emotional responses or compensate for lower automaticity (Sulpizio et al., Reference Sulpizio, Toti, Del Maschio, Costa, Fedeli, Job and Abutalebi2019).
In summary, the present study addresses two critical gaps regarding the neurocognitive mechanisms of emotional word learning in L1 and L2 contexts. First, while most psycholinguistic and neurolinguistic studies on emotional words examined post hoc processing of familiar L2 emotional words, few have tracked how an initially meaningless form in L2 comes to attain emotional meaning in the mind or brain (Ferré et al., Reference Ferré, Fraga and Hinojosa2025; Hinojosa et al., Reference Hinojosa, Moreno and Ferré2020). Understanding such an L2 learning process can reveal mechanisms of semantic encoding and emotional grounding that are otherwise obscured once a word is already familiar (Ferré et al., Reference Ferré, García, Fraga, Sánchez-Casas and Molero2010; Opitz & Degner, Reference Opitz and Degner2012; Pavlenko, Reference Pavlenko2008). Second, it remains unclear whether and how L2 speakers compensate for generally weaker affective resonance in their L2 lexicon. One possibility is that L2 speakers recruit additional neural resources (e.g., heightened attentional control via ACC or enhanced memory encoding efforts via hippocampal networks) to process L2 emotional words (especially those do not follow the same orthographical structure as their L1) effectively “boosting” the signals to more native-like levels (Abutalebi & Green, Reference Abutalebi and Green2016). By comparing L1 and L2 speakers, we aim to determine: (1) how emotional meaning learning modulates early and late brain responses to new words (e.g., EPN and LPC); (2) whether L1 and L2 speakers differ in how quickly or strongly these emotional associations are formed, as evidenced by behavioral learning performance and ERPs; (3) whether discrete emotions (disgust vs. sadness) show distinct learning outcomes or processing signatures and (4) whether L2 speakers show evidence of extra recruitment of cognitive-control or memory-related networks relative to L1 speakers. Ultimately, the present study aims to provide empirical data from the learning and subsequent processing to contribute to theories of embodied bilingual cognition.
To explore these questions, we designed an experiment with two sessions: a learning session and an ERP recording session. We recruited two groups of participants: one group of native speakers of Mandarin Chinese and another group of proficient L2 speakers of Mandarin Chinese. In the learning session, participants learned the emotional meanings of a set of Chinese pseudowords. Each pseudoword was consistently paired with an emotional facial expression in an associative learning paradigm. Essentially, participants saw, for example, a photograph of a person making a disgusted face together with a pseudoword on the screen, repeatedly, so that they could learn that this new string of characters corresponds to the concept of “disgust.” Multiple face-word pairings and repetitions were used to reinforce the association, with intermittent tests to check that participants learned the word meanings. On the next day, participants first completed a refreshing session and then, underwent ERP recording while performing a modified lexical decision task. In this ERP session, the critical stimuli were: (a) learned pseudowords with disgusting, sad and neutral meanings; (b) unlearned pseudowords to examine novelty responses; (c) actual Chinese words of neutral concepts (served as semantic anchors to encourage incidental engagement of semantic processing and ensure that participants could not adopt a purely orthographic or pseudoword-detection strategy for responses) and (d) filler items for response. Such a design encourages implicit processing ensured that the observed ERP modulations to the pseudowords reflect spontaneous affective grounding rather than task-induced lexical monitoring or decision-making strategies as participants were instructed to respond to infrequent filler items only. Meanwhile, this paradigm was chosen to minimize potential motor-related neutral activities that could interfere with the ERP components of interest. By combining an associative learning paradigm with ERP recordings and a cross-language comparison, the design directly addresses our research aims in a controlled yet ecologically valid manner. Crucially, it enables us to observe not only whether learning occurred, but also how the brain signals of word processing change as a function of learned emotional meanings.
Based on the literature reviewed and the rationale above, we formulated the following hypotheses:
-
a. Based on novelty detection literature, we predicted that unlearned pseudowords would elicit larger negative amplitudes compared to learned pseudowords for both groups in the early time window, reflecting successful encoding and reduced novelty for familiar items.
-
b. In the EPN time window, we hypothesized that the brain would show rapid attentional capture for learned emotional content. Specifically, we predicted larger EPN for emotional (disgust, sadness) words compared to neutral words. We also explored the possibility of an early differentiation between disgust and sadness.
-
c. While L2 processing may be less automatic, we hypothesized that L2 speakers would need to allocate more sustained cognitive resources to process the newly learned words. This would be reflected in larger LPC amplitudes for the L2 group compared to the L1 group. We focused on group differences given the potential for varied LPC patterns during early semantic-affective grounding.
-
d. We hypothesized that such an increased effort at the scalp level would be supported by the recruitment of a broader neural network. Therefore, we predicted that the L2 group would show greater brain activation in domain-general cognitive control, memory retrieval and emotion and semantic processing networks compared to the L1 group during later time windows.
2. Methods
2.1. Participants
A total of 24 native Chinese speakers (12 females, average age 21.95) and 24 Chinese L2 speakers (14 females, average age 24.74) who had all passed HSK level 5, which qualifies them as advanced Chinese L2 speakers (Peng et al., Reference Peng, Yan and Cheng2021) were recruited. The 24 L2 speakers varied in their L1 (five of them were native Thai speakers, three Japanese, three Korean, three Russian, two Indonesian, two Mongolian, two Spanish, one Burmese, one English, one Polish and one Uzbek) due to the limited number of proficient Chinese L2 speakers available on campus. All participants were right-handed, reported normal or corrected-to-normal vision and had no history of psychiatric or neurological disorders. The study was reviewed and approved by Biological and Medical Ethics Committee, Dalian University of Technology, China. All procedures in the study were in accordance with the Declaration of Helsinki and International Ethical Guidelines for Biomedical Research Involving Human Subjects. Written informed consent was obtained from all participants, who were paid 80 RMB for their time and involvement. Due to excessive EEG artifacts, data from two participants in the L1 group and one in the L2 group were excluded from further analysis. Thus, the final sample included 45 participants (22 L1, 23 L2). All participants completed a self-rating questionnaire to record the age of acquisition (AOA) of Mandarin Chinese assess their Chinese proficiency in listening, speaking, reading and writing on a 6-point scale (1 = very poor, 6 = excellent) (Yang et al., Reference Yang, Zhang, Liang, Cheng and Chen2023). Independent-samples t-tests confirmed that the L1 group rated their Chinese proficiency significantly higher than the L2 group (see Table 1).
Means and standard deviations of AOA and proficiency self-assessment scores

a All L1 participants started learning Chinese before primary school and cannot recall the exact age, thus all filled in 1.
2.2. Materials
2.2.1. Faces
For the learning session, 48 facial images were selected from the KDEF database (Lundqvist et al., Reference Lundqvist, Flykt and Öhman1998), comprising equal numbers of disgusted, sad and neutral expressions. The KDEF set is a widely used and validated source for experimental research on emotion recognition and associative learning (Goeleven et al., Reference Goeleven, De Raedt, Leyman and Verschuere2008). The selected images were matched across conditions for gender and identity. Pretests based on the KDEF normative database confirmed clear emotional quality: arousal ratings differed significantly across expressions, with emotional faces rated more arousing than neutral ones (disgust vs. neutrality: t(15) = 10.60, p < .001; sadness vs. neutrality: t(15) = 9.42, p < .001; disgust vs. sadness: t(15) = 1.19, p = .468). One-way ANOVAs revealed no reliable differences in terms of recognition accuracy (hit rate) across emotions (see Table 2 for statistical details). A subsequent TOST equivalence test (equivalence bounds set to Cohen’s d ± 0.5, α = .05) failed to establish statistical equivalence for any pairwise comparison (disgust vs. sadness: TOST p-values = .083/.084, 90% CI for d = [−0.601, 2.390]; disgust vs. neutrality: TOST p-values = .741/<.001, 90% CI = [0.132, 5.349]; sadness vs. neutrality: TOST p-values = .756/<.001, 90% CI = [0.148, 5.306]). This is likely attributable to the relatively small sample size (16 per set). Nevertheless, hit rates for all three categories were relatively high, suggesting that all faces were highly recognizable and provided stable emotional anchors for the learning phase.
Mean (SD) arousal and hit rates for disgusted, sad, and neutral faces

2.2.2. Sentence stimuli
To assess semantic generalization, 36 sentences describing target emotions (eight per emotion: disgust, sadness, neutrality) were constructed in Chinese. Pretests confirmed that the sentences clearly conveyed the intended emotions. A separate group of 43 L2 participants who share the same profile as described above yet never attended the learning and ERP recording sessions rated arousal, comprehensibility and emotional category accuracy. After screening, 24 sentences were retained. ANOVAs revealed extremely strong differences in arousal between emotional sentences and neutral ones (disgust vs. neutrality: t(7) = 13.17, p < .001; sadness vs. neutrality: t(7) = 15.58, p < .001), but not between the two emotions (disgust vs. sadness: t(7) = 2.41, p = .063). In contrast, there were no significant differences in comprehensibility or in emotion identification accuracy. Thus, the sentences were equally easy to read and reliably communicated the intended emotion categories (see Table 3 for statistical details).
Mean (SD) arousal, hit rates and comprehensibility for disgusting, sad and neutral sentences

2.2.3. Pseudowords
A total of 32 pseudowords were created for the study. Twenty-four pseudowords were used in the learning phase (three sets of eight), with each set sharing the same final character (i.e., “每,” “麦” or “完”). An additional eight pseudowords ending with “阿” were introduced in the ERP session as novel control items to examine the learning effect. This construction method follows previous work on pseudoword learning in Chinese, where morphological or positional regularities (e.g., shared suffixes) aid learnability and simulate real lexical patterns (Brysbaert & New, Reference Brysbaert and New2009; Liang et al., Reference Liang, Blythe, Bai, Yan, Li, Zang and Liversedge2017). Importantly, while the pseudowords in each set shared a final character, this is consistent with existing patterns of Chinese emotional words (e.g., 厌恶, 嫌恶, 憎恶 for “disgust”; 悲伤, 忧伤, 哀伤 for “sadness”), supporting ecological validity. To ensure that the stimuli were clearly nonwords and to prevent overlap with existing lexical items, characters that typically occur in the initial position of Chinese words were deliberately placed in the second position of the pseudowords. This manipulation follows the rationale of research on character positional frequency (Liang et al., Reference Liang, Blythe, Bai, Yan, Li, Zang and Liversedge2017), which demonstrated that incongruency between a character’s typical position and its actual position in a word increases processing difficulty during lexical acquisition. Other than that, five faculty members were asked to review the pseudowords to further ensure that none of them resembles any real words. Thus, the present pseudowords were both novel and orthographically incongruent, creating a stringent test for the learning of emotional meanings.
To evaluate lexical and visual comparability across groups, pseudowords were assessed on several quantitative indices using the SUBTLEX-CH database (Cai & Brysbaert, Reference Cai and Brysbaert2010). Specifically, we examined overall character frequency (logCHR), first-character frequency (logCHR1), overall context diversity (logCHR-CD), first-character context diversity (logCHR-CD1) and stroke counts. ANOVAs revealed no significant differences across the four pseudoword groups on any of these measures, confirming that the groups were well matched in terms of frequency, diversity and visual complexity (see Table 4 for statistical details).
Mean (SD) overall character frequency (logCHR), first-character frequency (logCHR1), overall context diversity (logCHR-CD), first-character context diversity (logCHR-CD1) and stroke for the four sets of pseudowords

2.3. Procedure
2.3.1. Learning session
The learning session was conducted on the first day and included both a learning phase and an evaluation phase. Participants began by reading task instructions. They were informed that they would see pseudowords paired with faces and were asked to memorize the associations.
Each learning trial began with a fixation cross at the center of the screen, followed by a face (500 ms) and then the pseudoword superimposed on the face (1000 ms). The session consisted of three training blocks: the first and second blocks each included three face–pseudoword pairs per emotion, and the third included two pairs per emotion. Each pair was repeated six times per block, yielding 54, 54 and 36 learning trials, respectively. The assignment of pseudowords to emotions was counterbalanced across participants.
After each learning block, participants completed a selection test in which they saw a face from the learning block with two pseudowords presented at the bottom-left corner, and had to choose the correct pairing by pressing “1” or “2” on the keyboard.
Following all learning blocks, participants completed the evaluation phase, which included three types of evaluation tests:
-
1. Matching test (24 trials): Participants judged whether the presented pseudoword–face pairing was correct.
-
2. Within-modality generalization test (12 trials): Participants saw a new face (not used in learning) expressing a target emotion and had to choose which of two pseudowords best matched it according to the learning.
-
3. Cross-modality generalization test (12 trials): Participants read a sentence expressing one of the target emotions and chose which pseudoword best describes the emotion of the sentence.
Across all tests, immediate feedback was provided after each response. This multi-stage design followed established associative learning paradigms for pseudoword emotion acquisition (Gu et al., Reference Gu, Liu, Wang, de Vega and Beltrán2022; Hammerschmidt et al., Reference Hammerschmidt, Sennhenn-Reulen and Schacht2017). Participants completed the learning phase and the evaluation phase twice to ensure that the correct and incorrect responses were counterbalanced across the three target emotions (the pairing between pseudowords and faces remained consistent). Examples of experimental trials were demonstrated in Figure 1.
Outline of trials in the learning session.

2.3.2. ERP recording session
The ERP session took place the following day. Participants first completed a short refreshing task (review all pseudoword-face pairings once and completed the two generalization tests from the previous day) to reinforce learning.
They were then seated in a sound-attenuated room approximately 80 cm from a computer monitor. Stimuli were presented using E-Prime 3.0 software. In the modified lexical decision task, participants silently read each stimulus and pressed “J” only when the stimulus was a vehicle name.
Each participant was presented with 24 learned pseudowords, 8 unlearned pseudowords, 8 neutral real words and 8 vehicle names. Vehicle names were repeated eight times and other stimuli four times, totaling 224 trials. Each trial began with a fixation cross (500 ms), followed by the stimulus (1000 ms), then a blank screen (500 ms) and finally a rest signal (2500 ms). Before the main task, participants completed 20 practice trials.
This paradigm was chosen because it allows implicit assessment of learned emotional meanings without requiring explicit emotion judgments (Citron, Reference Citron2012; Gu et al., Reference Gu, Liu, Wang, de Vega and Beltrán2022).
2.3.3. EEG recording and analysis
Electrophysiological (EEG) data were recorded using a set of 64 electrodes placed according to the extended 10–20 positioning system. The signal was recorded from actiCHamp amplifier (Brain Products, Germany) at a sampling rate of 1000 Hz. A vertex electrode served as the reference, and a forehead electrode served as the ground. Offline processing was re-referenced to the average of the bilateral mastoids. Impedances were kept below 5 kΩ. Electroencephalographic activity was filtered online within a band-pass between 0.1 and 100 Hz. It was then refiltered offline with a high-pass cutoff of 0.01 Hz and a low-pass cutoff of 45 Hz using a zero-phase Hamming windowed sinc FIR filter. Data were downsampled to 250 Hz and trials with drifting or large movement artifacts were removed by visual inspection before analysis. Ocular artifact reduction was performed through ICA component rejection using EEGLAB (Delorme & Makeig, Reference Delorme and Makeig2004). To improve ICA decomposition, data were reduced to 30 principal components via PCA prior to the ICA run. Signals exceeding ±80 μV in any channel were automatically discarded. This whole procedure rejected 9.3% of the experimental trials on average.
Continuous recordings were analyzed in pseudoword-locked −200 to 800 ms epochs. ERPs were computed using the 200 ms period before the pseudoword onset as the baseline. ERP waveforms of the two groups were obtained by averaging baseline-corrected EEG segments in two conditions for testing the learning effect: learned, unlearned; and in three conditions for testing the emotion effect: disgust, sadness neutrality.
We conducted a Monte Carlo power simulation (100 iterations) using the lme4 package in R. The simulation indicated that a total sample size of 40 provided a power of 0.99 to detect the learning effect. While the power to detect emotion × language interaction was lower, our final sample of 45 (22 for L1, 23 for L2) was deemed sufficient to detect the learning of emotional meanings.
The EEG data were then analyzed by using linear mixed-effects models (LMMs) implemented with the lme4 package 1.1.31 (Bates et al., Reference Bates, Mächler, Bolker and Walker2015) in the R environment (version 4.2.2; R Core Team, 2022). The selection of time windows and regions of interest (ROIs) was informed by both previous literature and a data-driven approach. Mean amplitudes for the EPN were extracted from 200 to 320 ms, a window identified via temporal permutation testing as the period of maximum differentiation between conditions (Junghöfer et al., Reference Junghöfer, Bradley, Elbert and Lang2001). The ROI (P7, P8, Pz, PO7, PO8, POz) was chosen to capture the characteristic posterior distribution of this component (Farkas et al., Reference Farkas, Oliver and Sabatinelli2020; Scott et al., Reference Scott, O’Donnell, Leuthold and Sereno2009). For the LPC, a window of 400–700 ms was selected, representing the sustained evaluative phase of word processing (Citron, Reference Citron2012). The ROI included centro-parietal electrodes (CP3, CP4, CPz, P3, Pz, P4), which align with the typical scalp distribution for late positive potentials associated with memory and affective evaluation (Tang et al., Reference Tang, Li, Fu, Wang, Li, Parviainen and Kärkkäinen2025). These mean amplitude values served as the dependent variable in a series of LMMs fit with the lme4 package. All models were fit using the “bobyqa” optimizer to ensure convergence. P-values were derived using the Satterthwaite approximation for degrees of freedom. Average amplitudes of the observed components were predicted by the fixed effects of group (L1, L2) and either learning (learned, unlearned) or emotion (disgust, sadness, neutrality), depending on the specific analysis. For the effect of learning, all learned pseudowords were collapsed and compared to unlearned items. For the effect of emotion, we analyzed only the learned pseudowords to examine patterns of the three emotion categories. The interaction between group and the respective experimental factor was also included in the analysis. The model included a cross-random factor for the intercept of the individual participant and each selected channel. We called the hypr package 0.2.3 (Lenth, Reference Lenth2019; Schad et al., Reference Schad, Vasishth, Hohenstein and Kliegl2020) to design contrasts matrices, where categorical variables were encoded with sequential difference contrasts when they have a two-level (1/2, −1/2) modality or three-level (1/3, −1/3). Maximal random structure (Brauer & Curtin, Reference Brauer and Curtin2018; Scandola & Tidoni, Reference Scandola and Tidoni2021) was considered by employing Complex Random Intercepts (CRIs, Scandola & Tidoni, Reference Scandola and Tidoni2021). In the present study, we applied the full-CRI modeling, where random slopes with interactions were replaced by different intercepts for each random term.
2.3.4. Source localization analysis
Source localization was carried out using Brainstorm and EEG data were co-registered with a standard anatomical template (ICBM152) and boundary element head models were constructed with OpenMEEG (Pascual-Marqui, Reference Pascual-Marqui2002). Brain activation for each experimental condition was estimated using sLORETA (Pascual-Marqui, Reference Pascual-Marqui2002) with unconstrained source orientations. Differences between every two experimental conditions were calculated for each participant and ROIs were thus created for sets of neighboring solution points showing significant differences (p < .05). Mean absolute values of the current densities of the ROIs were then exported for running ANOVAs or t-tests.
3. Results
3.1. Behavioral results
3.1.1. Learning session
Participants in both groups completed four tests in the learning session, namely, pseudoword selection test, pseudoword matching test, within-modality generalization test and cross-modality generalization test. Results of the pseudoword selection and matching tests were presented in Supplementary Materials. Results of the two generalization tests were shown here because they provide the critical evidence for whether the newly learned pseudowords had truly acquired transferable emotional connotations beyond the specific face – word pairings of the training phase (Gu et al., Reference Gu, Liu, Wang, de Vega and Beltrán2022).
For the within-modality generalization test, independent-samples t-tests revealed no significant differences between the L1 and L2 groups for any of the three emotions, with Bayesian t-tests suggesting anecdotal evidence for no group difference for disgust and neutrality, and moderate evidence for sadness (disgust: t(43) = 1.66, p = .105, BF 01 = 1.14; neutrality: t(43) = .28, p = .782, BF 01 = 2.81; sadness: t(43) = .68, p = .498, BF 01 = 3.29). A one-way repeated-measures Bayesian ANOVA revealed anecdotal evidence for an emotion effect (F(2, 42) = 3.32, p = .046, η2p = .14, BF incl = 1.78) for the L1 group. Pairwise post hoc comparisons showed anecdotal evidence for equivalence between disgust and sadness (t(21) = 1.16, p = .782, BF 10,U = 0.50) and between disgust and neutrality (t(21) = 1.42, p = .512, BF 10,U = 0.73), but moderate evidence for a difference between sadness and neutrality (t(21) = 2.66, p = .044, BF 10,U = 7.56). For the L2 group, evidence for an emotion effect was anecdotal (F(2, 44) = 4.44, p = .018, η2p = .17, BF incl = 1.22) with post hoc comparisons showing moderate evidence for equivalence between disgust and sadness (t(22) = 0.65, p = 1.000, BF 10,U = 0.32) and anecdotal evidence for a difference between disgust and neutrality (t(22) = 2.87, p = .027, BF 10,U = 2.87) and between sadness and neutrality (t(22) = 2.21, p = .114, BF 10,U = 1.80).
For the cross-modality generalization test, between-group comparisons suggested anecdotal evidence for better performance of the L1 group than the L2 group in the sadness condition (t(43) = 2.29, p = .027, BF 10 = 2.32), but no group differences for disgusting stimuli (t(43) = .66, p = .516, BF 01 = 2.85) or neutral trials (t(43) = 1.23, p = .227, BF 01 = 1.86). The Bayesian ANOVA performed for the L1 group provided decisive evidence for an emotion effect (F(2, 42) = 9.69, p < .001, η2p = .32, BF incl = 170.28). Post-hoc comparisons showed strong evidence for differences between disgust and sadness (t(21) = 3.07, p = .017, BF 10,U = 13.37) and between disgust and neutrality (t(21) = 3.57, p = .005, BF 10,U = 34.87), whereas anecdotal evidence was shown for equivalence between sadness and neutrality (t(21) = 0.62, p = 1.000, BF 10,U = 0.35). For the L2 group, the Bayesian ANOVA revealed anecdotal evidence against an emotion effect (F(2, 44) = 2.25, p = .130, η2p = .09, BF incl = 0.63).
3.1.2. Refreshing task
In the refreshing task on the second day, for within-modality generalization test, independent-sample t-tests showed no significant between-group differences for any emotion with Bayesian t-tests suggesting moderate evidence for no group difference for neutrality and anecdotal evidence for disgust and sadness (disgust: t(43) = 1.25, p = .218, BF 01 = 1.81; neutrality: t(43) = .29, p = .772, BF 01 = 3.28; sadness: t(43) = 1.55, p = .128, BF 01 = 1.30). For the L1 group, the Bayesian ANOVA showed no evidence for an emotion effect (F(2, 42) = 3.02, p = .075, η2p = .13, BF incl = 1.00) whereas for the L2 group, the analysis provided extreme evidence for an emotion effect (F(2, 44) = 16.87, p < .001, η2p = .43, BF incl = 2487.27). Post-hoc comparisons showed extreme evidence for differences between disgust and sadness (t(22) = 4.59, p < .001, BF 10,U = 142.23) and between neutrality and sadness (t(22) = 4.25, p = .001, BF 10,U = 100.37) whereas there was moderate evidence for equivalence between disgust and neutrality (t(22) = 0.00, p = 1.000, BF 10,U = 0.29).
For cross-modality generalization test, independent-samples t-tests revealed no significant group differences for any emotion with Bayesian t-tests suggesting moderate evidence for no group difference for disgust (t(43) = .09, p = .927, BF 01 = 3.38) and neutrality (t(43) = .40, p = .689, BF 01 = 3.18), and anecdotal evidence for sadness (t(43) = 1.00, p = .323, BF 01 = 2.27). The Bayesian ANOVA provided anecdotal evidence for the absence of an emotion effect in the L1 group (F(2, 42) = 2.75, p = .076, η2p = .12, BF incl = 0.51) and moderate evidence for its absence in the L2 group (F(2, 44) = 1.28, p = .281, η2p = .06, BF incl = 0.29). The detailed statistical data of the tests were presented in Figure 2.
(A) Mean accuracy for the within-modality and cross-modality generalization tests for the L1 and L2 groups, separately for disgust, sadness and neutrality conditions. (B) Mean accuracy for the within-modality and cross-modality generalization tests of the refreshing task for L1 and L2 groups, separately for disgust, sadness and neutrality conditions in the refreshing task.

3.1.3. ERP behavioral task
Behavioral performance in the lexical decision task was evaluated by computing hit rates for target stimuli (vehicle names) and false alarm rates for nontarget stimuli (all other words).
Hit Rate: L1 speakers achieved a mean hit rate of .98 (SD = .03, n = 22), while L2 speakers achieved .96 (SD = .06, n = 23). An independent-samples t-test indicated no significant group difference, t(43) = 1.55, p = .131.
False Alarm Rate: L1 speakers produced a mean false alarm rate of .002 (SD = .004), whereas L2 speakers showed a higher rate of .008 (SD = .008). This difference was statistically significant, t(43) = −3.31, p = .002.
3.1.4. ERP results
To compare the two language groups, a 2 (Learning: Learned, Unlearned) × 2 (Group: L1, L2) LMM was performed for testing the learning effects, and a 3 (Emotion: Disgust, Sadness, Neutrality) × 2 (Group: L1, L2) LMM for the emotion effects. Hypothesis-driven contrasts were set using the hypr package, with Unlearned, Neutrality and L2 serving as the reference levels for their respective factors. The model included random intercepts for Participants and Channels. The detailed statistical output is presented in Tables 5 and 6.
LMM results for learning effects (EPN, LPC)

Note: The p-values for contrasts showing significant differences were marked in bold.
LMM results for emotion effects (EPN, LPC)

Note: The p-values for contrasts showing significant differences were marked in bold.
3.1.5. Learning effects
EPN (200–320 ms). The analysis revealed a significant main effect of Learning, with unlearned pseudowords eliciting larger EPN amplitudes (Est. = 0.42, SE = 0.16, t = 2.59, p = .010). No main effect of Group was found (p = .929).
LPC (400–700 ms). There was a significant main effect of Group in the later time window indicating larger LPC amplitudes for the L2 group than the L1 group (Est. = −1.99, SE = 0.79, t = −2.53, p = .012). There was no main effect of Learning (p = .097). (see Figure 3 for waveforms, mean amplitudes and topographies).
(A) EPN waveforms at the parietal-occipital ROI (P7, P8, Pz, PO7, PO8, POz) and LPC waveforms at the central-parietal ROI (CP3, CP4, CPz, P3, Pz, P4) showing learning effects for L1 and L2 groups. Gray shadows in each waveform figure mark the interval of interest and the shaded ribbons around waveforms represent standard errors. (B) Mean amplitudes of the EPN and LPC components showing learning effects for L1 and L2 groups. (C) Topographical distributions of the EPN and LPC components for learned versus unlearned pairwise differences of the two groups.

3.1.6. Emotion effects
EPN (200–320 ms). A significant main effect of emotion was found, suggesting enlarged EPN amplitudes for disgusting pseudowords than sad pseudowords (Est. = −0.57, SE = 0.21, t = −2.65, p = .008). However, no significant differences were found between the two sets of emotional pseudowords and neutral pseudowords (p = .252).
LPC (400–700 ms). The main effect of emotion was also spotted in the later time window with significant differences found between emotional and neutral pseudowords. The LPC component was enhanced by emotional pseudowords compared to neutral ones (Est. = −0.61, SE = 0.27, t = −2.27, p = .024). Moreover, a main effect of Group was spotted, with the L2 group eliciting larger LPC amplitudes than the L1 group (Est. = −1.76, SE = 0.76, t = −2.31, p = .021). There was no significant difference in terms of LPC amplitudes elicited by disgusting and sad pseudowords (p = .184). (see Figure 4 for waveforms, mean amplitudes and topographies).
(A) EPN waveforms at the parietal-occipital ROI (P7, P8, Pz, PO7, PO8, POz) and LPC waveforms at the central-parietal ROI (CP3, CP4, CPz, P3, Pz, P4) showing emotion effects (Disgust, Sadness, Neutrality) separately for the L1 and L2 groups. Gray shadows in each waveform figure mark the interval of interest and the shaded ribbons around waveforms represent standard errors. (B) Mean amplitudes of the EPN and LPC components showing emotion effects for L1 and L2 groups. (C) Topographical distributions of the EPN and LPC components for pairwise differences between emotional conditions of the two groups.

3.1.7. Source localization results
Source localization was performed to further explore the main effects of emotion and L1/L2. No significant differences were observed for the emotion effects in both the EPN and the LPC time windows. There were, however, two clusters of differential activations located for the main effects of group in the 400–700 ms time window (see Figure 5). The L2 group in general induced greater current densities than the L1 group in the left (t(43) = 3.15, p = .003) and right (t(43) = 3.57, p < .001) hemispheres. More specifically, in the lateral orbital frontal cortex, inferior temporal gyrus, fusiform gyrus, parahippocampal gyrus, isthmus cingulate gyrus, precuneus, lateral occipital cortex and lingual gyrus.
(A) The cortex demonstrating significant effects in the 400–700 ms time window. (B) Mean current densities for the L1 and L2 groups in the two clusters.

4. Discussion
The present study set out to examine how newly learned emotional meanings influence word processing in first- (L1) and second-language (L2) speakers. Behaviorally, both groups demonstrated successful learning and generalization of the novel emotion – word associations across tasks. While both native and L2 speakers encountered distinct behavioral hurdles in generalizing emotional meanings, their performance eventually converged, although L2 learners exhibited higher neural control demands. Consistent with our hypotheses, ERP findings revealed that while unlearned pseudowords triggered early novelty responses, emotional pseudowords elicited enhanced late activity and disgusting ones uniquely showed stronger early neural differentiation than sad ones despite no initial emotion-neutral differences. Source localization analyses further indicated that L2 learners recruited broader neural networks relative to L1 speakers. Overall, the results confirm that associative learning modifies both early and late ERP responses, and that while L1 and L2 speakers show parallel patterns in the encoding of emotional content, L2 speakers engage additional resources to achieve comparable performance.
5. Interpretation of behavioral results
Successful generalization to novel faces and sentences indicated genuine emotional associations rather than rote memorization. In the cross-modality test, anecdotal evidence suggested that L1 participants outperformed L2 learners in the sadness condition, supporting that L2 speakers may face challenges with emotionally salient negative words (Caldwell-Harris, Reference Caldwell-Harris2014; Pavlenko, Reference Pavlenko2012). For the L1 group, moderate evidence for higher within-modality generalization accuracy for neutral over sad words suggests cognitive load costs that disgust overcame via survival salience (Rozin & Haidt, Reference Rozin and Haidt2013). However, strong evidence emerged suggesting L1 speakers struggled to abstract sensory-visceral disgust into sentential contexts compared to more social sadness and plain neutrality (Rozin et al., Reference Rozin, Haidt, McCauley, Lewis, Haviland-Jones and Barrett2008). Conversely, anecdotal evidence suggested that L2 learners showed difficulties with new disgusted and sad faces compared to neutral ones, which may reflect a complex interplay between the emotional characteristics of the stimuli, their conceptual categorization (Barrett, Reference Barrett2006) and the reliance on specific cues to support generalization (Vigliocco et al., Reference Vigliocco, Meteyard, Andrews and Kousta2009). However, the cross-modality performance of the L2 group was statistically leaning toward a null effect of emotion, suggesting the transition from embodied face-word associations to abstract sentential contexts was neurally and cognitively taxing, resulting in no particular emotion effectively facilitates generalization (Harris et al., Reference Harris, Ayçiçeği and Gleason2003).
In the refreshing task on the following day, as suggested by anecdotal to moderate evidence, high accuracy persisted and group differences vanished, indicating consolidation narrowed the L1-L2 gap (Ferré et al., Reference Ferré, Guasch, Mart’in and Hinojosa2017). Within-group analyses revealed extreme evidence that L2 participants still showed lower accuracy for sad than disgusting and neutral trials with new faces, indicating that while salient disgust stabilized, socially complex sadness remained harder to retain in the L2 lexicon. Meanwhile, both groups showed evidence for the absence of an emotion effect during cross-modality generalization. This suggests that consolidation neutralized the L1 group’s struggle with sentential disgust abstraction, and for the L2 group, the comparable facilitation effects of emotions persisted.
Overall, late bilinguals can learn and maintain relatively stable emotional meanings for new L2 words, although challenges in the early stage, particularly with sadness, require reinforcement to overcome (Jończyk et al., Reference Jończyk, Naranowicz, Bel-Bahar, Jankowiak, Korpal, Bromberek-Dyzman and Thierry2025). This behavioral foundation sets the stage for examining whether underlying neural responses converge or utilize distinct processing routes.
6. Interpretation of the learning effect
The finding that unlearned pseudowords elicited a more negative-going waveforms than learned pseudowords in the early time window indicates that lexical familiarity and meaningfulness play a major role in early word processing. EPNs in the 200–300 ms range reflect heightened perceptual processing of salient or meaningful visual inputs, and they are sensitive not only to affective salience but also to stimulus meaningfulness and novelty (Citron, Reference Citron2012; Kissler et al., Reference Kissler, Assadollahi and Herbert2009; Martín-Loeches, Reference Martín-Loeches2007; Schupp & Kirmse, Reference Schupp and Kirmse2021). In the current paradigm, unlearned pseudowords were the least familiar and thus the most contextually unexpected items, increasing early perceptual demands and producing stronger negativity relative to newly learned items that require less early perceptual amplification, as predicted by hypothesis (a). This interpretation aligns with previous research showing that posterior negativities attenuates as new word forms are consolidated into memory (Bakker et al., Reference Bakker, Takashima, van Hell, Janzen and McQueen2015). It is important to distinguish this early posterior effect from the classical P3a often associated with novelty detection in oddball paradigms (Polich, Reference Polich2007). While the P3a is a fronto-central positive-going component that reflects involuntary attention to perceptual deviance, our observed effect was a parietal-occipital negativity, suggesting that the novelty in the present design was not a simple perceptual surprise but rather a novelty of lexicality and meaningfulness. Therefore, the EPN and RP complex is a more appropriate framework than the P3a as it captures the transition from visual analysis to early form-meaning mapping (Citron, Reference Citron2012).
The group effect in the LPC (400–700 ms) window suggests that, even after learning, processing L2 pseudowords required more sustained elaborative resources. The LPC is often linked to postretrieval semantic evaluation and controlled processing (Bakker et al., Reference Bakker, Takashima, van Hell, Janzen and McQueen2015; Kissler et al., Reference Kissler, Assadollahi and Herbert2009). Larger LPC amplitudes in the L2 group thus reflect more effortful semantic retrieval and affective evaluation of the newly learned words. This supports Hypothesis (c): L2 speakers allocate more cognitive control or working memory to newly learned content. In ERP word-learning studies, a larger LPC for L2 or less familiar conditions has been attributed to less automatic access of meaning and greater reanalysis. For example, Cieślicka and Guerrero (Reference Cieślicka and Guerrero2023) and Jankowiak and Korpal (Reference Jankowiak and Korpal2018) found that for highly proficient Spanish–English bilinguals, LPC amplitudes were larger when words appeared in L2, suggesting greater processing load. Our source analyses buttress this interpretation: L2 learners showed higher current densities in broad brain regions (orbitofrontal cortex, temporal and occipital gyri, precuneus, etc.) known to support semantic memory, imagery and cognitive control. These distributed activations in L2 mirror the prediction of Hypothesis (d) that L2 would recruit a more extensive semantic or emotional network. In short, the learning effect here is twofold: a classic novelty effect in early attention (larger EPN elicited by unlearned than learned pseudowords) and a sustained semantic elaboration (larger LPC amplitudes for the L2 than the L1 group). Together, these patterns underscore that associative word learning quickly modulates early ERPs and that L2 processing remains more demanding even after fast learning.
7. Emotion and L1/L2 main effects
We observed multiple effects of emotional content on ERPs. First, in the EPN time window, disgusting pseudowords elicited more negative responses than sad pseudowords. This pattern was observed despite the fact that the faces used for learning these emotional meanings were matched for arousal. This indicates that learned disgusting meanings captured attention more strongly at an early perceptual stage than sad ones. Disgust is often considered a primal, survival-related emotion, with dedicated neural circuitry (e.g., insula activation) and potent sensory signals (Gu et al., Reference Gu, Liu, Wang, de Vega and Beltrán2022). The greater EPN for disgust may reflect its unique biological salience and the urgency of the sensory avoidance response it triggers. In contrast, sad pseudowords elicited more positive responses than their disgusting counterparts, suggesting that sadness, a more diffuse emotion primarily associated with loss and social rejection (Izard & Buechler, Reference Izard, Buechler, Plutchik and Kellerman1980; Lench et al., Reference Lench, Flores and Bench2011), required less immediate attentional prioritization. This emotion-specific EPN pattern is partly consistent with Gu et al. (Reference Gu, Liu, Wang, de Vega and Beltrán2022), who found that sad pseudowords elicited reduced EPN relative to disgusting and neutral ones. In that study, disgust and neutrality yielded similar EPNs, whereas sadness produced smaller negativity, echoing our results here. Previous ERP research reported that emotional words in general enhance the EPN (Kissler et al., Reference Kissler, Assadollahi and Herbert2009; Palazova et al., Reference Palazova, Mantwill, Sommer and Schacht2011). However, the comparable EPN amplitudes between emotional and neutral pseudowords suggest that the emotional automaticity of newly learned words may not be as robust as that of established emotional words. The emotional significance only emerged during the subsequent LPC time window, which reflects more effortful, sustained semantic and affective evaluation. In sum, the present findings suggest that disgusting stimuli engaged early attentional resources more strongly than sadness-related ones, consistent with the view that different negative emotions can leave distinct signatures in early perceptual processing (Kissler et al., Reference Kissler, Assadollahi and Herbert2009; Schacht & Sommer, Reference Schacht and Sommer2009). Meanwhile, the emotional dissociation between EPN and LPC suggests that while participants successfully learned the pseudowords as emotional, they have not yet achieved the automaticity of well-established emotional lexicons.
Second, at the LPC stage, we found a significant emotion effect: both disgusting and sad pseudowords evoked larger LPCs than their neutral counterparts. Existing empirical evidence suggested that emotional stimuli (regardless of valence) consistently enhance the late positivity in word and picture paradigms (Kissler et al., Reference Kissler, Assadollahi and Herbert2009). The LPC reflects sustained evaluative and integrative processing, reflecting the depth of semantic elaboration and memory updating (Bakker et al., Reference Bakker, Takashima, van Hell, Janzen and McQueen2015), which corresponds to our results: newly learned emotional meanings engaged extra semantic/affective processing, whereas neutral words, carrying no affective tag, elicited smaller LPCs. This general emotional LPC effect underscores that once learning occurred, it triggered richer semantic-affective evaluation. In line with embodied semantic theories, emotional words may partially reenact bodily affective states, thereby enriching the evaluative process, reflected by the enhanced LPCs (Kissler et al., Reference Kissler, Assadollahi and Herbert2009; Niedenthal, Reference Niedenthal2007; Sharif & Mahmood, Reference Sharif and Mahmood2023). All in all, the enhanced LPC amplitudes for both categories of emotional pseudowords (disgust, sadness) can be viewed as an affective amplification of the evaluative mechanism observed in the L1/L2 learning differences. Whether driven by cognitive demand or emotional salience, larger LPCs mark more extensive postlexical semantic-affective integration.
Third, the LPC also showed a main effect of language. This mirrors the group effect already noted, but in the context of emotion analysis, it indicates that, across emotional conditions, stronger late positivities were elicited among L2 than L1 participants. Such a pattern suggests that L2 processing in general consumed more sustained evaluative resources rather than an emotion-specific enhancement. This replicates and extends prior findings that bilinguals process emotional language differently. Cieślicka and Guerrero (Reference Cieślicka and Guerrero2023) and Opitz and Degner (Reference Opitz and Degner2012) observed that the dominant language (L1) triggered larger EPNs while the nondominant (L2) triggered larger LPCs. Our study suggests a similar dissociation: although we did not see a language effect in EPN for emotion, the LPC enhancement for L2 is concordant with needing more cognitive resources when the words are in the weaker linguistic code. The source localization results further indicate that L2 speakers engaged additional neural circuits during emotional word evaluation. Taken together, the emotion and language main effects indicate that discrete emotions (especially disgust) modulate early attention, while general emotionality and L2 status modulate later semantic and affective processing.
8. Interpretation of the interaction tests
Crucially, the LMMs revealed no significant interaction between emotion and language, which means that the L1 and L2 groups showed comparable patterns of emotion-related ERP modulations, with differences appearing primarily in the overall magnitude of the responses rather than the direction of the emotional effects. The lack of a detectable interaction suggests that the neural mechanisms of processing learned emotion meanings are functionally similar in L1 and L2, even if L2 processing is less automatic. According to the Bilingual Emotion Lexicon framework (Pavlenko, Reference Pavlenko2008), emotional detachment in L2 is often attributed to the context of acquisition. Words learned in formal settings via translation equivalents typically lack the rich, multisensory experiences that characterize L1 acquisition. However, our study employed an associative learning paradigm that paired novel words with facial expressions. The similar neural responses between the two groups suggest that such a grounded learning context allowed L2 speakers to establish strong conceptual-affective links, effectively bridging the gap between formal learning and the resonance usually reserved for naturalistic contexts (Ferré et al., Reference Ferré, Fraga and Hinojosa2025). This outcome also aligns with theories of neural assimilation in bilinguals: given sufficient learning and proficiency, second-language processing may recruit largely the same networks as the first language (Kroll & Bialystok, Reference Kroll and Bialystok2013). As Dang et al. (Reference Dang, Ma, Yuan, Fu, Chen, Zhang, Lu and Guo2023) observed, while L2 processing may require more sustained cognitive effort (in our case, the main effect of Group for the LPC), the fundamental semantic-affective evaluation remains consistent across languages. Our results thus suggest that the reduced affective resonance is not absolute: even if embodied responses (e.g., skin conductance) are often weaker in L2 (Harris et al., Reference Harris, Ayçiçeği and Gleason2003; Sharif & Mahmood, Reference Sharif and Mahmood2023), the cognitive evaluation of emotional content may converge as proficiency and learning increase.
9. Limitations and future directions
Several limitations of the present work should be acknowledged. First, our L2 participant group was heterogeneous in terms of their native languages. Cross-linguistic differences in emotion lexicons and cultural norms could affect how easily certain emotions are mapped onto new L2 words. Nevertheless, our results clearly demonstrated differential behavioral and electrophysiological responses to newly learned Chinese emotional words between native and L2 speakers. Future studies could recruit more homogeneous groups or directly compare speakers with different L1 backgrounds to further disentangle language-specific from general L2 learning effects. Additionally, the stimuli set included only two negative emotions. Other emotions such as fear, anger and happiness might yield different patterns. For instance, Briesemeister et al. (Reference Briesemeister, Kuchinke and Jacobs2014) showed that positive words can engage separate neural dynamics from negative ones, even in early components. Future work could expand to a wider range of discrete emotions to map the generalizability of our findings. From a methodological perspective, future studies could also incorporate real emotional words as positive controls, which would allow a direct comparison between newly learned emotional meanings and long-established affective representations in L2 vocabulary.
10. Conclusion
In conclusion, the present study shows that novel pseudowords rapidly acquire emotional meanings that both L1 and L2 speakers successfully generalize and retain across contexts. Behaviorally, L2 speakers were outperformed by L1 speakers on sadness, while within-group deficits appeared for both groups (sadness for L1 and disgust for L2) during within-modality tests. L1 speakers also uniquely struggled with cross-modal disgust abstraction. Most weaknesses diminished after consolidation though sadness remained particularly vulnerable for L2 speakers. Electrophysiologically, unlearned items elicited early novelty-related responses, while emotional pseudowords evoked enhanced late positive activity. Notably, L2 speakers exhibited larger LPCs and broader source engagement. Disgust and sadness diverged at early stages, underscoring emotion-specific influences beyond valence. Together, these findings demonstrate that emotional L2 word learning is effective yet neurally demanding, extending embodied accounts of bilingual cognition and clarifying how emotional meaning shapes both early attention and later evaluative–semantic processing.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S1366728926101448.
Data availability statement
The data supporting the conclusions of this article will be made available by the authors upon request.
Acknowledgments
The authors would like to thank the members of the Center for Language and Cognitive Science Research of the School of Foreign Languages and the School of International Education at Dalian University of Technology.
Funding statement
This work was supported by the National Office of Philosophy and Social Sciences under the National Social Science Fund of China (22CYY019).
Competing interests
The authors have no competing interest to declare.

