The impact of orthography versus images on foreign language learning: Evidence from behavioral and neural markers

Mathew Cieśla; Efthymia C. Kapnoula; Maksym Pozdniakov; Justyna Gruszecka; Katarzyna Jankowiak

doi:10.1017/S1366728925100242

The impact of orthography versus images on foreign language learning: Evidence from behavioral and neural markers

Published online by Cambridge University Press: 27 June 2025

Mathew Cieśla

Efthymia C. Kapnoula ,

Maksym Pozdniakov ,

Justyna Gruszecka and

Katarzyna Jankowiak

Show author details

Mathew Cieśla*: Affiliation:
Department of Psychology, https://ror.org/049e6bc10 Northumbria University , Newcastle Upon Tyne, UK Faculty of English, https://ror.org/04g6bbq64 Adam Mickiewicz University , Poznań, Poland
Efthymia C. Kapnoula: Affiliation:
Basque Center on Cognition, Brain and Language, https://ror.org/01a28zg77 Donostia-San Sebastian , Spain https://ror.org/01cc3fy72 Ikerbasque – Basque Foundation for Science , Bilbao, Spain
Maksym Pozdniakov: Affiliation:
Faculty of English, https://ror.org/04g6bbq64 Adam Mickiewicz University , Poznań, Poland
Justyna Gruszecka: Affiliation:
Faculty of English, https://ror.org/04g6bbq64 Adam Mickiewicz University , Poznań, Poland
Katarzyna Jankowiak: Affiliation:
Faculty of English, https://ror.org/04g6bbq64 Adam Mickiewicz University , Poznań, Poland
*: Corresponding author: Mathew Cieśla; Email: mat.ciesla@northumbria.ac.uk

Article contents

Abstract
Introduction
Methods
Results
Discussion
Conclusion
Data availability statement
Authors Contribution
Competing interests
Footnotes
References

Rights & Permissions

Abstract

A central question in foreign language (LX) learning is how vocabulary acquisition is affected by using image versus orthographic referents. According to the picture superiority effect (PSE) and bilingual/dual coding theory (b/DCT), images should lead to better novel word encoding and retrieval. We tested this prediction using behavioral and event-related potential (ERP) measures. Thirty Polish native speakers learned 40 LX (artificial language) words using either image or L1/orthographic referents. After 24 hours, participants were tested using a translational priming paradigm in congruent and incongruent training-testing modalities. Behavioral results showed higher accuracy and faster responses for LX words learned and tested with images, in line with the PSE and b/DCT. ERP results revealed smaller Late Positive Complex (LPC) amplitudes for words preceded by image compared to lexical primes, likely reflecting less cognitively demanding lexical retrieval. These results provide converging evidence that visual referents provide a more salient modality for L2 learning.

Keywords

word learning foreign language learning bilingualism ERP learning modality

Information

Type: Research Article
Information: Bilingualism: Language and Cognition , First View , pp. 1 - 13

DOI: https://doi.org/10.1017/S1366728925100242 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices: Open data Open materials
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1. Introduction

With the constantly increasing number of foreign language (LX) learners worldwide, the significance of effective language learning and teaching has recently received much scholarly attention, both from a theoretical and an applied perspective. Among the different aspects of LX development, vocabulary learning stands as a fundamental pillar, encompassing grasping word meaning, spelling, and pronunciation, which collectively facilitate effective communication (Krepel et al., Reference Krepel, De Bree and De Jong2021; Wilkins, Reference Wilkins1972). Yet, there is still much that remains unclear regarding the underlying mechanisms and corresponding neural markers of LX word learning, as well as how the speed and efficiency of this process might be modulated by learning modality (visual depiction vs. orthography).

Indeed, there is evidence that word learning outcomes are better when words are presented together with an image referent than without it (Leach & Samuel, Reference Leach and Samuel2007), and even more so when the image is of a familiar object than an unfamiliar one (Havas et al., Reference Havas, Taylor, Vaquero, De Diego-Balaguer, Rodríguez-Fornells and Davis2018). A question that is particularly relevant for vocabulary teaching is whether and how using different input modalities to present the meaning of a novel word may affect learning. For example, the two main modalities traditionally employed in educational settings are orthography (i.e., the L1 translation equivalent of the novel word) and visual depiction (i.e., an image depicting the word’s referent); however, it is still unclear which of these modalities leads to better word learning. Therefore, in addition to its theoretical significance, directly comparing these two input modalities can also have important practical implications for foreign language teaching.

Although there are a few studies that have addressed this question (e.g., Emirmustafaoğlu & Gökmen, Reference Emirmustafaoğlu and Gökmen2015; Liu et al., Reference Liu, Horinouchi, Yang, Yan, Ando, Obinna, Namba and Kambara2021), to our knowledge, there is no study that has directly compared these two training regimes (orthographic-referent-only vs. image-referent-only) by assessing word learning outcomes under congruent (matching learning and testing modalities) versus incongruent (mismatched) conditions. Furthermore, we aim to address this question using both behavioral and neural markers, which can provide a more comprehensive understanding of how well novel words are integrated. In summary, here we assess the efficiency and generalizability of using orthographic versus image-based training input in LX word learning using both behavioral (accuracy and reaction times) and neural markers of lexical integration, as reflected in event-related potential (ERP) patterns.

1.1. Orthography and image effects on L2 learning

It is well-described that images are more easily remembered and recalled than words (Altarriba & Knickerbocker, Reference Altarriba, Knickerbocker, McDonough and Trofimovic2011; Defeyter et al., Reference Defeyter, Russo and McPartlin2009; Hockley & Bancroft, Reference Hockley and Bancroft2011; Stenberg, Reference Stenberg2006; Stenberg et al., Reference Stenberg, Radeborg and Hedman1995). Yet, the underlying mechanisms that drive this picture superiority effect (PSE) are not yet fully understood. The dual coding theory (DCT; Paivio, Reference Paivio1971) and the bilingual dual coding theory (bDCT; Paivio & Desrochers, Reference Paivio and Desrochers1980) hypothesize that images are encoded into both the verbal and image systems, while words are encoded only into the verbal system. As a result, images provide a more robust method for retrieval, which results in faster and more accurate recall of image stimuli than words. To this end, both theories largely describe the unique, perceptual information and more direct semantic access that images provide, allowing for richer memory encoding and subsequently, more robust retrieval and recall.

In addition to studies looking at familiar word recognition, the PSE is also observed in studies of novel word learning. These studies suggest that using images in training promotes faster recognition and retrieval of newly learned words, as compared to using words (Liu et al., Reference Liu, Horinouchi, Yang, Yan, Ando, Obinna, Namba and Kambara2021) or iconic gestures (Morett, Reference Morett2019). For example, Liu et al. (Reference Liu, Horinouchi, Yang, Yan, Ando, Obinna, Namba and Kambara2021) taught 30 Japanese (L1) speakers novel Chinese (L2) words using pictures (CP; 苹果 [CH: apple] + picture of an apple), or L1 (Japanese) words (CJ; 苹果 [CH: apple] + りんご [JA: apple]). After the learning session, participants were tested in three recognition tasks. Here, participants were shown the new L2 (Chinese) word with three pictures (CP condition) or three L1 (Japanese) words (CJ condition) and should decide which referent corresponded to the new L2 word. In all three recognition tasks, participants were faster to respond to L2 words learned with images than L1 words. These results were interpreted in support of the PSE in L2 associative learning. Emirmustafaoğlu and Gökmen (Reference Emirmustafaoğlu and Gökmen2015) observed similar effects of the PSE in Turkish (L1) students of English (L2). For this study, 75 participants were trained on unknown L2 words using either images or their L1 word equivalent. In both conditions, the new L2 vocabulary item was read aloud by the experimenter to aid in retention. Participants were tested both immediately and one week after training, whereby they were presented with the referent (image or L1 word) and instructed to write down the corresponding L2 word. In both post-tests, participants who had learned the L2 items using images showed better recall performance than those who learned via L1 words, further supporting the PSE in L2 learning.

Moreover, given that images are thought to provide a more salient modality for word learning, one may predict that learning with images may also lead to better transfer of new items to orthography than the other way around. Preliminary evidence for this comes from work with familiar words. Stenberg et al. (Reference Stenberg, Radeborg and Hedman1995) found evidence for this in four experiments. In all experiments, participants were shown familiar L1 words and an equal number of images and were instructed to remember as many items as possible. Specifically, participants were presented with isolated (either word or image) stimuli (Experiments 1–3), or stimulus pairs (picture plus its orthographic L1 label; Experiment 4) and were then tested using a recognition post-test (each item always presented in one modality on each test trial). In Experiments 1–3, participants were tested on the recognition of the trained items (20 images and 20 words) both congruently (i.e., trained and tested with an image) and incongruently (i.e., trained with an image but tested with the L1 word). Results showed that stimuli learned and tested with images were responded to more quickly and accurately than those learned and tested with words, supporting the PSE. However, the positive effects of images disappeared when items were learned with an image and tested with an L1 word. Crucially, in Experiment 4, participants were faster to respond to image stimuli than words in testing, despite being presented in both modalities in training, supporting the bilingual/dual coding theory (b/DCT). Overall, the results of Stenberg et al. (Reference Stenberg, Radeborg and Hedman1995) align with the b/DCT and PSE, indicating that when images are used in training, both perceptual (e.g., color, shape) and conceptual (semantic) information is stored in memory, consequently leading to better recognition (Stenberg, Reference Stenberg2006).

More generally, the findings of Stenberg et al. (Reference Stenberg, Radeborg and Hedman1995) are also supported by transfer- appropriate processing (Craik & Lockhart, Reference Craik and Lockhart1972; Weldon & Roediger, Reference Weldon and Roediger1987), where it is not just the modality used at retrieval, but also the relationship (i.e., congruence) between initial encoding and retrieval that affects recognition performance. Given these interpretations, a similar generalization advantage for image-based over orthography-based training may also be expected in novel word learning. Alternatively, it is also possible that in the case of novel word learning, the encoding of new information may increase the cognitive resources required to complete the task, which may, in turn, change the way in which a given effect (e.g., PSE) plays into this process (for a similar case, where a facilitatory effect in familiar word retrieval becomes detrimental in novel word learning, see Baese-Berk et al., Reference Baese-Berk, Kapnoula and Samuel2025). To our knowledge, no study has directly tested this. Addressing this gap in the literature was the main goal of this study.

1.2. Using event-related potentials (ERPs) to track novel word learning

The process of word learning has been previously examined with the use of ERPs, which can uncover neural changes accompanying word learning (see Jankowiak, Reference Jankowiak2021 for a review). To investigate knowledge consolidation during LX word learning, ERP research has focused on the integration of newly acquired words with existing lexical items in the mental lexicon (Borovsky et al., Reference Borovsky, Elman and Fernald2012), as assessed through, for instance, the translation priming paradigm (e.g., Bakker et al., Reference Bakker, Takashima, Van Hell, Janzen and McQueen2015; McLaughlin et al., Reference McLaughlin, Osterhout and Kim2004). In this task, an existing (L1) word is followed by a newly learned translation equivalent (in LX), with the neural response to the second word reflecting the degree to which the LX word has been incorporated into the learner’s mental lexicon.

Two ERP components have been identified as indicators of neural changes during the learning process. Firstly, the N400 component, peaking at approximately 400 ms after stimulus onset, serves as an index of lexico-semantic access (Jankowiak & Rataj, Reference Jankowiak and Rataj2017; Kutas & Federmeier, Reference Kutas and Federmeier2011). In the context of word learning, the N400 response reflects the successful integration of newly acquired words into the lexico-semantic memory. Studies employing the translation priming paradigm have observed attenuated N400 responses to newly learned words when preceded by their translation equivalents compared to unrelated words (e.g., Pu et al., Reference Pu, Holcomb and Midgley2016; Yum et al., Reference Yum, Midgley, Holcomb and Grainger2014; Zhang et al., Reference Zhang, Chen, Tang, Yao and Lu2018). Secondly, the Late Positive Complex (LPC), observed at 600–800 ms after stimulus presentation, marks processes related to meaning integration and re-analysis (Aurnhammer et al., Reference Aurnhammer, Delogu, Brouwer and Crocker2023; Kolk & Chwilla, Reference Kolk and Chwilla2007). In the context of language learning, the LPC has been associated with episodic memory retrieval (Rugg & Curran, Reference Rugg and Curran2007).

The use of ERPs has provided valuable insights regarding the trajectory of novel word learning. For example, despite the evidence that new word-forms can be fully functional immediately after learning (e.g., Kapnoula et al., Reference Kapnoula, Packard, Gupta and McMurray2015; Kapnoula & McMurray, Reference Kapnoula and McMurray2016), there is also a substantial amount of work showing that word learning is significantly boosted by a consolidation period that strengthens neocortical connections, facilitating the integration of newly learned words into the mental lexicon (Davis & Gaskell, Reference Davis and Gaskell2009; McClelland et al., Reference McClelland, McNaughton and O’Reilly1995; for a review see Palma & Titone, Reference Palma and Titone2021). Indeed, prior ERP research has examined the effect of time and/or off-line consolidation on the integration of newly acquired words within the mental lexicon. For instance, McLaughlin et al. (Reference McLaughlin, Osterhout and Kim2004) observed lexical-driven post-learning N400 modulations as early as 14 hours following classroom instruction, while semantics-driven N400 effects were observed 63 hours after learning. In their study, lexical-driven N400 effects were identified by comparing ERP responses to real words versus pseudowords, reflecting learners’ sensitivity to word form. In contrast, semantics-driven effects were measured by comparing responses to target words preceded by either semantically related or unrelated primes, indexing emerging semantic integration. This suggests a relatively faster consolidation of word form compared to word meaning. To examine both lexical- and semantics-driven ERP changes, Bakker et al. (Reference Bakker, Takashima, Van Hell, Janzen and McQueen2015) implemented a word train-and-test regime that included a 24-hour consolidation period. Their findings revealed automatic lexico-semantic access to both existing and newly acquired words (reflected in the N400 response); however, the retrieval of newly acquired words proved more cognitively demanding relative to existing words (indicated by the LPC response). Such patterns point to a gradual consolidation of newly learned words into the mental lexicon, whereby newly learned words are automatically retrieved from lexico-semantic memory, yet their processing may not reach the same degree of automaticity as that of relatively frequent, well-known words within the 24-hour post learning period.

Based on the above, it becomes clear that ERP components like the N400 and the LPC can offer insights into aspects of novel word retrieval and lexico-semantic access beyond those provided by behavioral measures. Thus, in the present study, we included ERP measures in our testing regime to address our primary goal of comparing picture-based and orthography-based novel word learning.

1.3. Using a constructed language to study naturalistic LX word learning

The majority of ERP studies conducted thus far have focused on language learning using either pseudowords (Batterink & Neville, Reference Batterink and Neville2011; McLaughlin et al., Reference McLaughlin, Osterhout and Kim2004; Zhang et al., Reference Zhang, Lu, Liang and Chen2020) or words from a language previously unknown to participants (Pu et al., Reference Pu, Holcomb and Midgley2016; Yum et al., Reference Yum, Midgley, Holcomb and Grainger2014). The first approach (using pseudowords) may lead to results that are difficult to generalize to the real world due to participants being aware of the artificial nature of the materials. On the other hand, the second approach (using words from natural languages) might introduce potential bias to research on LX learning due to, for instance, accidental exposure to the natural language used. To address this issue, the utilization of an artificial, or constructed language (conlang) has emerged as a viable solution. Conlangs offer researchers the ability to introduce various psycholinguistic manipulations, such as controlling factors like linguistic distance or language-specific orthographic and phonotactic rules, while ensuring participants (a) treat them as real words and (b) have no prior exposure to the language (Hayakawa et al., Reference Hayakawa, Ning and Marian2020; McLaughlin, Reference McLaughlin1980; Weiss, Reference Weiss2020). Furthermore, a recent functional magnetic resonance imaging (fMRI) study comparing natural versus constructed language processing has shown identical neural processes for both, indicating that the conlang was treated as a natural language (Malik-Moraleda et al., Reference Malik-Moraleda, Taliaferro, Shannon, Jhingan, Swords, Peterson, Frommer, Okrand, Sams, Cardwell, Freeman and Fedorenko2025). These findings not only add to the benefits of using conlangs but also add the potential ability to generalize conlang findings to natural languages. Yet, so far, there has been little research conducted using conlang words as learning materials, with only a few exceptions (Bartolotti & Marian, Reference Bartolotti and Marian2017; García-Gámez & Macizo, Reference García-Gámez and Macizo2022). The present study employs a conlang that we created to be typologically close to the native language of our participants (i.e., Polish), but is based on Czech orthographic and phonotactic rules. With Poland and Czechia being geographic neighbors, both languages belonging to the West Slavic language family, and having high mutual intelligibility (Golubović & Gooskens, Reference Golubović and Gooskens2015), the potential for pre-exposure is extremely high and likely. Yet, with the use of this conlang, we have minimized the linguistic distance from the participants’ L1 while simultaneously removing the likely issue of pre-exposure. This application speaks highly to the very concept of using conlangs within psycholinguistic research, providing opportunities for linguistic manipulation while also removing potentially debilitating confounds (Weiss, Reference Weiss2020).

1.4. The present study

The goal of the present study was to examine the effects of input modality on novel word learning. Specifically, we aimed at comparing the use of images versus orthography to present the referents of novel words. We were interested in examining input modality effects on word learning both during training and in testing, as well as the effect of training–testing modality congruency. To this end, we taught participants a set of novel conlang words using either the orthographic or image modality to present the referent and then tested learning on both modalities 24 hours later. We analyzed behavioral measures (accuracy and reaction times) to assess learning progress across the two days, and we employed ERPs to assess different aspects of the integration of the newly learned words into the mental lexicon on the second day.

According to PSE and b/DCT, images were expected to carry an advantage for both novel word encoding and retrieval compared to the orthographic modality. Therefore, during training, we expected participants to reach the established learning criterion faster for items assigned to the image condition. In regard to participants’ performance during testing, we expected that (a) using image primes would lead to faster recognition and (b) having trained with images would lead to better cross-modal generalization than the reverse (i.e., smaller congruency effect for items tested with a lexical prime). Finally, turning to our ERP results, we expected smaller N400 responses, as well as smaller LPC responses for items primed and/or trained with images (reflecting less cognitively demanding lexico-semantic access and more automatic retrieval from lexico-semantic memory, respectively).

2. Methods

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

2.1. Participants

An a priori power analysis was conducted using G*Power version 3.1.9.6 (Faul et al., Reference Faul, Erdfelder, Buchner and Lang2009). This indicated a minimum sample of 24 to detect 90% power for a medium effect (F=0.25) at α = 0.05, using a repeated measures analysis of variance (RM ANOVA). The original sample included 40 participants; yet four of them were excluded due to significant electroencephalography (EEG) noise, four for low accuracy in the f-4AFC task (i.e., < 20%), one for not completing both days of the experiment and one resulting from technical issues. This resulted in a final sample size of 30 participants (22 females, 5 males and 3 non-binary), with a mean age of 23.36 years (95% CI [22.01, 24.72]). They were all students or graduates of Adam Mickiewicz University, Poznań (Poland). The Edinburgh Handedness Inventory (Oldfield, Reference Oldfield1971) indicated that 29 participants were right-handed and one was ambidextrous (M _right hand preference = 79.55, 95% CI [73.68, 85.41]). All participants were native speakers of Polish (L1), with all reporting English as their second language (L2). The Language History Questionnaire 3.0 (Li et al., Reference Li, Zhang, Yu and Zhao2020) showed that they were more proficient (M _{L1 proficiency} = 93%, 95% CI [89.98, 95.72]), dominant (M _{L1 dominance} = 57%, 95% CI [55.23, 59.22]) and immersed (M _{L1 immersion} = 93%, 95% CI [91.76, 94.17]) in L1 than L2 (M _{L2 proficiency} = 82%, 95% CI [78.29, 85.67]; M _{L2 dominance} = 50%, 95% CI [46.75, 52.27]; M _{L2 immersion} = 67%, 95% CI [62.93, 71.27]). All participants had normal or corrected-to-normal vision and hearing and did not suffer from any language, neurological or attention disorder. For their participation, they received course credits or monetary compensation worth 50 PLN.

2.2. Materials

The stimuli consisted of 40 Polish (L1) words, 40 artificial (LX) words and 40 images taken from the multilingual picture databank (MultiPic; Duñabeitia et al., Reference Duñabeitia, Crepaldi, Meyer, New, Pliatsikas, Smolka and Brysbaert2018). L1 and LX stimuli were all of 2–3 syllables (M = 2.4, 95% CI [2.29, 2.51]) and 3–8 letters (M = 5.56, 95% CI [5.27, 5.86]). L1 words were high frequency (M = 3.65, 95% CI [3.45, 3.83], determined using the SUBTLEX-PL corpus log₁₀ word frequency per million (lg.mln.freq; Mandera et al., Reference Mandera, Keuleers, Wodniecka and Brysbaert2015). Additional L1 stimuli parameters were determined using the Affective Norms for 4900 Polish Words Reload (ANPW_R) corpus (Imbir, Reference Imbir2021). Here, Polish words were rated using the Self-Assessment Manikin (SAM) scale using 1 (low) to 9 (high) for valence and arousal, and 1 (high) and 9 (low) for concreteness (Imbir, Reference Imbir2015; Lang, Reference Lang, Sidowski, Johnson and Williams1980). As such, L1 stimuli were rated as low-neutral in valence (M = 4.17, 95% CI [3.44, 4.91]), low arousal (M = 2.54, 95% CI [3.44, 4.91]) and high concreteness (M = 1.42, 95% CI [1.17, 1.67]). Furthermore, Polish words with distinctive characters (e.g., Ł, Ż, etc.) were excluded.

LX stimuli were created using WordCreator Software (Trost, Reference Trost2022), which builds artificial words using specific linguistic information such as character, di-, and trigram frequency, and user-specified settings (e.g., character count). Czech was chosen for the base language as it is one of the nearest linguistic relatives to Polish, with approximately 60% mutual intelligibility (Golubović & Gooskens, Reference Golubović and Gooskens2015). Linguistic distance is a well-known factor in determining L2 learning outcomes, with lower distance resulting in faster acquisition (Isphording & Otten, Reference Isphording and Otten2014; Lindgren & Muñoz, Reference Lindgren and Muñoz2013). Each LX stimulus was randomly assigned to an L1 referent, and this process was repeated four times to create four different L1-LX assignment conditions. No L1–LX pairs were repeated between the four assignment conditions, ensuring that observed effects were not due to a specific L1–LX sequence. Additionally, an auditory stimulus was recorded for each of the 40 LX words in a sound booth by a Ukrainian native speaker to be utilized during the learning sessions. Audio files were recorded and edited for clarity and pitch using Audacity (Audacity Team, 2022).

2.3. Procedures

The procedure applied in the experiment was approved by the Ethics Committee for Research Involving Human Participants at Adam Mickiewicz University, Poznań (Resolution No. 29/2021/2022). Written informed consent was obtained from all participants involved before the experiment started. The experiment involved two sessions: a behavioral training session (Day 1) and an electrophysiological testing session (Day 2), which took place 24 hours after the first session. The experiment was carried out in the Psychophysiology of Language and Affect (PoLA) Laboratory at the Faculty of English, Adam Mickiewicz University, Poznań. In both sessions, participants were seated in a dimly lit and quiet booth, approximately 70 cm away from an LED monitor with a screen resolution of 1280×1024 pixels. PsychoPy was used to present the stimuli and collect the behavioral data (Peirce et al., Reference Peirce, Gray, Simpson, MacAskill, Höchenberger, Sogo, Kastman and Kindelov2019; for EEG data, see the EEG data recording section below).

2.4. Day 1: Training and collection of secondary measures

Participants began with the Edinburgh Handedness Inventory (Oldfield, Reference Oldfield1971) to ensure all participants were right-handed, and the Language History Questionnaire 3.0 (Li et al., Reference Li, Zhang, Yu and Zhao2020) to measure language background. Next, participants conducted a Stroop test measuring executive function, which was collected as part of piloting a different study and will not be used for the main analysis of this study. After completing the Stroop test, participants proceeded to the main training session, where they were told they would be learning a Czech dialect and that they should learn the new words to the best of their ability. The 40 novel words were grouped into 10 training sets of four items. The training was split into two phases: The first phase included an association task and a blocked four-alternative forced choice task (b-4AFC). During the first phase, items were blocked by training set. The second phase included a final 4AFC (f-4AFC) using all 40 items. Crucially, across all training tasks, the referent of each novel word was consistently presented either orthographically or in the form of an image; for any given participant, half of the items were assigned to the orthographic and half to the image referent condition. In addition, all of the items in each training set were assigned to the same referent condition (image or lexical), and the items of each set were always presented together in the 4AFC tasks.

During the association task, a fixation cross appeared (500 ms), followed by the L1/image referent (1000 ms), another fixation mark (500 ms), and finally the LX word (Figure 1). The presentation of the LX word was self-paced and was accompanied by a single presentation of an audio recording of the corresponding LX word. The use of auditory inputs was meant to boost learning outcomes, in accordance with Bakker et al. (Reference Bakker, Takashima, Van Hell, Janzen and McQueen2014), who found a facilitatory effect of auditory input on novel word learning after a 24-hour consolidation period. This was done once for each item in a training set (i.e., four association trials). After that, participants conducted the b-4AFC task on the items of that training set.

Figure 1. Examples of association trials in which participants were familiarized with the new LX words on Day 1.

During the b-4AFC task, the referent (orthographic form of L1 word or image) was shown in the center of the screen with the four LX words of the set shown in the corners of the screen (Figure 2). Participants would then select the correct translation of the L1 referent by clicking on it using the mouse. A correct selection would turn the correct word green, while an incorrect selection would turn the correct translation green and the incorrectly chosen word red. In both cases, only the response-associated word(s) would remain on screen until the participant chose to continue. Each LX–referent pair was presented three times, and the locations of the LX words were randomized in each repetition. Once participants completed the association and b-4AFC tasks for one training set, they moved on to the association and b-4AFC tasks for the next set.

Figure 2. Examples of b-4AFC and f-4AFC trials employed on Day 1.

The learning session concluded with a final 4AFC (f-4AFC) including all 40 words. This task was identical to the b-4AFC but followed a learning-to-criterion approach; the number of loops was conditional upon the participant’s accuracy, and it required 100% response accuracy to be completed. If any incorrect selections were made, the loop (including all 40 items) was repeated. If 100% accuracy was obtained in the first loop, participants would be required to complete a second loop with 100% accuracy again. This two-loop, minimum requirement served two purposes. First, to ensure that the initial 100% accuracy was not due to chance (i.e., reducing the odds of any single answer being chosen correctly by accident) and secondly to ensure a minimum of six total L1–LX exposures (i.e., 1 [association task] + 3 [b-4AFC task] + 2 [f-4AFC task]). The f-4AFC was self-paced, and the presentation of the stimuli was randomized during each loop. An analysis accounting for the number of f-4AFC loops was conducted on the Day 2 response time data to test for any effects.

2.5. Day 2: Testing

The testing session was done using a standard translation priming paradigm. Participants were instructed to decide whether the LX word, learned 24 hours prior, was a translation equivalent of the L1 word/image referent preceding it (a binary translation recognition task). During this task we manipulated: (a) whether the LX word matched the preceding referent (translational match vs. translational mismatch) and (b) whether the modality in which the referent was presented (i.e., orthographic or image) was congruent or incongruent to how the LX word was learned in the Day 1 training.

Each trial began with a fixation cross (200 ms), followed by a visual prime that was either the orthographic form of an L1 word or an image (300 ms), another fixation cross (200 ms), and finally the orthographic form of the LX word, which was presented until participants made their decision by pressing one of the designated keys. The designation of each key to a “yes”/“no” response was counterbalanced between participants. Participants had 2,000 ms to respond. Each trial was separated by a 1,000 ms inter-trial interval. Every 120 trials, participants were given a self-paced break to rest. In total, the experiment comprised 480 trials (40 LX words × 2 translational match conditions × 2 modality congruency conditions × 3 repetitions). The testing session took approximately 40 minutes. Upon completion of the testing, participants were debriefed, which included informing them that the new vocabulary items they had learned were from a conlang.

2.6. EEG data recording

EEG data were recorded at 2048 Hz from 64 Ag/AgCl electrodes placed at the standard extended 10–20 positions. The bipolar electrodes monitoring vertical (vEOG) and horizontal (hEOG) eye movements were placed above and below the left eye and next to the outer rims of both eyes, respectively. The EEG signals were recorded by ActiView (Biosemi B.V., Amsterdam) and amplified using an ActiveTwo AD-box (Biosemi B.V., Amsterdam).

2.7. Data analysis

2.7.1. Behavioral data analysis

Day 1 (training) and Day 2 (testing) accuracy and reaction time (RT) data were analyzed using a repeated measures analysis of variance (RM-ANOVA). Day 1 used prime modality (image vs. lexical) as the within-subjects factor. Day 2 used prime modality (image vs. lexical), translation match (match vs. mismatch) and training–testing congruency (congruent vs. incongruent) as within-subjects factors. Data were analyzed using SPSS and JASP (JASP Team, 2023)

2.7.2. EEG data analysis

BrainVision Analyzer 2.1 software (Brain Products, Germany) was used to analyze the data offline. Continuous EEG data were down-sampled to 500 Hz, referenced to the common average reference (Luck, Reference Luck2014; Nunez & Srinivasan, Reference Nunez and Srinivasan2006) and filtered offline (Butterworth zero phase filters) with a high-pass filter set at 0.1 Hz (slope 24 dB/octave) and a low-pass filter set at 20 Hz (slope 24 dB/octave). They were then segmented from 200 ms before critical word onset to 1,000 ms afterward, baseline-corrected relative to the signal between −200 and 0 ms before stimulus onset. Data were edited for artifacts by rejecting trials with flatlining events, voltage differences higher than 100 μV or voltage steps higher than 50 μV, which led to the removal of 1.77% of the trials. Ocular artifacts were corrected using the ocular artefact regression method by Gratton et al. (Reference Gratton, Coles and Donchin1983).

We analyzed two ERP components previously reported in research on neural markers of foreign language learning: N400 and LPC (e.g., Bakker et al., Reference Bakker, Takashima, Van Hell, Janzen and McQueen2015; García-Gámez & Macizo, Reference García-Gámez and Macizo2022; Zhang et al., Reference Zhang, Huang, Jiang, Xu, Rao and Xu2023). The analyses were performed within pre-defined time windows: 300–500 ms (N400) over the FC1, FCz, FC2 (fronto-central), C1, Cz, C2 (central), CP1, CPz and CP2 (centro-parietal) electrodes; 600–800 ms (LPC) over the C1, Cz, C2 (central), CP1, CPz, CP2 (centro-parietal), P1, Pz and P2 (parietal) electrodes.

ERPs were time-locked to the onset of the critical LX word. Within the N400 and LPC time frames, mean ERP amplitudes were analyzed using repeated measures ANOVAs, with L1–LX Translation Match (Match vs. Mismatch), Prime Modality (Lexical vs. Image Prime) and Learning–Testing Congruency (Congruent vs. Incongruent) as within-subject factors. Anterior–posterior electrode position (N400: frontocentral vs. central vs. centro-parietal electrodes; LPC: central vs. centro-parietal vs. parietal electrodes) along with Laterality (left vs. midline vs. right electrodes) were included in the analyses as within-subject factors. The Greenhouse–Geisser correction was applied when the sphericity assumption was violated, as indicated by Mauchly’s tests. Pairwise comparisons were corrected for multiple comparisons with the Bonferroni correction.

3. Results

3.1. Behavioral results

3.1.1. Training (4AFC Tasks): Accuracy and reaction times (Day 1)

On average, participants needed three f-4AFC loops to reach 100% accuracy across all items (min = 2; max = 7). Therefore, each participant was exposed to each LX–referent pair between six and 11 times. To reiterate, the f-4AFC was used as a training-to-criterion tool in order to ensure that participants knew 100% of the new vocabulary items and to ensure a minimum of six exposures to these items. Participants’ average RT across all f-4AFC loops was 1,618.12 ms (SD = 320.17 ms). We addressed our main question of whether training modality affects learning outcomes by comparing LX items in the two training conditions in terms of the earliest f-4AFC loop in which the LX word was identified correctly, as well as accuracy and RTs in the f-4AFC trials. All three comparisons were conducted using a paired-samples t-test with the alpha adjusted to avoid Type 1 errors (α = .017).

LX items paired with an image referent were correctly identified for the first time on an earlier f-4AFC loop compared to items in the orthographic condition (M _image = 1.05, SD = .12, M _ortho = 1.1, SD = .15), t(29) = 2.747, p = .010. For accuracy analyses, we used the empirical-logit-transformed proportion as the dependent variable after excluding each participant’s last loop (in which they all had, by definition, 100% accuracy across items). Accuracy was significantly higher when the LX word was paired with an image (M = 98.07%, SD = 3.11%) compared to when it was paired with an L1 word (M = 94.92%, SD = 5.63%), t(29) = 2.790, p = .009. For the RT analysis, only correct responses were used, resulting in 3% data loss. In addition, RT outliers were removed using ±2SD from the mean (Berger & Kiefer, Reference Berger and Kiefer2021), resulting in an additional 0.1% data loss. RTs for items in the image training condition were faster (M = 1,614.38 ms, SD = 321.01) compared to those in the orthographic condition (M = 1,622.09 ms, SD = 391.58), t(29) = 2.628, p = .014. Together, these results point to preliminary evidence for a learning advantage for novel words paired with images as referents over those paired with orthographically presented L1 translations.

3.1.2. Testing: Accuracy and reaction times (Day 2)

Participants performed the translation priming task without difficulties and responded in a prompt manner; average accuracy was 93.20% (SD = 5.80%), and average RT was 681.88 ms (SD = 106.75 ms). We analyzed accuracy and RTs using repeated-measures analysis of variance (RM ANOVA) to test the effects of Translation Match (match vs. mismatch), Prime Modality (image vs. orthography) and Learning–Testing Congruency (congruent vs. incongruent), as well as their interactions. An additional analysis was conducted on RTs using the between-subjects covariate of Exposure (number of f-4AFC loops) to test for any effects of the number of f-4AFC loops during training on Day 1.

For accuracy, we used the empirical-logit-transformed proportion as the dependent variable. The analysis revealed a main effect of Translation Match, F(1, 29) = 5.30, p = .029, η_p² = .154, pointing to higher accuracy for translational mismatches (M = 94.04%, SD = 6.87%) than translational matches (M = 92.36%, SD = 6.83%). There was also a main effect of Prime Modality, F(1, 29) = 20.40, p < .001, η_p² = .413, with higher accuracy for image (M = 93.95%, SD = 7.03%) than orthographic primes (M = 92.43%, SD = 5.57%). Finally, a main effect of Learning–Testing Congruency was also observed, F(1, 29) = 4.78, p =.037, η_p² = .141, with higher accuracy for learning–testing congruent items (M = 93.17%, SD = 6.75%) compared to incongruent items (M = 93.25%, SD = 5.72%). None of the interactions were significant.

Only correct responses were used in the RT analysis, resulting in 6% data loss. Outliers were removed using ±2SD from the mean and any RTs less than 200ms, resulting in an additional 5% data loss (Berger & Kiefer, Reference Berger and Kiefer2021). First, to test for any effects of the number of f-4AFC exposures on testing RTs, an RM ANOVA using Exposure as a between-subjects covariate was conducted. The analysis revealed no main effect of Exposure on RTs, F(1,28) = 0.923, p = .345, η_p ² = .032, and no effects on any within-subject factors or their interactions (all ps > .2). As such, the covariate of Exposure was removed from the final analysis. This analysis revealed a main effect of Translation Match, F(1, 29) = 58.27, p < .001, η_p ² = .668, where participants were faster to respond to translational matches (M = 659.46, SD = 125.98) than translational mismatches (M = 705.76, SD = 125.99). The analysis showed a further main effect of Prime Modality, F(1, 29) = 18.12, p < .001, η_p ² = .384, with shorter RTs for target words preceded by image primes (M = 677.30, SD = 127.90) than by lexical primes (M = 691.92, SD = 128.66). Finally, a main effect of Learning–Testing Congruency was observed, F(1, 29) = 18.02, p < .001, η_p ² = .383, with faster RTs for learning–testing congruent items (M = 678.15, SD = 129.63) compared to incongruent items (M = 691.07, SD = 127.01). Additionally, a significant two-way interaction between Prime Modality (image vs. lexical) and Training–Testing Congruency (congruent vs. incongruent) was observed, F(1,29) = 34.16, p < .001, η_p ² = .541 (Table 1).

Table 1. Mean RTs (in milliseconds) and standard deviations by modality, translation and congruency (Day 2)

Note: N = 30.

A Bonferroni-corrected pairwise t-test was conducted to explore this interaction, which showed significantly faster RTs in trials with image/congruent primes (M = 658.01, SD = 127.20) relative to all other conditions. Specifically, RTs in trials with image/congruent primes were significantly faster compared to trials with image/incongruent primes (M = 696.59, SD = 126.72), t(29) = −7.22, p < .001, as well as compared to trials with both orthographic/congruent primes (M = 698.30, SD = 129.95), t(29) = −7.23, p < .001, and orthographic/incongruent primes (M = 685.55, SD = 128.13), t(29) = −6.00, p = <.001. No other statistically significant differences were observed (all ps > .05), including orthographic primes in congruent conditions compared with incongruent conditions t(29) = 2.39, p = .12. Figure 3 presents the observed two-way interaction with mean RTs for the Prime Modalities (image vs. lexical) across both Learning–Testing Congruency conditions (congruent vs. incongruent). Day 2 accuracy as well as RT means and standard deviations for all conditions can be seen in Table 2.

Figure 3. Mean RTs for image- and orthography-primed items in the learning–testing congruent and incongruent conditions (Day 2).

Table 2. Accuracy, mean RTs (in milliseconds), and standard deviations by condition (Day 2)

Note: N = 30.

Firstly, these findings speak to the reliability of the training-to-criterion approach from training on Day 1 (see Exposure analysis). Despite the differing numbers of f-4AFC loops required for each participant during training, they had no effect on overall RTs in testing. Rather, the f-4AFC acted purely as a tool to ensure 100% accuracy on the learning of the new vocabulary items. Furthermore, these findings indicate that recall performance is better (i.e., faster) when training and testing are conducted in the same modality (see main Congruency effect). Crucially, in line with our predictions, our results indicate that training with an image leads to better generalization of learning. Specifically, training with orthography, compared to images, leads to a disadvantage when one is tested with images; however, the reverse is not observed. Thus, images appear to provide a more salient modality for L2 learning, such that training with images generalizes better to orthography than the other way around.

3.2. Event-related potentials

3.2.1. N400 (300–500 ms)

Within the N400 time window (300–500 ms), the analysis of variance (ANOVA) showed an interaction between Laterality and Learning–Testing Congruency, F(2, 58) = 6.31, p = .003, η_p ² = .179. Post-hoc analyses further revealed that over right electrode positions (i.e., FC2, C2 and CP2), the learning–testing congruent condition (M = −1.22 μv, SE = .14) elicited larger N400 amplitudes relative to the incongruent condition (M = −1.02 μv, SE = .14), p = .006. No other main effects or interactions were significant.

This finding suggests that presenting a lexical item in the same modality as the one in which it was learned leads to more robust activation within the lexico-semantic memory. Figure 4 presents mean ERP amplitudes for learning–testing congruent and incongruent conditions, as observed over the right electrode sites. In contrast to our predictions, the N400 was not modulated by prime and/or training modality. Possible reasons are discussed in the Discussion.

Figure 4. Grand averages for congruent and incongruent learning–testing conditions in the N400 time frame (300–500 ms).

3.2.2. LPC (600–800 ms)

Within the LPC time window (600–800 ms), the analysis of variance (ANOVA) showed a main effect of Prime Modality, F(1, 29) = 5.32, p = .028, η_p ² = .155, whereby larger LPC amplitudes were observed for words preceded by lexical (M = −.27 μv, SE = .13) than image (M = −.44 μv, SE = .14) primes. Figure 5 (left) presents mean ERP amplitudes for words preceded by images and lexical referents.

Figure 5. Grand averages in the LPC time frame (600–800 ms). Left – words preceded by image and lexical primes. Right – words preceded by image and lexical primes in the translation-match and translation-mismatch condition.

Furthermore, the analysis yielded an interaction between Laterality, Prime Modality and Translation Match, F(2, 58) = 5.81, p = .005, η_p ² = .167. Post-hoc analyses further showed that over left (p = .041) and midline (p = .017) electrode positions (i.e., C1, CP1, P1, Cz, CPz and Pz), among translation-match trials, words preceded by lexical primes (M _{left electrodes} = −.14 μv, SE = .14; M _{midline electrodes} = −.31 μv, SE = .16) elicited larger LPC amplitudes compared to those preceded by image primes (M _{left electrodes} = −.41 μv, SE = .18; M _{midline electrodes} = −.65 μv, SE = .17). This effect was not observed for the translation-mismatch trials, ps > .05. No other main effects or interactions were significant.

Overall, this pattern is in line with our prediction that image primes would allow a more direct, and thus less demanding, activation of conceptual representations compared to lexical primes. Figure 5 (right) presents mean ERP amplitudes for words preceded by image and lexical primes among translation-match trials, as observed over the left and midline electrode sites.

4. Discussion

The goal of this study was to evaluate LX word learning outcomes of training with orthographic (i.e., L1 word) versus image-based input and to test lexical integration using both behavioral and neural markers. As outlined in the Present Study section, we expected to observe an advantage for both encoding and retrieval of novel words learned with an image. Indeed, our behavioral results revealed better cross-modal generalization of learning when training involved images. Specifically, when a new LX word was trained with orthography, testing in a different modality led to significantly worse performance, as compared to when it was learned and tested with orthography (i.e., congruency effect). In contrast, when a new LX word was learned using an image, there was no disadvantage when tested incongruently (i.e., lexically), indicating stronger transfer of the new LX information. Therefore, our behavioral results provide clear evidence for the advantage of image, compared to orthographic-based training input.

Moreover, our ERP results provide additional insights into the real-time dynamics of lexico-semantic activation and retrieval processes for newly learned words. Specifically, in regard to our main question, we observed larger LPC amplitudes for items following an image prime. Given that LPC is associated with episodic memory retrieval (Rugg & Curran, Reference Rugg and Curran2007), we argue that this effect likely reflects reduced cognitive load when activating conceptual representations that follow image primes. Even though this effect was independent of the training modality, this pattern is in line with the idea that pairing new LX words with image referents (either during training or during retrieval) is advantageous compared to pairing them with L1 words. Below, we discuss our results and their theoretical and practical implications in more detail.

4.1. Efficiency and generalizability of learning: Behavioral results (Day 1 and Day 2)

During training on Day 1, participants’ performance was better in learning the LX words associated with images compared to those associated with L1 words. Specifically, we found a performance advantage for image-associated items in all three of the behavioral measures we analyzed (i.e., first f-4AFC loop in which the LX word was identified correctly, accuracy and RTs). Indeed, a similar pattern emerged in participants’ performance in the translation priming task on Day 2, where response times for LX words learned with image primes were again faster than those learned with lexical primes. Therefore, behavioral results from both training (Day 1) and testing (Day 2) indicate that training with images leads to more robust novel word learning. Therefore, our findings are in line with both the PSE and the b/DCT (Paivio, Reference Paivio1971; Paivio & Desrochers, Reference Paivio and Desrochers1980), according to which images promote faster recognition and retrieval compared to other referent forms (Defeyter et al., Reference Defeyter, Russo and McPartlin2009; Hockley & Bancroft, Reference Hockley and Bancroft2011; Liu et al., Reference Liu, Horinouchi, Yang, Yan, Ando, Obinna, Namba and Kambara2021; Morett, Reference Morett2019; Stenberg, Reference Stenberg2006; Stenberg et al., Reference Stenberg, Radeborg and Hedman1995).

In addition to the training modality effect, we also observed a main effect of testing modality in the translation priming task (Day 2), pointing to better performance for items primed by an image. This facilitatory effect for image primes tentatively supports conceptual mediation of the revised hierarchical model of bilingual word processing (RHM; Kroll et al., Reference Kroll, Van Hell, Tokowicz and Green2010; Kroll & Stewart, Reference Kroll and Stewart1994). The RHM proposes that, as a result of the weaker LX–Concept connection, when low-level bilinguals are lexically primed, they must rely on backward mediation via the L1 in order for semantic information to be obtained. Images, however, provide richer conceptual information, allowing for the more direct LX–Concept connection to be used, resulting in faster LX meaning retrieval. The present findings provide support for this assumption, suggesting that incorporating image-based training materials could enhance language learning efficiency, particularly for low-proficiency foreign language learners. Additionally, our results could inform the design of language assessments, pointing to the relevance of including visual elements to more accurately assess LX learners’ conceptual understanding and semantic retrieval abilities.

Furthermore, a main effect of training–testing congruency was found in RT patterns for the translation priming task (Day 2), showing that performance was better (i.e., faster) when testing followed the same modality as training (Altarriba & Knickerbocker, Reference Altarriba, Knickerbocker, McDonough and Trofimovic2011). Crucially, we also observed an interaction between testing modality and training–testing congruency, pointing to an asymmetry; when orthographic primes were used, having been trained with images did not result in a disadvantage compared to training with orthography. Conversely, when testing was conducted using images, having been trained with orthography did result in a disadvantage compared to training with images. This pattern, similar to that reported by Stenberg et al. (Reference Stenberg, Radeborg and Hedman1995), suggests that images provide a more salient modality for novel word learning and recognition. Specifically, images convey richer conceptual information as compared to orthography. As a result, training with images likely results in more robust encoding of lexico-semantic information, which, in turn, leads to better cross-modal transfer of learning. These asymmetrical results can be explained by transfer-appropriate processing, whereby it is both the modality of initial encoding and recall modality congruency that determines the efficacy of the PSE (Roediger et al., Reference Roediger, Weldon and Challis1989; Weldon & Roediger, Reference Weldon and Roediger1987). That is, when retrieval demands differ from initial encoding (e.g., incongruent-to-training conditions), the PSE can be reversed or eliminated entirely (Brundage & Barile-Spears, Reference Brundage and Barile-Spears2015; Rugg et al., Reference Rugg, Johnson, Uncapher, Addis, Barense and Duarte2015). Altogether, these findings suggest that incorporating images in language learning programs could enhance vocabulary acquisition and retention, as images seem to facilitate better transfer of knowledge across different modalities. However, the effects of transfer-appropriate processing (i.e., initial encoding and retrieval demands) should also be taken into consideration, as training–testing incongruence may result in diminished recall.

4.2. Neural markers of lexical integration: Event-related potentials (Day 2)

Within the N400 time window, we observed a main effect of learning–testing congruency, whereby larger N400 amplitudes were elicited when participants were tested in the same modality in which they first learned the words (i.e., image or lexical prime). Such results align with the functional role of the N400 (Kutas & Federmeier, Reference Kutas and Federmeier2000, Reference Kutas and Federmeier2011), according to which the component reflects the amount of information that needs to be retrieved from the lexico-semantic memory network. Consequently, it appears that encountering a lexical item previously learned in the same modality triggers a heightened activation of information within the lexico-semantic memory. This suggests that learning and subsequently testing newly acquired words within the same modality results in a more robust storage and retrieval process, indicative of stronger memory representations of the items (Baggio & Hagoort, Reference Baggio and Hagoort2011; Jankowiak & Rataj, Reference Jankowiak and Rataj2017). Such patterns seem crucial as far as refining educational strategies and optimizing learning environments are concerned. Specifically, by recognizing that testing in the same modality as learning enhances information retrieval and strengthens mental representations, educators can tailor instructional methods to leverage these findings. Furthermore, this may have implications beyond education, potentially improving cognitive rehabilitation therapies or interventions aimed at improving memory function in various contexts.

Although congruency effects were evident, there were no effects of modality (image or orthography) observed within the N400 window. The lack of this effect is likely the result of training-to-criterion, whereby participants were trained to 100% accuracy across all LX items on Day 1. This likely resulted in a ceiling effect, diminishing the variability in N400 amplitudes. Specifically, since participants were trained to such high accuracy, their high familiarity with the LX stimuli may have allowed them to engage similar conceptual access mechanisms regardless of whether the prime was orthographic or image-based.

Within the LPC time frame, the results yielded a main effect of prime modality, with larger LPC amplitudes for target words preceded by lexical compared to image primes. We interpret this as reflecting the different translation mechanisms engaged when participants performed the translation recognition task for L1–LX word pairs versus image–LX pairs. Specifically, since the critical items consistently involved LX lexical referents, when presented with lexical primes (L1 words), participants had to activate lexical level links between the two words (Palmer & Havelka, Reference Palmer and Havelka2010), followed by the activation of the word concept (i.e., meaning). In contrast, image primes (lacking linguistic features) relied solely on conceptual activation (Francis et al., Reference Francis, Augustini and Sáenz2003). The smaller LPC responses to image primes might therefore indicate attenuated activation in memory networks when activating conceptual links required by image primes, compared to both lexical and conceptual links required by lexical primes (Stenberg, Reference Stenberg2006). These findings also align with the word association hypothesis postulated within the revised hierarchical model (Kroll & Stewart, Reference Kroll and Stewart1994), suggesting that when presented with an LX word, low-proficiency LX learners need to activate an L1 lexical representation to retrieve its meaning. Consequently, lexical representations of L1 and LX words are argued to be stronger in the LX–L1 direction relative to the L1–LX direction (Alvarez et al., Reference Alvarez, Holcomb and Grainger2003; Kroll et al., Reference Kroll, Michael, Tokowicz and Dufour2002; Talamas et al., Reference Talamas, Kroll and Dufour1999). In the lexical priming condition, our participants were presented with word pairs in the more demanding L1–LX direction, potentially requiring more cognitive resources to match the lexical and conceptual representations between the two lexical forms, as reflected in larger LPC amplitudes.

Crucially, in the LPC time window, alongside the main effect of prime modality, we also found an interaction between prime modality and translation match, which revealed that the observed effect of prime modality only manifested when the two items (i.e., L1–LX word pairs or image–LX pairs) included correct translations. This further reinforces our interpretation, suggesting that the less cognitively demanding activation of conceptual representations with image primes was evident only when both items shared the same meaning. Conversely, when incorrect translations were presented, the advantage of using images disappeared, and participants did not benefit from the activation of conceptual representations offered by images. Altogether, these results hold potential implications for language education strategies tailored to foreign language learners. By understanding the differential impact of lexical and image primes on translation mechanisms, educators can design more targeted and effective teaching methods. For instance, educators may incorporate visual aids strategically to facilitate conceptual understanding and reduce cognitive load during translation tasks. Similarly, for learners struggling with lexical translation, teaching interventions can focus on strengthening word form associations between L1 and LX, thereby enhancing overall translation proficiency.

4.3. Constructed languages in psycholinguistics research

Constructed languages are not a new development within psycholinguistic research, yet it is only in recent years that they have become more widely used (Goodall, Reference Goodall2023). Conlangs show great promise with regard to various manipulations and confound control, across several domains from within- to cross-language research, allowing for more understudied areas to be explored (Hayakawa et al., Reference Hayakawa, Ning and Marian2020; McLaughlin, Reference McLaughlin1980; Weiss, Reference Weiss2020). For example, accounting for factors such as incidental exposure in L2 learning has become more challenging as our world becomes more connected (Mohamed, Reference Mohamed2018; Peters & Webb, Reference Peters and Webb2018). Indeed, the conlang developed for the present study was constructed specifically for the purpose of maintaining minimum linguistic distance between the novel items and participants’ L1(to increase ease of learning), while also removing the likely confound of pre-exposure. This would have been exceedingly difficult, if not impossible, to achieve using a natural language due to geographic and social factors related to this study’s L1 (Polish). This, in of itself, speaks to the very benefits of using constructed languages in L2 research.

Yet, experimental benefits aside, questions have remained regarding the reliability and generalizability of studies that utilize constructed languages (Ettlinger et al., Reference Ettlinger, Morgan‐Short, Faretta‐Stutenberg and Wong2016, Madlener-Charpentier, Reference Madlener-Charpentier, Tyler, Ortega, Uno and In Park2018). The present study has recorded previously described effects across both behavioral and cognitive domains while using a constructed language, and it is not the first. Replicable effects have been observed in constructed languages across children’s language learning (Culbertson & Schuler, Reference Culbertson and Schuler2019), language experience (Hayakawa et al., Reference Hayakawa, Ning and Marian2020), marker and function words (Juola, Reference Juola2018), morphological rules (Ferman et al., Reference Ferman, Olshtain, Schechtman and Karni2009) and even dialect effects in literacy (Williams et al., Reference Williams, Panayotov and Kempe2020, Reference Williams, Panayotov and Kempe2022). Indeed, a recent fMRI study comparing artificial languages (Esperanto, Klingon, etc.) to natural language showed identical neural processing across the brain’s language network (Malik-Moraleda et al., Reference Malik-Moraleda, Taliaferro, Shannon, Jhingan, Swords, Peterson, Frommer, Okrand, Sams, Cardwell, Freeman and Fedorenko2025). These findings hold strong implications for the reliability and generalizability of research conducted with constructed or artificial languages. Additionally, it confirms that a key element of any language, natural or otherwise, is providing a means for representation and expression for defining the world around us.

With this in mind, while conlangs show great promise within psycholinguistics research, their use should not be treated as a one-size-fits-all methodology. Rather, strong attention should be paid to the overall goals of the research, the question(s) it aims to answer and the stimuli developed with parameters that are suitable for answering those questions. (Alzahrani, Reference Alzahrani2025; Ettlinger et al., Reference Ettlinger, Morgan‐Short, Faretta‐Stutenberg and Wong2016; Madlener-Charpentier, Reference Madlener-Charpentier, Tyler, Ortega, Uno and In Park2018; Tang & Baer-Henney, Reference Tang and Baer-Henney2023). Natural language and language processing are complex phenomena, and the use of conlangs to further understand them should be done so responsibly.

4.4. Limitations and future directions

One limitation of this study is that participants were tested only 24 hours after the learning session, providing limited information regarding LX retention and long-term memory storage. While the ERP data showed early signs of lexical integration, consistent with previous findings that neural indices of lexical and semantic processing can emerge rapidly following learning (e.g., McLaughlin et al., Reference McLaughlin, Osterhout and Kim2004; Mestres-Missé et al., Reference Mestres-Missé, Rodriguez-Fornells and Münte2007), it remains unclear whether these early effects persist or evolve over time. Extending the delay between training and testing could help examine the developmental trajectory of modality effects on lexico-semantic processing and reveal whether the benefits of image-based learning are sustained as reliance shifts from episodic to long-term memory systems (see Dumay & Gaskell, Reference Dumay and Gaskell2007).

Additionally, our study employed identical visual stimuli for both training and testing in the image-referent condition. While this approach controlled for visual variability and ensured that observed effects were due to learning modality rather than stimulus novelty, it also constrained our ability to assess image-based generalization. Future research could therefore introduce visually distinct but conceptually related images (e.g., different pictures of a duck) during testing to examine whether learners can transfer lexical knowledge to novel exemplars of the same referent and to track the trajectory of this generalization process (e.g., one may expect weaker generalization to novel exemplars immediately after learning and evidence for abstraction after consolidation; see Dumay & Gaskell, Reference Dumay and Gaskell2007; McMurray et al., Reference McMurray, Kapnoula, Gareth Gaskell, Gaskell and Mirković2016). Such an approach would help clarify whether image-based learning promotes flexible, conceptual representations or merely supports recognition of specific visual tokens. This issue has important implications for understanding the depth of encoding across modalities (e.g., Hockley & Bancroft, Reference Hockley and Bancroft2011) and the role of perceptual variability in category learning (e.g., Perry et al., Reference Perry, Samuelson and Bursey2014).

Finally, we aimed at minimizing the linguistic distance between L1 and LX items to control for variability and ensure more consistent results. However, this approach limits the generalizability of our findings to other language pairs with greater linguistic differences. Languages with higher linguistic distance might exhibit different patterns of acquisition and integration, possibly requiring different cognitive and mnemonic strategies for successful word learning and may show divergent neural and behavioral patterns of acquisition (e.g., Degani et al., Reference Degani, Prior and Hajajra2018). Future studies should therefore systematically vary linguistic distance to better understand how language similarity interacts with training modality and cognitive demands during vocabulary acquisition.

5. Conclusion

The present study tested the effects of using image versus lexical referents in early-stage LX learning. Importantly, it is not only the first study to directly compare the two training modalities in terms of their cross-modal generalizability, but it is also the first to examine the neural underpinnings of these effects using ERPs. Overall, images proved to lead to more robust learning outcomes both immediately and after a 24-hour consolidation period. This was further evidenced in attenuated LPC responses, suggesting a greater reliance on conceptual information and, subsequently, less cognitively demanding LX processing when a new lexical item is accompanied by an image referent. Furthermore, an effect of learning–testing congruency was observed, whereby recall accuracy and speed were better when items were tested in the same modality that they were learned. This was corroborated by larger N400 amplitudes for congruent as compared to incongruent trials, demonstrating a richer lexico-semantic activation. These results provide support for the PSE and the b/DCT and further extend them into the earliest stages of LX learning. Finally, these findings not only hold strong scientific, but also pedagogical implications, extending psycholinguistics research from the laboratory and into the classroom.

Supplementary material

The supplementary material for this article can be found at http://doi.org/10.1017/S1366728925100242.

Data availability statement

The data that support the findings will be available in the OSF repository at https://osf.io/q5v6e/ following a 6-month embargo from the date of publication to allow for commercialization of research findings.

Acknowledgments

The authors wish to thank the Polish-U.S. Fulbright Commission for making this study possible and Chris Parkinson for his invaluable assistance in experimental programming and debugging. We also wish to thank all of the participants for their time and contributions to this study. Support for this project was provided by the Polish-U.S. Fulbright Commission through an Independent Research Grant awarded to MC. Support for this project was provided by the School of Languages and Literatures, Adam Mickiewicz University, Poznań awarded to KJ. Support for this project was provided by the Spanish State Research Agency and European Regional Development Fund through Grant # PID2020-113348GB-I00 and PID2023-146423NB-I00 awarded to ECK. This work was supported by the Basque Government through the BERC 2022-2025 program and by the Spanish State Research Agency through the “Severo Ochoa” Programme for Centres/Units of Excellence in R&D CEX2020-001010/AEI/10.13039/501100011033.

Authors Contribution

MC: conceptualization; data curation; investigation; formal analysis; methodology; project administration; resources; software; supervision; validation; visualization; writing – original draft; writing – review & editing | EK: conceptualization; formal analysis; methodology; writing – original draft; writing – review & editing | MP: methodology; investigation; writing – review & editing | JG: investigation; writing – review & editing | KJ: conceptualization; data curation; funding acquisition; formal analysis; methodology; project administration; resources; software; supervision; validation; visualization; writing – original draft; writing – review & editing

Competing interests

The authors declare none.

Footnotes

This research article was awarded Open Data and Open Materials badges for transparent practices. See the Data Availability Statement for details.

References

Altarriba, J., & Knickerbocker, H. (2011). Acquiring second language vocabulary through the use of images and words. In McDonough, K., & Trofimovic, P. (Eds.), Applying priming methods to L2 learning, teaching and research: Insights from psycholinguistics (pp. 21–48). Amsterdam: John Benjamins Pub. Co.10.1075/lllt.30.06altCrossRef Google Scholar

Alvarez, R. P., Holcomb, P. J., & Grainger, J. (2003). Accessing word meaning in two languages: An event-related brain potential study of beginning bilinguals. Brain and Language, 87(2), 290–304. https://doi.org/10.1016/S0093-934X(03)00108-1.CrossRef Google Scholar PubMed

Alzahrani, A. (2025). The acceptability and validity of AI-generated psycholinguistic stimuli. Heliyon, 11(2), e42083. https://doi.org/10.1016/j.heliyon.2025.e42083.CrossRef Google Scholar PubMed

Audacity Team. (2022). Audacity(R): Free audio editor and recorder (3.3.3). Audacity Team. https://audacityteam.org Google Scholar

Aurnhammer, C., Delogu, F., Brouwer, H., & Crocker, M. W. (2023). The P600 as a continuous index of integration effort. Psychophysiology, 60(9), e14302. https://doi.org/10.1111/psyp.14302.CrossRef Google Scholar PubMed

Baese-Berk, M. M., Kapnoula, E. C., & Samuel, A. G. (2025). The relationship of speech perception and speech production: It’s complicated. Psychonomic Bulletin & Review, 32(1), 226–242. https://doi.org/10.3758/s13423-024-02561-w.CrossRef Google Scholar PubMed

Baggio, G., & Hagoort, P. (2011). The balance between memory and unification in semantics: A dynamic account of the N400. Language and Cognitive Processes, 26(9), 1338–1367. https://doi.org/10.1080/01690965.2010.542671.CrossRef Google Scholar

Bakker, I., Takashima, A., Van Hell, J. G., Janzen, G., & McQueen, J. M. (2014). Competition from unseen or unheard novel words: Lexical consolidation across modalities. Journal of Memory and Language, 73, 116–130. https://doi.org/10.1016/j.jml.2014.03.002.CrossRef Google Scholar

Bakker, I., Takashima, A., Van Hell, J. G., Janzen, G., & McQueen, J. M. (2015). Tracking lexical consolidation with ERPs: Lexical and semantic-priming effects on N400 and LPC responses to newly-learned words. Neuropsychologia, 79, 33–41. https://doi.org/10.1016/j.neuropsychologia.2015.10.020.CrossRef Google Scholar PubMed

Bartolotti, J., & Marian, V. (2017). Bilinguals’ existing languages benefit vocabulary learning in a third language. Language Learning, 67(1), 110–140. https://doi.org/10.1111/lang.12200.CrossRef Google Scholar

Batterink, L., & Neville, H. (2011). Implicit and explicit mechanisms of word learning in a narrative context: An event-related potential study. Journal of Cognitive Neuroscience, 23(11), 3181–3196. https://doi.org/10.1162/jocn_a_00013.CrossRef Google Scholar

Berger, A., & Kiefer, M. (2021). Comparison of different response time outlier exclusion methods: A simulation study. Frontiers in Psychology, 12, 675558. https://doi.org/10.3389/fpsyg.2021.675558.CrossRef Google Scholar PubMed

Borovsky, A., Elman, J. L., & Fernald, A. (2012). Knowing a lot for one’s age: Vocabulary skill and not age is associated with anticipatory incremental sentence interpretation in children and adults. Journal of Experimental Child Psychology, 112(4), 417–436. https://doi.org/10.1016/j.jecp.2012.01.005.CrossRef Google Scholar

Brundage, T., & Barile-Spears, A. L. (2015). Effects of stimulus mode and mode congruency on degraded image identification for the picture superiority effect. Psi Chi Journal of Psychological Research, 20(1), 37–44.Google Scholar

Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11(6), 671–684. https://doi.org/10.1016/S0022-5371(72)80001-X.CrossRef Google Scholar

Culbertson, J., & Schuler, K. (2019). Artificial language learning in children. Annual Review of Linguistics, 5(1), 353–373. https://doi.org/10.1146/annurev-linguistics-011718-012329.CrossRef Google Scholar

Davis, M. H., & Gaskell, M. G. (2009). A complementary systems account of word learning: Neural and behavioural evidence. Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1536), 3773–3800. https://doi.org/10.1098/rstb.2009.0111.CrossRef Google Scholar PubMed

Defeyter, M. A., Russo, R., & McPartlin, P. L. (2009). The picture superiority effect in recognition memory: A developmental study using the response signal procedure. Cognitive Development, 24(3), 265–273. https://doi.org/10.1016/j.cogdev.2009.05.002.CrossRef Google Scholar

Degani, T., Prior, A., & Hajajra, W. (2018). Cross-language semantic influences in different script bilinguals. Bilingualism: Language and Cognition, 21(4), 782–804.CrossRef Google Scholar

Dumay, N., & Gaskell, M. G. (2007). Sleep-associated changes in the mental representation of spoken words. Psychological Science, 18(1), 35–39.CrossRef Google Scholar PubMed

Duñabeitia, J. A., Crepaldi, D., Meyer, A. S., New, B., Pliatsikas, C., Smolka, E., & Brysbaert, M. (2018). MultiPic: A standardized set of 750 drawings with norms for six European languages. Quarterly Journal of Experimental Psychology, 71(4), 808–816. https://doi.org/10.1080/17470218.2017.1310261.CrossRef Google Scholar PubMed

Emirmustafaoğlu, A., & Gökmen, D. U. (2015). The effects of picture vs. translation mediated instruction on L2 vocabulary learning. Procedia – Social and Behavioral Sciences, 199, 357–362. https://doi.org/10.1016/j.sbspro.2015.07.559.CrossRef Google Scholar

Ettlinger, M., Morgan‐Short, K., Faretta‐Stutenberg, M., & Wong, P. C. M. (2016). The Relationship Between Artificial and Second Language Learning. Cognitive Science, 40(4), 822–847. https://doi.org/10.1111/cogs.12257CrossRef Google Scholar PubMed

Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. https://doi.org/10.3758/BRM.41.4.1149.CrossRef Google Scholar

Ferman, S., Olshtain, E., Schechtman, E., & Karni, A. (2009). The acquisition of a linguistic skill by adults: Procedural and declarative memory interact in the learning of an artificial morphological rule. Journal of Neurolinguistics, 22(4), 384–412. https://doi.org/10.1016/j.jneuroling.2008.12.002.CrossRef Google Scholar

Francis, W. S., Augustini, B. K., & Sáenz, S. P. (2003). Repetition priming in picture naming and translation depends on shared processes and their difficulty: Evidence from Spanish-English bilinguals. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29(6), 1283–1297. https://doi.org/10.1037/0278-7393.29.6.1283.Google Scholar PubMed

García-Gámez, A. B., & Macizo, P. (2022). Lexical and semantic training to acquire words in a foreign language: An electrophysiological study. Bilingualism: Language and Cognition, 25(5), 768–785. https://doi.org/10.1017/S1366728921000456.CrossRef Google Scholar

Golubović, J., & Gooskens, C. (2015). Mutual intelligibility between west and south Slavic languages. Russian Linguistics, 39, 351–373. https://doi.org/10.1007/s11185-015-9150-9.CrossRef Google Scholar

Goodall, G. (2023). Constructed languages. Annual Review of Linguistics, 9(1), 419–437. https://doi.org/10.1146/annurev-linguistics-030421-064707.CrossRef Google Scholar

Gratton, G., Coles, M. G. H., & Donchin, E. (1983). A new method for off-line removal of ocular artifact. Electroencephalography and Clinical Neurophysiology, 55(4), 468–484. https://doi.org/10.1016/0013-4694(83)90135-9.CrossRef Google Scholar PubMed

Havas, V., Taylor, J., Vaquero, L., De Diego-Balaguer, R., Rodríguez-Fornells, A., & Davis, M. H. (2018). Semantic and phonological schema influence spoken word learning and overnight consolidation. Quarterly Journal of Experimental Psychology, 71(6), 1469–1481. https://doi.org/10.1080/17470218.2017.1329325.CrossRef Google Scholar PubMed

Hayakawa, S., Ning, S., & Marian, V. (2020). From Klingon to Colbertian: Using artificial languages to study word learning. Bilingualism: Language and Cognition, 23(1), 74–80. https://doi.org/10.1017/S1366728919000592.CrossRef Google Scholar PubMed

Hockley, W. E., & Bancroft, T. (2011). Extensions of the picture superiority effect in associative recognition. Canadian Journal of Experimental Psychology / Revue Canadienne de Psychologie Expérimentale, 65(4), 236–244. https://doi.org/10.1037/a0023796.CrossRef Google Scholar PubMed

Imbir, K. K. (2015). Affective norms for 1,586 polish words (ANPW): Duality-of-mind approach. Behavior Research Methods, 47(3), 860–870. https://doi.org/10.3758/s13428-014-0509-4.CrossRef Google Scholar PubMed

Imbir, K. K. (2021). Corrigendum: Affective norms for 4900 polish words reload (ANPW_R): Assessments for valence, arousal, dominance, origin, significance, concreteness, Imageability and, age of acquisition. Frontiers in Psychology, 12, 707540. https://doi.org/10.3389/fpsyg.2021.707540.CrossRef Google Scholar PubMed

Isphording, I. E., & Otten, S. (2014). Linguistic barriers in the destination language acquisition of immigrants. Journal of Economic Behavior & Organization, 105, 30–50. https://doi.org/10.1016/j.jebo.2014.03.027.CrossRef Google Scholar

Jankowiak, K. (2021). Current trends in electrophysiological research on bilingual language processing. Language and Linguistics Compass, 15(8), e12436. https://doi.org/10.1111/lnc3.12436.CrossRef Google Scholar

Jankowiak, K., & Rataj, K. (2017). The N400 as a window into lexico-semantic processing in bilingualism. Poznan Studies in Contemporary Linguistics, 53(1). https://doi.org/10.1515/psicl-2017-0006.CrossRef Google Scholar

JASP Team. (2023). JASP (0.17.2.1). https://jasp-stats.org/Google Scholar

Juola, P. (2018). Authorship attribution, constructed languages, and the psycholinguistics of individual variation. Digital Scholarship in the Humanities, 33(2), 327–335. https://doi.org/10.1093/llc/fqx045.CrossRef Google Scholar

Kapnoula, E. C., & McMurray, B. (2016). Newly learned word forms are abstract and integrated immediately after acquisition. Psychonomic Bulletin & Review, 23(2), 491–499. https://doi.org/10.3758/s13423-015-0897-1.CrossRef Google Scholar PubMed

Kapnoula, E. C., Packard, S., Gupta, P., & McMurray, B. (2015). Immediate lexical integration of novel word forms. Cognition, 134, 85–99. https://doi.org/10.1016/j.cognition.2014.09.007.CrossRef Google Scholar PubMed

Kolk, H., & Chwilla, D. (2007). Late positivities in unusual situations. Brain and Language, 100(3), 257–261. https://doi.org/10.1016/j.bandl.2006.07.006.CrossRef Google Scholar PubMed

Krepel, A., De Bree, E. H., & De Jong, P. F. (2021). Does the availability of orthography support L2 word learning? Reading and Writing, 34(2), 467–496. https://doi.org/10.1007/s11145-020-10078-6.CrossRef Google Scholar

Kroll, J., & Stewart, E. (1994). Category interference in translation and picture naming: Evidence for asymmetric connections between bilingual memory representations. Journal of Memory and Language, 33, 149–174.10.1006/jmla.1994.1008CrossRef Google Scholar

Kroll, J., Van Hell, J., Tokowicz, N., & Green, D. (2010). The revised hierarchical model: A critical review and assessment. Bilingualism: Language and Cognition, 13(3), 373–381. https://doi.org/10.1017/S136672891000009X.CrossRef Google Scholar

Kroll, J. F., Michael, E., Tokowicz, N., & Dufour, R. (2002). The development of lexical fluency in a second language. Second Language Research, 18(2), 137–171. https://doi.org/10.1191/0267658302sr201oa.CrossRef Google Scholar

Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Sciences, 4(12), 463–470. https://doi.org/10.1016/S1364-6613(00)01560-6.CrossRef Google Scholar PubMed

Kutas, M., & Federmeier, K. D. (2011). Thirty years and counting: Finding meaning in the N400 component of the event-related brain potential (ERP). Annual Review of Psychology, 62(1), 621–647. https://doi.org/10.1146/annurev.psych.093008.131123.CrossRef Google Scholar PubMed

Lang, P.J. (1980). Behavioral treatment and bio-behavioral assessment: Computer applications. In Sidowski, J. B., Johnson, J.H., & Williams, T.A. (Eds.) Technology in Mental Health Care Delivery Systems, 119–137. Ablex. https://doi.org/10.1111/j.1469-8986.1993.tb03352.xGoogle Scholar

Leach, L., & Samuel, A. (2007). Lexical configuration and lexical engagement: When adults learn new words. Cognitive Psychology, 55(4), 306–353. https://doi.org/10.1016/j.cogpsych.2007.01.001.CrossRef Google Scholar PubMed

Li, P., Zhang, F., Yu, A., & Zhao, X. (2020). Language history questionnaire (LHQ3): An enhanced tool for assessing multilingual experience. Bilingualism: Language and Cognition, 23(5), 938–944. https://doi.org/10.1017/S1366728918001153.CrossRef Google Scholar

Lindgren, E., & Muñoz, C. (2013). The influence of exposure, parents, and linguistic distance on young European learners’ foreign language comprehension. International Journal of Multilingualism, 10(1), 105–129. https://doi.org/10.1080/14790718.2012.679275.CrossRef Google Scholar

Liu, X., Horinouchi, H., Yang, Y., Yan, Y., Ando, M., Obinna, U. J., Namba, S., & Kambara, T. (2021). Pictorial referents facilitate recognition and retrieval speeds of associations between novel words in a second language (L2) and referents. Frontiers in Communication, 6, 605009. https://doi.org/10.3389/fcomm.2021.605009.CrossRef Google Scholar

Luck, S. J. (2014). An introduction to the event-related potential technique (2nd ed.) The MIT Press.Google Scholar

Madlener-Charpentier, K. (2018). Do findings from artificial language learning generalize to second language classrooms? In Tyler, A. E., Ortega, L., Uno, M., & In Park, H. (Eds.), Usage-inspired L2 instruction: Researched pedagogy (pp. 211–234). John Benjamins Publishing Company.10.1075/lllt.49.10madCrossRef Google Scholar

Malik-Moraleda, S., Taliaferro, M., Shannon, S., Jhingan, N., Swords, S., Peterson, D. J., Frommer, P., Okrand, M., Sams, J., Cardwell, R., Freeman, C., & Fedorenko, E. (2025). Constructed languages are processed by the same brain mechanisms as natural languages. Proceedings of the National Academy of Sciences, 122(12), e2313473122. https://doi.org/10.1073/pnas.2313473122.CrossRef Google Scholar PubMed

Mandera, P., Keuleers, E., Wodniecka, Z., & Brysbaert, M. (2015). Subtlex-pl: Subtitle-based word frequency estimates for polish. Behavior Research Methods, 47(2), 471–483. https://doi.org/10.3758/s13428-014-0489-4.CrossRef Google Scholar PubMed

McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102(3), 419–457. https://doi.org/10.1037/0033-295X.102.3.419.CrossRef Google Scholar PubMed

McLaughlin, B. (1980). On the use of miniature artificial languages in second-language research. Applied PsychoLinguistics, 1(4), 357–369. https://doi.org/10.1017/S0142716400001004.CrossRef Google Scholar

McLaughlin, J., Osterhout, L., & Kim, A. (2004). Neural correlates of second-language word learning: Minimal instruction produces rapid change. Nature Neuroscience, 7(7), 703–704. https://doi.org/10.1038/nn1264.CrossRef Google Scholar PubMed

McMurray, B., Kapnoula, E. C., & Gareth Gaskell, M. (2016). Learning and integration of new word-forms: Consolidation, pruning, and the emergence of automaticity. In Gaskell, M.G., Mirković, J. (Eds.) Speech perception and spoken word recognition (pp. 126–152). Psychology Press.Google Scholar

Mestres-Missé, A., Rodriguez-Fornells, A., & Münte, T. F. (2007). Watching the brain during meaning acquisition. Cerebral Cortex, 17(8), 1858–1866.CrossRef Google Scholar PubMed

Mohamed, A. A. (2018). Exposure frequency in L2 reading: An eye-movement perspective of incidental vocabulary learning. Studies in Second Language Acquisition, 40(2), 269–293. https://doi.org/10.1017/S0272263117000092.CrossRef Google Scholar

Morett, L. M. (2019). The power of an image: Images, not glosses, enhance learning of concrete L2 words in beginning learners. Journal of Psycholinguistic Research, 48(3), 643–664.CrossRef Google Scholar

Nunez, P. L., & Srinivasan, R. (2006). Electric fields of the brain. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195050387.001.0001.CrossRef Google Scholar

Oldfield, R. C. (1971). The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia, 9(1), 97–113.10.1016/0028-3932(71)90067-4CrossRef Google Scholar PubMed

Paivio, A. (1971). Imagery and verbal processes. Psychology Press.Google Scholar

Paivio, A., & Desrochers, A. (1980). A dual-coding approach to bilingual memory. Canadian Journal of Psychology/Revue Canadienne de Psychologie, 34(4), 388–399. https://doi.org/10.1037/h0081101.CrossRef Google Scholar

Palma, P., & Titone, D. (2021). Something old, something new: A review of the literature on sleep-related lexicalization of novel words in adults. Psychonomic Bulletin & Review, 28(1), 96–121. https://doi.org/10.3758/s13423-020-01809-5.CrossRef Google Scholar

Palmer, S. D., & Havelka, J. (2010). Age of acquisition effects in vocabulary learning. Acta Psychologica, 135(3), 310–315. https://doi.org/10.1016/j.actpsy.2010.08.002.CrossRef Google Scholar PubMed

Peirce, J., Gray, J. R., Simpson, S., MacAskill, M., Höchenberger, R., Sogo, H., Kastman, E., & Kindelov, J. K. (2019). PsychoPy2: Experiments in behavior made easy. Behavior Research Methods, 51, 195–203.Google Scholar PubMed

Perry, L. K., Samuelson, L. K., & Bursey, A. H. (2014). Does perceived affordance influence word learning? Evidence from contextually rich environments. Frontiers in Psychology, 5, 453.Google Scholar

Peters, E., & Webb, S. (2018). Incidental vocabulary acquisition through viewing L2 television and factors that affect learning. Studies in Second Language Acquisition, 40(3), 551–577.CrossRef Google Scholar

Pu, H., Holcomb, P. J., & Midgley, K. J. (2016). Neural changes underlying early stages of L2 vocabulary acquisition. Journal of Neurolinguistics, 40, 55–65. https://doi.org/10.1016/j.jneuroling.2016.05.002.CrossRef Google Scholar PubMed

Roediger, H. L., Weldon, M. S., & Challis, B. H. (1989). Explaining dissociations between implicit and explicit measures of retention: A processing account. In Varieites of memory and consciousness (pp. 39). Taylor & Francis Group.Google Scholar

Rugg, M. D., & Curran, T. (2007). Event-related potentials and recognition memory. Trends in Cognitive Sciences, 11(6), 251–257. https://doi.org/10.1016/j.tics.2007.04.004.CrossRef Google Scholar PubMed

Rugg, M. D., Johnson, J. D., & Uncapher, M. R. (2015). Encoding and retrieval in episodic memory: Insights from fMRI. In Addis, D. R., Barense, M., & Duarte, A. (Eds.), The Wiley handbook on the cognitive neuroscience of memory (1st ed., pp. 84–107). Wiley. https://doi.org/10.1002/9781118332634.ch5.CrossRef Google Scholar

Stenberg, G. (2006). Conceptual and perceptual factors in the picture superiority effect. European Journal of Cognitive Psychology, 18(6), 813–847. https://doi.org/10.1080/09541440500412361.CrossRef Google Scholar

Stenberg, G., Radeborg, K., & Hedman, L. R. (1995). The picture superiority effect in a cross-modality recognition task. Memory & Cognition, 23(4), 425–441. https://doi.org/10.3758/BF03197244.CrossRef Google Scholar

Talamas, A., Kroll, J. F., & Dufour, R. (1999). From form to meaning: Stages in the acquisition of second-language vocabulary. Bilingualism: Language and Cognition, 2(1), 45–58. https://doi.org/10.1017/S1366728999000140.CrossRef Google Scholar

Tang, K., & Baer-Henney, D. (2023). Modelling L1 and the artificial language during artificial language learning. Laboratory Phonology, 14(1), 1–54. https://doi.org/10.16995/labphon.6460.CrossRef Google Scholar

Trost, S. (2022). WordCreator (22.7.4). Stefan Trost Media.Google Scholar

Weiss, D. J. (2020). Introduction: The use of artificial languages in bilingualism research. Bilingualism: Language and Cognition, 23(1), 72–73. https://doi.org/10.1017/S1366728919000750.CrossRef Google Scholar

Weldon, M. S., & Roediger, H. L. (1987). Altering retrieval demands reverses the picture superiority effect. Memory & Cognition, 15(4), 269–280. https://doi.org/10.3758/BF03197030.CrossRef Google Scholar PubMed

Wilkins, D. A. (1972). Linguistics in language teaching. Cambridge: MIT Press.Google Scholar

Williams, G. P., Panayotov, N., & Kempe, V. (2020). How does dialect exposure affect learning to read and spell? An artificial orthography study. Journal of Experimental Psychology: General, 149(12), 2344–2375. https://doi.org/10.1037/xge0000778.CrossRef Google Scholar PubMed

Williams, G. P., Panayotov, N., & Kempe, V. (2022). Exposure to dialect variation in an artificial language prior to literacy training impairs reading of words with competing variants but does not affect decoding skills. Journal of Experimental Psychology: Learning, Memory, and Cognition, 48(12), 1868–1904. https://doi.org/10.1037/xlm0001094.Google Scholar

Yum, Y. N., Midgley, K. J., Holcomb, P. J., & Grainger, J. (2014). An ERP study on initial second language vocabulary learning: Initial L2 vocabulary learning. Psychophysiology, 51(4), 364–373. https://doi.org/10.1111/psyp.12183.CrossRef Google Scholar

Zhang, J., Huang, Y., Jiang, C., Xu, Y., Rao, H., & Xu, H. (2023). Dynamic brain responses to Russian word acquisition among Chinese adult learners: An event-related potential study. Human Brain Mapping, 44(9), 3717–3729. https://doi.org/10.1002/hbm.26307.CrossRef Google Scholar PubMed

Zhang, Y., Chen, B., Tang, Y., Yao, P., & Lu, Y. (2018). Semantic similarity to known second language words impacts learning of new meanings. Frontiers in Psychology, 9, 2048. https://doi.org/10.3389/fpsyg.2018.02048.CrossRef Google Scholar PubMed

Zhang, Y., Lu, Y., Liang, L., & Chen, B. (2020). The effect of semantic similarity on learning ambiguous words in a second language: An event-related potential study. Frontiers in Psychology, 11, 1633. https://doi.org/10.3389/fpsyg.2020.01633.CrossRef Google Scholar

Figure 1. Examples of association trials in which participants were familiarized with the new LX words on Day 1.

Figure 2. Examples of b-4AFC and f-4AFC trials employed on Day 1.

Table 1. Mean RTs (in milliseconds) and standard deviations by modality, translation and congruency (Day 2)

Figure 3. Mean RTs for image- and orthography-primed items in the learning–testing congruent and incongruent conditions (Day 2).

Table 2. Accuracy, mean RTs (in milliseconds), and standard deviations by condition (Day 2)

Figure 4. Grand averages for congruent and incongruent learning–testing conditions in the N400 time frame (300–500 ms).

Cieśla et al. supplementary material

File 101.1 KB

Article contents

The impact of orthography versus images on foreign language learning: Evidence from behavioral and neural markers

Abstract

Keywords

Information

1. Introduction

1.1. Orthography and image effects on L2 learning

1.2. Using event-related potentials (ERPs) to track novel word learning

1.3. Using a constructed language to study naturalistic LX word learning

1.4. The present study

2. Methods

2.1. Participants

2.2. Materials

2.3. Procedures

2.4. Day 1: Training and collection of secondary measures

2.5. Day 2: Testing

2.6. EEG data recording

2.7. Data analysis

2.7.1. Behavioral data analysis

2.7.2. EEG data analysis

3. Results

3.1. Behavioral results

3.1.1. Training (4AFC Tasks): Accuracy and reaction times (Day 1)

3.1.2. Testing: Accuracy and reaction times (Day 2)

3.2. Event-related potentials

3.2.1. N400 (300–500 ms)

3.2.2. LPC (600–800 ms)

4. Discussion

4.1. Efficiency and generalizability of learning: Behavioral results (Day 1 and Day 2)

4.2. Neural markers of lexical integration: Event-related potentials (Day 2)

4.3. Constructed languages in psycholinguistics research

4.4. Limitations and future directions

5. Conclusion

Supplementary material

Data availability statement

Acknowledgments

Authors Contribution

Competing interests

Footnotes

References

Cieśla et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests