Highlights
-
• Sound symbolism is the non-arbitrary relationship between speech sound and meaning.
-
• We investigated the possible advantage of bilingual speakers compared to monolingual speakers.
-
• Results confirmed the presence of sound symbolism in both monolinguals and bilinguals.
-
• No significant differences emerged between monolingual and bilingual participants.
-
• Sound symbolism is an innate mechanism independent of prior linguistic experience.
1. Introduction
Sound symbolism refers to a non-arbitrary correspondence between the sound and meaning of words (Kovic et al., Reference Kovic, Plunkett and Westermann2010). A growing body of research suggests that listeners across diverse linguistic backgrounds are sensitive to these form-to-meaning correspondences (Nygaard et al., Reference Nygaard, Cook and Namy2009; Ramachandran & Hubbard, Reference Ramachandran and Hubbard2001). Early studies in this domain explored sensory sound symbolism through behavioral paradigms examining the relationship between syllables and the size or shape of visual stimuli. In the pioneering “Mal-Mil experiment” (Sapir, Reference Sapir1929), participants associated the non-word “Mal” with larger objects and “Mil” with smaller ones. Similarly, Köhler (Köhler, Reference Köhler1929) found that individuals tend to match curved shapes with the non-words “maluma” and spiky shapes with “takete” – later adapted as “bouba” and “kiki”, respectively (Ramachandran & Hubbard, Reference Ramachandran and Hubbard2001). These findings confirm a systemic relationship between phonetic structure and meaning, supporting the concept of synesthetic sound symbolism (Hinton et al., Reference Hinton, Nichols and Ohala2006).
Beyond non-words, sound symbolism has also been examined within natural languages, but the possible relationship between bilingualism and the sensitivity to sound symbolism remains mostly unexplored. Here, we aimed to explore whether bilinguals outperform monolinguals, supporting the idea that greater linguistic exposure enhances sound-meaning associations. Moreover, we also aimed at exploring the role of word category, hypothesizing that nouns and verbs could be more easily associated with their meanings in unknown languages with respect to adjectives, due to their greater concreteness, possibly leading to sound-symbolic cues. In fact, several studies have demonstrated that psychoacoustic properties of word sounds can universally and non-arbitrarily correspond to their meanings (Dingemanse et al., Reference Dingemanse, Blasi, Lupyan, Christiansen and Monaghan2015; Johansson et al., Reference Johansson, Anikin, Carling and Holmer2020; Monaghan et al., Reference Monaghan, Shillcock, Christiansen and Kirby2014; Preziosi & Coane, Reference Preziosi and Coane2017). Cross-linguistic behavioral studies using forced-choice paradigms have shown that individuals can infer the meaning of sound-symbolic words in foreign languages at above-chance level. Early research demonstrated that participants presented with antonym pairs in unfamiliar languages performed significantly above chance by relying solely on phonetic cues (Berlin, Reference Berlin2006; Brown et al., Reference Brown, Black and Horowitz1955; Klank et al., Reference Klank, Huang and Johnson1971; Tzeng et al., Reference Tzeng, Nygaard and Namy2017). More recently, a modified paradigm (D’Anselmo et al., Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019) asked participants to select the correct translation of foreign words from three alternatives in their native language. Italian and Polish participants performed similarly in guessing the meanings of nouns, verbs, and adjectives from four non-Indo-European languages, showing that sound symbolism operates independently of the listener’s native language.
As specified above, a scarcely explored domain in sound symbolism research is bilingualism. On the one hand, knowledge of multiple languages may enhance sensitivity to linguistic cues, leading to better performance in guessing the meaning of unfamiliar words. Indeed, second language acquisition has been linked to improvements in memory (French & Jacquet, Reference French and Jacquet2004), attention (Yang & Yang, Reference Yang and Yang2016), and relational skills (Chen & Padilla, Reference Chen and Padilla2019), and even neuroplasticity (Abutalebi et al., Reference Abutalebi, Tettamanti and Perani2009; Li et al., Reference Li, Legault and Litcofsky2014). In addition to these general cognitive advantages, bilinguals may also develop linguistic skills relevant to phonosymbolic processing. For example, they may be more sensitive to subtle phonetic distinctions between languages (Antoniou et al., Reference Antoniou, Liang, Ettlinger and Wong2015 and may also exhibit greater metalinguistic awareness, allowing them to reflect more effectively on linguistic structure, and to abstract and transfer patterns across language systems (Bialystok, Reference Bialystok2001). These factors provide a basis for expecting a bilingual advantage in sound-symbolic tasks. On the other hand, if sound symbolism is universal, its effects may be independent of linguistic background, implying no advantage for bilingual speakers.
Given that nearly half of the general population is bilingual, with even higher percentages in Europe (https://europa.eu/eurobarometer/surveys/detail/1049), research in this area is crucial to understand how multilingualism shapes cognition. Bilingualism and Multilingualism are defined by different degrees of language proficiency and exposure (Cenoz, Reference Cenoz2013; Grosjean & Li, Reference Grosjean and Li2013), and their effects can be independent of one another (Surrain & Luk, Reference Surrain and Luk2019). Linguistic structure encompasses phonology, morphology, semantics, syntax, and pragmatics (Kortmann, Reference Kortmann2020). In the context of sound symbolism, the most relevant aspects are phonology (sounds of a language) and morphology (meaning structure), which interact to create the sound-meaning associations observed in this phenomenon. This instinctive association between phonology and morphology emerges early in development. Research suggests 12-month-old infants exhibit bouba-kiki effects, whereas 4-month-old infants do not (Pejovic & Molnar, Reference Pejovic and Molnar2017). Importantly, this effect was similar in monolingual and bilingual infants. This has been interpreted as evidence that sound symbolism relies on universal, possibly innate, perceptual mechanisms, rather than a learned linguistic feature. Supporting this idea, Lockwood et al. (Reference Lockwood, Dingemanse and Hagoort2016) and Perniss et al. (Reference Perniss, Thompson and Vigliocco2010) proposed that sound symbolism may serve as an innate advantage in language. However, alternative theoretical accounts argue that these associations may arise from statistical learning, whereby learners detect non-arbitrary patterns through repeated exposure to consistent sound–meaning pairings (Monaghan et al., Reference Monaghan, Mattock and Walker2012), or from perceptual bootstrapping, in which early perceptual biases facilitate the mapping between sounds and meanings during vocabulary development (Imai & Kita, Reference Imai and Kita2014). Similarly, Imai et al. (Reference Imai, Kita, Nagumo and Okada2008) showed that both children and adults are sensitive to sound-symbolic relations between novel words and actions, suggesting that such mappings are robust and may reflect universal cognitive mechanisms. Regarding second language acquisition, Nygaard et al. (Reference Nygaard, Cook and Namy2009) found that English speakers learning Japanese associated words with their correct English translation more quickly when the Japanese words contained sound symbolic cues. These findings suggest that sound symbolism facilitates language learning, yet it remains unclear whether bilingualism enhances this ability.
Despite evidence supporting sound symbolism in language processing, no studies have systematically explored whether bilingualism confers an advantage in this domain. The present study examines whether bilingualism improves the ability to infer the meaning of unfamiliar words. To test this hypothesis, we adopted the same paradigm used by D’Anselmo et al. (Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019). Participants heard nouns, verbs, and adjectives in four unfamiliar non-Indo-European languages (Finnish, Japanese, Tamil, and Swahili) and selected the correct translation from three written alternatives in their native language (L1) presented on the computer screen (chance level: 33%). These languages were chosen to ensure zero prior exposure, as they belong to distinct linguistic families (Hammarström et al., Reference Hammarström, Forkel, Haspelmath and Bank2024): Finnish Uralic family, Japanese Japonic family, Swahili Atlantic-Congo family, and Tamil Dravidian family. Additionally, Finnish and Swahili are syllable-timed languages, while Japanese and Tamil are mora-timed languages. Previous research (D’Anselmo et al., Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019) suggested that performance might be higher for syllable-timed languages, as Italian and Polish – participants’ native languages – are syllable-timed languages as well. However, results unexpectedly showed above-chance performance for Finnish and Japanese, but not for Swahili and Tamil. The present study aims to replicate these findings in Italian monolingual and bilingual participants while testing whether bilingualism enhances sensitivity to sound symbolism. Two possible outcomes were considered: (a) bilinguals outperform monolinguals, supporting the idea that greater linguistic exposure enhances sound-meaning associations; (b) bilingual and monolingual performance is equivalent, reinforcing the hypothesis that sound symbolism is an innate, universal mechanism. Finally, we hypothesized that word category influences performance, with nouns and verbs being guessed more accurately than adjectives, as found in previous studies. This difference could stem from the greater concreteness of nouns and verbs, which may be more readily mapped onto sound-symbolic cues.
2. Materials and methods
2.1. Participants
A sample of 195 volunteers participated in the study, divided into bilingual and monolingual participants, including 155 females and 37 males (3 participants did not indicate their gender). The mean age of the whole sample was 35.69 ± 0.85 years, and the mean years of education were 17.24 ± 0.13. G-Power 3.1.9.4 (Faul et al., Reference Faul, Erdfelder, Lang and Buchner2007) was used to determine a priori the minimum sample size required for repeated measures ANOVA with within-between interactions. Considering the interaction between categories of words and languages as reported in the previous study (D’Anselmo et al., Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019), we fixed effect size f = 0.20, power = 99%, and α = .05, obtaining a minimum of 76 participants for each group. Due to the fact that the attribution to each group (monolingual vs bilingual) would be carried out a posteriori, we decided to recruit a larger sample than required to account for potential unbalanced subsamples, participants’ exclusions due to technical issues, or possible drop-out. To ensure the participation of bilinguals, the Linguistic Center of the University of Chieti-Pescara was involved in recruitment, and the link for participation in the study was also shared by means of social media. Inclusion criteria were: age ranging from 18 to 64 years and a high level of proficiency in the Italian language. Exclusion criteria were: the presence of neurological and/or psychiatric diagnoses, hearing or speech impairments, and familiarity with Finnish, Japanese, Tamil, or Swahili (languages used in the task, see Stimuli and procedure section).
Participants who accepted to participate in the study completed the Bilingual Language Profile (BLP; Alonzo et al., Reference Alonzo, Gertken and Amengual2012), a self-report questionnaire assessing language competencies for L1 and L2 separately, and composed of four subscales:
-
1) Language use (a measure of how frequently and in what contexts each language is used, assessed by 5 questions, each with a score from 0 to 10),
-
2) Language history (exploring the timeline of language acquisition for both languages, assessed by 6 questions, each with a score ranging from 0 to 20),
-
3) Language attitude (the subjective feelings and beliefs about each language, including the perceived importance, assessed by 4 questions, each with a score ranging from 0 and 6), and
-
4) Language proficiency (assessing the subjective ability in each language for speaking, reading, writing, and listening, assessed by 4 questions, each with a score ranging from 0 to 6).
To ensure that each subscale received equal weight in the global language index, the score for each subscale was multiplied by adjustment factors (1.09 for Language Use, 0.454 for Language History, 2.27 for Language Proficiency, and for Language Attitudes). Thus, each subscale ranged from 0 to 54.5, obtaining a global language index score from 0 to 218 for L1 and for L2, separately. To obtain the language dominance index, L1 total score is subtracted from L2 total score for a final index that ranges from −218 to +218, where −218 indicates maximal dominance in L1 (native language), 0 indicates a balanced bilingualism with no preference between L1 and L2 (second language), and 218 represents the maximal dominance in L2. Since the original version of the BLP questionnaire does not provide cut-off points for defining bilingualism, and a 0 score represents perfect bilingualism (Olson, Reference Olson2023), the participants with a BLP score ranging from −50 to +50 were classified as bilingual. Based on this criterion, the sample included: 101 monolingual participants (76 females and 24 males), and 94 bilingual participants (79 females and 13 males). In the monolingual group, the self-reported L1 was Italian for 94 (93%) participants, whereas for the remaining seven participants Italian was reported as a language with a high level of proficiency but the final score to the BLP was not enough to categorize these participants as bilinguals (their L1 were English for 3, French for 2, Spanish for 1, and Ukrainian for 1 participant). In the bilingual group, the self-reported L1 was Italian for 77 (82%) participants, and the remaining 17 participants reported Italian as L2 (with L1 being French for 7, English for 4, Slovenian for 4, Ukrainian for 1, and Croatian for 1 of them). BLP does not allow for a difference between bilingual and multilingual participants (a difference not crucial for the aim of the present study). Statistical comparisons revealed that the two groups did not significantly differ in age, gender, or years of education (see Table 1).
Table 1. Frequency (N), mean (and standard error) for monolingual and bilingual subsamples with statistical values

Informed consent was obtained from all participants prior to their inclusion in the study. The study was carried out in accordance with the principles of the Declaration of Helsinki and was approved by the Review Board of the Department of Psychological, Health and Territorial Sciences – University “G. d’Annunzio” of Chieti-Pescara (protocol number: IRBP/22004).
2.2. Stimuli and procedure
The stimuli and procedure were the same as those used in a previous study on cultural differences in sound symbolism (D’Anselmo et al., Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019). Each word was selected primarily from the basic vocabulary of the Italian language, including 10 adjectives, 10 nouns, and 10 verbs (N = 30). For each target word, two distractors (wrong responses) were also selected. To avoid a potential bias of the word’s length, the mean number of letters of the correct options was compared to the mean number of letters of the two distractors by means of t-tests. Results confirmed the absence of significant differences in length between correct and incorrect response options (t 119 = −0.35, p = 0.723). Then, the final set of 120 stimuli was generated by translating each target word in Finnish, Japanese, Swahili, and Tamil (N = 30×4 = 120). The full list of stimuli and response options is provided in Appendix 1. Audio recordings were generated using Google Translate’s voice feature and subsequently edited using the GoldWave software (V.5.25; GoldWave Inc.) to normalize amplitude levels, ensuring that all stimuli were presented at a comparable volume.
The experiment was conducted online by using the E-Prime Go software (Psychology Software Tools Inc., Pittsburgh, PA). Each trial consisted of a target word presented auditorily via headphones, followed by three response options displayed on a computer screen in Italian. One of the three options was the correct translation, while the remaining two served as distractors. The response options were balanced for the mean number of letters (see Appendix 1).
Participants were instructed that they would hear a word in an unknown language and subsequently see on the screen three possible translations in Italian. They were informed that only one option was correct, presented in a random position on the screen, and were asked to select the correct translation as quickly as possible by clicking on the word. After each response, the mouse cursor was automatically reset to the center of the screen.
2.3. Statistical analyses
The number of correctly recognized words was converted into percentages for each condition. The mean accuracy was compared to the chance level (33%) by one-sample t-tests carried out across all participants, the two groups separately, the four different languages, and the three word categories. To account for multiple comparisons, a Bonferroni correction was applied, setting the significance thresholds at p = 0.025 for the two subsamples, at p = 0.0125 for the four foreign languages, and at p = 0.017 for the three word categories. A t-test for independent groups was conducted to directly compare monolingual and bilingual performance. Finally, to examine possible interactions among the factors, an analysis of variance (ANOVA) was carried out with Group (monolingual, bilingual) as a between-subject factor, Language (Finnish, Japanese, Swahili, Tamil), and Category (noun, verb, adjective) as within-subject factors. The percentage of correct responses was used as the dependent variable, and post hoc comparisons were carried out using the Duncan test when necessary. A post-hoc power analysis for within-between interaction ANOVA was carried out by means of G*Power software, considering the effect size of the three-way interaction (Group, Language, and Category) and alpha 0.05. Results revealed a power = 0.98, ensuring that the sample size was adequate.
3. Results
The mean accuracy across all participants was significantly above chance (M = 36.20%, SE = ±0.41%, 95% CI [35.40−37.00%]; t 194 = 7.87, p < 0.001), supporting the presence of sound symbolism.
Both bilinguals (M = 36.18%, SE = ±0.61%, 95% CI [34.98−37.38%]; t 93 = 5.26, p < 0.001) and monolinguals (M = 36.21%, SE = ±0.55%, 95% CI [35.12−37.30%]; t 100 = 5.86, p < 0.001) performed significantly above chance, although no significant difference emerged between the two groups (t 193 = −0.04, p = 0.97; 95% CI [−1.64%−1.57%]), leading to their results being analyzed together. Concerning the language comparison, performance was above chance for Finnish (M = 36.65%, SE = ±0.74%, 95% CI [35.18–38.12%]; t 194 = 4.90, p < 0.001), Japanese (M = 38.62%, SE = ±0.74%, 95% CI [37.15–40.08%]; t 194 = 7.55, p < 0.001), and Swahili (M = 36.53%, SE = ±0.61%, 95% CI [35.32–37.74%]; t 194 = 5.77, p < 0.001), whereas performance was at chance level for Tamil (M = 32.99%, SE = ±0.60%, 95% CI [31.81–34.18%]; t 194 = −0.01, p = 0.99). Regarding word category comparisons, nouns and verbs were recognized significantly above chance (nouns: M = 37.67%, SE = ±0.65%, 95% CI [36.39–38.94%]; t 194 = 7.22, p < 0.001; verbs: M = 37.08%, SE = ±0.62%, 95% CI [35.85–38.31%]; t 194 = 6.53, p < 0.001), whereas adjectives did not significantly differ from chance (M = 33.85%, SE = ±0.53%, 95% CI [32.79–34.90%]; t 194 = 1.58, p = 0.115).
The ANOVA revealed that, as expected from the previous results, no significant effect of Group emerged, nor did Group interact with other factors (see Table 2). Language had a significant main effect (F3,579 = 13.90, p < 0.001, ɳp2 = 0.07) and showed a higher performance for Japanese and a lower performance for Tamil compared to other languages (all ps < 0.03). Category had a significant main effect (F2,386 = 14.30, p < 0.001, ɳp2 = 0.07); indeed, adjectives were recognized less accurately than both nouns and verbs (ps < 0.001).
Table 2. Summary of the ANOVA results

Note: Degrees of freedom (df), F-values, p-values, and partial eta squared (η p2) are reported.
A significant interaction between Language and Category was found (F6,1158 = 9.38, p < 0.001, ɳp2 = 0.05, see Figure 1). Post hoc comparisons showed no difference among nouns, verbs, and adjectives presented in Tamil, whereas nouns were better recognized than verbs and adjectives in Finnish and in Japanese (ps < 0.009), verbs were recognized more accurately than adjectives in Finnish (p = 0.008), and they were better recognized than both adjectives and nouns in Swahili (ps < 0.001). Adjectives were better recognized in Japanese compared to Finnish (p = 0.002); nouns in Finnish and Japanese compared to both Swahili and Tamil (ps < 0.001); verbs were recognized with greater accuracy in Swahili than in the other languages, and they were worse recognized in Tamil than in the other languages (ps < 0.04; see Figure 1).

Figure 1. Interaction between language and category in the percentage of correctly recognized words. The boxplot displays the distribution of individual scores (dots), with the box representing the interquartile range (IQR), the dashed horizontal lines indicate the median, and whiskers extend to 1.5 × IQR. The dashed line indicates the chance-level accuracy (33%). Asterisks indicate significant differences among word categories within each language, while letters denote significant differences among languages within each word category (a > b > c). Figure created using Plotly Chart Studio: https://chart-studio.plotly.com/ (Data Apps for Production | Plotly, s.d.).
4. Discussion
This study examined whether bilingualism influences sound symbolism, investigating whether exposure to multiple languages facilitates the ability to infer the meaning of unknown words based on their phonetic properties. We compared monolingual and bilingual participants in a task where they listened to target words in four foreign languages and were required to select the correct meaning from three alternative translations presented in their native language. Our findings confirmed the existence of sound symbolism, as participants performed above chance level, both when considering the whole sample and when analyzing monolinguals and bilinguals separately. However, no significant difference emerged between the two groups, revealing that knowledge of L2 does not enhance sensitivity to sound meaning associations. This evidence supports the hypothesis that sound symbolism is a universal cognitive mechanism, independent of language learning and exposure. Previous research has suggested that the implicit association between sound and meaning may facilitate language acquisition (Imai et al., Reference Imai, Kita, Nagumo and Okada2008; Lockwood et al., Reference Lockwood, Dingemanse and Hagoort2016; Nygaard et al., Reference Nygaard, Cook and Namy2009). Based on this, it might be predicted that bilingual individuals – due to their broader linguistic experience – could exhibit enhanced performance in guessing the meanings of unknown words. Our results do not support this hypothesis, as they revealed no significant differences between monolingual and bilingual participants. However, the interpretation of this null effect requires caution. A lack of statistical significance does not necessarily indicate that the two groups are equivalent in their sensitivity to phonetic symbolism, since the null hypothesis test does not allow us to draw definitive conclusions about the absence of real differences. For instance, more sensitive indices could show group differences, as well as a comparison between multilingual and absolute monolingual participants could reveal differences that are not evident in the present study. Moreover, the extent to which the present results could be generalized to other language backgrounds remains open, especially since phonological proximity to the test languages could play a role, and also the specific task used here (a forced-choice task) could have affected the results. To sum up, the present results allow us to conclude that no bilingual advantage emerges in this paradigm, suggesting that sensitivity to sound symbolism does not depend on bilingual exposure. While this pattern is consistent with the idea of a broadly shared cognitive mechanism, further evidence is needed to definitively establish the universality of sound symbolism, with broader cross-linguistic validation.
Similarly, the absence of bilingual advantage in this domain was previously reported by Pejovic and Molnar (Pejovic & Molnar, Reference Pejovic and Molnar2017), who tested infants under one year of age on cross-modal correspondences between sound-symbolic non-words and rounded/angular shapes (Pejovic & Molnar, Reference Pejovic and Molnar2017). Their findings revealed that the “intuition” for this correspondence was absent in preverbal infants (4-month-old) but present in 12-month-old infants, with no difference between monolingual (Basque) and bilingual (Spanish-Basque) infants. The authors suggested that sensitivity to sound symbolism serves as a referential cue in the early stages of word learning (Imai & Kita, Reference Imai and Kita2014). Based on this, it could be hypothesized that bilingual adults, having accumulated greater linguistic knowledge, would show a heightened sensitivity to the referential features of unknown words, possibly leading to a higher accuracy in meaning inference. However, our results contradict this prediction, reinforcing the idea that sound symbolism is an innate and implicit ability, unaffected by Bilingual Language Profile.
It is important to clarify that our study focused specifically on bilinguals, comparing them to monolinguals, without directly examining the potential impact of multilingualism. While our results indicate that bilinguals did not exhibit a clear advantage in inferring the meaning of unfamiliar words from their sounds, individuals who speak three or more languages might be more sensitive to sound–meaning associations due to their broader language experience. Future research should investigate whether multilingualism, beyond bilingualism, contributes to enhanced sensitivity in understanding the meaning of unfamiliar words based on their phonetic form. When analyzing the entire set of samples, participants performed above chance level for Finnish, Japanese, and Swahili, and for nouns and verbs, but Tamil and adjectives did not significantly differ from chance. The ANOVA confirmed these patterns, showing lower accuracy for Tamil compared to the other languages, and for adjectives compared to nouns and verbs. This pattern is particularly interesting considering that none of the participants had prior experience with these languages. The fact that performance was not above chance for Tamil and adjectives replicates the results of a previous study with Italian and Polish participants (D’Anselmo et al., Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019). However, one key difference emerged: while in D’Anselmo et al. (Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019), Swahili was not significantly above chance (even if it was higher than 33%, it did not reach statistical significance), in our study, performance for Swahili did exceed the chance level. Despite this minor difference, the overall consistency between studies reinforces the notion that sound symbolism should be an automatic and implicit cognitive mechanism, independent of prior language experience.
The universality of this phenomenon aligns with findings by Blasi et al., who analyzed 100 words from more than 4000 different languages (Blasi et al., Reference Blasi, Wichmann, Hammarström, Stadler and Christiansen2016) and found that specific phonemes recur in words for certain concepts across unrelated languages (e.g., words denoting “smallness” frequently contain the high-front vowel /i/). Even if most of the evidence concerning sound symbolism comes from antonymic-matching paradigms (Brown et al., Reference Brown, Black and Horowitz1955), which focus on sensory adjectives (e.g., brightness/darkness), our study found that adjectives were the most challenging category to infer. This discrepancy may be due to differences in the experimental procedure: for instance, in the original study, Brown et al. (Reference Brown, Black and Horowitz1955) used a two-alternative forced-choice paradigm (chance level = 50%), while our study (like D’Anselmo et al., Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019) used a three-alternative task (chance level = 33%). Previous studies have also tested whether pairs of words from different languages had the same or different meanings (Brackbill & Little, Reference Brackbill and Little1957). The replication of D’Anselmo et al. (Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019) findings in our study further supports the robustness of the three-alternative paradigm while also demonstrating that bilingualism does not enhance performance in this domain.
It is well known that bilingualism influences brain plasticity and cognitive processing (Costa & Sebastián-Gallés, Reference Costa and Sebastián-Gallés2014), leading to the reorganization of linguistic areas in each hemisphere (Li et al., Reference Li, Legault and Litcofsky2014). It has been shown that early bilingual exposure enhances brain response to the fundamental frequency (F0) in vowels, a key speech feature used for language identification (Skoe et al., Reference Skoe, Burakiewicz, Figueiredo and Hardin2017). Moreover, it is established that sound symbolic words are known to elicit greater activity in cross-modal sensory areas (Revill et al., Reference Revill, Namy, DeFife and Nygaard2014), including in the superior temporal sulcus (STS), a brain region responsible for integrating iconic and arbitrary aspects of language (Kanero et al., Reference Kanero, Imai, Okuda, Okada and Matsuda2014). It is important to note that the STS is also a key structure in bilingualism, with both anatomical changes and increased connectivity to frontal areas observed in bilinguals compared with monolinguals (Li et al., Reference Li, Legault and Litcofsky2014). In their review, Sidhu and Pexman proposed that sound symbolism arises from five interacting factors, including neural substrates, statistical co-occurrence of phonetic features, evolutionary associations, and linguistic structures (Sidhu & Pexman, Reference Sidhu and Pexman2018). Similarly, Vigliocco et al. (Reference Vigliocco, Perniss and Vinson2014), highlighted the importance of iconicity in natural languages, particularly in non-Indo-European languages, where language is less arbitrary and more influenced by phonetic-meaning correspondences. This iconic nature of language has also been observed in Spanish (Indo-European language), where onomatopoeias and interjections received the highest iconicity scores, followed by adjectives (Hinojosa et al., Reference Hinojosa, Haro, Magallares, Duñabeitia and Ferré2021). Neuroscientific research further supports this idea: iconic words with high emotional arousal activate the left amygdala (multimodal emotion processing), along with the left superior temporal gyrus (crucial in sound processing) and the left inferior frontal gyrus (semantic processing) (Aryani et al., Reference Aryani, Hsu and Jacobs2019). Finally, a recent study on emotional words showed that negative concepts are more likely to be expressed using iconic words (Calvillo-Torres et al., Reference Calvillo-Torres, Haro, Ferré, Poch and Hinojosa2024) and that specific phonemes predict affective ratings, with voiceless (/p/, /t/) and voiced plosives (/b/, /d/, /g/) linked to high-arousal words, while fricatives (/s/, /z/) and hissing consonants are associated with negative valence. All these findings point to common evolutionary roots of sound symbolism across languages. Although our current behavioral paradigm does not directly measure these neural processes, these findings provide a plausible biological basis for sound-symbolic inference. Future research combining neuroimaging techniques with behavioral tasks – potentially including bilingual populations – could elucidate how neural mechanisms such as those in the STS and amygdala mediate sensitivity to sound symbolism.
Our findings confirm that sound symbolism plays a fundamental role in language acquisition. While both groups performed above chance, no significant differences were observed between them. In line with previous research, our results suggest that sound symbolism may reflect a universal and spontaneous linguistic mechanism, emerging early in infancy and remaining independent of formal language study.
However, the present study has some limitations that should be acknowledged. The auditory stimuli were generated using the pronunciation function of Google Translate. While this method allowed us to exactly replicate the previous paradigm (D’Anselmo et al., Reference D’Anselmo, Prete, Zdybek, Tommasi and Brancucci2019) and ensured the emotional neutrality of the stimuli, it may have reduced their perceptual naturalness. Additionally, the use of a three-alternative forced-choice task may have limited the detection of subtle group differences or individual variations in sensitivity. Future studies could address these issues by using stimuli recorded from natural voices to determine whether the naturalness of the stimulus can influence sensitivity to sound symbolism and by adopting tasks with graded confidence ratings or more sensitive measures of individual performance. Moreover, future research should further explore the universality of sound symbolism across cultures, linguistic families, and diverse bilingual experiences, including populations such as heritage speakers or habitual code-switchers. This would help provide deeper insight to facilitate spontaneous language comprehension.
Data availability statement
Data of the present study are available from the corresponding author on reasonable request.
Acknowledgements
The authors thank Ms Lucrezia Venditti and Ms Alessia Petaccia for the kind technical support in conducting the experiments.
Author contribution
All the authors above approved the final version of the manuscript. Specifically: AD: conceptualization, methodology, software, formal analysis, writing – original draft; GP: conceptualization, methodology, software, formal analysis, writing – original draft; TZ: subject recruitment, critical revision; MdA: subject recruitment, critical revision; VP: conceptualization, subject recruitment, experiments supervision, critical revision; RF: conceptualization, experiments supervision, data curation, critical revision; LT: conceptualization, experiments supervision, critical revision, project administration.
Funding statement
No specific funds were received for this study.
Competing interests
The authors declared the absence of conflicting interest.
Declaration of artificial intelligence use
No artificial intelligence (AI) tools were used for this study.
Ethical approval and informed consent statements
Informed consent was obtained from all participants prior to their inclusion in the study. The study was carried out in accordance with the principles of the Declaration of Helsinki and was approved by the Review Board of the Department of Psychological, Health and Territorial Sciences – University “G. d’Annunzio” of Chieti-Pescara (protocol number: IRBP/22004).
Appendix 1
List of words used. First column: word category (verbs, nouns, and adjectives); second column: foreign Language (Finnish, Japanese, Swahili, and Tamil); third column: phonetic transcription of the sound of the word. The column “Correct” indicates the correct Italian translation of the auditory word presented. The columns “Option 1” and “Option 2” indicate the two incorrect response alternatives in Italian.


