Did you see that? False memories for emotional words in bilingual children

Abstract When participants process a list of semantically strongly related words, the ones that were not presented may later be said, falsely, to have been on the list. This ‘false memory effect’ has been investigated by means of the DRM paradigm. We applied an emotional version of it to assess the false memory effect for emotional words in bilingual children with a minority language as L1 (their mother tongue) and a monolingual control group. We found that the higher emotionality of the words enhances memory distortion for both the bilingual and the monolingual children, in spite of the disadvantage related to vocabulary skills and of the socioeconomic status that acts on semantic processing independently from the condition of bilingualism. We conclude that bilingual children develop their semantic knowledge separately from their vocabulary skills and parallel to their monolingual peers, with a comparable role played by Arousal and Valence.


Introduction
Mariana is ten years old and she attends the sixth grade in Genoa.Her family moved to Italy, to Mariana's father's hometown when she was just three years old, leaving Santiago del Chile, Mariana's mother's birthplace.The mother speaks to Mariana in Spanish, to maintain a bond with the child's mother tongue.Mariana has always been frightened by the idea of the hospital, by the doctors, with their white coats and serious faces, by the smell of disinfectant that invades the rooms, by the beds, the comings and goings of people, the syringes.The words doctor, sterile gown, cot, syringe have a priority meaning for Mariana in light of her fear.
The emotional component of the lexicon influences its processing.During the course of the present study we will examine how the emotional word processing develops in the second language of proficient bilingual children and how it acts on their semantic elaboration: does the emotionality of the lexicon play the same role in one's second language, making semantic based processing a priority, as it does in one's first language?

False memory
One way to pursue our aim is to measure VERBAL FALSE MEMORIES (Oliveira, Albuquerque & Saraiva, 2018;Seamon, Luo, Schlegel, Greene & Goldenberg, 2000).Suppose we presented Mariana with a list of words, all semantically related to the concept of hospital (e.g., "doctor, sterile gown, bed, syringe"), without ever mentioning the word "hospital" (often called CRITICAL LURE).Suppose further that we then asked her for a list of words she remembers of the words we presented her with.It is likely that she would also mention the word "hospital"the lure, activated by those actually seen, was not present but, because it is related at the level of meaning, it is recalled by Mariana as if actually seen.
To give a definition of this phenomenon, VERBAL FALSE MEMORY occurs when participants "claim that certain words that have not actually been presented are as distinct as those that actually appear as equivalents" (Zhou, 2019).
The question could then be: why does this kind of memory distortion happen?The way in which our memory system works is adaptive and false memories fall within that system (Schacter, Guerin & St. Jacques, 2011), leading to an extensive and articulated understanding of events that may not be well-coded at first (Schacter, 2012).Having a flexible memory system enables us to imagine the future and to reconsider one's past (Klein, Robertson & Delton, 2009).To complete these cognitive actions we need to have the possibility to recombine episodic and semantic memory elements.This recombination can produce distortions in remembrance, at the expense of the accuracy and in favour of a more global vision (Newman & Lindsay, 2009).

False memory in children
False memory has been studied not only in an adult population, but also over the course of life and, in particular, during childhood (e.g., Reyna & Lloyd, 1997;Bjorklund, 2000;Brainerd, Reyna & Forrest, 2002;Metzger, Warren, Shelton, Price, Reed & Williams, 2008;Otgaar & Smeets, 2010;Otgaar, Howe, Muris & Merckelbach, 2019).The analysis of this phenomenon, in a developmental phase, made it possible to clarify its onset and its relationship with the performance in other aspects of a cognitive system that is still in formation (e.g., Brainerd, Reyna & Holliday, 2018).For a long time it was believed that younger children were more prone to the formation of false memories: the cause of this has been traced back to their lower ability to memorize specific details of events related to their lives (e.g., Binet, 1900;Bruck & Ceci, 1999;Quas, Qin, Schaaf & Goodman, 1997;Reyna, Mills, Estrada & Brainerd, 2007;Small, 1896;Stern, 1910).However, Brainerd et al. (2002) found that false memories increase proportionally with age while recalling or recognizing verbal material.The authors refer to this phenomena in terms of "developmental reversals" (Brainerd, Holliday, Reyna, Yang & Toglia, 2010) and attribute it to the fact that the more children grow, the more they acquire the ability to "get the gist" of a series of items consistent in their semantics.Metzger et al. (2008) conducted three experiments using word lists to analyze developmental trends in accurate and false memory production for words.At first, they adapted their materials to be consistent with 2 nd and 8 th grader children's vocabulary, then they asked participants to generate their own lists.The authors found that both true and false memory increased with age.Furthermore, they found that young children did have access to their semantic storage, but they didn't seem to be able to use it to improve their recall.Their results tend to support Brainerd et al. (2010) thesis of the developmental reversals, by affirming that the more children grow, the larger use they make of their semantic knowledge, which leads to better recall of the actually presented words, but also to a larger semantic-based misremembrance.

False memory in bilinguals:
A key to access bilingual word form and meaning representations in adults and children The modality of access to language in bilinguals is affected by various features (e.g., proficiency, age of acquisition and language dominance, e.g., Kastenbaum, Bedore, Peña, Sheng, Mavis, Sebastian-Vaytadden, Rangamani, Vallila-Rohter & Kiran, 2019;Montrul & Foote, 2012;Puig-Mayenco, Cunnings, Bayram, Miller, Tubau & Rothman, 2018) and it is still the subject of a wide debate.False memory can provide insights related to semantic processing in bilinguals and to the way in which they process different categories of stimuli, in comparison with monolinguals.The ease with which a word can be retrieved from memory depends on the goodness of fit between the signal and the stored representation, the memory strength of a word, and the number of words that partially match the speech signal and, as a result, compete for selection with the target word.This process is effortless for monolingual speakers, but may be more challenging for second language (L2) and bilingual speakers (Schmidtke, 2014).The analysis of this effort can shed light on some currently active debates regarding bilingual word form and meaning representations, both when we consider adults and children.
van Heuven, Dijkstra and Grainger (1998) started from the theoretical framework of an implemented version of the Bilingual Interactive Activation (BIA) model to account for a series of results, obtained through progressive demasking and lexical decision tasks, in order to examine how Dutch-English bilinguals recognized target words belonging to one language and how this processing was affected by orthographic neighbourhood from the same language used in the task or from the other language of participants.Their results provided evidence for parallel activation of words in an integrated Dutch-English lexicon.Dong et al. (2005) proposed that the mental lexicon in bilinguals was represented through a shared and distributed model.They conducted two experiments to demonstrate their view.The first one was a semantic priming paradigm and led to the demonstration of the sharing of relations between concepts across translation equivalents.The second aimed at examining the details of meaning separation, and its results showed that bilinguals have the tendency to integrate meanings through translation equivalents, but the L1 words are preferably represented in the L1 conceptual system, while the L2 words tend to be maintained in the L2 conceptual one.
The organization of the mental lexicon in bilinguals and the way they store meanings has been extensively studied also in children.Kajcsa (2010) assessed it on a bilingual sample of children attending kindergarten through a picture-naming task.Children of the age of 3-4 years appear to be able to access their mental lexicon in a sophisticated way and they show the same pattern found in an adult population, with the only different tendency being for children to preferentially process nouns.Furthermore, bilingual children rely on the mental lexicon of both languages, linked to the same conceptual system.
Although today most literature seems to argue for the idea of an integrated semantic system for bilinguals' representation of meanings, this topic still occupies many researchers (e.g., Gathercole, 2020).
We think that false memory can help better understanding of the above-mentioned aspects.Graves and Altarriba (2014) reviewed some of the main results related to false memories for verbal material in bilinguals.Anastasi, Rhodes, Marquez and Velino (2005) tested false memories for words in monolingual English and bilingual participants and found that the bilinguals showed a reduction in false alarms in comparison with the English monolinguals for English words, but did not show a significant false memory reduction for their native language; Kawasaki-Miyaji et al. found that a larger amount of false recognition occurred in Japanese (the participants' dominant language) than in English, but in both the languages false memories were produced (Graves & Altarriba, 2014).In general, it was found that false memories occurred in the between-language conditions and this result is consistent with views that assume the existence of separate lexical representations for a word in each language, but a common conceptual-level representation for both languages.
Marmolejo, Diliberto-Macaluso and Altarriba (2009) presented Spanish-English bilinguals with 10 lists of words, aiming to examine the impact of switching language on false memory.Results showed a higher proportion of veridical and false recall in English, the dominant L2 language, than in Spanish, which was the participants' L1.Howe, Gagnon and Thouas (2008) examined false memories for words in bilingual children, within and between languages Bilingualism: Language and Cognition through a recall task and through a recognition task; the children were aged from 6 to 11 years old.The authors found an increasing production of false memories as age increased, coherently with the previously exposed hypothesis of a larger gist elaboration during growth, as found in monolinguals.Furthermore, children of all the age groups exhibited a better recall of truly presented words in the within-language condition; as regards the false recall, younger children showed fewer false memories in the betweenlanguage condition, while the older children and the adults showed the opposite pattern.To conclude, as regards the recognition task, all the age groups showed a larger number of false memories in the between language condition.As Riesthuis, Otgaar and Wang concluded (2019), the results found in the literature regarding false memories in bilinguals' L2 is rather controversial and there is still need to assess to what extent the false and true memories are encoded in a similar way by bilinguals and by monolinguals.

Deese-Roediger-McDermott (DRM) Paradigm
The DRM paradigm is the task that is most commonly used to measure false memory.Its name is an acronym of its three authors: it was developed by Deese at first (1959) and then implemented by Roediger and McDermott (1995).Its most famous version consists in lists of words, all semantically related to a lure that represents the semantic nucleus or theme of each list and that won't be presented.The task is divided in two different moments: at first, participants are exposed to a coding phase, during which they have to memorize as many words as possible from the lists.Secondly, usually after a period of latency, there is the experimental phase, when they have to remember the words they have previously memorized.There are two variants of the experimental phase: subjects can be required to simply recall the words they remember, or they may have to recognize them among other not presented words, among which there are also the critical lures related to every presented list (i.e., word recognition, defined as a "complex process that requires the encoding of a signal and subsequent mapping of this information to representations in memory", Schmidtke, 2014).In both these two variants of the task people are supposed to falsely recall or recognize also the lures, on the basis of the spread of semantic activation that arises when the words are coded and that reinforces the overdetermined concept that ties them.Brainerd, Yang, Reyna, Howe, and Mills (2008b) assessed the semantic properties of the DRM lists, through the examination of 16 dimensions (i.e., familiarity, meaningfulness, concreteness, imagery, categorizability, number of attributes, pleasantness, arousal, dominance, valence, synonymy, antonymy, taxonomy, entity relations, introspective relations, and situational relations) and found through a factorial analysis that they are a valid index of semantics, extremely rich in meaning.

False memory as a semantic index
On the one hand, as we already mentioned, false memory can be meant as an inaccuracy and a sign of the fallibility of our mnestic system.However, on the other hand, it has been proved that it arises as a result of a semantic-based processing of the information.
Semantics is the study of meanings in natural or artificial languages.In psychology it concerns "how human users of language come to be able to understand what utterances in a language mean.Necessarily, it is also concerned with the question of how the meanings of words are represented in the mind" (Sanford, 2006).Semantics mediates our ability to understand the relationship between things, as well as to analyze and categorize the world around us, giving our knowledge an order; the desire to better understand it has moved a large part of research and has opened an interdisciplinary debate, which involves various fields, from psychology, to philosophy, linguistics and others.
Fuzzy Trace Theory (Reyna & Brainerd, 1995) accounts for false memory as a semantic index, postulating that recollection occurs on the basis of two different and independent forms of processing: VERBATIM retrieval concerns the elaboration of the surface characteristics of a trace to be memorized (e.g., the colour of a stimulus, its position among other elements); conversely, GIST retrieval refers to the extrapolation of the meaning conveyed by what is being processed: namely, its semantics.Both the verbatim and the gist form of elaboration act in the formation of false memory: while the former is deputed to select the actually presented material and reject the critical lures, the latter leads to processing the information on the basis of coherence and semantic similarity, and it acts in the unfolding of false memory, by promoting acceptance of the non-present but associated critical lures.These two ways act oppositely, but complementarily, respectively explaining the level of accuracy and the level of false memory (Reyna, Corbin, Weldon & Brainerd, 2016).
Another theory that accounts for false memory production in terms of semantic processing is the associative activation theory (AAT; Howe, Wimmer, Gagnon & Plumpton, 2009): it rises from the spreading activation models and it postulates that the processing of one concept makes the activation to the related concept nodes spreading in one's knowledge.This activation leads to the awakening of other related concepts, and the "incorrectly" activated ones may lead to being later falsely remembered as already having been seen or heard, and therefore it may account for false memory production.The stronger the relations between concepts and the more one's knowledge base spreads fast, the more false memory will be likely to occur (Otgaar, Howe, Muris & Merckelbach, 2018).

Emotion boosts semantic processing
Both memory and the elements to be memorized are never totally aseptic.On the contrary, they are inseparable from the emotional component and imbued with it.This aspect is part of the very concept of semantics and in linguistics it is called "connotation": with this term we refer to the profound meaning of a term, imbued with beliefs and emotions (Riemer, 2010).Zhang, Gross and Hayne (2016) underlined that emotion is one of the factors that acts on the quality of our memories: the higher emotional connotation of a word represents a preferential channel in terms of vocabulary processing and it is able to boost the semantic activation.It has been proved that the degree of emotionality of words acts not only on true recollection, but also on memory distortion, being able to elicit a larger amount of false memories (Kaplan, Van Damme, Levine & Loftus, 2015).This fact goes in the direction of an enhanced gist-based processing for emotional words, which responds to a greater activation and urgency, the same found outside the laboratory, in a context of "everyday life", in which the "emotional information receives preferential processing, which facilitates adaptive strategies for survival" (Lee, Greening & Mather, 2015).
What primarily acts on memory is the arousal of the elicited emotions: indeed, from a phylogenetic point of view, a high 500 Martina Cangelosi et al.
arousal event is more likely to be related to personal survival factors or reproductive success, compared to neutral stimuli and it has a highly adaptive value.Not only the arousal, but also the valence associated to a stimulus acts on its recollection.In particular, it seems that experiences with a negative valence are remembered with a greater number of details compared to neutral and positive valence memories (Zhang et al., 2016).

Emotional words processing in bilinguals
Memorization of emotive words in bilinguals has been assessed at first by Anooshian and Hertel (1994), who compared recall of emotional and neutral words through a rating task, in either a first and a second language.These two authors hypothesized that the emotional words in a second language lack the emotional connotation and, consequently, are better recalled, acting similarly to the neutral ones.They tested a group of Spanish-English bilinguals and confirmed their hypothesis of a less emotionally charged lexicon in L2.Ayçiçeği and Harris (2004) started from the methodology of Anooshian and Hertel to demonstrate that, on the contrary, emotion plays a role also in the bilingual L2.In particular, the authors revised the materials used by Anooshian and Hertel noticing that they only considered the arousal of the words (emotional vs neutral).Ayçiçeği and Harris implemented their set of emotional words, considering valence (positive vs negative), arousal (emotional versus neutral).From their results, obtained by administering a recall and a recognition task to native Turkish bilinguals, speaking English as L2, it appears that emotion is able to elicit an enhancement of memorization.They found this same pattern both in L1 and in L2; in particular, they found that negative words were better remembered than positive ones, supporting the proposal that bilingual speakers elaborate negative stimuli better than positive ones in their L2, that led to smaller differences between the two languages.
A similar result has been confirmed by Ferré, García, Fraga, Sánchez-Casas and Molero (2010), who administered a set of positive, negative and neutral stimuli to different groups of bilinguals.From their study it appeared that emotional words were better recalled than neutral ones and that valence and arousal act in interaction, playing a mutual influence on memory results.This pattern was found in the bilinguals' first and second language and regardless of language dominance, context of acquisition, age of acquisition and similarity between languages.In a further and similar study conducted in 2017, they demonstrated that language status plays an important role in emotional word processing.Also Chen, Lin, Chen, Lu and Guo (2015) examined the elaboration of emotional lexicon in both the languages spoken by Chinese-English bilinguals and found that in L1 it would be processed more rapidly than the neutral one, while in L2 the emotional word processing would rely more on semantics.Brase and Mani (2017) assessed the effects of learning context on the acquisition and processing of emotional words in bilinguals.From their results, they found that the context of acquisition varied depending on the language status of the L1 versus the L2 and on the requirement of the tasks.Their results have been replicated by Grabovac and Pléh (2014).

Emotional DRM lists
The DRM paradigm allows us to understand the influence that the emotional content of the lexicon has on its memorization, in both an adult and in a child population (e.g., Brainerd et al., 2008b;Brainerd, Yang, Toglia, Reyna & Stahl, 2008c;Howe, 2007;Quas, Rush, Yim, Edelstein, Otgaar & Smeets, 2015;Baugerud, Howe, Magnussen & Melinder, 2016).Brainerd et al. (2008c) found that previously used DRM lists differed in both the arousal and in valence dimensions.On the basis of this finding, the authors validated the Cornell/Cortland Emotional Lists (CEL, Brainerd et al., 2008c).This is a pool of 32 lists, divided into 4 subsets: high arousal and negative valence lists, low arousal and negative valence lists, high arousal and positive valence lists, low arousal and positive valence lists.The administration of lists belonging to each of the four groups enables us to analyze the effect of arousal and valence both separately and interactively.
What emerges from this and other studies with the CEL lists (e.g., Brainerd, Stein, Silveira, Rohenkohl & Reyna, 2008a;Dehon, Larøi & Van der Linden, 2010;Brainerd et al., 2010;Howe, 2007;El Sharkawy, Groth, Vetter, Beraldi &Fast, 2008 andHowe, Candel, Otgaar, Malone &Wimmer, 2010) is that on the one hand, the high arousal increases the memorization of target words (actually presented ones), on the other hand it increases the number of false memories.As regards the valence, it is possible to affirm how the negative valence generally has a greater impact on the memorization process (Brainerd et al., 2008a(Brainerd et al., , 2010;;Dehon et al., 2010;El Sharkawy et al., 2008;Howe, 2007;Howe et al., 2010), although a portion of these results could be influenced by the nature of the proposed task (recall or recognition task) (Dehon et al., 2010;El Sharkawy et al., 2008;Howe, 2007;Howe et al., 2010).To return to our example, one thing would be to present Mariana with a list of words that are related to a concept that is emotionally relevant to her, such as "hospital".Another thing would be to present her with a list of words all semantically associated with the more neutral concept of "green" (e.g., frog, apple, salad, grass): she is supposed to be more likely to be activated by the words related to the hospital sphere and, therefore, to better remember them.However, this activation could be so strong as to lead her to further misremember the lure "hospital" compared to the more neutral lure "green".

Overview
On the one hand, nearly all of the research that studied false memory in a bilingual population of children used standard DRM lists, without considering the emotional component, that appears to be lacking also regarding general processing of the emotional lexicon during childhood.The emotional content of the words could be a key to assess semantic processing, since the emotionality of a word is able to activate more the semantic network and consequently a gist based processing, which would be responsible for false memory.
On the other hand, all of the experiments that intended to study the role of emotional word processing in bilinguals focused on correct memorization and not on memory distortions.
In light of these gaps we found in literature, we would like to investigate whether the connotation of a word in a bilingual second language acts on semantic and mnestic processing similarly to how it would for monolingual peers in their native languages.For this purpose, we employed an emotional version of the DRM paradigm to test bilingual and monolingual children.

Aim of the study
The aim of the present study is to examine semantic processing of emotional words (divided according to their arousal and valence) Bilingualism: Language and Cognition in bilingual children speaking a minority language, as measured in their L2.On the basis of literature, we can postulate that children of the age of our general sample (Mean age = 10.12) would have an emotional semantic processing pattern similar to that of the adults: this means we expect children to produce more false memories for high emotional words.Our explorative scope is to verify if bilingual children show the same pattern of their monolingual peers, with a false memory enhancement for higher emotionally charged lures.We can argue that this result would imply that bilingual semantics would be equally developed as for the monolinguals.
As regards accuracy rates in the recognition task, we expect similar performances between the two groups, according to literature which certified that bilingualism does not impact on memory performance per se (e.g., Bonifacci, Giombini, Bellocchi & Contento, 2011;Bonifacci, Lombardo, Pedrinazzi, Terracina & Palladino, 2020;Engel de Abreu, 2011;Feng, 2008).To conclude, we would like to consider some control measures that we think could act on connotative semantic performance.

Participants
A sample of 129 children from two school levels, IV and VI grades, took part in the study (see Table 1).It was recruited at a comprehensive Institute of Pavia, a town in Northern Italy.Teachers report a high degree of bilingualism and a massive presence of immigrant children in all the classes of the Institute, that covers schooling from kindergarten to first grade secondary school.
Seven children were excluded for learning disabilities (diagnosed by the Public Health Service) or for scoring less than two standard deviations below the expected mean in the vocabulary and reading test: for this purpose, we administered a reading word task and a non-word task (DDE-2, 1995), where children had to read aloud respectively lists of content words and lists of non-words that sound similar to Italian words; fluency and accuracy were taken into account.As regards fluency, it was assessed according to the reading time of each list, measured in syllables per second: the final fluency score consists of the sum of the three word lists scores and, separately, of the three non-words lists scores.As regards accuracy, it consisted of the total amount of errors made by the children (each word read wrongly can contain at most one error; in light of this each word can only be read totally correctly or totally wrongly): errors regard pronunciation of each word and non-word.
Thus, 122 children have been included in the analyses.
Half of the children were bilingual (n = 56), speaking a minority language (30% of the sample speaking Arabic; 10% speaking Chinese, 27% speaking South American Spanish, 7% speaking Albanian, 5% speaking Romanian, 3% speaking Polish, 4% speaking Brazilian Portuguese and 1% speaking Turkish and 13% speaking other dialects) in their family context as L1 and attending Italian language school, so that they were fluent in Italian, their L2.They had not received previous schooling in a language different from Italian.11 children were born in a country different from Italy (19.6% of the bilingual sample): the 80.4% of the sample was made up of second generation immigrant children, attending all their schooling in Italy, from kindergarten to date; as regards the remaining first generation immigrant children, they all started primary school in Italy.The whole bilingual sample was made up of balanced bilinguals, highly proficient in Italian, in accordance with the average expected for their age, as demonstrated by vocabulary, comprehension and reading scores.Furthermore, we administered two subscales of the Italian version of parental questionnaire ALDeQ (Bonifacci, Mari, Gabbianelli, Ferraguti, Montanari, Burani & Porrelli, 2016): Scale A, "Early milestones" aimed at collecting information regarding the early developmental steps, as first words/sentences, whereas Scale B "Child's current L1 and L2 abilities" aimed at assessing the child's level in each of the two languages spoken.The maximum score that can be obtained in each of the two scale is 18 points.Bilinguals scored 13.51 at Scale A and 8.56 at scale B.
The monolingual group was made up of children with Italian as L1, with comparable age and schooling.(Mean age of the monolingual group: 10.20; Mean age of the bilingual group: 10.04.p = 0.394) Minority languages are the ones spoken by the minority of the population of a territory and those with lower socioeconomic status; a broad debate related to the relation between bilingual cognition and socioeconomic status (SES.Morton & Harper, 2007;Nair, Biedermann & Nickels, 2017) stressed the necessity to always take into account the role of SES when speaking about bilingualism and considering the associated distinction of minority vs. majority languages.
In light of this, we referred to the Hollingshead questionnaire Four factor Index of Social Status (1975) in order to calculate the socioeconomic status (SES) of each child, intended as an average of the educational level + profession of the mother and the educational level + profession of the father.We referred to the index and values proposed by the author: it was significantly different between the two groups, as can be seen in Table 2. Therefore, SES was treated as a covariate in the statistical analyses.

Vocabulary
In order to assess receptive vocabulary skills, we proposed a verbal meaning task (PMA, 1957) in which participants had to correctly mark the synonym of a given word, choosing between four alternatives.They had 8 minutes to complete the task, composed by 30 trials, as in its original version; scoring consists in counting the number of correct responses.We administered a fluency task (BVN 5-11, 2005) to assess productive vocabulary: children had to name as many words as possible belonging to a given category (i.e., "fruits": the child had to name as many fruits as possible); they had one minute for each of four categories, as in the test manual.We chose to use this task instead of other similar tasks like Peabody Picture Vocabulary Test (PPVT-III, Dunn & Dunn, 1997) in order to avoid an excess of overlap (collinearity) between our main semantic DRM measure and the vocabulary ones: in fact, PPVT is given as a semantic measure (e.g., Catani, Mesulam, Jakobsen, Malik, Martersteck, Wieneke, Thompson, Thiebaut De Schotten, Dell'Acqua, Weintraub & Rogalski, 2013), while categorical fluency is more often treated as a measure that lies on the edge between productive vocabulary and executive functioning (e.g., Varvara, Varuzza, Sorrentino, Vicari & Menghini, 2014).

Working Memory
Direct and Reverse Span (BVN 5-11, 2005) were administered to assess working memory: in the former task the experimenter read aloud blocks composed by increasing sequences of numbers; at the end of each sequence the participant had to repeat the numbers in the same order.The latter task was similar to the previous one, but at the end of the sequence, the participant had to repeat the numbers in the reverse order.The tasks ended when participants failed to recall two sequences of the same block.We have chosen an oral working memory task to measure this component as detached from reading and writing skills.

Experimental Task
As regards the main task, participants were exposed to 12 emotional DRM lists, presented in Italian, L2 for the bilinguals and L1 for the monolingual (see Supplementary Materials, Table S1).The materials have been adapted and validated from the original CEL lists (Brainerd et al., 2008c), following the same testing procedure used by the Authors on a sample with comparable age.In order to adapt and validate the Cornell/ Cortland Emotional Lists (CEL, 2008) to Italian we've conducted a pilot study on 25 monolingual children (Mean age = 9.8), different from the ones that took part in our main study.Arousal and valence were assessed on a dichotomous scale (respectively, high/ low and negative/positive), backward associative strength (BAS) on a 7 point Likert scale (from 1 = not associated at all to 7 = very associated).We checked the frequencies of the words included referring to CoLFIS Lexical Database (Bertinetto, Burani, Laudanna, Marconi, Ratti, Rolando & Thornton, 2005); in addition we asked children to mark their familiarity with the words, from 0 = not at all to 4 = very well.We only included words marked by the children as known from 2 to 4 and from medium to high frequency words.We've compared our results to the original ones and found a correspondence of 82% as regards the valence and a correspondence of 62% as regards arousal.We've reassigned the lists to the four Arousal * Valence conditions on the basis of the pilot results.
Twelve lists have been chosen from the adapted-to-Italian 32 Cornell/Cortland Emotional Lists (CEL, Brainerd et al., 2008c).Every list presented in the coding phase was made up of 10 of the 15 original words, ordered from the one with the stronger BAS, to that with the weaker BAS (Backward Association Strength, which is the degree of association of each word of a list with its corresponding lure, as measured on a 7 point Likert Scale).The 5 words with the lowest BAS value, out of the 15 words of each list, were thus eliminated.The 12 lists were 3 from each of the four Arousal * Valence combination: High Arousal * Negative Valence, Low Arousal * Negative Valence, High Arousal * Positive Valence, Low Arousal * Positive Valence.
The recognition task was related to every Arousal * Valence combination and was made up of 21 words: 9 target words (3 from each presented list), the 3 not presented critical lures corresponding to each presented list, 1 related distractor for each presented list (taken by the 12 th position of the 15-words lists) and 6 unrelated distractors, two for each of three unpresented lists, belonging to the same Arousal * Valence combination.

Procedure
Participants took part in two meetings, the first one was collective and the second one individual, both taking place inside the school, during the scheduled time for educational activities.The individual meetings lasted about 45 minutes per participant.The collective meeting lasted about 20 minutes, involving an entire class at a time and provided for the administration of the vocabulary test: the activity was carried out following the methods and times established by the standardized battery.The individual meeting was in turn divided into two parts: during the first part the children carried out the working memory test, the reading test and the categorical fluency test in paper form, following the methods and times established by the respective standardized batteries.Breaks were foreseen between one task and the next, in order to avoid fatigue.In the second part, the DRM task was proposed on a Macbook Air laptop computer, with a 13-inch screen We constituted an overall Vocabulary index with PMA and the Categorical Fluency Task and an overall Working memory index with Direct and Reverse Span Tasks, as Lecce, Zocchi, Pagnin, Palladino & Taumoepeau (2010) did, in light of the high correlation between these two couples of variables (i.e., PMA -fluency task correlation = .39,p < .001;Direct Span -Reverse Span correlation = .36,p < .001)and in order to avoid multicollinearity in our model.and it was run on Open Sesame 2.0 (Mathôt, Schreij & Theeuwes, 2011), in a both visual and auditory modality.Stimuli were written in the font "mono", 46 px, at a resolution of 1024 px X 768 px, with a mean of approximately 70°as visual angle.The audio files were .wav,created with the text to speech program TextAloud 3.0.108(https://textaloud.it.uptodown.com/windows).The procedure was the same adopted and described by Brainerd et al. (2010): each child was instructed to focus on the screen and try to memorize as many presented words as possible.12 adapted to Italian CEL lists were run.There were four coding sessions, one for each Arousal * Valence combination: after the first coding session there was a 30-sec interval in which children were involved in an easy mathematical task.Following the math distractor task, the first recognition task was presented, in which the child had to press the space bar every time s/he thought s/ he recognized the word as part of the previously memorized ones.This procedure was repeated for each of the four sessions, from the coding to the recognition phase: between one block and the next there were breaks of two minutes during which the experimenters asked students short questions to assess their understanding and asked them for feedback regarding perceived difficulty and fatigue.
The order of the two testing phasesthe collective and individual onewas randomized between classes, while the order of the two individual partspaper one and the DRM taskwas randomized between participants, in the same way as for the order of the DRM sessions.

Signal detection index
Signal detection theory is widely used in the literature that refers to DRM paradigm (Rotello, 2017): d' is the index used in order to separate memory performance from noise (which is response bias).Our 'hits' were the correctly recognized words, while our 'false alarms' were the critical lures.The d' indexes obtained (z(H)z(FA)) were the ones considered in the analyses, so as to work on a clean measure (Fischer & Milfont, 2010), where z(H) was the normalized number of Hits, that in our case were the correctly recognized words (accuracy index) and z(FA) was the normalized number of False Alarms that in our case corresponded to the number of false memories.Here we report the d' measures for the Total Sample (Mean d' = −2.722352e-16,SD = 1.007), for the Bilingual Sample (Mean d' = −0.02623643,SD = 1.0136) and for the Monolingual Sample (Mean d' = 0.021, SD = 1.004): t tests run on these values showed that these means are comparable (Difference between the monolingual d' Mean and the bilingual d' Mean: p = .614),as will be better discussed in the following sections.

Results
Statistical analyses are reported below.The core of the research was to examine semantic processing of emotional words (divided according to their valence and arousal) in minority language bilingual children, compared with their monolingual peers.For this purpose, we run on RStudio 1.0.153(RStudio Team, 2020) a Linear Mixed Model (LMM), which enables the inclusion of random effects, in addition to the fixed effects, in order to account for individual variability in parameters estimate.As can be observed in Table 3, our dependent variable was d', while bilingualism, valence and arousal constituted our fixed effects; SES, vocabulary and working memory were used as covariates, in order to take into account a possible role of these variables on d' values.
Subjects and words were our random intercepts (see Table 4).
At first, we report the results regarding the emotional semantic processing pattern on the general sample, of both bilinguals and monolinguals.As postulated, arousal had a statistically significant main effect on d', F (1, 114.18) = 40.547,p < .001:high arousal lures (Mean = −.274) were more prone to be recognized as already seen than the low arousal lures (Mean = 0.263).
Valence did not show a statistically significant main effect, F (1, 113.96) = 0.194, p = .884.However the Valence * Arousal interaction was statistically significant, F (1, 114.19) = 13.250,p < .001,as can be observed in Fig. 1: false memory production was affected by the Arousal * Valence interaction.To disentangle the interaction effect, we tested differences by using post-hoc multiple comparisons with Tukey HSD corrected p-values.In the low arousal condition, positive valence stimuli (Mean = 0.433) showed significantly higher d' compared to stimuli with negative valence (Mean = 0.093; t(222) = −2.758,p = 0.006), while there was a statistical difference approaching significance in the high arousal condition, in the opposite direction (Mean pos = -0.3899,Mean neg = −0.1575;t(222) = 1.887, p = .061),with positive stimuli showing lower d' compared to negative stimuli.
As regards the accuracy rates, as expected bilingualism proved not to be significant in affecting accuracy scores (p = .549).
As regards the role played by the control measures, in the Tables below it is possible to observe their mean values and the correlations between variables (see Tables 5 and 6).
More in detail, the socioeconomic status was significant in affecting d' mean values, F (1, 111.02) = 14.753, p < .000.This implies that participants with higher SES had higher accuracy rates in the DRM task (see Figure 2).
Vocabulary did not show significant effects on d' values, F (1, 111.02) = 0.063, p = .803.Working memory showed a trend toward statistical significance, F (1, 111.02) = 3.493, p = .064.High scores in the two memory tasks marginally increase the accuracy rates in the DRM task (see Supplementary Materials, Table S2 for the analysis conducted separately on the two groups).

Discussion
In this study we focused on how semantics develops in the second language when it is spoken by minority language bilingual children, who learn it mainly within school and the enlarged social context.We aimed in particular to understand how the emotional connotation of the lexicon (its positive or negative valence and its activationarousal) affects the quality of its memory, by investigating the production of false memories: indeed, false memory proved to be a valid index of semantic activation, as it is the result of a GIST-BASED elaboration of the proposed material (Brainerd et al., 2008b).
We meant to explore a domain that appears to be a gap in literature: some authors (e.g., Ayçiçeği & Harris, 2009; see also Eilola, Jelena & Sharma, 2007 for contrasting data) examined the bilingual mnestic processing of emotionally charged verbal materials focusing on true recall or recognition of actually presented words; the authors that examined bilingual false memories didn't consider the role played by the emotional connotation.
Going through our aims, first of all we meant to replicate the results found in literature related to monolinguals, hypothesizing that semantic false memory for higher emotional lures would have been larger than for less emotionally charged lures (e.g., Brainerd et al., 2008c;Dehon et al., 2010;Brainerd et al., 2010;Howe, 2007;El Sharkawy et al., 2008 andHowe et al., 2010).From our results, it emerged that the arousal dimension was significant in affecting the production of false memories: high arousal words were easily recognized and generated a greater number of false memories; this aspect is consistent with the aforementioned literature.
On the other hand, valence was not statistically significant in affecting semantic elaboration, but the Valence * Arousal interaction did: these two emotional dimensions cannot, undoubtedly, be considered as a single variable that acts indiscriminately on semantics and, through the latter, on processing, but they seem to have a mutual and determinant influence on it.
Our explorative aim was to verify how the bilingual children elaborate emotionally charged words in their L2 compared to their monolingual peers.In particular, we meant to compare the two groups of children, relative to their pattern of false memories for lures divided according to their arousal and valence (as in Brainerd et al., 2010).
Bilingual children seemed to show the same pattern of results as regards how the emotional component of the elaborated lexicon acts on memory distortion, despite the fact that, due to analysis carried out on the global sample, it is odd to make real inferences and it is only possible to make hypotheses observing a global pattern that goes beyond the "bilingualism" factor.
In particular, arousal proved to be significant in the total group in affecting false memory production: higher arousal lures were more prone to be falsely recognized as already seen.We tested the main effect of arousal also exclusively on the bilingual group and found significance (p < .001).This result doesn't seem to support the hypothesis according to which bilingual people might experience their L2 as less emotionally charged (Ayçiçeği & Harris, 2009), and, therefore, they might be less prone to let the emotional component of the words distort the elaboration of the lexicon.
Furthermore, there were no differences as regards valence and Valence * Arousal interaction: also considered separately in this group, the former was not significant in affecting false memories, but it was significant in its interaction with the arousal.These results lead us to consider the semantic ability of the minority language bilingual children as equally developed compared to that of their monolingual peers.In particular, it emerged that this dimension is developed enough to be affected by the connotational aspect of the presented lexicon, influencing the memory performance and the elaboration of verbal material as much as happens in the monolingual children.
We expected to find similar accuracy rates between monolingual and bilingual children in the correct recognition, according to literature which certified that bilingualism does not act on memory performance (Bonifacci et al., 2011;Engel de Abreu, 2011;Feng, 2008): indeed, what emerged from our results was that there were no statistically significant differences between the two groups as regards the correct recognition of the actually presented words.
These results have been considered in light of our control variables, which proved to be intrinsically related with both the concept of semantics and of bilingualism.
In particular, it emerged that vocabulary skills did not affect the memorization and correct performance in the recognition task.The correlation found between vocabulary and bilingualism was coherent with the literature (Bialystok, 2009), confirming that the bilingual children in this age group (Mean age of the monolingual group: 10.20; Mean age of the bilingual group: 10.04) seem to have a less developed vocabulary knowledge if compared to that of their monolingual peers and if measured in isolation in each of their two spoken languages.These results lead us to consider semantic development as detached from the vocabulary component, going in the direction of the literature that considers the bilingual semantic capacity as language-independent and as a shared system between the two languages spoken (e.g., Dong et al., 2005;Francis, 2005;Navracsics, 2002;van Heuven et al., 1998).
Working memory proved to have a weight in the production of false memories: better results in the working memory tasks seem to be associated to better accuracy rates in the DRM recognition phases.Although considering the combined index it is only a trend towards significance, this result appears interesting since it provides evidences regarding the different mechanisms underlying the production of false memories: even if the role of semantic processing as the primary architect of the production of this type of false memories is established, a greater understanding of how much is due to semantics and how much, instead, is due to working memory can be useful to make the DRM task increasingly more accurate: a good support in working memory is required to retain the already repeated items and to pre-represent the ones that have to be further produced.As Baddeley, Allen and Hitch (2011) postulated in their last refined working memory model, working memory interacts with long term memory: this interaction can explain much about the final working memory performance.In light of these considerations, we think that the relation between working memory and DRM recognition deserves to be further explored.
The socioeconomic status proved to be significant in affecting false memory: higher levels of SES would corresponded to a higher number of false memories and, therefore, to higher semantic levels.This result goes in the direction of the literature that highlighted SES as a predictor of some lexical, semantic and vocabulary knowledges (e.g., Gatt, Baldacchino & Dodd, 2020;Maguire, Schneider, Middleton, Ralph, Lopez, Ackerman & Alyson, 2018), in the relation between linguistic abilities and school performance (e.g., von Stumm, Rimfeld, Dale & Plomin, 2020) and in explaining working memory abilities (e.g., Tine, 2014).It is important to notice that the SES variable in our study was highly and negatively correlated with bilingualism (−.39, p < .001),which means that the bilingual children included in our sample had a lower socioeconomic status compared to their monolingual peers.These results lead us to propose that although minority language bilingualism is often associated with a condition of lower socio-economic status, these two variables can be considered in light of the different influence they seem to have on semantics: if coming from a family with high socioeconomic status seems to positively influence the semantic competence, the same cannot be said of the condition of bilingualism.
The main limit attributable to the study concerns the inability to emphasize the specificity of all the linguistic and cultural backgrounds of the bilingual sample, made up of the broad definition of minority language bilingualism, but which actually involved children from a large number of countries and nationalities, very different one from each other and, furthermore, which involved extremely heterogeneous linguistic realities.The mediation carried out by teachers and experimenters did not solve the impossibility of adapting the materials to all the linguistic realities present within the involved sample.
Moreover, despite the fact that this aspect could go beyond the scope of this research, while interpreting the results of the study, one cannot avoid considering the role played by the belonging culture and the homogeneity of the here considered bilingual sample as a mediator of the emotional attribution.Although these children were prematurely exposed to their L2 at an immersive level, most of them had experiences that differed from the ones expressed by the Italian culture (the one of their L2); this aspect can have a weight in the subjective emotional interpretation that the children gave to the proposed words.One further thing that can be mentioned relative to the influence of cultural differences on memory distortion is schemas, defined by Webb, Turney and Dennis (2016) as "useful memory tools that allow information to be categorized according to a common concept or theme", adaptable associative networks of knowledge extracted over multiple similar experiences (Robin & Moscovitch, 2017).Schemas consist of a collection of "underlying personal beliefs, social pressures, biases and heuristics, and cultural assumptions" which, taken together, permeate our mental organization and constitute the basis for our daily experience and a key access to interpreting the world (Hendricks, 2021).In light of this implicit process, it is clear that not all schemas are universally accepted and that they strongly depend on the environment in which one is immersed.These differences play a role on memory distortions and the priority, the relevance that one puts in one's elaboration largely depends on them (Webb et al., 2016).
To date, we are working on extending the project to different bilingual samples (i.e., of majority language), as well as different age groups (seniors aged 65 and over).It would also be interesting to consider the reaction times associated with the recognition task, evaluating how they change on the basis of arousal and valence.Another possibility would be to implement the study with the joint use of techniques, such as eye tracking and galvanic skin response (GSR), the former aimed at detecting the direction of the gaze, the amount of saccadic responses and the duration of the fixations while the DRM lists are presented and recognized, the latter to detect physiological activation in relation to the emotional nature of the presented stimuli.

Conclusions
From our results it appears that children from 9 to 11 years produce false memories, as an expression of semantic processing, Bilingualism: Language and Cognition similarly to adults.Equally, emotion plays a similar role on the aforementioned semantic processing, with high-arousing lures being false-recognized more than low-arousing lures.Minority language bilingual children seem to develop the connotative dimension of semantics in their L2 equally to their monolingual peers.This similarity has been shown also as regards accuracy rates in the recognition task, as a proof that bilingualism doesn't affect memory performance per se.Semantic processing seems to be detached from vocabulary knowledge, usually negatively affected by bilingualism in this age period.To conclude, an important role in false memory's production seems to be played by socioeconomic status, negatively correlated with bilingualism: this aspect strengthens the need to consider bilingualism in its entirety, linked to its social context and to its various features.
The debate about bilingual semantics is still open and it still fascinates many researchers.Semantics is a sort of bridge between two of the dimensions identified by Bialystok (2009) as crucially sensitive for bilingualism understanding: vocabulary and memory.A greater understanding of bilingual semantics in relation to these two dimensions can help us to disambiguate the place it has in relation to language, to better clarify how dependent it is on the development of the lexicon and how much it affects its memorization.
Although the presented study cannot and does not aim at exhausting a debate as broad, complex and multifaceted, nevertheless it opens the way for an unprecedented look at these fundamental themes.Here, we aim to consider them not within an abstract void, but as inseparably imbued with emotional colours, which permeate, shape and determine the movements and mutual relationships.

Fig. 1
Fig. 1 d' mean values and SD in High and Low Arousal and Negative and Positive Valence conditions.

Table 2 .
Socioeconomic status between groups

Table 3 .
Mixed Model

Table 5 .
Between groups means and standard deviation in the control measures with significance degrees