Connecting perception and production in early Catalan–Spanish bilingual children: language dominance and quality of input effects

Abstract This study investigates perception and production of the Catalan mid-vowel /e/-/ɛ/ contrast by two groups of 4.5-year-old Catalan–Spanish bilingual children, differing in language dominance. Perception was assessed with an XAB discrimination task involving familiar words and non-words. Production accuracy was measured using a familiar-word elicitation task. Overall, Catalan-dominant bilingual children outperformed Spanish-dominant bilinguals, the latter showing high variability in production accuracy, while being slightly above chance level in perception. No correlation between perception and production performance could be established in this group. The effect of language dominance alone could not explain the Spanish-dominant participants’ performance, but quality of Catalan input (native vs. accented speech) was identified as an important factor behind familiar-word production and the inaccurate representation of the target contrast in the lexicon of the bilinguals’ less-dominant language. More fine-grained measurements of experience-related factors are needed for a full understanding of the acquisition of challenging contrasts in bilingual contexts.


Introduction
One of the challenges in bilingual first language acquisition is the building of contrastive categories corresponding to each of the bilingual's input languages. Vowel contrasts are especially challenging for the learner as a complete overlap between the two vowel systems is unlikely to be found. While some contrasts might be common to both languages and show a very similar distribution, many others reflect a different division of the acoustic space, involving not only misalignments, but also different number of categories, some of them partially overlapping. For Catalan and Spanish bilingual learners, these two languages' vowel systems differ in several ways. Catalan, but not Spanish, has vowel reduction and the repertoire for unstressed syllables (i.e., [ə], [u], [i] in Central Catalan) differs from the one in stressed syllabic positions (i.e., [a], [ɛ], [e], [i], [ɔ], [o], [u]) (Prieto, 2004). Besides the presence of vowel reduction only in Catalan, another main difference between Catalan and Spanish vowel systems concerns the mid-vowels in the front and back areas of the vowel space. In stressed positions, whereas Spanish has just two midvowel categories (one front and one back), Catalan has four mid-vowel categories, two front and two back contrastive vowels. Thus, acquiring Catalan specific mid-vowels can be challenging for Spanish-Catalan bilinguals, as these vowels are likely to be assimilated to the single category found in Spanish. For the /e/-/ɛ/ front mid-vowel pair this might involve assimilation to the Spanish /e/ category. Extensive research on this phenomenon has been carried out in the last two decades, covering simultaneous bilinguals, as well as early and late sequential bilingual adults and their skills in perceiving or producing this contrast. Research in infancy and early childhood has specifically been focused on the perception of this challenging contrast. The current study addresses both the perception and production of the front mid-vowel Catalan contrast in bilingual children, in an effort to gain a better understanding of its encoding in the lexicon when differences in initial linguistic experience are present and regular exposure to one of the ambient languages, Catalan in this case, is not taking place immediately after birth (as in simultaneous bilinguals), or its presence in home environments is substantially reduced. As the focus of the study is a Catalan-only contrast, we are specifically interested in exploring possible differences in perception and production between bilingual children differing in the amount of language exposure to Catalan in their early years. A short overview of the previous literature regarding this challenging contrast for Spanish-Catalan bilinguals is first addressed, briefly summarizing the main findings in perception or production studies with adult populations before focusing on infant and child data. Results from the limited number of studies exploring the link between perception and production in different populations of early bilingual children and early L2 learners will be also summarized. This review aims at highlighting critical factors behind the results that have so far been obtained regarding the perception-production link. In some of the studies, this connection appears to be weak, either as a result of methodological aspects, or as an effect of experience-dependent factors that are likely to modulate the otherwise expected connection between perception and production skills. By exploring this connection in a sample of young bilingual children, differing in Catalan input quantity, but also quality in their early years of language exposure, we expect to gain a better understanding of how these input factors might alter the perception-production link.
Acquisition of the front mid-vowel contrast: adult and child data As reported in previous research, many Spanish-Catalan bilingual adults present difficulties in the perception and the production of Catalan-specific vowel contrast, despite having received early and continuous exposure to Catalan, and despite their regular use of this language. This effect has been extensively studied in early sequential bilinguals, who were not simultaneously exposed from birth to both languages, but who began regular exposure to Catalan as their L2 in their pre-school years. These studies have revealed that Spanish-dominant bilinguals tend to be less accurate in the discrimination (Bosch, Costa & Sebastián-Gallés, 2000), as well as in the production of the /e/-/ɛ/ contrast (Bosch & Ramon-Casas, 2011;Lleó, Cortés & Benet, 2008), also showing more variability in performance (Amengual, 2016a;Bosch & Ramon-Casas, 2011) than Catalan-dominant bilingual groups. Overall, these results confirm that despite many years of exposure to Catalan, many Spanish-dominant bilingual adults continue to have difficulties with this challenging contrast, a fact that reveals the complex interplay between age of onset and amount of exposure in the acquisition of these contrastive vowels.
No surprise, then, if similar difficulties have been observed in Catalan-Spanish bilingual young children. Research has shown that by age 12 months infants growing up in Catalan-Spanish bilingual homesthat is, regularly exposed to both languageswere able to categorize and discriminate the Catalan /e/-/ɛ/ contrast (Bosch & Sebastián-Gallés, 2003). However, just a few months later bilingual toddlers failed to detect mispronounced familiar words involving /e/ and /ɛ/ vowels in a word recognition task and it was not until age 3 years when Catalan-dominant bilingual children (but not their Spanish-dominant peers) began to show sensitivity to this phonological contrast (Ramon-Casas, Swingley, Sebastián-Gallés & Bosch, 2009). Hence, the presence of a dominant language in bilingual family contexts, a multifaceted factor likely to result from early onset, predominant use by the main caretaker and infants' more frequent and regular exposure to that language, may negatively affect children's sensitivity to contrasts in the less dominant language and their encoding in the phonological representation of known words (see also García-Sierra, Ramírez-Esparza & Kuhl, 2016, showing early effects of amount of language exposure on bilinguals' brain responses to native contrasts). Regarding production, language dominance effects have also been found in studies that examined vowel accuracy in 3-to 5-year-old Spanish-Catalan bilingual children raised in neighborhoods differing in the extent of Catalan and Spanish language use: those bilingual children who had been more exposed to Spanish were less accurate in the production of the Catalan /ɛ/ mid-vowel than those children more exposed to Catalan (Cortés, Lleó & Benet, 2009). Differences were also evidenced in the acoustic analyses of the first two formant frequencies of the target vowels produced by Catalan-dominant bilingual children, whose F1 and F2 values were more distant in the acoustic space than those produced by their Spanish-dominant peers, especially regarding the height dimension (F1), a key factor contributing to the /e/-/ɛ/ distinction (Carrera-Sabaté & Fernández-Planas, 2005).
Taken together, these studies point to a language dominance factor likely to affect young bilinguals' processing of challenging contrasts in their less dominant language, a factor that continues to affect their processing abilities until adulthood. A Catalandominant exposure would favor acquisition of the front mid vowel contrast, in line with models on native phonetic category learning that emphasize the building of categories from exposure to statistical distributions of vowel exemplars in the input languages (Kuhl et al., 2008;Werker, Yeung & Yoshida, 2012). These models, however, do not easily extend to early sequential bilinguals, so the question remains as to whether a slightly later onset of exposure or a less predominant presence of Catalan in the infants' environment would severely preclude establishing the front mid-vowel contrast. While admitting that amount of exposure can initially affect or delay the encoding of this challenging contrast, it seems reasonable to consider that with continued experience with Catalan this contrast should eventually be established. However, successful encoding of this contrast does not seem to be guaranteed as suggested by previous research (Ramon-Casas et al., 2009). An additional dimension needs to be considered here and it is related to input quality. This factor is present in bilingual contexts and can account for individual variability in the processing of some particularly challenging properties of the less dominant language. In fact, recent work on L2 learning has also pointed out the crucial role of input quality as an often missing dimension to explain inaccurate representations in otherwise early L2 learners with high use of their L2 language (see Flege, 2019, for adult studies). Studies addressing lexical and grammatical acquisition in young bilingual children and L2 learners have already shown the negative impact of exposure to non-native input (Paradis, 2011;Place & Hoff, 2011, while also revealing no effects when the input is provided by proficient L2-speakers (Hoff, Core & Shanks, 2020;Paradis, Rusk, Sorenson Duncan & Govindarajan, 2017;Sorenson Duncan & Paradis, 2020). In the phonological domain, however, non-native speech input (i.e., accented speech) is very often provided by otherwise fluent and proficient L2 speakers who are likely to produce phonological inconsistencies and more variability in their speech.
Accented speech usually contains ambiguous phonetic tokens, either by falling between two otherwise distinct categories or by being inconsistently produced in the lexicon. This type of accented production can lead to the building of phonemic categories that will differ from the more standard categories arising from exposure to native, nonaccented input (see, for instance, Stoehr, Benders, Van Hell & Fikkert, 2019, showing how non-native production of Dutch VOT by sequential German (L1)-Dutch (L2) bilingual mothers was affecting the quality of native VOT learning in their Dutch-German bilingual children). Combined exposure to native and accented input can also lead to more variable representations of certain vowel contrasts, thus affecting children's lexical productions which have been found to be more variable in high frequency words (Levy & Hanulíková, 2019). This experience-related factor is common in the Spanish-Catalan bilingual context, where proficient L2 Catalan speakers who are fluent in and regularly use their L2 language (Catalan), inaccurately or inconsistently produce some of its phonological features, those that conflict with properties of their L1 Spanish (see Amengual, 2016a;Bosch & Ramon-Casas, 2011;Cortés et al., 2009;Mora & Nadeu, 2012;Recasens, 1991).
To sum up, quantity and quality of Catalan exposure are two factors that may contribute to less successful speech learning of Catalan-specific vowel properties, affecting their discrimination and phonological representation in the lexicon. The connection between perception and production skills has scarcely been explored in Spanish-Catalan bilinguals or early L2 learners of Catalan. In what follows, a brief overview of the literature specifically addressing the perception-production link will be synthesized, highlighting data from child studies that are relevant for our own research.
The link between perception and production Links between perception and production in L1 acquisition have been established both in infants (i.e., Kuhl & Meltzoff, 1996;Kuhl et al., 2008;Vihman, 2014) and in adults (Casserly & Pisoni, 2010, for an early review). Early L2 learning and bilingual language acquisition are especially relevant domains where the dynamics in the coupling of perceptual skills and accuracy in production along the learning process are worth exploring. A full understanding of the complex speech perception/production interaction at different levels of language learning and use has not yet been reached. In this context, the relative weight of different factors affecting the perception-production link in young learners, focusing on bilingual populations with different language experience in their early years, is central to the present research.
Studies focusing on highly proficient early bilingual adults have shown a correlation between perception and production abilities, although correlations are not always very large, modulated by individual variation (see, for instance, Flege, MacKay & Meador, 1999). A few studies have also been developed on Spanish-Catalan bilingual adults differing in language dominance, in which the link between perception and production is modulated by this factor (Amengual, 2016a;Amengual & Chamorro, 2015). Crucially for our research, Amengual (2016a) found that Catalan-dominant bilinguals performed much better than Spanish-dominant bilinguals both in perception and in production of front and back mid-vowel contrasts in Catalan words, Spanish-dominant bilinguals presenting higher variability levels in both tasks, some participants attaining native accuracy while others performed at chance. To explain the lack of relation between perception and production in the non-dominant language of early bilingual adults, these studies point to language dominance, but also to input factors, as determinants of the observed variability.
Regarding younger populations of early L2 learners and young bilingual participants, the literature remains scarce, with some studies supporting the perception-production connection in the acquisition of language-specific (L2) contrasts. McCarthy, Mahon, Rosen and Evans (2014) explored Sylheti-English early sequential bilingual children in their perception and production of the English (L2) voicing contrast in bilabial and velar plosives presented in familiar words. This is a challenging VOT contrast for L1 Sylheti speakers, as their voiceless plosives fall into the region of the English voiced ones, so a different boundary needs to be established in each language. Participants were tested twice, first at 52 months of age (with 7 months of accumulated exposure to English) and one year later, age around 5 years. Results indicated that at Time 2, after starting school (that is, after regular exposure to L2 English; i.e., full-time education in English, provided by native English speakers), bilinguals eventually matched their monolingual peers in both perception and production measures. Data suggest gradual gains in phonological specification of non-native (L2) contrasts resulting from continued exposure to English. As the authors state: "…an accumulation of English experience, in full-time education with no Sylheti support, (…) was required in order for the bilingual children to acquire the L2 phonemic categories" (McCarthy et al., 2014(McCarthy et al., , p. 1977. The authors acknowledge the complex nature of the link between perception and production in individual performance, but their group data favor an interpretation of the close connection between gains in perception and improved accuracy in production, in a context of increasing exposure to native, non-accented English input. Also addressing a population of early L2 learners, Netelenbos and Li (2013) explored the perception and production of L2 French VOT contrasts in plosives in Canadian English-speaking children acquiring French through immersion school programs. These are challenging contrasts involving a different voiced-voiceless boundary in each language. Results revealed inconsistent acquisition of French VOT values for /b/, with no gains in spite of increasing French exposure. Results confirmed the perception-production link as revealed by children's parallel patterns of performance in both domains. Critically, the L2 exposure in this study was often provided by non-native French speakers at school settings. Both McCarthy et al. (2014) and Netelenbos and Li (2013) suggest that not only quantity of L2 input, but also its quality (native-likeness) is important to reach high levels in perception and production, affecting both processes at the same time. This study is especially relevant for our research as it shows the limited impact that L2 exposure can have on the acquisition of an L2 challenging contrast when L2 language experience is not native-like.
To our knowledge there is only one study that could not find a connection between perception and production abilities in early bilingual children. Darcy and Krüger (2012) tested Turkish-German bilingual children (mean age at test 11 years; onset of exposure to L2 German around 3 years of age) in the discrimination of German vowel contrasts not present in Turkish. They used non-words in perception and familiar words in production containing the same vowel contrasts. The results showed an "apparent dissociation" between perception and production, with an advantage of the latter over the former: bilinguals were worse than German-monolinguals in the perception task, but they were indistinguishable from their monolingual peers in the production test, although their phonetic realizations presented higher variability than the monolinguals'. Darcy and Krüger (2012) argued that the production advantage in their data was likely to be the consequence of having used a production task based on lexical items whereas the perception task was tapping at a phonetic category level, thus, constraining an adequate comparison between perception and production abilities in their research. In fact, differences between sound categorization in lexical versus non-lexical tasks have already been documented both in adults (Amengual, 2016b;Darcy, Dekydtspotter, Sprouse, Glover, Kaden, McGuire & Scott, 2012) and children (Walley & Flege, 1999).
Overall, the relationship between perception and production performance, inherently complex, seems to be supported. However, research failing to report a clear link, even if limited, is relevant and informative about the factors that might have precluded the expected connection. In some cases, the problem seems to lie in differences in task demands. In other cases, failing to obtain a significant correlation in perception-production performance might be the result of large individual differences, that are not always addressed when research focuses on group data. It must be emphasized that individual variability might not only be the result of differences in age of onset and amount of exposure to L2 contrasts but, crucially, it can also result from input quality factors, often more difficult to be identified, measured and experimentally controlled. Specifically, in socio-linguistic bilingual contexts where both languages are extensively in contact and broadly used by adult speakers having a different L1, the ability to accurately perceive and produce an L2 contrast is going to be affected by the degree of variability and the presence of accented-speech in the input the learner is exposed to. This dimension can modulate the way the perception-production link is established and can constrain attaining native or near-native performance in L2 speech contrasts, those belonging to the less dominant language. From this perspective, it is relevant to explore how bilingual children, differing in language dominance and also likely to differ in the quality of their speech input, perceive and produce a challenging language-specific L2 contrast. Such research can offer an additional perspective on the way the perception-production link is established in these specific language contexts, thus extending our understanding of factors that modulate their coupling in speech learning. This is a central goal in our research.

The current study
We explored the connection between perception and production of the /e/-/ɛ/ Catalanspecific contrast in a sample of early bilingual children, clearly differing in language dominance (L1 Catalan or Spanish, considering L1 as the predominant language in the family context). Participants were attending schools where Catalan was the main language (i.e., the language used for classroom activities and communication, following immersion program guidelines implemented in public schools, from kindergarten up to elementary school grades), thus, having had daily intensive exposure to Catalan from at least age 3 years, or even earlier for those having attended day-care or nursery centers before entering kindergarten. Regarding perception, participants were tested with a task involving both familiar and non-sense words, in order to check for possible differences in participants' ability to apply their phonological categories to unknown words. Production accuracy of the target vowels was measured using a familiar word elicitation task. According to previous studies involving Spanish-Catalan bilingual toddlers, but also adults (e.g., Amengual, 2016a;Ramon-Casas et al., 2009), we expected an effect of language dominance, with Spanish-dominant bilingual children being overall less accurate than their Catalan-dominant peers in the perception and production tasks involving the challenging front mid-vowel contrast.
As for the perception-production link, we could expect a connection to be found, with discrimination performance aligned with production performance, even if overall performance was found to be lower in the Spanish-dominant group compared to the Catalan-dominant group. However, the effect of language dominance and the likelihood of higher exposure to accented speech in the Spanish-dominant group (in spite of having extended Catalan exposure in educative settings), could lead to higher individual variability in performance, possibly limiting the expected connection between perception and production skills. These results would then be in line with Amengual's (2016a) study, in which adult participants' high variability in performance obscured the expected perception-production link.

Participants
The participants in the study were 4-to 5-year-old children born to Catalan-or Spanishspeaking families and attending second year kindergarten in schools in Barcelona. A short questionnaire was specifically designed to set up two clearly different groups that differed in language dominance: a Catalan-dominant bilingual group and a Spanish-dominant bilingual group. Language dominance was determined by the language spoken by the main caregiver (usually the mother), the person regularly interacting with the infant and spending longer time with him/her during the first year of life. The questionnaire also explored the languages used by different agents in the child environment so as to obtain a clearer picture of the participants' linguistic background and also reach a tentative estimate of exposure to an accented speech input. Specifically, the questionnaire included items assessing age of onset of L2 exposure, languages spoken by close relatives, friends or neighbors, languages spoken at nursery or day-care centers if attended before age 3, when kindergarten begins, languages at kindergarten and indirect exposure to languages through the media. Participants were all bilingual at the age of testing and exposure to their L2 was not occasional, but regular. Exposure to their L2 (or less-dominant language) could have been provided by any other family member besides the mother, who provided the L1, or other relatives or family friends, and also by educators at the centers before entering school and other children in their environment. Participants who did not have this regular, even if unbalanced, exposure to both Catalan and Spanish before entering kindergarten by age 3 years were not included in the study.
Children (N=36) were recruited from four different public schools in Barcelona metropolitan area, in which Catalan was the main language within the classroom for teaching and communication purposes (see Artigal, 1997, on the Catalan immersion program). All participants had, thus, been regularly and continuously exposed to Catalan (at least for 5 days a week since they entered kindergarten). This was important to ensure the bilingual status of the participants, especially the ones in the Spanish-dominant group, as the tasks focused on the encoding of a Catalan-specific contrast. The final sample comprised 18 4-to 5-year old participants in each language-dominance group (mean age Catalan-dominant 4;4 (years; months); mean age Spanish-dominant 4;7; with no significant age differences between groups [t (34) =1.02; p = .31]). In the Catalan-dominant group there were 10 girls and 8 boys, and in the Spanish-dominant group 10 boys and 8 girls. Additionally, 8 children were tested but excluded from the final sample due to not reaching the bilingual criterion for inclusion (4) or incomplete data collection (4).

Experiment 1: Perception of the mid-front vowel contrast
In this experiment the perception of the Catalan /e/-/ɛ/ contrast in familiar and nonwords was tested, comparing Spanish-dominant and Catalan-dominant bilingual children performance. According to previous studies with slightly younger Catalan-Spanish bilingual children (Ramon-Casas et al., 2009) an effect of language dominance was expected, and better results in the Catalan-dominant group were predicted, especially in the perception of the target contrast in familiar words. Greater variability in the perception task by the Spanish-dominant group, more likely to be exposed to accented speech, could also be expected, although this prediction remained tentative in the absence of direct measures on this experience-related factor. Accuracy in the perception involving non-words could be limited in both groups, but possibly worse in Spanish-dominant bilingual participants for whom the acquisition of the Catalan contrast could be fragile and unstable. Following studies both on adults (Amengual, 2016b;) and children (Darcy & Krüger, 2012;Walley & Flege, 1999) such difficulties with sound discrimination in non-words, regardless of language-dominance, would be explained in terms of the overall complexity of sound categorization in non-lexical tasks, especially for young children.

Materials and procedure
Perception of the Catalan contrast /e/-/ɛ/ in familiar and non-words was analyzed, using a discrimination task (XAB) adapted from Brasileiro (2009), where X represents the standard Catalan pronunciation of the word, and A and B were two versions of the same item, only one of them being correctthat is, phonologically similar to X token previously heard. All the tokens were recorded on the same session produced by two highly proficient Catalan-Spanish bilingual women. Tokens produced by speaker 1 were always correctly pronounced and were linked to X. Tokens recorded by speaker 2 were phonologically similar and dissimilar versions of speaker 1 tokens, and were randomly associated to A and B.
Different types of words, some of them previously used in other studies (see Ramon-Casas et al., 2009)  The "dissimilar" pronunciations were created by replacing /e/ for /ε/ and vice versa. The stimuli were naturally produced by a trained speaker. These control and target words were selected as they are all familiar to 4-yearolds and present in their vocabulary. Non-words were /dedi/ and /dεdi/, similar to those previously used in a vowel categorization task with young infants (Bosch & Sebastián-Gallés, 2003). The dissimilar tokens, as in target words, were created by replacing the critical vowel (/e/ to /ε/ and vice versa). Finally, two additional non-words (/dodi/ and / dudi/) were used in the training phase. The dissimilar tokens were created by interchanging /o/ and /u/ vowels. These vowels belong to both Spanish and Catalan systems and research has shown they can be distinguished from 12 months of age ). The XAB discrimination task was administered to each child individually in a quiet room of their school. The task was run on a laptop and children were instructed to wear headphones. The experiment consisted of different trials in which static cartoon images of one adult (a female teacher) and two similar-looking children were displayed on the screen (see Figure 1). The teacher occupied a central position on the upper part of the screen and the two childrenonly differing in the color of their hairwere at the bottom, to the left and right side of the teacher. Participants were told that the teacher (X) would produce a word, and the two boys (A and B) would repeat that word, but only one of them would produce it accurately, similar to the teacher's production. They had to indicate either by pointing (left or right) or by a button press (left or right button of a response box) which boy had produced the word as the teacher did. All tokens were presented in isolation and through headphones. A 1500 ms train-whistle-like sound was presented in between X and AB tokens. This masking sound was intended to favor children responses based on the activation of the phonological representation of the target vowels in familiar words rather than a direct acoustic comparison.
The experiment began with two training trials, to familiarize children with the procedure and to ensure they had understood the instructions. Only in this phase, participants received feedback. The training could be repeated if children had not demonstrated to Figure 1. Layout of the XAB task: Teacher (central image, X-word); child A (left, A-word) and child B (right, B-word). The images of each child moved in synchrony with the audio that was played to identify who the speaker was.
have understood the task. Immediately after the training, the testing phase began involving 36 trials divided into two similar blocks of 18 trials. After the first block, a 10 second's clip was shown on the screen. In each block the three control and four familiar target words appeared twice, presented in a quasi-random order, always followed by the four non-words at the end of the block. Different orders were created and location of A and B characters was reversed between blocks. The testing session lasted around 20 minutes.

Results
Correct responses (i.e., correctly indicating which character, A or B, had produced the word in a similar way as the teacher) were identified and a percentage was obtained for each category (control words, familiar target words and non-words, see Figure 2). Measures were normally distributed and no outliers (values AE3 SD greater than the mean of the distribution) were identified. Percentages of correct responses were then compared to chance level (50%) as an indication of children's ability to detect the correct word form and discriminate the target contrast. The percentages of correct answers in control words were 96% in Catalan-dominant group (SD = 5) and 95% in Spanish-dominant group (SD = 6), both percentages being clearly greater than chance level (Catalan-dominant: t A repeated measures ANOVA with type of word (target, control and non-words) as a within factor, and language group (Catalan-dominant and Spanish-dominant) as a between factor, showed a significant main effect of type of word [F(2, 68) = 80.77; p < .01; η 2 p = .90)], a significant between-group difference [F(1, 34) =22.88; p < .01; η 2 p = .40], and a significant type of word x group interaction [F(2, 68) = 6.33; p < .01; η 2 p= .39]. Post-hoc t-tests revealed   Figure 3). It is important to note that regarding the perception of the target contrast in nonwords (/dedi/ versus /dεdi/), although measures from both groups were not statistically different, Spanish dominant behaved at chance level while Catalan-dominant were slightly better, marginally differing from chance level (p = .07).
As predicted, results in Experiment 1 revealed overall better accuracy in the perception of the target (/e/-/ɛ/) by Catalan-dominant bilingual participants. Mean percentage of correct responses to target words were above chance level in both groups, but betweengroup differences were significant, favoring the Catalan-dominant group. Variability was present in both groups, likely to be linked to the task design and its demands at the tested age. Regarding non-words, no group reached mean values above chance level and the between-group comparison yielding a non-significant difference, confirming young children's difficulties in tasks where a response based on sublexical information is requested and the presence of a masking noise clearly interferes with the short term memory processes needed to produce the correct answer.
Experiment 2: Production of the mid-front vowel contrast In Experiment 2, participants' production abilities of the target contrast in familiar words were explored, both in auditory (target-like productions) and acoustic (spectral analyses) terms. According to previous studies with children (Lleó et al., 2008) we expected a better production accuracy of the Catalan /ɛ/-words by the Catalan-dominant group, revealed both in the auditory and acoustic analyses. The Spanish-dominant bilingual group, possibly more affected by experience-related factors on the target contrast, was expected to show higher levels of individual variability (paralleling adult data in Amengual, 2016a).

Materials and procedure
The word elicitation task required children to produce familiar words in Catalan that contained the target vowels by means of a set of children's books and pictures. Words were elicited through picture naming and through a conversation between the child and one experimenter who asked simple questions to facilitate word elicitation in this task. The recording sessions were done in a quiet room at the participants' school, so they were in a familiar place associated with the use of Catalan. A Sony ECM-CS10 unidirectional lapel microphone, connected to a portable MZ-RH10 Sony recorder, was used. Each child was recorded individually in a session lasting from 45 to 60 minutes. A total of 756 words that contained /e/ and a total of 777 words that contained /ɛ/ (with participants contributing between 9-28 words to each category) were elicited and subsequently analyzed. Recordings in which the production was not clear enough for a reliable analysis (e.g., when too much noise was present or when the recording was not intelligible) were excluded. As in Bosch and Ramon-Casas (2011) and in Lleó et al. (2008) both auditory and acoustical analyses were performed. For the auditory analysis, the recordings were transcribed (perception-based) by two native Catalan-speakers and classified as target or not target-like, based on correct or incorrect target vowel production. From this auditory analysis, the percentages of /e/-type word and /ɛ/-type word correct responses were obtained. For the acoustic analysis, we conducted a spectral inspection of the target vowels only for the target words that were classified as being correctly pronounced. Using Praat 4.2 (Boersma, 2001), the center of the steady-state period of each target vowel was extracted and analyzed in order to minimize possible co-articulatory effects of adjacent consonants. F1 and F2 values were identified and reported in bark scale. Following Amengual (2016a), Euclidean distances were calculated, so the analysis of the individual variation in the data was considered. Note that Euclidean distances represent the distance in barks between F1 and F2 for the two vowel categories, showing how robust the contrast is represented in each participant. A low value would be indicative of category merging, while a large distance would mean no overlap in the production of these mid-vowel categories.

Results
All the obtained measures were normally distributed and no outliers (valuesAE3 SD greater than the mean of the distribution) were identified. Regarding the auditory analysis, the percentage of target-like production in /e/-words, as judged by the two native-Catalan judges, was 100% for all the participants. However, large differences between groups were obtained in the percentage of /ɛ/-word target-like production [Catalan-dominant: M = 91.05%, SD = 7.32 from 418 produced words, 382 were correctly produced; Spanish-dominant: M = 41.33%, SD = 33.32%; from 359 produced words only 158 were correctly produced; t(34) = 6.18; p < .01; Cohen's d = 2.06; see Figure 4].
The results of this experiment, as predicted, revealed that Catalan-dominant bilingual participants were highly more accurate in the production of the /ɛ/ mid front vowel than their Spanish-dominant peers. Although individual variability was expected in the Spanish-dominant group, the range was maximum, with some participants correctly producing many /ɛ/-target like words, while a few others did not even produce a single /ɛ/-word correctly. This pattern of performance was totally absent in the Catalandominant group whose results were close to a ceiling effect.
Correlation analyses between perception and production data As the two groups under study significantly differed in both skills, the analysis of the link between perception and production data was undertaken by correlating the measures for each group separately, based on the measures from the /e/-/ɛ/ discrimination in familiar words and the spontaneous production of /ɛ/-type words (see Table 1 for a summary of the measures by group).
The analysis using percentage of correct responses from the perception task (target words) and percentage correct productions of /ɛ/-type words for each group yielded no evidence of a significant positive correlation (Catalan-dominant n = 18; r = .38; p = .11; Spanish-dominant n = 18; r = .08; p = .75). A visual inspection of the results (see Figure 6) shows that the Spanish-dominant group presents higher variability than the Catalandominant group, especially in the production of the Catalan-specific /ɛ/ vowel, and overall lower scores in the perception task also involving a considerable range of variability. The Catalan-dominant group attained overall higher results and showed a good performance in both perception and production tasks, thus constraining the evidence of a correlation between these two skills due to near ceiling affects. Note, however, that an analysis grouping together all bilingual participants yielded a highly significant correlation between their perception and production accuracy of /ε/-type words (r (36) = .45; p < .01). Percentage of correct responses in the perception of non-words and percentage correct productions of /ɛ/-type familiar words for each group did not yield any evidence of a significant positive correlation (Catalan-dominant r (18) = .07; p = .75; Spanish dominant r (18) = .05; p = .81). None of the analyses using other production measures, such as formant frequency values or Euclidean distances, and the percentage of correct responses F2-/ɛ/ 13.2 13.1 Figure 6. Correct responses in target perception (y-axis) plotted against correct responses in /ɛ/-type word production scores (x-axis) for the Spanish-dominant (B-Spa, filled grey squares) and the Catalan-dominant (B-Cat, empty circles) groups. The dashed line represents chance level, which was 50% for the perception task.
in the perception experiment for target, control and non-words, reached significance (all p > .05).
In order to further explore the factors contributing to the variability observed in the Spanish-dominant bilingual group, we looked for markers of exposure to native Catalan (i.e., non-accented Catalan speech) from the answers to the questionnaire we used to classify participants based on the language dominance criteria. This was the best approach we could adopt as we did not have access to recordings from the people in the child environment using Catalan regularly so as to estimate the level of exposure to accented speech in a reliable manner. Therefore, after considering different options, we targeted a variable from the questionnaire that could offer a good estimate of the quality of the Catalan input at home. This variable was the language (L1) of the grandparents (Catalan or Spanish), as it has been shown that Catalan produced by older generations can be considered more native-like in terms of phonological characteristics, especially affecting the production of the contrast under study (Mora & Nadeu, 2012;Recasens, 1993). So, if grandparents were native speakers of Catalan we assumed that also the Catalan produced by their off-spring (in this case the father of the participants forming the Spanishdominant group) would probably be non-accented. Two subgroups were identified based on this variable and a non-parametric analysis was applied to compare their performance on the perception and production tasks. The analyses only revealed subgroup differences in accuracy of ɛ-word production [U(13,5)= 4.50, z =-2.79, p = .005]. Specifically, those Spanish-dominant bilingual participants who were likely to have been more exposed to non-accented Catalan in the home environment, were better in /ɛ/-word production (mean of correct ɛ-word production 76.8%, SD = 18.4, range: 53-100) than those in the other subgroup (mean of correct /ɛ/-word production 27.7%, SD = 27.0; range: 0-78).
To sum up, the first analyses exploring the perception-production link in each language group (i.e., Catalan-dominant and Spanish-dominant groups) did not reveal a clear connection between these two skills. On the one hand, the reduced variability and near-ceiling performance of the Catalan-dominant group and, on the other, the high level of dispersion of production data over perception performance in the Spanish-dominant group, limited the possibility to observe any significant correlation. Subsequent analyses on the performance of the Spanish-dominant bilingual group, taking into account the estimated measure of Catalan input quality, suggested the importance of this experiencerelated factor in modulating the perception-production link. Although the measure was not able to neatly separate the two subgroups of Spanish-dominant participants in terms of accuracy in production, it revealed that the best producers and, crucially, the worst ones in our sample belonged to a different subgroup based on this tentative estimation of quality of input in home environments. The quality of Catalan input seems to be an important dimension to explain the high range of variability detected in the Spanishdominant group. This interpretation, however, is preliminary and it should be taken with caution until more direct and reliable measures on exposure to accented speech in this population can be obtained.

Discussion and conclusion
The present study examined the perception and production of the Catalan front midvowel /e/-/ɛ/ contrast by 4-to 5-year-old early bilinguals who differed in language dominance (L1 Catalan or Spanish). To our knowledge, this is the first study in which the same sample of early Spanish-Catalan bilingual children are tested in both perception and production and the possible link between their performance in these two skills is explored. Data from an XAB perceptual task requiring the comparison between correct and mispronounced tokens of the same stimuli, and production data from a word elicitation task, revealed that the Catalan-dominant group outperformed the Spanishdominant group in both tasks. The perceptual task included familiar word and non-word conditions. The Catalan-dominant group was overall significantly better than the Spanish-dominant group in the familiar word condition, in which mispronounced tokens involved a change in the target vowels under study. For the non-word condition, both groups failed to accurately detect the target vowel change, even though the Catalandominant group performance approached statistical significance. Regarding production, group differences were found relative to the critical /ɛ/-type word pronunciations, with Spanish-dominant bilinguals making more errors (pronouncing /e/ in /ɛ/-type words) and showing a relatively smaller acoustic distance between these mid-vowel categories, but crucially no overlap, which would have indicated a single rather than two contrastive categories. The analysis of the perception-production link, based on the connection between correct responses to the perceptual task and percentage correct production of /ɛ/-type words in each bilingual group, failed to reveal a significant correlation between these two skills. As already mentioned, for the Catalan-dominant group, the limited variability in their accuracy scores and the high performance in both tasks reduced the power of statistics on the correlation between the variables under study. For the group of Spanish-dominant bilinguals, high variability in the production of /ε/-type words was found but it could not be linked to their (also variable) level of sensitivity in the discrimination task. While it could be argued that this variability might be related to individual differences in their language skills or to differences in children's vocabulary size, we have no indication that the participants in the Spanish-dominant group differed from the ones in the Catalan-dominant group in their oral skills, as this could have been detected in the conversation session to elicit target word production, in the absence of more direct testing. The apparent disconnection between perception and production skills is likely to be related to inconsistencies in the phonological representation of the target contrast in their lexicon of Catalan words, in spite of children's capacity to distinctly produce the contrastive /e/ -/ɛ/ vowels. Discussing the relative weight of the two main factors so far considered (i.e., amount of exposure leading to language dominance and non-native input quality) is thus needed so as to better understand the performance of the Spanish-dominant group and the lack of relationship found between their perception and production results.
To begin with, the role of language dominance in predicting accuracy both in perception and production is discussed. Although the concept of language dominance remains a bit controversial, it is nevertheless useful when applied to young learners in bilingual environments whose experience with each of the ambient languages is rather unbalanced, often involving a later age of onset and a more limited amount of exposure to one of these languages. Age of onset and amount of exposure significantly affect the performance of the two subgroups of bilinguals in our study. Only Catalan-dominant bilinguals, continuously exposed to this language from birth, could accurately perceive and produce the target contrast reaching near-ceiling effects, whereas Spanish-dominant participants still lagged behind their Catalan-dominant peers in spite of continued exposure to Catalan, initially limited at home but more extended after entering school under Catalan immersion programs in the kindergarten years. However, amount of exposure as a single factor cannot satisfactorily explain the broad range of variability and the absence of a clear connection between perception and production performance.
If more exposure was needed to compensate for the late onset and less dominant presence of Catalan in the children's family environment, one could expect, still at that age, lower attainment levels, but a close perception-production link in individual performance. One could also expect that this Spanish-dominant bilingual group would eventually catch up their Catalan-dominant bilingual peers, with gradual gains, both in perception and production, as seen in McCarthy et al.'s (2014) study. However, this is not exactly the pattern of results we obtained (we can identify "good perceivers" in this group that were not "good producers", and vice versa), and the catching-up expectation is far from realistic if one takes into account data from adult studies (e.g., Amengual, 2016a;Bosch & Ramon-Casas, 2011;Bosch et al., 2000). Variability in this group clearly suggests an inconsistent representation of the target vowels in known words which seems to be blocking the possibility to show a connection between perception and production abilities. When exposure to Catalan has not been regular and extensive before age 2-3, the perception and production of challenging contrasts in known words can be initially compromised, but accumulation of exposure to the L2 does not guarantee an eventual attainment of native-like phonological performance for the challenging L2 contrast under study, even after several years of experience with this L2, as well as its regular use on a daily basis. This phenomenon could be easily captured in a production task such as the one we adopted in the current study, where participants could produce different words, many of them high frequency lexical items, revealing their inconsistencies and errors in their phonological representation. Variability in performance cannot be solely attributed to the quantity of input factor; input quality should also be taken into account.
As already mentioned, quality of input is a dimension that can help explain the building and maintenance of inaccurate representations as well as the high levels of variability in production, in spite of extended exposure to the language (see Flege, 2019; also Stoehr et al., 2019 on the effects of non-native maternal input on children's performance). A closer look at individual data from our participants reveals the extremely variable performance of the Spanish-dominant group, with very few of them reaching good accuracy (above 80% correct) in the production of this contrast (see Figure 6). Only 20% of the Spanish-dominant participants showed production of /ɛ/-type words similar to Catalan-dominant participants' performance. We suggest that what seems to be an erratic performance in many of the participants in this group can be better understood by considering the input quality factor. When the input to the learner presents inconsistencies at the phonological level, the representation of the corresponding categories needs to be modified in order to accommodate this non-native variability (see, for example, Durrant, Delle Luche, Cattani & Floccia, 2015), leading to differences in categorization and increasing the likelihood of more variable productions. The representation of this contrast at the lexical level will depend on the particular quality of Catalan input that the learner receives. The sociolinguistic context where this research has been developed represents an extensive language contact context where Spanish-accented Catalan is present in many areas and neighborhoods in which Spanish is also extensively used (Lleó, Benet & Cortés, 2007;Lleó et al., 2008;Mora & Nadeu, 2012). This type of accented input, much more likely to be experienced by Spanish-dominant bilinguals in their home and social environment, could become a key factor in explaining the performance differences between the bilingual groups in this research. Note that Catalan-Spanish bilingual contexts are rather different from the sociolinguistic contexts in which other bilingual studies have been developed. For example, McCarthy et al. (2014) showed that through regular exposure to non-accented L2 English and limited use of L1 (Sylheti), restricted to the family environment, young immigrant populations in the UK improved their abilities in perception and production of L2 contrasts. In a different study, Netelenbos and Li (2013) explored Canadian-English speaking children acquiring French from school immersion programs, a totally different sociolinguistic context from the one described in the previous study. They found a clear perception-production link, with the same reduced gains in L2 speech learning. They attributed these results to L2 exposure being limited to school settings and English being the dominant language outside school. In both these studies it seems that L2 input quality was controlled and, even though the levels of attainment differed, the perception-production link could be established.
The sociolinguistic and educational context in these studies is, thus, rather different from ours, especially because our Spanish-dominant participants are growing up in a bilingual environment where both Spanish and Catalan are present in everyday life, their L1 is not limited to the family environment and it does not have a lower social status. The extended use of the two languages (Catalan and Spanish) in our community and the phonological and lexical proximity between them is also a relevant factor that must be considered, possibly contributing to the presence of accented speech (here Catalan produced by L1 Spanish speakers, usually late bilinguals). In the present study, the need to explore the quality of Catalan input was motivated by the observation of the rather unexpected huge variability in production and the perception-production disconnection in the Spanish-dominant bilingual group. We could only tentatively estimate the likelihood of exposure to Catalan accented speech from the language background questionnaire that gathered information regarding the different sources of L1 and L2 input provided to the child participants. Based on the grandparents L1, we estimated the presence/absence of accented speech in home environments and we found evidence that those participants whose grandparents were L1 Catalan speakers tended to be better in the production of the /e/-/ɛ/ Catalan contrast. Note that Catalan provided by grandparents can be considered native-like, as the influence of Spanish over Catalan (in terms of phonological interference) has been especially observed in younger but not in older generations (i.e., Recasens, 1993). Although tentative, this result offers a first indication of input quality connecting this factor with production data. It was effective in separating participants in the extremes of the continuum: those failing to correctly produce any /ɛ/words and those with production accuracy falling within the range of the Catalandominant bilingual group. However, the whole range of variability still remains to be explained. The nature of this variable does not allow us to find a robust explanation for the huge variability obtained in the production abilities of the Spanish-dominant group, but it nevertheless points out the relevance of the input quality factor as one dimension contributing to account for variability and persistent difficulties in Spanish-dominant bilingual children to encode the challenging L2 Catalan contrasts.
We cannot conclude without addressing some of the methodological limitations of the study that might have also played a role in the results we have obtained. The perception task involved a limited number of familiar word items on which the measures were based. A task including more items could have been more adequate to better explore the discrimination of the target contrast in words of different length and different cognate status, possibly becoming more informative about bilinguals' capacity to encode the target contrast. Including more items in the perceptual task might have revealed a greater dispersion pattern, possibly making our data more comparable with adult performance in Amengual's (2016a) study involving a rather similar population of bilinguals. The age of the participants was crucial for us to opt for a shorter task containing a limited number of items to be tested. The task also included non-words which have proved to be difficult for young children, especially with the inclusion of the masking noise we applied so as to tap onto the sublexical representation level. However, the masking noise and the temporal distance between the model and the target stimuli inevitably taxed the short-term memory processes needed in this task and data were not as informative as expected. The performance of both groups was affected by this design and we missed an opportunity to reveal a more stable and accurate representation of the contrast in the Catalandominant group. All in all, future research should take these limitations into account to improve the design of the task to test perception skills in preschoolers. Concerning the production task, some methodological problems were also present. The number of recorded productions from each child participant was variable, some of them contributing with a limited number of words that might have affected the values when the percentage of correct items was computed. This is a limitation typically found when obtaining spontaneous samples of speech from young children, but here it might have negatively affected the results of the perception-production link that we were exploring. Future research should find better ways to avoid these limitations.
To sum up, regular and predominant language exposure during the first two years of life is one of the critical factors that influences not only perception and production, but also the dynamics that can be established between these two language skills relative to the phonological specification of the target contrast. Sequential exposure seems to modulate the connection between perception and production skills, leading to alternative strategies in word encoding and production for specific contrasts. Quantity of exposure but also quality of the input relative to the specific contrastive sounds in the less dominant language in the bilingual contexts under study are both factors that hinder an accurate (native-like) performance involving the target contrast. The quality of the input seems to have an incidence on individual variability in production but, in order to better explore the influence of this experience-related factor in perception as well as in production, direct measures of the input to the young learner need to be implemented. Future studies will also have to consider the relative influence that the nature of the lexical items used to explore children's phonological representations (e.g., cognate versus non-cognate words; frequent vs. infrequent lexical items) may have on the individual variability in perception and in production, and whether and to what extent this dimension also modulates the properties of the accented speech input (see Bosch & Ramon-Casas, 2011). In the phonological domain, further research is needed to reach a better understanding of the complex interplay among input quantity, input quality and specific language properties as factors affecting variability in perception and production skills in early bilingual children.