Expecting the unexpected: Code-switching as a facilitatory cue in online sentence processing

Abstract Despite its prominent use among bilinguals, psycholinguistic studies reported code-switch processing costs (e.g., Meuter & Allport, 1999). This paradox may partly be due to the focus on the code-switch itself instead of its potential subsequent benefits. Motivated by corpus studies on CS patterns and sociopragmatic functions of CS, we asked whether bilinguals use code-switches as a cue to the lexical characteristics of upcoming speech. We report a visual world study testing whether code-switching facilitates the anticipation of lower-frequency words. Results confirm that US Spanish–English bilinguals (n = 30) use minority (Spanish) to majority (English) language code-switches in real-time language processing as a cue that a less frequent word would ensue, as indexed by increased looks at images representing lower- vs. higher-frequency words in the code-switched condition, prior to the target word onset. These results highlight the need to further integrate sociolinguistic and corpus observations into the experimental study of code-switching.

In parallel, experimental research on bilingual language control has capitalized on the use of cued language switching tasks (e.g., Meuter & Allport, 1999;Costa & Santesteban, 2004;Gollan & Ferreira, 2009). These tasks are primarily focused on production, often lack sentential context, or include language switches that are not representative of the discoursesupported code-switches found in bilingual speech (e.g., Schotter, Li & Gollan, 2019). The goal of these studies is to test the limits of bilingual language control in terms of switch costs, mixing costs, and their relationship to domain-general cognitive processes such as inhibition or increased attention.
While providing important insights into bilingual sentence processing and language control, these research directions do not readily incorporate the sociopragmatic motivations for CS (Myers-Scotton & Jake, 2001) or the processing benefits that CS may provide to the bilingual comprehender. Its frequency and functional distribution in bilingual discourse suggest that CS affords processing benefits which override purported processing costs. The current study experimentally tests one such processing benefit of CS: alerting to and aiding prediction of upcoming unexpected or less predictable information, operationalized as low-frequency words in a neutral sentential context. We discuss several theoretical frameworks which can account for this processing benefit of code-switching. Finally, we call for establishing a new direction in code-switching research which focuses on the often beneficial effects of code-switching on the processing of subsequent structures, rather than solely focusing on switch costs at the switch site.
Background CS is associated with a myriad of functions, some of which are identity expression (Velasquez, 2010), situational marking, (re)negotiating social relations (Myers-Scotton, 1993), face-saving (Bentahila, 1983), discourse organization (Auer, 1988), emphasis (Gumperz, 1982), and introducing indirect speech (Albirini, 2011). More recently, Myslín and Levy (2015) tested a proposal that CS serves an information-distribution function for organizing discourse, using statistical modeling of a bilingual corpus. The authors collected a corpus of conversations among five Czech-English bilinguals living in an English-speaking community, including two older L1 Czech-L2 English speakers, as well as three younger heritage Czech speakers. The corpus totalled three hours and was analyzed in terms of intonational units. From this corpus, the authors extracted utterances spoken by two older Czech-dominant bilinguals which either contained a final-word code-switch (Czech to English) or did not contain switches (unilingual Czech). The Czech-English code-switch direction was chosen as it was predominant in the corpus overall, regardless of the language dominance of the speakers (601 Czech-English vs. 24 English-Czech code-switches [Myslín & Levy, 2015]).
To analyze what information-theoretic constructs could affect the switch v. non-switch status of the intonational units, the authors calculated a range of factors. These factors included participant constellation (presence/absence of a younger bilingual), various lexical accessibility factors (relative frequency ratio [the extent to which the frequencies of language equivalents within each language differ], word length, imageability, concreteness, part of speech), lexical contextual factors (trigger presence [proper nouns, cognates, or phonologically nonintegrated loanwords in the vicinity of the final word], lexical cohesion), syntactic contextual factors (collocational strength, syntactic dependency distance), and predictability of meaning in context. Importantly, predictability of meaning in context was calculated using a modified Shannon guessing game performed by another set of Czech-English bilinguals. Participants were provided with the intonational units up until the final word and were asked to guess the meaning of the final word in the unit. The predictability of meaning was calculated as the percentage of correct guesses. These factors were included as predictors in a logistic regression model and a model selection procedure was implemented. The authors found that predictability was the second most explanatory variable for the code-switch behavior, after part of speech, such that these Czech-dominant Czech-English bilinguals code-switched into Englishpresumably the more salient or marked languageat more unpredictable words. The authors cite audience design (Clark & Murphy, 1982) as the cause of this behavior: bilinguals are taking into account their interlocutor's language knowledge and choosing the more salient code to highlight and promote intelligibility of particularly informative portions of speech. Both older non-heritage and younger heritage bilinguals under this explanation would have an understanding of which is the more salient code while listening to older bilinguals. Myslín and Levy (2015) take the more salient code to mean the less used or the less dominant code of the speakers themselves, i.e., English. Nevertheless, the Czech-English code-switches were prevalent in the discourse of bilinguals of different language dominance, including three English-dominant younger bilinguals (Myslín & Levy, 2015). Therefore, the status of the more salient code is not necessarily tied to a speaker's own language dominance, but to the community-wide designation of a specific language as more salient. Alternatively, the status of the more salient language could be related to the lower frequency of one code in a specific language situation, e.g., the frequency of English in a predominantly Czech language context. In majorityminority language situations, switching from the minority language into the majority language, i.e., the language of power, is the most common switch direction (e.g., Nicaraguan English Creole to Spanish in Nicaragua: Blokzijl, Deuchar & Parafita Couto, 2017;Bhatt, 2013, as cited in Blokzijl et al., 2017 Spanish to English in the US: Blokzijl et al., 2017;Poplack, 2000;Zentella, 1997).
Another recent statistical corpus-modeling study examined the effect of word surprisal and word entropy, among other factors, on the CS behavior in a Chinese-English written text corpus (Calvillo, Fang, Cole & Reitter, 2020). The corpus consisted of the online-forum discourse of Chinese-English bilinguals who had been studying in the US for several years. The researchers translated the Chinese-English sentences to Chinese and paired them with structurally similar non-switched Chinese sentences from the corpus. They devised a core logistic regression model to account for the likely factors shaping the CS behavior, containing the following characteristics of the first CS word or the equivalent word in the paired non-code-switched sentence: word frequency, word length, sentence length, part of speech, dependency relation, dependency distance, and location in sentence. They then tested whether adding word surprisal and word entropy separately improved the model, using the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). The authors found that word surprisal, defined as the negative log-probability of a word given a certain number of previous words, significantly improved the fit of the model. Therefore, a measure similar to the predictability of meaning in context used in Myslín and Levy (2015), was also shown to shape CS behavior. Importantly, the word frequency factor was trending even after adding the correlated factor of word surprisal to the model, such that less frequent words tended to be a more likely code-switch site. The authors discuss audience design as one explanation for the obtained effects. They also propose that the code-switching behavior could be due to the fact that retrieving words with higher surprisal and lower frequency results in an increase of cognitive effort, thus releasing inhibition over L2 and resulting in a code-switch.
Other than or in addition to audience design, potential bilingual listeners' ability to use switches as cues to more informative or less predictable material could stem from their sensitivity to code-switching distribution in bilingual discourse, as predicted by theories linking production and comprehension (e.g., P-chain framework, Dell & Chang, 2014;Production-Distribution-Comprehension, PDC, MacDonald, 2013;Interactive Alignment, Pickering & Garrod, 2004). Thus, if speakers' choices to code-switch are driven by information-theoretic and sociopragmatic functions, listeners could statistically learn the pairing between a code-switch and the ensuing less expected information through exposure to the distribution of code-switches produced within the bilingual community. A similar concept has been demonstrated for grammatical patterns and code-switching: more commonly attested code-switches are also easier to process (e.g., more common switches into the progressive vs. past participle in auxiliary-verb phrases, or more common switches after a masculine vs. a feminine gender determiner in Spanish-English code-switching; e.g., Guzzardo Tamargo, Valdés Kroff & Dussias, 2016;Valdés Kroff, 2016;Valdés Kroff, Dussias, Gerfen, Perrotti & Bajo, 2017). Thus, the code-switching patterns, possibly stemming from sociopragmatic functions and/or difficulties in word retrieval, could in turn affect a bilingual listener's prediction while comprehending speech. In our case, because code-switches tend to occur at points of greater surpisal or less expected information, code-switches could potentially serve as a facilitatory cue to promote the prediction of less expected, i.e., low-frequency, items for the listener.
Relatedly, sociolinguists have determined that a recurrent sociopragmatic motivation for CS to the more marked language is speaking about emotional, information-rich taboo topics (Bentahila, 1983;Tomić, 2015;Tomić & Valdés Kroff, 82 Aleksandra Tomić and Jorge R. Valdés Kroff forthcoming). Bilinguals seem to switch to the language of power (i.e., typically the majority language) prior to or on negative, taboo concepts, in order to possibly ease their own or listener's processing of the concept. We illustrate this function in Example 1.
1) waħed lli ʕandu la diarrhée tajSwb šwija 'someone who has diarrhea can take a bit of it' (Arabic-French, Bentahila, 1983, p. 236) Socially illicit taboo concepts are relatively infrequent in discourse, compared to socially neutral words, and are informative to the listener (∼0.3% to 0.5% taboo word rate in spoken discourse; McEnery, 2006;Mehl & Pennebaker, 2003). The pattern of minority to majority language code-switches preceding taboo words also supports the hypothesis that CS may offer processing benefits by signaling and thus aiding the prediction of highly informative or more unpredictable portions of upcoming speech. The CS function of cueing listeners to more informative speech aligns it with other salient speech events which can be used as discourse-organizational markers, such as disfluencies. Disfluencies, or irregularities in fluent speech, including "uh", "um", or pauses, occur when referring to new vs. given information (Arnold, Wasow, Ginstrom & Losongco, 2000;Barr, 2001). Experimental research has shown that this distribution regularity helps monolinguals predict unexpected, new (Arnold, Fagnano & Tanenhaus, 2003;Arnold, Tanenhaus, Altmann & Fagnano, 2004;Arnold, Kam & Tanenhaus, 2007) or low-frequency words (Bosker, Quené, Sanders & de Jong, 2014). In these visual world studies, disfluent instructions cause listeners to start looking at the unexpected item faster, shortly prior to or following the onset of the target word. This predictive benefit that disfluencies afford could be due to listeners attributing them to the difficulty speakers might be experiencing retrieving less accessible language material, such as the difficulty when retrieving discourse-new concepts or lower-frequency concepts. Another explanation invoked in Arnold et al. (2004) is the aforementioned statistical learning of co-occurrences between disfluencies and particular types of language use (MacDonald, 2013).
CS is a highly regularized linguistic behavior, in terms of both the structural and pragmatic rules it conforms to (e.g., Myers-Scotton, 1993), and as such does not represent disfluent speech in most cases. Nevertheless, it represents a salient speech event, similar to disfluencies, which can easily become meaningful to comprehenders. Therefore, we suggest that despite its observed processing costs, CS may serve a similar, important discourse function of signaling specific types of upcoming linguistic information to bilingual comprehenders. Listeners should thus be able to predict upcoming less expected information based on the presence of a code-switch.

Current study
To test whether Spanish-English bilinguals use CS as a facilitative cue in sentence processing, we employed the visual world paradigm with eye-tracking (Tanenhaus, Spivey-Knowlton, Eberhard & Sedivy, 1995) using two-picture displays. We operationalized predictability as lexical frequency such that, on experimental trials, one image represented a low-frequency word and the other a high-frequency word. Lexical frequency, contextual predictability of meaning, and word surprisal are different, but correlated measures (Calvillo et al., 2020). In the maximally simplified and neutral context we chose as an initial test of our hypothesis, the less frequent word would necessarily be the less predictable item.
Audio instructions asked participants to select a target image. The instructions were either in unilingual Spanish (i.e., the minority language) or code-switched into English (i.e., the majority language). We chose the Spanish-English CS direction due to several reasons. Czech-English L1-L2 bilinguals have been found to use switches from the heritage, minority language to the language of the majority to mark more informative portions of speech to other bilinguals, including younger English-dominant heritage Czech speakers (Myslín & Levy, 2015). Presuming that a similar sociopragmatic function of code-switching to less predictable items exists within the Spanish-English bilingual community in the US, the switch direction would be from Spanish to English, regardless of the language dominance of the individuals. English as the majority language in the US corresponds to the language of power and switches to the language of power are in general more frequent (Bhatt, 2013, as cited in Blokzijl et al., 2017Blokzijl et al., 2017;Myslín & Levy, 2015), including in the Spanish-English bilingual community in the US (Moreno, Federmeier & Kutas, 2002;Herring, Deuchar, Couto & Quintanilla, 2010;Valdés Kroff, 2016;Valdés Kroff, Guzzardo Tamargo & Dussias, 2018;cf. Blokzijl, Deuchar & Parafita Couto, 2017). Subsequently, we only focus on this one switch direction as the more sociolinguistically representative and ecologically valid switch direction. Switches into the opposite direction would not provide an adequate test for our hypothesis, as switching from English into Spanish could have introduced confounding variables. For example, the less frequent English-Spanish code-switches could have been unexpected and difficult to process, affecting the prediction and processing of the post-switch language material. Additionally, English to Spanish code-switches are likely associated with different sociopragmatic functions altogether.
Crucially, in our study code-switches occurred before the naming of the target object. If bilinguals indeed interpret a code-switch as a signal to upcoming unexpected information, then the proportion of fixations to lower frequency items should be higher on code-switch trials as compared to Spanish trials BEFORE the onset of target words.

Instructions and audio recordings
We constructed two carrier phrases, each with a unilingual Spanish and Spanish-English code-switched variant. Code-switches preceded the name of the target object by three words, one content and two function ones, to avoid any immediate switch costs affecting the results. The code-switch occurred after an article, at the noun, which is a well-documented, frequent code-switch site (Valdés Kroff, 2016). Carrier phrases (Example 2) and target names were recorded by a balanced Puerto Rican Spanish-English speaker, a trained audiologist.
2) a. Encuentra/Elige el dibujo de un/una/Ø __________ b. Encuentra/Elige el drawing of a/an/Ø __________ "Find/Select the drawing of a/an/ Ø __________" Picture names were recorded in isolation with declarative, falling intonation. Carrier phrases were recorded in combination with a stand-in noun, and subsequently cut, to ensure that the intonation and article pronunciation were as natural as possible. The onset of the code-switch was briefly delayed compared to the comparable Spanish-only point (mean difference = 22 ms). The delay was the product of natural pronunciation prolongation and corroborates experimental findings from Fricke, Kroll, and Dussias (2016). Using a Spanish-English bilingual corpus, the authors found that speech rate is reliably prolonged prior to code-switches, which in turn aids the processing of code-switches as demonstrated by an experimental study. We decided to leave the delay so as not to make the processing of the switches more difficult (Shen, Gahl & Johnson, 2020) and introduce potential confounds. We lay out potential consequences of this design feature on our results in the Discussion section. The time frame from the onset of the code-switch to target word onset was longer in the Spanish-only analogue (Mean = 955 ms) than in the CS conditions (Mean = 833 ms), presumably due to an additional syllable in the Spanish equivalent for "drawing" and the additional syllable in trials with feminine determiners. Carrier phrases were scaled to an average intensity of 70 dB, and nouns were scaled to an average intensity of 66 dB using Praat (Broersma & Weenink, 2018) to ensure volume uniformity and a natural volume decline at the end of sentences. Carrier phrases and target nouns were concatenated without a pause, to mimic the way they were naturally pronounced by the speaker when recording carrier phrases in combination with stand-in nouns.
We chose experimental images to minimize the frequency differences between Spanish and English language equivalents for the picture names (Table 1), to avoid one language equivalent being more accessible than the other.
A two-tailed paired t-test showed no significant difference in frequency between English and Spanish picture name counterparts (t[63] = −1.527, p = .131, mean difference = −0.109).
In parallel, we devised experimental items, i.e., image pairs, to maximize frequency differences between pair members. One-tailed paired t-tests indicated a significant frequency difference between the high-and low-frequency experimental pair members (Spanish: t[33] = 21.778, p < .001, mean of differences = 3.595; English: t[33] = 33.993, p < .001, mean of differences = 3.536). We also matched experimental pairs (15 feminine) for the gender of the Spanish translation equivalent to prevent participants from using grammatical gender as a predictive cue (Valdés Kroff et al., 2017). Twelve English-Spanish cognates were included in the experimental trials, due to low availability of appropriate images. We paired cognates with each other to control for possible cognate effects, resulting in 6 experimental cognate-pairs.
We created four lists of experimental audio instructions, with one audio experimental item appearing in one of four conditions across lists: Spanish, Low Frequency; CS, Low Frequency; Spanish, High Frequency; or CS, High Frequency targets. This process resulted in eight audio trials per condition and ensured that a participant sees an experimental trial in only one of the four conditions. Nevertheless, as described in greater detail below, we analyzed the looks to images regardless of target/distractor status in a critical time window prior to when listeners began to process the phonological information of the named target item in the audio instructions. Consequently, participants saw sixteen images pertaining to 4 conditions (e.g., a low and a high frequency image in the Spanish unilingual condition, regardless of the frequency of the target). Trial order was pseudorandomized to ensure no more than three experimental pairs appeared in a row. Each list had 64 filler trials, drawn from the same database. Filler pair members were similar in frequency to each other. The fillers were the same across lists, but their order and image position were randomized in presentation.
The experiment was programmed in Experiment Builder (SR Research, 2011). Images were presented on a white background. Eight possible picture locations were arranged in an ellipse on the screen (Figure 1). To avoid overlap between the looks to the target vs. distractor items, images were never in adjacent positions while remaining unpredictable as to their location from one display to the next.

Participants
Thirty Spanish-English bilingual participants (4 male), age range 18-32 (M = 20.83, SD = 3.53), were recruited at the University of Florida and compensated in course credit or cash. All participants reported having begun learning both English and Spanish before puberty (Spanish age of acquisition [AoA] Mean = 0.67, SD = 2.35; English AoA Mean = 3.67, SD = 2.48).
Participants completed a Language History Questionnaire (LHQ) and adapted standardized grammar tests from the Michigan English Language Institute College English Test (MELICET) and the Diplomas of Spanish as a Foreign Language (DELE; Table 2). The order of the main experiment and tests was counterbalanced, as well as the language order of proficiency measures. The LHQ responses and proficiency scores can be found in the Open Science Framework repository: https:// osf.io/azcn4/. For Spanish words "carriola" and "recogedor", CELEX frequency count was 0, so comparable frequency metric was taken from EsPal (Duchon, Perea, Sebastián-Gallés, Martí & Carreiras, 2013).

Aleksandra Tomić and Jorge R. Valdés Kroff
Twenty-eight participants reported CS in the LHQ, whereas two participants responded with "Not sure". All participants completed questions on the frequency of use and exposure to CS in speaking and writing. The mean response to frequency of oral CS use was 4.1, SD = 0.85, and the mean response to aural exposure to CS was 3.87, SD = 0.97, with 1 indicating "Never" and 5 "Always" (Table 2).
Our sample reveals greater overall proficiency in English than Spanish as well as a greater amount of daily exposure to English (Table 2), reflecting the participants' likely status as heritage speakers (Prada Pérez & Hernández, 2017).

Procedure
Participants were given instructions to listen to audio recordings and use the mouse to click on the named image. The instructions were presented in Spanish-English code-switched speech, to promote a bilingual language context. Eye movements were recorded from the right eye (viewing was binocular) using an SR Research Eyelink 1000 Plus desk-mounted eye-tracker. Participants' heads were stabilized using a chin rest, and they were seated approximately 70 cm from a 24-inch LED Benq monitor. Participants completed a 9-point calibration and validation test. Calibration was deemed successful if average error was at or below 0.5 degrees. Participants competed eight practice items before the main experiment.

Data analysis
Importantly, we are interested in processing BEFORE the onset of the target word. We selected a target time period for eyemovement analysis that was 200 ms before and after target onset (Figure 2). Planning and launching an eye-movement takes approximately 150-200 ms (Allopenna, Magnuson & Tanenhaus, 1998;Travis, 1936, cf. Altmann, 2011, so the time region of 200 ms before and after the target onset provides an approximate time window into predictive processes that occur prior to participants' ability to process the onset of the target word. Looks to images, aggregated by frequency regardless of target/distractor status, as in the final analysis (Figure 2, panel A), or by target or distractor selection (Figure 2, panels B and C, respectively) corroborate this interpretation, as we see the same preferential looking patterns in our chosen time slot in all panels. These similar processing patterns suggest that the participants did not yet process the target item onset during the critical time slot and were instead being driven by the presence or absence of CS.
We excluded incorrect-response trials (1.35% data loss), unrelated looks, and time spent in blinks and saccades (32.27% data loss). This resulted in 33.62% overall data loss. The data loss of 1.35% due to incorrect responses suggests that participants were performing at ceiling, despite the presence of low-frequency items. The data loss predominantly stemmed from blinks, saccades, and unrelated looks (32.27%). We note that a greater number of saccades, blinks, and looks outside of our target items may have occurred because we were interested in predictive processing before the onset of the yet-to-be-named target item. Additionally, unincluded samples due to blinks, saccades, and unrelated looks were mostly interspersed throughout the region of interest. Only ∼4.58% of the entire data set corresponded to entire trial losses due to blinks, saccades, and unrelated looks. The full data set and preprocessing code can be found in the Open Science Framework repository: https://osf.io/azcn4/.
The Time variable in the eye-tracking data was binned into 20 ms bins. The dependent variable was the Proportion of fixations towards items aggregated by condition and time bins. The independent variables were Language Context (Spanish, Code-switched), Frequency of the Fixated Image (High, Low), and Dominance (continuous). Importantly, the data were analyzed as looks to higher or lower frequency items regardless of their target/distractor status. This was done to maximize the number of data points for analysis, and because our research question investigates predictive processing prior to target processing. Dominance was operationalized as the ratio between the DELE and MELICET scores, with a higher ratio meaning more relative Spanish dominance. Due to a procedure error, the Dominance data was not available for one participant. Their Dominance score was substituted with the Dominance mean. Dominance and Proportion of Looks were standardized by z-scoring.

Growth curve analysis
We performed a Growth Curve Analysis, with time transformed with orthogonal polynomials of 1 st , 2 nd , and 3 rd order (OT1, OT2, OT3, respectively; Mirman, 2014). This analysis would have modeled the curvature of the time series data. Nevertheless, for the critical time window, a full model with OT1, OT2, and OT3 as main effects and their interactions or a reduced model individually including OT1 and OT2 or OT3 as main effects and their interactions did not provide a significant improvement over a model that includes only OT1 as a main effect and its interactions with the other predictors. Model comparisons were performed using the base R anova function and the AIC index. Therefore, we only report the more parsimonious model.
Proportion of looks to items in each bin was fit to a linear mixed-effects model using the lme4 package (Bates, Maechler, Bolker & Walker, 2015) in R (R Core Team, 2017). The model included Language (contrast coded: Spanish −0.5, Code-switched +0.5), Frequency (contrast coded: High −0.5, Low +0.5), Dominance (continuous), and the linear orthogonal polynomial term, OT1 (continuous), as well as their interactions. Random intercept for Subject and Language, Frequency, and orthogonal polynomial term OT1 slopes by Subject were included in the model as random effects. We report significant main effects and interactions below. Full model output (Table A1) and model fit graph ( Figure A1) are in Appendix A. The analysis code can be found in the Open Science Framework repository: https://osf.io/ azcn4/.
The main effect of OT1 was significant, b = 0.286, SE = 0.094, t = 3.027, such that overall looks increased over time. The Language x Frequency interaction proved significant, b = 0.490, SE = 0.047, t = 10.337, such that Low-frequency items were fixated more in the CS conditions compared to the Spanish conditions. For the crucial interaction of Language x Frequency, we report partial effects means, standard errors, and confidence intervals from the model equivalent to the one reported, yet without contrast coding, in Table 3. The partial effects table was produced using the EFFECTS package (Fox, 2003;Fox & Weisberg, 2019).
Additionally, the interaction of Language, Dominance, and Frequency was significant, b = 0.285, SE = 0.047, t = 6.016, such that the participants with higher relative Spanish dominance looked at the Low-frequency items more in the CS condition compared to the Spanish condition. Upon visual inspection (Figure 3), this effect seemed to stem from the more Spanish-dominant speakers showing the frequency bias, or the tendency to look at the more familiar, higher-frequency item (e.g., Dahan, Magnuson & Tanenhaus, 2001) in the Spanish condition. This high-frequency bias in more Spanish-dominant participants was more dramatically reversed to the low-frequency item preference in the CS condition, compared to less Spanish-dominant participants who did not exhibit the high-frequency bias in the Spanish condition. Although dominance was introduced as a continuous measure in our model, we visually present the results by categorically splitting the participants via mean split (Figure 3).
The interaction of Language, Frequency, and the linear time term was significant, b = 0.89, SE = 0.211, t = 4.212 such that looks towards the Low-frequency item increased over time in the CS condition compared to Spanish condition. The interaction of Language, Dominance, Frequency, and the linear time term was significant as well, b = 0.822, SE = 0.212, t = 3.889, such that the participants with higher relative Spanish dominance looked more over time at the Low-frequency item in the CS condition compared to the Spanish condition.

Discussion
We investigated whether bilingual listeners interpret a code-switch as signaling upcoming less expected or less predictable content, operationalized as lexical frequency. The results point to a global increase of looks to low-frequency items in CS instructions in a 400 ms critical time window prior to when participants process the onset of target items. This result is especially revealing given the robustly documented frequency bias in visual world studiesa tendency to look at higher frequency, more familiar items (e.g., Dahan, Magnuson & Tanenhaus, 2001), which we replicated in the unilingual Spanish condition. The results corroborate our hypothesis that CS plays a role in signaling upcoming unexpected information, even in our simple operationalization of unexpectancy as lexical frequency and in the neutral sentential contexts that participants heard. This hypothesis was driven by the distribution of code-switches in bilingual discourse, which suggests that code-switches have a role in information distribution (Myslín & Levy, 2015), such that more informative speech occurs after/on the switch from the less to more salient language, which we operationalized in the context of bilingual speakers in the US as the switch from the minority language (i.e., Spanish) into the majority language (i.e., English). The hypothesis was additionally motivated by the fact that code-switches are salient linguistic events which can affect bilingual comprehenders' prediction of upcoming speech, similar to the effect of disfluencies on prediction.
Higher order interactions highlight that dominance further plays a role in prediction in terms of lexical frequency. Nevertheless, dominance only affected whether the frequency bias, or the tendency to look at more frequent items, was apparent in Spanish-only instructions. The fact that both more English-dominant and Spanish-dominant bilinguals look more towards the lower frequency item in the CS condition suggests that all bilinguals in the language community, regardless of their personal language dominance, developed sensitivity to the salience of the majority language in a minority language context, or the sensitivity to the sociopragmatic, information distribution function of the minority to majority language code-switches.
Several theories could explain our findings. As proposed by Myslín and Levy (2015), speakers could be designing their utterances with the audience in mind (Clark & Murphy, 1982). This would entail speakers using more extensive coding, either the switch itself or the situationally more marked language, to highlight less predictable language material and prevent miscommunication. Similarly, listeners could also be using the established community-wide sociopragmatic, information-distribution function associated with code-switches while processing. In addition, listeners could attribute a speaker's switch as indicative of struggling with lexical access in one language, similar to the effects Table 2. Proficiency and language use profile for participants (n = 30). LHQ values represent self-reported ratings of proficiency on a scale of 1 (no proficiency) to 10 (highly proficient). MELICET and DELE scores are calculated out of 50. Aural CS Exposure and Oral CS Use were on a scale from 1 -"Never" to 5 -"Always".  of disfluency leading to anticipation of discourse-new or lowerfrequency content in monolingual speakers (e.g., Arnold et al., 2007). That CS can affect ease of processing in comprehension is also in line with models that posit a tight link between production and comprehension (e.g., Dell & Chang, 2014;MacDonald, 2013;Pickering & Garrod, 2004). The PDC model, for example, stipulates that production pressures shape language distribution, which in turn shapes ease of comprehension. Switching to the majority language could thus be a bilingual resource strategy to aid lexical retrieval (e.g., Gollan & Ferreira, 2009) or may be the natural product of planning "harder" items later in the utterance (Johns & Steuck, 2021). These potential production strategies could affect distribution patterns in bilingual discourse, which subsequently affect the ease of language comprehension. The result that bilinguals could use code-switches predictively irrespective of language dominance suggests that both more and less Spanish-dominant speakers are exposed to this code-switching pattern or function and are able to make use of it during online processing. Importantly, the above explanations are not mutually exclusive. Code-switching could have started out as a strategy to ease production, shaping distribution, which resulted in the statistical learning of the co-occurrence of code-switches and lowerfrequency or unpredictable words. Subsequently, code-switching could have become a relatively conscious sociopragmatic strategy to aid listeners' prediction of more difficult language material. Future studies should employ neurophysiological paradigms and modulate the perceived animacy of the speaker, e.g., whether the listener believes the sentences are spoken by a human or using an artificially generated voice, as well as the listener's awareness of the speaker's code-switching behavior. These modulations would help ascertain to what extent the listeners' ability to use code-switches predictively is due to attribution of code-switches to human speakers' state of mind, due to awareness of their sociopragmatic strategies involving CS, or the product of (statistical) learning. Participants' predictive behavior not changing when listening to artificial voice would provide evidence for the latter explanation. Nevertheless, the likelihood that pragmatic rules and state-of-mind considerations initially guided the predictive behavior could not be ruled out in such a paradigm.
There are also a few less likely explanations for our results, yet worth mentioning. Potentially, the slight natural prolongation (∼22 ms) prior to the CS mentioned in the Instructions and audio recordings section is responsible for or contributes to the CS effect on prediction. Previous studies demonstrate that the prolongations prior to code-switches aid their processing (Fricke et al., 2016) and artificially removing phonetic cues can interfere with CS processing (Shen et al., 2020). Therefore, we did not alter the natural pronunciation of the CS in our  88 Aleksandra Tomić and Jorge R. Valdés Kroff recordings. It is likely that the prolongation aided the processing of the CS, yet did not extend its influence to the target word. Nevertheless, future studies could vary the pre-CS prolongation placement, length or intonation to ascertain the contribution of phonetic or prosodic cues in signaling the upcoming lower frequency items. Another less likely explanation for the CS effect is that it might represent a delayed novelty bias (Horstmann & Herwig, 2016) which may be observed as a slight "bump" in the Spanish condition towards the low-frequency item at the point equivalent to the approximate onset of the CS point ( Figure 2, panel A). The novelty effect, possibly delayed due to CS processing, could indeed be playing a role in the lowfrequency preference at the target word onset in the CS condition. Nevertheless, we believe that this possibility does not take away from the fact that the CS swayed prediction towards the lowfrequency word at the onset of the target word. Moreover, the increase in looking at lower frequency items is much larger at the target onset in the CS condition than it is at the beginning of the point equivalent to the CS onset in the Spanish condition ( Figure 2, panel A). It is thus likely that these are two different effects. This study represents a first proof-of-concept test of experimentally investigating the potential benefits of code-switching to sentence processing and moves the focus away from examining switch costs at the code-switch site itself (e.g., Valdés Kroff et al., 2018; see also Gullifer & Titone, 2019, for the effects of code-switching on downstream lexical access). We have argued that the English to Spanish switch direction is the sociolinguistically less-preferred switch direction found in the Spanish-English bilingual community under study (Beatty-Martínez & Dussias, 2017;Blokzijl et al., 2017;Valdés Kroff, 2016). Nevertheless, future studies could build upon this work by manipulating population and switch characteristics, by including bilinguals from other communities, the English-Spanish code-switch direction, English-Spanish bilinguals, and bilinguals of varying proficiencies in Spanish and English. These manipulations could help disentangle the influence of language dominance, age of acquisition, and/or the salience of the switch language on the strength of the anticipatory process we report here, or even whether a code-switch in a certain direction/population is interpreted as a predictive cue at all.
In our study, the expectedness of items is operationalized using lexical frequency. In spite of the relatively simple unpredictability manipulation of lexical frequency in neutral sentential contexts, we found a strong interaction of unpredictability and language context. Frequency is, nevertheless, correlated with word length, another source of lexical access difficulty. Future studies could assess the effect of CS on prediction in terms of both frequency and word length. CS is likely to aid the prediction of items which are difficult to process in general.

Conclusions
Our primary goal in this study was to account for the discrepancy between well-documented switch costs with the ubiquity of code-switching in bilingual speech and to bridge the gap between sociolinguistic and psycholinguistic research on CS. The psycholinguistic focus on switch costs may be undervaluing the sociopragmatic functions of CS, which could result in processing benefits for subsequent language structures. Here, we experimentally probed one such function: discourse organization in terms of information distribution.
The results provide support for the prior findings on the information distribution of CS (Myslín & Levy, 2015) in the realm of comprehension. They confirm our hypothesis that CS provides experimentally detectable processing benefits in anticipation of unexpected information, much like other salient cues, such as disfluencies in monolingual studies (Arnold et al., 2003(Arnold et al., , 2004(Arnold et al., , 2007. Interestingly, the production of both disfluencies and code-switches are associated with production costs, yet both carry potential for comprehension benefits. Here, we began with a simple operationalization of unexpectancy as lexical frequency. However, this function of switching from the minority language to the language of power could extend to other non-salient/salient information contrasts. Future studies could thus probe the role of CS in online processing of given vs. new, lexically complex vs. simple words, emotionally neutral vs. taboo information (Tomić & Valdés Kroff, in prep). We hope that this endeavor will further open the scientific conversation on the roles of CS in language processing and continue to bring psycholinguistic research closer into alignment with sociolinguistic approaches (Myers-Scotton, 2006).