A multivariate account of particle alternation after bare-form try in native varieties of English

This multifactorial study reviews the determinants of particle alternation after uninflected try in varieties where English is native. The effects of a number of previously discussed and novel predictors are probed in data from well-known corpora. The results confirm the inclinations of North American varieties (try to) in contrast with those of the Australasian, British and Irish varieties (try and in speech but try to in writing). The previously reported general effects of the tense of try, mode and horror aequi are also corroborated. As regards the effect of register, the study contributes the finding that following Latin-based infinitives favor try to in most varieties, especially in writing. The article discusses the status of the substantiated effects with respect to the notions of conventionalization and entrenchment: crucially, the higher degree of conventionalization of try to in North American varieties (a) makes the use of this variant less conditional on the sequential need to license euphony and (b) neutralizes the general contextual/register distinction for the alternation. From a usage-based viewpoint, the findings suggest that the higher frequency of a multiword sequence in a specific variety, and the higher degree of activation in the language users’ minds, can make it less contingent on general probabilistic constraints.


Introduction
Infinitival markers in English show variation for only a limited set of verbs. For instance, help or assist might feature either to or zero as a marker of infinitival subordination (cf. Lohmann 2011;Levshina 2018). In the case of try, the variation is between to, and or zero. 2 This multifactorial study focuses on the probabilistic factors determining particle alternation (to vs and) after bare-form try in six varieties where English is a native language (ENL varieties): American, Australian, British, Canadian, Irish and New Zealand English.
A number of recent studies have dealt with the diachronic evolution of this alternation and its dialectal variation by focusing primarily on differences between the North American varieties and British English. Historically, both alternatives seem to have evolved hand in hand, and both underwent reanalysis (Tottie 2012;Ross 2013). On the one hand, try and has its origin in plain coordination, at a time when try had a different range of meanings ('test the strength of' or 'prove'). On the other hand, try to did not originally convey its contemporary meaning ('attempt to') until the seventeenth century; before then, it more likely conveyed a more compositional/purposive 3 interpretation in line with the original meaning of try: 'practice in order to ' (cf. Brook & Tagliamonte 2016: 302-4

for an overview including examples).
Although the current study is synchronic and does not draw on questions such as the chronology of both shifts (cf. Vosberg 2006;Tottie 2012;Ross 2013;Brook & Tagliamonte 2016: 304), it is important to keep in mind that both variants have arisen as a result of the reanalysis and grammaticalization of a frequent collocation: both involve semantic bleaching of the verb try and show phonetic reduction (try and > [ˈtrɑɪən] or [ˈtrɑɪn]; try to > [ˈtrɑɪ ɾə]; cf. Pullum 1990: 221;Ross 2013: 118). Therefore, it should be possible to identify different 'cross-dialectal linguistic patterns' associated with each variant that might account for an alternation that 'remains in flux' (Brook & Tagliamonte 2016: 304).
The 'try bare + particle + V inf ' sequence allows two possible constructional pathways for language users. When the initial verb complies with what Carden & Pesetski (1977) termed the bare form condition (i.e. both verbs must be uninflected), 4 speakers might either opt for the 'V-to-V inf ' construction (cf. Lorenz & Tizón-Couto 2020: 80-2 for an overview) or the 'try-type pseudocoordination' construction (i.e. try and; Quirk et al. 1985: 978-9;Pullum 1990: 220-2;De Vos 2005;Ross 2014). 5 This is in line with a 2 As shown by Maia (2012) and Ross (2013), the 'zero' alternative (e.g. I will try finish the project) is rare and, thus, not considered in this study. 3 In purposive to-infinitives (e.g. He only works to make money) the matrix verb does not refer to the 'possible world' status of the proposition of the to-infinitive (cf. Morita 2012: 37); rather, it denotes an action performed by the subject with the purpose of achieving the situation expressed by the to-infinitive (cf. Schmidtke-Bode 2009: 1; Rudnicka 2019: 87-8). 4 Although some regional exceptions have been pointed out by Ross (2013: 124-5) for northeastern Canada or South African English (e.g. Noeleen tries and find answers and solutions), particle variation after try is generally only possible when both verbs are uninflected (cf. Pullum 1990). 5 The most frequent verbs in the first slot of the 'pseudocoordination' construction are go, try and come. Pullum (1990: 221-2) distinguishes between pseudocoordination with go and come ('the pseudocoordinate complement construction with basic motion verbs') from pseudocoordination with try, and other verbs such as remember, mind or be sure ('the pseudocoordinate complement or hendiadys construction'). The main formal difference lies in the fact that pseudocoordination headed by come/go allows for inflection in both V slots (He goes and gets it / He went and got it), while try-type pseudocoordination requires the bare form or bare stem condition (*He tries and does it / *He tried and did it)just like serial verb constructions (Come have a drink / Go get the nurse; cf. Flach 2015Flach , 2017aFlach , 2017b. usage-based perspective on lexical choice, namely that words and sequences are stored and processed in an activation network (cf. Bybee 1995) and '[l]exical access can be seen as a competition process whereby a particular item is selected from a set of alternative expressions' (Diessel 2019: 200). At the theoretical level, this study is inspired by usage-based models of language assuming 'that grammatical knowledge is experience-based and partially probabilistic' (Szmrecsanyi et al. 2016: 110, cf. Bresnan 2007. More precisely, probabilistic models uphold that 'language users implicitly learn the probabilistic effects of constraints on variation by constantly (re-)assessing input of spoken and written discourses they engage in throughout their lifetimesand crucially, this input is likely to differ across different speech communities and varieties of English' (Szmrecsanyi et al. 2016: 133-4). The usage-based, probabilistic stance taken here is also inspired by integrative socio-cognitive theories such as the Entrenchment-and-Conventionalization model (Schmid 2015), which assumes that 'grammar or language emerges from and is constantly updated by the repetition of situated usage events' (Schmid 2015: 8). Within Schmid's model, repetition contributes to (a) the routinization and schematization of linguistic expressions (i.e. entrenchment) and (b) to their conventionalization 'by facilitating diffusion across speaker groups, genres, and registers' (Schmid 2015: 20).
At the methodological level, this article is inspired by (a) the increasing interest in multivariate quantitative models of language use (see Levshina 2016: 236 for an overview) and (b) the quest for multivariate statistical methods that accurately estimate hypothesized effects but do not rely on dichotomous decisions made on the basis of p-values or automatic variable selection (cf. Larsson et al. 2021;Tizón-Couto & Lorenz 2021).
Assuming that '[g]rammatical variation is sensitive to multiple and sometimes conflicting probabilistic constraints' which 'may remain invisible unless analyzed quantitatively' (Szmrecsanyi et al. 2016: 112), this multifactorial study looks into the language-internal and language-external factors that come into play in this particular alternation (try to/and V inf ). From a usage-based perspective, it addresses the following research questions: -Do the language-internal or 'cotextual' (e.g. tense or horror aequi) and languageexternal or 'contextual' factors (e.g. mode) of variation previously attested for American, Canadian and British English hold across other ENL varieties such as Irish, Australian or New Zealand English? -Are there any other language-internal factors, e.g. connected to the following infinitive, that play a role in the alternation but have 'remained invisible'? Could characteristics of the ensuing infinitive (e.g. its stress pattern, its initial sound, its Latinate or Germanic roots) have an effect on the alternation? -What is the magnitude of the effect of horror aequi on this alternation? How does the degree of conventionalization of one or the other variant interact with horror aequi in each variety? Can the 'conserving effect' (Bybee 2006: 715) of frequency make a variant more resistant to the hold of horror aequi? In other words, is horror aequi more often disregarded by language users when the disfavored variant (try to) is highly conventionalized in a particular variety (e.g. US) or mode (e.g. speech)? -Do ENL varieties share a core probabilistic representation of the alternation (and its associated constraints and preferences)?
The article is structured as follows: section 2 discusses the factors that have been previously claimed to determine the alternation. Section 3 introduces the data sources employed, the predictor variables analyzed and the statistical method implemented. The results are discussed in section 4, which is divided into three smaller case studies. Section 5 connects the results with the research questions laid out, and section 6 provides conclusions and some suggestions for further research.

Previous accounts of the alternation
The alternation between try to and try and 'has been at different times, attributed to syntax, semantics, formality, medium, region, tense and subsequent verb' (Brook & Tagliamonte 2016: 302). This section focuses on the language-internal and -external factors that have been hitherto suggested as instrumental as regards the alternation.

Semantics
The earliest studies (1920searly 2000s) contain some claims of semantic distinctions between the two variants: some authors claimed meanings of 'greater urgency' (Wood 1965: 241) or a vaguer/more uncertain goal (Pishwa 2006: 282) for try and (cf. Hommerberg & Tottie 2007: 47 or Ross 2013 for an overview). In contrast, the most comprehensive studies carried out in the last thirty years consider the two variants synonymous (cf. Lind 1983;Faarlund & Trudgill 1999;Gries & Stefanowitsch 2004;Hommerberg & Tottie 2007;Maia 2012;Tottie 2012;Ross 2013Ross , 2014Brook & Tagliamonte 2016). Lind (1983: 562), for instance, suggests that the subtle shades of meaning, pointed out by previous authors, root from differences in stress or intonation, but not directly from the choice of particle. Gries & Stefanowitsch (2004: 122-3) also show that there are not any clearly distinctive collexemes for either construction in the ICE-GB corpus of British English.

Variety
The most recent corpus studies have connected the try to variant with the American and Canadian varieties but the try and variant with British English (cf. Hommerberg & Tottie 2007: 48). Moreover, the chances of finding try and in British speech have been reported as significantly higher in comparison with British writing (cf. Biber et al. 1999;Hommerberg & Tottie 2007: 48;Brook & Tagliamonte 2016: 305). Although authors like Ross (2013: 128) have highlighted dialectal variation as a promising area of research for this particular alternation, no detailed quantitative analysis has been carried out beyond corpora of American, British and Canadian English.

Horror aequi
Lind (1983: 561) and Biber et al. (1999: 738-9) point out the ability of the try and variant to support euphony when the verb try is preceded by to (e.g. to try to V inf ). The general avoidance of repetition in this context is a clear example of the effect of horror aequi, namely the 'widespread tendency to avoid the use of formally identical and adjacent grammatical elements or structures' (Rohdenburg 2003: 236). Under that label, such tendency has also been reported to have an effect on the choice of the particle following bare-form try (Hommerberg & Tottie 2007: 56-7). In fact, horror aequi has been claimed to have caused the emergence of the try and V inf variant in English (cf. Rohdenburg 2003;Vosberg 2006). Nonetheless, this proposal has been refuted by Tottie (2012), who illustrates that try and is in fact the older construction with the meaning 'attempt to ', and Ross (2013: 113-22), who shows that both variants have progressed hand in hand since 1500.

Tense
The Oxford English Dictionary (OED; 1989) (s.v. and, conj.) suggests that try and is more frequent when try is in either the infinitive or imperative form. Hommerberg & Tottie's (2007) corpus study confirms this suggestion and reports that try to is more usual when try is in the past or present tense. This conforms to the development proposed by Ross (2013) for the bare form condition: from its origin to the early 1800s, 'the try and construction had been strictly limited to non-finite, non-factive contexts (infinitives, and presumably imperatives)' (Ross 2013: 119). By analogy with infinitives and imperatives, try and began to be used in the other uninflected form (i.e. the bare present tense) in the mid 1800s and was not described by linguists until the 1960s (Ross 2013: 120-1). Further evidence comes from the fact that instances of the sequence try and be in the present tense are only found since the 1900s (Ross 2013: 121

Following infinitive
The effect of the following infinitive on the alternation has also been previously investigated. Lind (1983: 562) reports a preference for try to be over try and be in a corpus of British fiction (12 vs 4 instances), which she explains on grounds of sonority and the 'semantically empty' character of be. For British English, Hommerberg & Tottie (2007: 55) report a preference for try and remember (8/9 instances) in the CobuildDirect Corpus, and Gries & Stefanowitsch (2004: 122) report that, although try and get and try to make have a significant collostructional attraction, the two particles are largely transposable in the ICE-GB corpus. For both Canadian and British English, Brook & Tagliamonte (2016: 313-14) report that do and be are more 'resistant' to the try and variant; they connect this result with the trajectory proposed by Ross (2013), i.e. try and seems to have retained 'the syntactic/semantic restrictions on its earlier use' ('test/examine and'), which 'kept it from combining with static and nontransitive verbs' such as be (Brook & Tagliamonte 2016: 321).

Data and methods
The main datasets for this study come from four different well-known corpora, namely the International Corpus of English (ICE), the Global Web-Based English Corpus (GloWbE; Davies 2013), the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA; Davies 2008-). The ICE data were extracted from five components sampled in countries where English is the first language of the majority of speakers: Australia, Canada, Great Britain, Ireland and New Zealand. 6 So that 'mode' could be included in the analysis, only ICE components providing a complete spoken and written sample (approximately 600,000 words of spoken material and 400,000 of written material) were included. This means that only Canadian English is represented as a North American variety in this dataset. Table 1 provides a summary of the distribution of the two variants across corpus and mode in the ICE corpora investigated.
The GloWbE data were extracted from six of its components: Australia, Canada, Great Britain, Ireland, New Zealand and United States. 7 Table 2 provides a summary of the two variants across the six different corpus components. In order to further examine the distribution of the alternation in specific varieties, data were also extracted from the BNC and COCA. Table 3 provides a summary of the distribution of the two variants across mode for these two corpora. 6 A useful feature of ICE is that all its components follow the same design and annotation scheme, which makes them particularly appropriate for establishing comparisons between varieties. All ICE components used belong to first generation ICE corpora, which were sampled in the early 1990s. 7 Size is the most obvious strength of the GloWbE corpus, but it also has clear limitations: practicability bias (i.e. it is genre specific and overrepresents easily obtainable texts: websites and blogs) and a much lower accuracy in comparison with, for instance, the ICE corpora: speakers from one variety might have posted on a website classified in GloWbE as another variety, or two (or more) websites might have reposted the same words from a speaker/writer and thus yield 'double-hits' that could result in the overrepresentation of certain features.

DAVID TIZÓN-COUTO
The dataset analyzed for ICE includes every token of try to/and V inf in the five subcorpora. Minimum frequency for each of the two strings was set at 5 for the automatic search in the three mega corpora. The GloWbE dataset was not manually shifted for potential 'double-hits', as this task would have been practically unattainable; the results must be considered with this caveat in mind.

Predictor variables
The dependent variable in the statistical models fitted is labeled 'construction' (try and V inf vs try to V inf ). The independent variables and their respective levels are listed below: 1. 'mode': written, spoken 2. 'corpus/variety': Australia, Canada, Great Britain, Ireland, New Zealand, United States 3. 'tense': present/past, infinitive, imperative 4. 'to_before': yes, no 5. 'and_before': yes, no 6. 'freq.V2': overall frequency of following infinitive, derived from the BNC corpus. 8 7. 'latin': Latin-based collocate: yes (e.g. explain), no (e.g. get) 8. 'fol_sound': first sound of following infinitive: consonant vs vowel/glide 9. 'stress': stress pattern of the following infinitive: first (first syllable/monosyllabic) vs other Independent variable number 3 takes up the four possible forms of bare-form try in English (see (1) to (4)). The low frequency of tokens in the past tense has motivated the grouping with present tense into one variable level; theoretically, this combination does not raise an issue because both the present and past tenses have been reported to favor the try to variant (see section 2.5).
(1) Present: Join us as we try to find out if the forest and the trees will be around for future generations (ICE-CAN; S2B-037#129:5:A) (2) Past (preceded by did): What I did try and address was the impact the reality faith has in the structures of our common life (ICE-GB; S1B-028:1:B) (3) Infinitive: (a) We can try to generate ideas and focus on ways in which the environmental crisis can be used to create a better world (ICE-AUS; S2B-021(A)) (b) It's only when you bring somebody in and start to try to teach them to be able to take on the project that you realise all the complications and implications that go with the particular work (ICE-IRE; S2A-031$A) (4) Imperative: Try and keep salt water away from reels (not always an easy task). (ICE-NZ; W2D-018#96:1) Variables 4 and 5 were employed in order to assess the effect of horror aequi on the alternation; the objective was to test whether a preceding to or and would have an effect on the outcome. 9 Variables 6 to 9 ('freq.V2', 'latin', 'fol_sound' and 'stress') assess the effect of the features of the following infinitive. First, the overall frequency of the infinitive, which is derived from the BNC corpus, was employed in order to control for the collocational preferences after each particle.
Second, the Latinate origin of the following infinitive is employed as an indirect measure of register, 'based on the assumption that words of Latinate origin tend to indicate a more elevated or formal register than words with Germanic roots' (Lorenz 2020: 256). The same method applied in Lorenz (2020) was implemented: following infinitives were coded as Latin-based if they contain a Latinate affix (re-, ad-, col/con/ com-, dis-, de-, im/in-, per-, pro-, trans-, sub-, -ain, -age, -ate, -efy/ify, -ise/ize, -ion, -ish, -ure).
Third, the first sound of the ensuing infinitive was also coded as either a consonant or a vowel. The motivation behind this 'fol_sound' variable is that English has been suggested to have a preferred consonant-vowel (CV) syllable structure. For instance, Rohdenburg & Schlüter (2000) have shown that in Early Modern English there was an effect of this 'optimal syllable structure' on the choice of my vs mine: the latter is preferred before vowels in order to respect the CV alternation. Likewise, infinitives beginning with a vowel should be more likely in their bare form (and not preceded by to) after help (cf. Lohmann 2011). For the present alternation, a vowel or a glide should avoid a consonant cluster after try and, while a consonant should generally be more likely after try to.
Last, the stress pattern of the following infinitive was coded as either 'first' syllable (including monosyllabic) or 'other'. The potential effect of the principle of rhythmic alternation (Schlüter 2005(Schlüter , 2009) is behind the inclusion of this 'stress' variable. Apparently, 'a sequence of stressed syllables is to be avoided, as well as a sequence of lapses, i.e. unstressed syllables' (Lohmann 2011: 505). This tendency for contrast has been shown to cause the omission of grammatical elements, for instance, in the case of dares vs dares to: in historically recent English, the particle is more likely to be included before a stressed syllable to avoid a clash (Schlüter 2005: 206-9). For the alternation in focus, the variant try to should more typically precede a stressed syllable; in contrast, the variant try and, which is often reduced (cf . Pullum 1990;Ross 2013) and might be conjoined into one stressed syllable in speech as [ˈtrɑɪ(ə)n], 10 should be more frequent before an unstressed syllable to avoid a clash.
All of the above listed variables were analyzed in the ICE dataset. Variables 1, 3, 5 and 6 were not coded for the GloWbE dataset, while 3, 5 and 6 were not coded for the BNC and COCA. 'Tense' was not coded in the datasets from the megacorpora, as it would have implied the manual classification of 320,000 tokens. The inclusion of 'and_before' in the statistical models could not be considered for GloWbE, BNC or COCA due to the low number of attested examples. As illustration, figure 1 offers the distribution in GloWbE of the two alternants with (a) a preceding and, (b) neither a previous and nor a previous to ('no') or (c) a previous to. The figure shows that the proportion of tokens of try and preceded by and does not exceed 0.5 percent in any of the six ENL varieties in focus.

Statistical analysis
In order to explore the effect of the factors described, logistic regression models were fitted on each dataset in R (2021). 11 The maximal deductive model, i.e. containing all variables Figure 1. Distribution of try to/and by ENL variety and previous item in GloWbE 11 The datasets and R-scripts used in the present study are published as Tizón-Couto (2022)  In both cases the model produced a singular fit, which means that the random effects structure is too complex to be supported by the data. Most speakers produce only one token, and the maximum number of tokens produced by a single participant is eight, so the data should be fairly representative and unproblematic as regards the 'speaker' variable. However, not including 'verb' as a random factor could compromise the validity of the results, especially given that preferences as regards the following verb have been central in the previous literature. For this reason, a Bayesian mixed-effects logistic regression model 12 with a weakly informative prior distribution 13 was fitted on each dataset to favor comparability of the results. Bayesian modeling makes it possible to fit 'complex models with a large number of random variance components' (Sorensen et al. 2016: 7); 14 the Bayesian method implemented with the help of R package brms (Bürkner 2018) allowed for the inclusion of a random intercept for 'verb' in the models fitted for the ICE (section 4.1), GloWbE (section 4.2), BNC and COCA (section 4.3) datasets. Two-and three-way interaction terms were gradually added to the maximal deductive model for each dataset (including the random intercept for 'verb'), and the expected log pointwise predictive density (ELPD) of the resulting models was compared by means of Pareto Smoothed Importance Sampling Leave One Out (PSIS-LOO) Cross-Validation (Vehtari et al. 2015(Vehtari et al. , 2017. The model with the highest ELPD_LOO and the lowest LOOIC was selected to report on each dataset. 15 12 Levshina (2022) lays out some of the epistemological and practical advantages of using Bayesian models. In contrast with frequentist statistics, which involve null hypothesis testing and a predetermined threshold for significance ( p-value), 'Bayesian inference allows us to estimate the probability of the research hypothesis given the data' (Levshina 2022: 226). 13 Cauchy priors were used, centered around zero and with the scaling parameter of 2.5. Non-informative priors of this kind are required to specify a Bayesian model but acknowledge 'ignorance' as regards the expected range of values (Levshina 2022: 227). 14 According to Sorensen et al. (2016: 177), one of the advantages of Bayesian modeling concerns variance components (random effects): 'Fitting a large number of random effects in non-Bayesian settings requires a large amount of data. Often, the data-set is too small to fit reliable distributions of random effects (Bates et al. 2015). However, if a researcher is interested in differences between individual subjects or items (random intercepts and random slopes) or relationships between differences (correlations between variance components), Bayesian modeling can be used even if there is not enough data for inferential statistics. The resulting posterior distributions might have high variance but they still allow for calculating probabilities of true parameter values of variance components.' 15 The LOO Information Criterion is an estimator of the relative quality of Bayesian statistical models for a given set of data, i.e. a measure of how good a model is at predicting future data assuming that the future data come from the same distribution as the observed data (cf. Vehtari et al. 2015Vehtari et al. , 2017. after an experiment/corpus study. In all plots presenting modeled results, CIs are reflected as the whiskers projected from each estimate. 17 The brm() model for the ICE dataset was specified as follows: construction corpus*mode+tense+and_before+to_before*mode+ latin * corpus + stress * fol_sound +log.freq +(1|verb) The C-value is 0.815: the four Markov chains (at 4,000 iterations) converge, R-hat is 1.00, and PSIS Leave One Out Cross Validation (Vehtari et al. 2015(Vehtari et al. , 2017 suggests that the model is a good fit: all Pareto k estimates are good (k < 0.5).

656
DAVID TIZÓN-COUTO with a higher probability of try and, while a positive coefficient suggests a higher likelihood of the try to variant. Figure 2 suggests that the predictors 'corpus', 'mode', 'tense', 'to_before' and 'latin' have an effect on the outcome. As regards 'corpus', Ireland and (especially) Canada are more likely to feature the to variant than Australia (reference level) and GB, while New Zealand clearly favors the and variant. With regard to 'mode' and 'tense', the estimates in figure 2 show that the written register and a tensed try clearly favor the to variant. Horror aequi only triggers the choice of the most euphonic variant when to precedes the bigram, but not when and does. Last, an ensuing Latinate infinitive favors try to. The modeled effects, and their interactions, can be explored independently for more detail by means of conditional effects plots derived from the Bayesian model (cf. Bürkner 2018). Figure 3 plots the interaction between the predictors 'corpus' and 'mode'. On the one hand, the differences between the five varieties are much sharper in the spoken mode: try and is more likely in Australian, British and, especially, New Zealander speech. Irish English holds the middle ground. On the other hand, every variety is notably more likely to feature try to in the written mode. Figure 4 displays the effect of 'tense' in each variety ('corpus'). We see that a tensed try (i.e. present or past tense) favors the to variant, even in varieties where try and is most likely in speech (New Zealand and Australia). In contrast, an infinitive or imperative try both favor the and variant, even in varieties where try to is the default choice (Canada). Figure 5 plots the interaction between 'mode' and horror aequi ('to_before'). A preceding to makes try and more likely in both the spoken and the written mode; however, there are no statistical differences within each mode. When the bigram 'try + particle' is not preceded by to, there is a clear distributional difference of the variants across modes; however, this distinction is softened when the bigram is preceded by to. Figure 6 illustrates the interaction between 'corpus' and 'latin'. The overall effect of a following Latinate infinitive suggested by the model is much weaker in the British and Irish subcorpora, and it is actually reversed in the Canadian variety. Figure 7 offers further detail as regards the effect of 'latin': it combines the marginal effects of 'latin', 'mode' and 'stress' at specific levels of the random effect 'verb' (cf. Lüdecke 2018). The four groups shown correspond to the top ten verbs from the four possible combinations for 'verb': 'latin' = yes + 'stress' = other (upper panel), 'latin' = yes + 'stress' = first, 'latin' = no + 'stress' = other, 'latin' = no + 'stress' = first (lower panel). We see that Latin-based infinitives are more likely to be preceded by try to (model estimates are located at 82.5% in writing and 50% in speaking) than non-Latinate ones ( 62-75% in writing and 15-25% in speaking). This trend seems to be slightly reinforced when the Latinate infinitive receives 'stress' after the first syllable: note that the intervals become narrower in the top panel. Non-Latinate items receiving stress after the first syllable (third panel from the top) feature ampler intervals in writing, which suggests that these allow for wider alternation. The divide between the two modes suggested by the model is most clearly observable in high-frequency monosyllabic non-Latinate infinitives: the intervals (for spoken vs written) only barely overlap for items like find, be or stop in the lower panel. Summing up the results from the ICE dataset, first there is a clear effect of 'tense' for all varieties, such that present and past typically favor a following to but imperative and infinitive favor and. Second, there is a strong effect of 'mode': written English favors try to, while spoken English allows for wider variation across varieties. Third, the effect of a previous to on the choice of particle softens the general distinction between the two modes for the alternation. Last, the effect of 'latin' suggests a general trend that try to is more likely when the following infinitive includes a Latinate affix. Figure 8 reports the modeled effects obtained with the Bayesian method for the GloWbE dataset. The reference levels for the independent categorical variables are as follows: 'corpus' = Australia, 'to_before' = no, 'latin' = no, 'stress' = first, 'fol_sound' = consonant. 18 Figure 7. Conditional effects plot at specific levels of the random effect 'verb'. Plotted probabilities take into account the modeled effects for 'latin', 'mode' and 'stress'(ICE dataset) 18 The brm() model for the GloWbE dataset was specified as follows: construction variety*to_before+latin*stress*fol_sound+(1|verb) The C-value is 0.804, the four Markov chains (at 4,000 iterations) converge, R-hat is 1.00, and PSIS Leave One Out Cross Validation (on 500 posterior samples) suggests that the model is a goof fit: all Pareto k estimates are OK (k < 0.7).

660
DAVID TIZÓN-COUTO Figure 8 suggests that the predictors 'variety', 'to_before' and 'latin' have an effect on the outcome. As regards 'variety', Canada and US are much more likely to feature the to variant than the other four varieties in focus. Horror aequi ('to_before') very clearly favors try and and, as was also the case in the ICE dataset, an ensuing Latinate infinitive favors try to. Interactions between these factors are better understood when analyzed individually. Figure 9 clarifies the interaction between 'variety' and 'to_before' in the GloWbE data. Overall, the effect of horror aequi emerges larger than in the ICE dataset. Crucially, the much higher rate of try to in Canada and US cannot offset the strong effect of horror aequi, but seems to make the sequential repetition of to more tolerable to speakers of the two North American varieties: the distance ( 0.45) between the estimates for 'to_before' = no and 'to_before' = yes for AU, GB, IE and NZ becomes smaller in CA ( 0.35) and even smaller in US ( 0.3). Figure 10 illustrates the three-way interaction specified in the model between 'latin', 'stress' and 'fol_sound'. This complex interaction was not recommended by ELPD model comparison for the ICE dataset, but the larger GloWbE dataset does provide some insight into the combined effects of the morphophonemic features of the following infinitive on the alternation. We see that Latinate infinitives are overall more likely to be preceded by try to than most monosyllabic non-Latinate items (e.g. get, make, find, do, etc.). Interestingly, infinitives receiving stress after the first syllable and beginning with a vowel (e.g. avoid, understand, achieve, escape, apply, enjoy) follow a similar trend and favor try to. Last, the distinction between Latinate and non-Latinate ensuing infinitives does not arise when these are polysyllabic and receive stress beyond the first syllable ('other').
To sum up the results from GloWbE, first there is a clear distinction between the North American varieties, which clearly favor try to, and the rest. Second, there is a strong effect of a previous to on the choice of particle, which is only slightly minimized in varieties where try to shows near complete dominance in terms of frequency (i.e. it is highly conventionalized). Finally, the effect of 'latin' replicates the trend already observed in the ICE dataset, namely that try to is more likely to precede Latinate infinitives. The three-way interaction specified in the model to explore the effect of specific features of the following infinitive reveals a noteworthy trend that non-Latinate infinitives beginning with a vowel and receiving stress beyond the first syllable behave similarly to Latinate items as regards the alternation. Besides, the distinction between Latinate and non-Latinate seems to be neutralized for polysyllabic infinitives receiving stress after the first syllable (e.g. persuade vs understand). Figure 11 reports the modeled effects obtained with the Bayesian method for the BNC and COCA datasets. The reference levels for the independent categorical variables are as follows: 'to_before' = no, 'mode' = spoken, 'latin' = no, 'fol_sound' = consonant, 'stress' = first. 19 Figure 11 suggests a number of differences between the two datasets: try to is the dominant variant in COCA, while the alternation is more balanced in the BNC. Horror aequi has a fairly robust effect in both datasets, i.e. it favors try and, but the effect is stronger in the BNC. There is a distinct effect of 'mode' in the BNC, such that try to is preferred in writing, but this effect emerges much more weakly in the case of COCA: try to is the default variant in American English regardless of mode. This is more clearly illustrated in the right panel of figure 12, which offers a joint visualization from both models of the interaction between horror aequi ('to_before') and 'mode'. Figure 12 illustrates one further important difference between the two varieties: although the effect of horror aequi is fairly similar in terms of probabilities in the written mode for both datasets, its influence is not as powerful in the spoken American 19 The brm() models for the BNC and COCA datasets were specified as follows:

BNC and COCA
BNC: construction to_before * mode + latin + fol_sound + stress + (1 | verb) COCA: construction to_before * mode + latin * mode + stress * fol_sound * mode + (1 | verb) The C-values are 0.851 (BNC) and 0.711 (COCA). The four Markov chains (at 4,000 iterations) converge and R-hat is 1.00 in both cases; PSIS Leave One Out Cross Validation suggests that both models are a good fit: all Pareto k estimates are good (k < 0.5, BNC) and OK (k < 0.7 on 500 posterior samples, COCA). variety (when compared to the larger different probabilities observed in spoken British English). Note that, in the right panel, both SP(oken) estimates (for 'to_before' = no and 'to_before' = yes) are much closer, and the estimate for 'SP' ranks higher than the estimate of 'WR' when horror aequi is in play (i.e. 'to_before' = yes).
The effects connected to the properties of the following infinitive, i.e. 'latin', 'stress' and 'fol_sound', form a more complex net. In order to clarify how they interact in each dataset, figures 13 and 14 combine the marginal effects of these three predictors with 'mode' (cf. Lüdecke 2018). Some of the distinctions observed in these two figures are quite subtle. The effects of 'latin', 'stress' and 'fol_sound' do not materialize in BNC-written. All estimates in the right panels of figure 13 (almost) equally approach the 100 percent line; we see a ceiling effect caused by try to being the default option in BNC-written. However, the relevant effects of 'latin', 'stress' and 'fol_sound' in American writing (see figure 11) are perceptible in figure 14. The effect of 'latin' is illustrated in the lower panels: the four estimates for Latinate items ('latin' = yes) in COCA-written rank above the estimates for Latinate infinitives in COCA-spoken. This  figure  14 combines the three effects and, accordingly, exceeds a 95 percent probability of try to. Last, the upper right panel shows that the effects of 'stress' and 'fol_sound' also apply to non-Latinate items: the 'stress' = other estimates rank higher than the 'stress' = first estimates, and the 'stress' = other estimate combined with the effect of 'fol_sound' = vow/gl ranks the highest. It seems then that verbs such as forget, become, correct or support and (more clearly) achieve, avoid, escape or understand favor a preceding try to in American writing. This trend dovetails with the results observed in GloWbE as regards the interplay between 'latin', 'stress' and 'fol_sound' (see figure  10, right panel).
As for the spoken mode (left panels in both figures), there is extensive alternation in BNC-spoken ( 40-70 percent probability of try to; figure 13) and a much more limited span of variation in COCA-spoken ( 87-93 percent probability of try to; figure 14). The effects of 'latin', 'stress' and 'fol_sound' are essentially offset in COCA-spoken, but the effect of 'stress' shows up as a fairly clear trend in BNC-spoken (figure 13, left panels): estimates for items stressed after the first syllable ('other') reach the 65 percent line for non-Latinate items beginning with a vowel (understand, avoid, agree), and they stretch beyond 70 percent for Latinate items beginning with a vowel (ensure, establish, improve). Summing up the results for BNC and COCA, we see that try to is the dominant variant in the American English data, but there is a much more fluid alternation in the British data. The effect of horror aequi is very much alive in both datasets, and interacts differently with the alternation depending on 'mode': try to is consistently preferred regardless of mode in COCA when horror aequi is not a factor, and it is only slightly dispreferred in both writing and speaking when horror aequi comes into the picture. 20 Crucially, American speakers seem to monitor the alternation for horror aequi more closely in writingwhich connects well with the large effect of 'to_before' observed in the US and Canada components of GloWbE (i.e. written materials from websites and blogs; see figure 9). Try to is also strongly favored in BNC-written, in spite of horror aequi, and try and is most clearly favored in BNC-spokenespecially when the following infinitive is non-Latinate, begins in a consonant, and is monosyllabic or stressed in the first syllable (get, make, find).

Discussion
This section discusses how the findings speak to the research questions laid out in section 1.

RQ1. Do the cotextual and contextual factors of variation previously attested for American, Canadian and British English hold across other ENL varieties?
The determining factors that have been most widely agreed on in the previous literature for American, British and Canadian English (namely register, variety, horror aequi and tense) also stand across other ENL varieties.
The results yield fairly strong empirical evidence as regards the contextual (register/ stylistic) choices motivating the use of each variant: try to is favored in writing and before Latinate items, while try and emerges in speaking, before high-frequency monosyllabic non-Latinate items and it acts as a buffer to horror aequi. 21 The three case studies presented (ICE, GloWbE, BNC vs COCA) also permit some finetuning of the varietal preferences reported in previous studies: it does seem to be the case that (a) try to is the (almost) default choice in North American varieties and (b) try and is highly conventionalized in British speech. However, the three snapshots of variation offered here suggest that labeling try and as a distinguishing feature of British English would be an oversimplification: the results attest wide cotextual and contextual competition between the two variants in ENL varieties where try and is conventionalized in speech (i.e. the British, Irish and both Australasian varieties). 20 The makeup of the two corpora might also account for these differences: the spoken section of COCA contains mostly public/broadcast speech, whereas the BNC includes a large portion of spontaneous conversation. It might be the case that the COCA-spoken data is closer to writing in terms of register. 21 Alternatively, it could be argued that the results do not reflect the work of horror aequi but rather the fact that try and is restricted to non-factive environments. To-infinitive clauses provide such environment: He wants to try and read a book (cf. Ross 2013: 117-20).

A MULTIVARIATE ACCOUNT OF PARTICLE ALTERNATION
The effect of horror aequi, which is discussed in more detail in RQ3 below, applies generally across varieties. This syntagmatic constraint regularly offsets the (contextual) preference for try to in written English or, to put it differently, counters the link between informality/colloquialness and the try and variant.
Last, the effect of tense is also corroborated in the first case study (ICE): infinitive/ imperative try allows ampler particle alternation, while present/past forms favor to. The finding that the try and variant disfavors bare indicatives (present) but favors imperatives is in line with usage-based accounts of related constructions which are also subject to the bare stem condition: serial verb constructions such as go/come V inf (e.g. Everyday I go get the paper, Come see what you have done) show a strong distributional preference for imperatives, as well as a significant skew against bare indicatives (Flach 2017a(Flach : §5, 2017b. This suggests that the morphological constraint set by the bare stem condition might be connected to a semantic constraint, namely non-assertive (i.e. actions that are not yet actualized, but merely potential) and hortative-mandative contexts (cf. Flach 2015Flach , 2017aFlach , 2017b. In terms of language change, the fact that tensed bare forms of try and are disfavored in contemporary data attests to the development described in Ross (2013), namely that present tense try and had to slowly find its way into try-type pseudocoordination. The substantial difference reported for 'tense' suggests that a more advanced shift allowing for the conventionalization of inflected forms of try before and would still be far on the horizon. 22 RQ2. Are there any other language-internal factors, e.g. connected to the following infinitive, that play a role in the alternation? Could characteristics of the ensuing infinitive (e.g. its stress pattern, its initial sound, its Latinate or Germanic roots) have an effect on the alternation?
One of the contributions of this study is the detection of effects connected to the morphophonemic features of the ensuing infinitive. Firstly, the variable employed to identify Latin-based items suggests that the stylistic difference between the two alternants is sharper than previously reported. This finding, together with the results for 'mode', suggests that the main functional (and pragmatic) distinction between the two variants might possibly be that try and specializes as a marker of a generally informal registerexcept in contexts where it might be used to mend a local repetition of to. 22 No instances of the sequence 'tries/tried and + V inf ' were found in any of the ICE components investigated.
However, one clear instance of tried and V [past] was spotted in the BNC (i) and ambiguous instances (which could be interpreted as simple coordination) can also be easily found in GloWbE (ii): (i) When he tried and saw the sky covered with rushing clouds, the lawn that had become a hay-field, the cedar's wheeling branches, the gun levelled, there would come an explosion in his memory like the firing of that shot-gun, a redness in front of his eyes with splintered edges, then black-out.
[Fiction] (ii) 'On the graphics side, that weather was difficult,' cos those guys were saying, 'No we can't push anymore effects or polygons, that place is already full.' But again we've got very, very smart people that tried and made it work'. [South Africa: General] 668 DAVID TIZÓN-COUTO Secondly, there is an effect of the stress pattern of the ensuing infinitive (most clearly in GloWbE and COCA), such that polysyllabic items receiving stress after the first syllable (often beginning with a vowel; e.g. understand or avoid) are also more likely to be preceded by try to. This finding does not match the preliminary hypothesis that the principle of rhythmic alternation (Schlüter 2005(Schlüter , 2009) could account for the choice of particle. 23 This result does not at all imply that rhythmic accounts of morphosyntactic alternations are devoid of explanatory power. Rather, it indicates that the alternation in focus is probably not the most suitable environment to test the principle of rhythmic alternation: both to and and are unstressed syllables and have the potential to be syllabic 24 (when that can buffer a stress clash). Besides, even in the case of extreme reduction of and, /n/ is in the group of high-sonority consonants that can serve as syllable nucleus (and thus avoid a clash). With this rationale in mind, two possible alternative accounts are briefly discussed here as regards the observed effect of the 'stress' pattern of the following infinitive.
A first potential account appeals to the taxonomic/paradigmatic interference between the contracted variant 'tryna' (from trying to; cf. Lorenz & Tizón-Couto 2017) in tokens where the sequence try and occurs before an unstressed syllable. Language users might then (intuitively) sidestep sequences such as try and avoid/explain/ understand/become since these are associated with a reduced variant of a closely related construction (V-to-V inf ) that marks informality (cf. Lorenz & Tizón-Couto 2017, 2020. This first account then speculates that such sequences might be dispreferred on the basis of the colloquialness evoked by a paradigmatically associated contraction. A second potential explanation could be that the sequence 'try and + unstressed syllable' does not offer a transparent cue to the word boundary of the infinitive, and this complicates lexical access for the hearer. This account assumes that strong syllables most typically mark the onset of lexical words (Cutler & Carter 1987;Cutler & Norris 1988). As Cutler (1992: 424) suggests, 'listeners treat any strong syllable as if it were highly likely to be word-initial' and, at the other end, 'speakers' clear speech strategies are indeed based on capacities commanded by listeners'. In the case of try and, a reduced/coalesced [ˈtrɑɪ(ə)n] plus an unstressed syllable would imply that one suprasegmental beat runs across two word boundaries, the second of which marks the beginning of a lexical item. As a means to strive for clarity language users might then avoid sequences such as try and avoid [ˈtrɑɪnəˈvoɪd], try and explain [ˈtrɑɪnɪksˈpleɪn] or try and understand [ˈtrɑɪnʌndəˈstaend], which include three different words but only a single intermediate stress mark that does not correspond to any of the two word 23 This hypothesis predicted that try + and might be felt by hearers as a single strong syllable in pronunciation because of frequent coalescence in speech: [ˈtrɑɪ(ə)n]. As a result of this process, a following weak syllable would be more likely after try and to avoid a clash between two strong syllables. In contrast, because the /t/ sound at the boundary between the two words in try to makes it more difficult to join them into a single beat, the combination should be less likely to be perceived as one strong syllable. This would predict a following weak syllable to be more acceptable after try to. 24 See Ladefoged & Johnson (2015: 253-7) for syllabic or variably syllabic consonants.
boundaries. What contradicts this conjecture is that we have not seen a truly solid effect of 'stress' in either British (BNC) or American speech (COCA), where sequential preferences such as those proposed by Cutler (1992) should be more conspicuous: the trend only applies to the written mode in the larger corpora investigated. In favor of this account, however, it might not be unreasonable to assume that, just as language users can control for horror aequi in writing, they might likewise be able to (more or less consciously) identify a sequence that is phonologically dispreferred and avoid it in writing.
It seems difficult, at this point, to advance beyond these two speculative accounts as regards the effect of the stress pattern (and the initial sound) of the following infinitive on this alternation, but its role merits further investigation (also in other alternating multiword sequences).
RQ3. What is the magnitude of the effect of horror aequi on this alternation? How does the degree of conventionalization of one or the other variant interact with horror aequi in each variety? Can the 'conserving effect' (Bybee 2006: 715) of frequency make a variant more resistant to the hold of horror aequi?
We have observed an effect of horror aequi in each variety investigated. However, the estimated effect does not emerge as robustly in North American varieties as it does in the rest of the ENL varieties. Relevant findings concerning the American variety (COCA) are that the central distinction for this alternation between the spoken and written mode (i.e. the 'contextual' distinction between the two variants) does not hold unless horror aequi is in play. Furthermore, the general trend that try and is more likely in the oral than in the written mode of the other varieties (e.g. ICE and BNC) is actually reversed in American English (COCA) when horror aequi comes into play. Taken together, these findings suggest (a) that the constraint set by horror aequi on the alternation is partly softened in American English due to the high degree of conventionalization of try to and (b) that try and is not apparently identified as a marker of informality in American English, but rather as a means to avoid horror aequi (especially in writing). The interpretation that try to is not identified, either, as a marker of register in North American varieties is supported by the lack of an effect of Latinate collocates observed in ICE-Canada (which includes speech; figure 3) and COCA-spoken (figure 14, left panels).
All in all, we see that the general effect of horror aequi is partly conditioned by the overall frequency of a variant, which is here associated with the concepts of entrenchment and conventionalization (Schmid 2015). In this vein, it might be argued that the degree of varietal conventionalization of a variant, and the 'cotextual' entrenchment of this sequence (possibly as a chunk; cf. Schmid 2015: 15) 25 in the minds of the speakers of a particular variety, also comes into play as a probabilistic factor in the alternation.
RQ4. Do ENL varieties share a core probabilistic representation of the alternation (and its associated constraints and preferences)?
This study has identified and substantiated a number of cross-varietal linguistic patterns associated with each variant. In other words, there are a number of language-internal and external constraints that distinctly apply to the alternation regardless of variety. Most of the cross-varietal factors observed appear to be based on general cognitive preferences and mechanisms (e.g. euphony, clarity at word boundaries), which would explain why they hold generally. An alternative possibility would be that some of these cross-varietal factors (e.g. register) are actually conventions which are deeply rooted in the language and have left a mark on all varieties (e.g. the effects of Latinate infinitives and mode on the alternation). Regardless of whether they are best defined as either general cognitive mechanisms or deep-seated conventions, they can be safely labeled as probabilistic factors.
Crucially, the cross-varietal patterns observed can be modified according to variety-specific conventions, which may override or reinforce them to some extent: the effects of cross-varietal constraints and preferences might be modulated (i.e. partly amplified or overridden) on the basis of the degree of conventionalization of a variant in a specific variety. Thus, the higher degree of 'contextual' entrenchment 26 and conventionalization of the try and variant in spoken Australian, British, Irish and New Zealand English most likely strengthens the perception of try to as the more formal variant to be used in writing. Similarly, the higher degree of 'cotextual' entrenchment and conventionalization of try to in American English might be responsible for (at least partly) counterweighing the otherwise consistent cross-varietal effects of horror aequi and register. This trend is in line with a usage-based perspective: when a multiword sequence is more automatized and entrenched in the language users' memory, cotextual and contextual factors that disfavor it might go more easily unnoticed (or be more easily disregarded).
From a usage-based perspective, words and sequences are stored and processed in an activation network, and lexical access can be seen as a competition between alternative expressions (cf. Bybee 1995; Diessel 2019, among others). Notably, frequency has been shown to have an effect on lexical access, such that 'frequent items are more easily accessed, or activated, than infrequent ones' (Diessel 2019: 201). All in all, the results obtained suggest that the unconscious decision-making process of language use must be determined by (probabilistic) language-internal and language-external factors that apply across varieties, but also by the repeated usage events experienced by speakers of different varieties.

Conclusion
This article has tried to hone the description of a fairly enigmatic alternation in native varieties of English. From a multifactorial usage-based perspective, this study has reviewed the determinants of particle alternation after bare-form (i.e. uninflected) try in varieties where English is native.
The study has demonstrated that general probabilistic factors compete in this specific alternation: e.g. horror aequi offsets the general predisposition of bare-form try to be followed by to in written English. Thus, although there is a cross-varietal connotation of informality generally attached to try and (in 'contextual' competition with try to), the tolerable/practical function of and within the 'to try + particle + V inf ' sequence in writing (i.e. rescuing the sequence from triggering horror aequi) might not allow for an outright contextual link in the overall network between 'informality' and try and.
The study has illustrated that varieties can show fairly concurrent variant patterns, but also that general probabilistic factors might be lessened by the higher degree of conventionalization of a variant in a particular variety. One of the things we learn from revisiting this alternation is that the high frequency of a variant cannot shield it from online constraints (horror aequi) and/or stylistic preferences (register). However, the higher level of activation in the network (i.e. the minds of language users' of a particular variety) can make the sequence more resilient to certain general constraints: in American English, a highly conventionalized alternant such as try to is less persistently tied to the need to license euphony (horror aequi) and less prone to be interpreted as a marker of register. This suggests a worthwhile connection between the cotextual automatization/entrenchment of a multiword sequence and its ability to (at least partly) override probabilistic syntagmatic and contextual constraints or inclinations in language use.
The study has isolated some relatively novel factors connected to the morphophonemic features of the following infinitive which were previously undiscussed: there is a clear effect of (non-)Latinate infinitives, while the effects of the stress pattern and the initial sound of the infinitive deserve further validation. The findings for 'stress' and 'following sound' run counter to the initial hypotheses that an infinitive with first-syllable stress might favor try and (rhythmic alternation) or that an infinitive beginning with a vowel/glide might be dispreferred after try to (optimal syllable structure), but they suggest that clarity at word boundaries might play a role in the realization of multiword sequences allowing alternation.
In an ideal research cycle, corpus studies on usage data like the one presented here represent an inductive stage to generate or refine hypotheses on the underlying mental representations of linguistic expressions. Experimental methods are most useful at deductive stages, where specific assumptions can be tested with a narrow focus on a particular variable and control over confounding factors. Experimental methods 672 DAVID TIZÓN-COUTO involving production (e.g. sentence shadowing) or perception (e.g. word monitoring) should prove helpful in order to further examine some of the threads laid out here as regards the status of alternating multiword sequences. For instance, provided that sequential relations between lexemes and constructions are forward looking (Diessel 2019: 15), one might wonder whether hearers will be able to predict that particle to is significantly more likely to follow a tensed, yet bare/uninflected, form of try. Likewise, if the effect of horror aequi is as solid as suggested by the results obtained from the corpora, hearers of most varieties should also be able to predict that try and is generally more likely after a preceding to. Testing such probabilistic trends experimentally would provide information as to whether users (unconsciously) take note of, and can anticipate, the cotextual and contextual preferences of alternating multiword sequences. If combinatory preferences of this kind can be shown to be part of language users' stored knowledge of multiword variants, this would bring their mental representations closer to the account traditionally offered for lexical variants (vs syntactic ones).