Skip to main content Accessibility help

Probabilistic indigenization effects at the lexis–syntax interface



Szmrecsanyi et al. (2016) define probabilistic indigenization as the process whereby probabilistic constraints shape variation patterns in different ways, which eventually leads to more heterogeneity in the constraints governing syntactic variation across different varieties of English. The present study extends our knowledge of the heterogeneity of probabilistic grammars by sketching a corpus-based variationist method for calculating the similarity between varieties thereby drawing inspiration from the comparative sociolinguistics literature. Based on linguistic material from the International Corpus of English, we ascertain the degree of regional variability of five probabilistic constraints on the genitive, dative, particle placement and subject pronoun omission alternations across three varieties of English, namely British, Indian and Singapore English. Our results indicate that, of the four alternations under study, the genitive alternation is the most homogeneous one from a regional perspective, followed – in increasing order of heterogeneity – by subject pronoun omission, dative and particle placement alternations. On the basis of these findings, we evaluate claims in the literature according to which the extent of probabilistic indigenization is proportional to the lexical specificity of the syntactic phenomenon under study, a hypothesis that is borne out by our data.



Hide All

Generous financial support from the following institutions is gratefully acknowledged: Regional Government of Galicia (grants no. ED431B 2017/12 and ED431D 2017/09); Spanish Ministry of Innovation, Science and Universities (grants no. FFI2017-86884-P, FFI2014-52188-P and BES-2015-071233); European Regional Development Fund; and the Research Foundation Flanders (grant no. G.0C59.13N). We would further like to express our gratitude to the editors and copy-editors of English Language and Linguistics, and three anonymous reviewers for their helpful suggestions. Thanks are also due to Benedikt Szmrecsanyi and Daniela Pettersson-Traba for their valuable comments on earlier versions of this article. The usual disclaimers apply.



Hide All
Davies, Mark. 2013. Corpus of Global Web-Based English: 1.9 Billion Words from Speakers in 20 Countries (GloWbE). https// (accessed 11 April 2018).
ICE-GB: International Corpus of English – The British Component. (accessed 11 April 2018).
ICE-IND: International Corpus of English – The Indian Component. (accessed 11 April 2018).
ICE-SIN: International Corpus of English – The Singaporean Component. (accessed 11 April 2018).
Bates, Douglas, Maechler, Martin, Bolker, Ben & Walker, Steve. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1), 148.
Bernaisch, Tobias, Gries, Stefan Th. & Mukherjee, Joybrato. 2014. The dative alternation in South Asian English(es): Modelling predictors and predicting prototypes. English World-Wide 35(1), 731.
Bresnan, Joan. 2007. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Featherston, Sam & Sternfeld, Wolfgang (eds.), Roots: Linguistics in search of its evidential base, 7596. Berlin: Mouton de Gruyter.
Bresnan, Joan, Cueni, Anna, Nikitina, Tatiana & Harald Baayen, R.. 2007. Predicting the dative alternation. In Bouma, Gerlof, Krämer, Irene & Zwarts, Joost (eds.), Cognitive foundations of interpretation, 6994. Amsterdam: Royal Netherlands Academy of Science.
Bresnan, Joan & Hay, Jennifer. 2008. Gradient grammar: An effect of animacy on the syntax of give in New Zealand and American English. Lingua 118(2), 245–59.
Bresnan, Joan & Ford, Marilyn. 2010. Predicting syntax: Processing dative constructions in American and Australian varieties of English. Language 86(1), 168213.
De Cuypere, Ludovic & Verbeke, Saartje. 2013. Dative alternation in Indian English: A corpus-based analysis. World Englishes 32(2), 169–84.
Gelman, Andrew. 2008. Scaling regression inputs by two standard deviations. Statistics in Medicine 27(15), 2865–73.
Gerwin, Johanna & Röthlisberger, Melanie. To appear. Dialectal ditransitive patterns in British English: Weighing sociolinguistic factors against language-internal constraints. In Röthlisberger, Melanie, Zehentner, Eva & Colleman, Timothy (eds.), Ditransitive constructions in Germanic languages: Diachronic and synchronic aspects (Studies in Germanic Linguistics). Amsterdam and Philadelphia: John Benjamins.
Grafmiller, Jason. 2014. Variation in English genitives across modality and genres. English Language and Linguistics 18(3), 471–96.
Grafmiller, Jason & Szmrecsanyi, Benedikt. 2018. Mapping out particle placement in Englishes around the world: A case study in comparative sociolinguistic analysis. Language Variation and Change 30(3), 385412.
Gries, Stefan Th. 2003. Multifactorial analysis in corpus linguistics: A study of particle placement. New York: Continuum.
Hajjem, Ahlem, Bellavance, François & Larocque, Denis. 2014. Mixed-effects random forest for clustered data. Journal of Statistical Computation and Simulation 84(6), 1313–28.
Harrell, Frank E. Jr 2014. Hmisc: Harrell miscellaneous. R Package Version 3.14-6. (accessed 17 September 2018).
Heller, Benedikt. 2018. Stability and fluidity in syntactic variation world-wide: The genitive alternation across varieties of English. PhD dissertation, KU Leuven.
Heller, Benedikt, Szmrecsanyi, Benedikt & Grafmiller, Jason. 2017. Stability and fluidity in syntactic variation world-wide: The genitive alternation across varieties of English. Journal of English Linguistics 45(1), 327.
Hinrichs, Lars & Szmrecsanyi, Benedikt. 2007. Recent changes in the function and frequency of Standard English genitive constructions: A multivariate analysis of tagged corpora. English Language and Linguistics 11(3), 437–74.
Hoffmann, Thomas. 2014. The cognitive evolution of Englishes: The role of constructions in the Dynamic Model. In Buschfeld, Sarah, Hoffmann, Thomas, Huber, Magnus & Kautzsch, Alexander (eds.), The evolution of Englishes: The Dynamic Model and beyond, 160–80. Amsterdam and Philadelphia: John Benjamins.
Hosmer, David W. & Lemeshow, Stanley. 2000. Applied logistic regression. New York: Wiley.
Hothorn, Torsten, Buehlmann, Peter, Dudoit, Sandrine, Molinaro, Annette & Van Der Laan, Mark. 2006. Survival ensembles. Biostatistics 7(3), 355–73.
Hundt, Marianne, Röthlisberger, Melanie & Seoane, Elena. To appear. Predicting voice alternation across academic Englishes. Corpus Linguistics and Linguistic Theory.
Kachru, Yamuna. 2006. Hindi. Amsterdam and Philadelphia: John Benjamins.
Labov, William. 1972. Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press.
Leimgruber, Jakob R. E. 2013. Singapore English: Structure, variation, and usage. Cambridge and New York: Cambridge University Press.
Levin, Beth. 1993. English verb classes and alternations: A preliminary investigation. Chicago: University of Chicago Press.
Levshina, Natalia. 2015. How to do linguistics with R: Data exploration and statistical analysis. Amsterdam and Philadelphia: John Benjamins.
Li, Charles N. & Thompson, Sandra A.. 1989. Mandarin Chinese: A functional reference grammar. Berkeley, Los Angeles and London: University of California Press.
Mesthrie, Rajend & Bhatt, Rakesh M.. 2008. World Englishes: The study of new linguistic varieties. Cambridge and New York: Cambridge University Press.
Mukherjee, Joybrato. 2007. Steady states in the evolution of New Englishes: Present-day Indian English as an equilibrium. Journal of English Linguistics 35(2), 157–87.
Mukherjee, Joybrato & Hoffmann, Sebastian. 2006. Describing verb-complementational profiles of New Englishes: A pilot study of Indian English. English World-Wide 27(2), 147–73.
Poplack, Shana & Tagliamonte, Sali A.. 2001. African American English in the diaspora. Oxford: Blackwell.
R Core Team. 2017. R: A Language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Rosenbach, Anette. 2002. Genitive variation in English: Conceptual factors in synchronic and diachronic studies. Berlin: Mouton de Gruyter.
Rosenbach, Anette. 2003. Aspects of iconicity and economy in the choice between the s-genitive and the of-genitive in English. In Rohdenburg, Günter & Mondorf, Britta (eds.), Determinants of grammatical variation in English, 379711. Berlin: Mouton de Gruyter.
Rosenbach, Anette. 2014. English genitive variation – The state of the art. English Language and Linguistics 18(2), 215–62.
Rosenbach, Anette. 2017. Constraints in contact: Animacy in English and Afrikaans genitive variation – A cross-linguistic perspective. Glossa: A Journal of General Linguistics 2(1), 72. 121.
Röthlisberger, Melanie. 2018. Regional variation in probabilistic grammars: A multifactorial study of the English dative alternation. PhD dissertation, KU Leuven.
Röthlisberger, Melanie, Grafmiller, Jason & Szmrecsanyi, Benedikt. 2017. Cognitive indigenization effects in the English dative alternation. Cognitive Linguistics 28(4), 673710.
Schneider, Edgar W. 2003. The dynamics of New Englishes: From identity construction to dialect birth. Language 79(2), 233–81.
Schneider, Edgar W. 2007. Postcolonial English: Varieties around the world. Cambridge: Cambridge University Press.
Sharma, Devyani. 2012. Indian English. In Kortmann, Bernd & Lunkenheimer, Kerstin (eds.), The Mouton world atlas of variation in English, 523–30. Berlin and Boston: Mouton de Gruyter.
Speiser, Jaime Lynn, Wolf, Bethany J., Chung, Dongjun, Karvellas, Constantine J., Koch, David G. & Durkalski, Valerie L.. 2019. BiMM forest: A random forest method for modeling clustered and longitudinal binary outcomes. Chemometrics and Intelligent Laboratory Systems 185. doi: 10.1016/j.chemolab.2019.01.002
Stefanowitsch, Anatol & Gries, Stefan Th.. 2003. Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics 8(2), 209–43.
Strobl, Carolin, Boulesteix, Anne-Laure, Zeileis, Achim & Hothorn, Torsten. 2007. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics 8(25), (accessed 17 September 2018).
Strobl, Carolin, Boulesteix, Anne-Laure, Kneib, Thomas, Augustin, Thomas & Zeileis, Achim. 2008. Conditional variable importance for random forests. BMC Bioinformatics 9(307). (accessed 17 September 2018).
Szmrecsanyi, Benedikt, Grafmiller, Jason, Heller, Benedikt & Röthlisberger, Melanie. 2016. Around the world in three alternations: Modeling syntactic variation in varieties of English. English World-Wide 37(2), 109–37.
Szmrecsanyi, Benedikt, Grafmiller, Jason, Bresnan, Joan, Rosenbach, Anette, Tagliamonte, Sali A. & Todd, Simon. 2017. Spoken syntax in a comparative perspective: The dative and genitive alternation in varieties of English. Glossa: A Journal of General Linguistics 2(1), 86. 127.
Szmrecsanyi, Benedikt, Grafmiller, Jason & Rosseel, Laura. MS. Variation-based distance and similarity modeling: A case study in World Englishes. Unpublished manuscript.
Szmrecsanyi, Benedikt & Hinrichs, Lars. 2008. Probabilistic determinants of genitive variation in spoken and written English: A multivariate comparison across time, space, and genres. In Nevalainen, Terttu, Taavisainen, Irma, Pahta, Paivi & Korhonen, Minna (eds.), The dynamics of linguistic variation: Corpus evidence on English past and present, 291309. Amsterdam: John Benjamins.
Tagliamonte, Sali A. 2002. Comparative sociolinguistics. In Chambers, J. K. & Schilling, Natalie (eds.), The handbook of language variation and change, 729–63. Oxford: Wiley-Blackwell.
Tagliamonte, Sali A. & Harald Baayen, R.. 2012. Models, forests and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change 24(2), 135–78.
Tamaredo, Iván. 2018. Processing grammatical structures: Morphosyntactic complexity and efficiency in varieties of English around the world, with special reference to pronoun omission. PhD dissertation, University of Santiago de Compostela.
Torres Cacoullos, Rena & Travis, Catherine E.. 2014. Prosody, priming and particular constructions: The patterning of English first-person singular subject expression in conversation. Journal of Pragmatics 63, 1934.
Wolk, Christoph, Bresnan, Joan, Rosenbach, Anette & Szmrecsanyi, Benedikt. 2013. Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica 30(3), 382419.


Probabilistic indigenization effects at the lexis–syntax interface



Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed