1. Introduction
Two directions of nasalisation – regressive (leftward) and progressive (rightward) – are attested in Paraguayan Guaraní (referred to henceforth by the initialism PG; ISO: gug), a Tupí–Guaraní language spoken by around 6.5 million people, primarily in Paraguay. Segments preceding a stressed nasal vowel are realized as anticipatorily nasalised: this is visible in (1a), where nasality spreads leftward from the root-final stressed vowel /ẽ/ in the root ‘arrive’. Nasalisation of a suffix or enclitic is also attested, triggered by the same root-final stressed vowel: the enclitic meaning ‘until’ surfaces with an initial nasal in (1a), as opposed to the initial voiceless stop of the same morpheme in an oral context like (1b).Footnote 1
Drawing on data like that in (1), some previous work has claimed that PG represents a case of bidirectional harmony (Lunt Reference Lunt and Anderson1973; Goldsmith Reference Goldsmith1976). Others have noted, however, that progressive nasalisation seems morpheme-specific (Lapierre & Michael Reference Lapierre and Michael2018; Estigarribia Reference Estigarribia2021): I investigate these conflicting assessments here. This work also addresses the interactions of etymological origin and application of nasalisation. Spanish-origin roots are extremely frequent in PG and, due to differences in phonotactic constraints between Spanish and PG, provide an interesting lens through which to view PG nasalisation patterns. Regressive nasal harmony can be triggered by a nasal consonant within a Spanish-origin root (Russell Reference Russell2022a); however, the question of whether Spanish-origin roots may trigger progressive nasalisation as well has not yet been addressed in the literature.
In this article, I tackle the question of whether regressive and progressive nasalisation processes in PG are in fact distinct, and if so, what different mechanisms underlie the two types of nasalisation. I examine the two directions of nasalisation and their interactions with root etymological origin through a corpus study of actual application rates, presenting the first quantitative study of nasalisation rate in PG. I find that direction does in fact affect application of nasalisation processes, and that nasalisation rate is influenced by different factors for each direction of nasalisation. Variation in the regressive nasalisation triggered by a root is influenced by factors including the morphosyntactic status of the affix and the token frequency of the root. Crucially, the process of regressive nasal harmony has been extended to Spanish-origin items, to the point that etymological origin of the root is not a significant predictor of application of nasal harmony. However, root etymological origin does greatly impact progressive nasalisation rate. I propose that regressive nasalisation is best analysed as (productive) nasal harmony, while progressive nasalisation represents a case of morpheme-specific phonologically conditioned suppletive allomorphy.
This article is structured as follows: in §2, I provide background on PG and its morphophonology. I then present an overview of nasality and nasalisation in PG in §3, including the two nasalisation processes in question: regressive nasal harmony and morpheme-specific progressive nasalisation. I introduce their interactions with Spanish-origin items in §4. Next, I present and briefly discuss the findings of a quantitative corpus study of nasalisation in PG in §5. I connect these findings to the larger picture of PG nasalisation and variation in vowel harmony cross-linguistically in §6, then briefly conclude in §7. Data presented here, unless otherwise cited, come from two sources: elicitation data collected remotely with two native speakers of PG (Gómez et al. Reference Gómez, Ovelar, Bossi, Dabkowski, Drummond, Grabowski, Jarvis, Khuu, Michael and Russell2020), and a corpus of sociolinguistic interviews conducted in Paraguay (Bittar Prieto Reference Bittar Prieto2021).
2. Background
PG is one of the most widely spoken Indigenous languages of the Americas: indeed, Paraguay is the only American nation where an Indigenous language has survived as a majority language spoken by the non-Indigenous population (Estigarribia Reference Estigarribia2020). Spanish colonisers arrived in what is now Paraguay in the early 1500s, but never had a large administrative presence and did not enforce the exclusive use of Spanish. The few Spanish men that lived in the region generally married ethnically Guaraní women, who raised their children speaking Guaraní (Morínigo Reference Morínigo1959). The use of the Guaraní language has been a strong symbol of national identity for centuries, particularly as a way for Paraguayans to differentiate themselves from their neighbours in times of conflict (Estigarribia Reference Estigarribia2015). Today, PG is an official language of Paraguay, along with Spanish (Constitución Política de la República de Paraguay, article 140). It has been present in all schools in the country, either as the primary language of instruction or as a separate subject, since 1994.
2.1. PG morphophonology
PG is an agglutinative language; many different morphemes are expressed as prefixes or suffixes within the verbal complex (Ayala Reference Ayala1996; Hamidzadeh & Russell Reference Hamidzadeh, Russell, Weber and Chen2014; Zubizarreta Reference Zubizarreta2022). The full verbal complex template is shown in (2):
This order of elements in the verbal complex interacts with morphophonological processes in the language. The root and its prefixes form a prosodic word, which is the unit of nasalisation and stress assignment (Russell Reference Russell, Elkins, Hayes, Jo and Siah2022b), while suffixes constitute separate prosodic words of their own (D
bkowski Reference Dabkowski, Jurgec, Duncan, Elfner, Kang, Kochetov, O’Neill, Ozburn, Rice, Sanders, Schertz, Shaftoe and Sullivan2022).
2.1.1. Agreement
PG makes use of two sets of agreement morphology, which I term ‘Set A’ and ‘Set B’ here. In Table 1, I present the full independent form of each subject pronoun, followed by the corresponding Set A and B forms. Set A agreement has been previously analysed as true phi-agreement, as opposed to the Set B clitics (Woolford Reference Woolford, Legendre, Putnam, de Swart and Zaroukian2016; Zubizarreta & Pancheva Reference Zubizarreta and Pancheva2017).
Table 1 PG pronoun paradigm.

Set A agreement is used when agreement is controlled by the transitive subject (as in (3a)) or active intransitive subject (as in (3b)).
Set B forms are used when agreement is controlled by the transitive object (as in (4a)) or stative intransitive subject (as in (4b)). The system of argument cross-reference in PG is a case of hierarchical agreement: subject and object cross-referencing morphemes compete for a single slot, and the winner is the form higher on the person hierarchy (Velázquez-Castillo Reference Velázquez-Castillo1991; Nichols Reference Nichols1992; Siewierska Reference Siewierska1996). Additionally, Set B forms are used in possessive contexts, as in (4c).
2.1.2. Stress
Primary stress is systematically assigned to the final syllable of the prosodic word. A root and its prefixes constitute a single prosodic word: a root-final syllable receives primary stress, as in (5), with a few lexical exceptions (Gregores & Suárez Reference Gregores and Suárez1967).
Suffixes in PG fall into one of two domains, classified by whether the suffix can bear stress (Mistieri Reference Mistieri2013). This is a property of the suffix itself: for instance, the totalitative suffix in (6a) is stress-bearing, while the future suffix in (6b) is not. As exemplified by the final syllable of the root ‘walk’ in (6a), the final syllable of an embedded prosodic word is realised with secondary stress, even when this results in a surface stress clash.
The ability to bear stress is not strictly conditioned by semantics: in fact, whether or not a suffix bears stress is actually attributable chiefly to historical origin. Stress-bearing suffixes can generally be reconstructed as having coda consonants in Proto-Tupí–Guaraní, while non-stress-bearing suffixes ended in open syllables (Jensen Reference Jensen, Derbyshire and Pullum1998; Russell Reference Russell, Marianne, Reisinger and Underhill2023). All stress-bearing suffixes linearly precede all non-stress-bearing-suffixes and enclitics, and primary stress is assigned to the final syllable within the domain of stress-bearing suffixes.
3. Nasality and nasalisation in Paraguayan Guaraní
3.1. Nasality
PG distinguishes six vowel qualities, as shown in Figure 1. Vowel length is not contrastive. All oral vowels have nasal counterparts.

Figure 1 Paraguayan Guaraní phonemic vowel inventory.
Vowel nasality is contrastive only if the vowel is stressed (Barratt Reference Barratt1981; Walker Reference Walker1999). Typically, the vowel which receives the primary stress within a lexical item is the root-final vowel (Adelaar Reference Adelaar1994; Cabral & Rodrigues Reference Cabral and Rodrigues2011; Mistieri Reference Mistieri2013). Preceding a stressed root-final nasal vowel, all previous vowels within the word also predictably surface as nasal. I interpret this generalisation as evidence that only a vowel which receives stress may be specified phonologically as either oral or nasal: all other vowels within the phonological word are underlyingly unspecified for nasality, and predictably surface as either nasal or oral due to the influence of the stressed vowel. Therefore, the minimal pair [tu
pa] ‘bed’ vs. [tũ
p
] ‘God’ exists in the language, but there are no possible counterparts like *[tũ
pa] or *[tu
pã], in which nasality is contrastive on an unstressed, non–root-final vowel.
The inventory of consonants in PG, including those of ‘mixed’ articulation, is represented in Table 2. Notably, there are no voiced stops in the inventory. However, segments of ‘mixed’ articulation, in which a consonant begins as a nasal and ends as a voiced stop, are present in PG. The status of these ‘mixed’ segments has long been debated in the Tupí–Guaraní literature. Analyses vary between arguing that they are the surface result of the pre-nasalisation of oral stops (Gregores & Suárez Reference Gregores and Suárez1967; Rose Reference Rose2008; Daviet Reference Daviet2016) versus the post-oralisation of nasal consonants (Piggott Reference Piggott1992; Cardoso Reference Cardoso2009; Lapierre & Michael Reference Lapierre and Michael2018; Estigarribia Reference Estigarribia2021). I take the latter position, that these ‘mixed’ segments are post-oralised allophones of nasal consonants. The surface form of an underlying nasal consonant is determined by its immediate phonological environment: before an oral vowel, a nasal stop surfaces as its partially oralised allophone, as the result of shielding (Stanton Reference Stanton2017). Before a nasal vowel, a nasal consonant surfaces faithfully as nasal, while an underlying oral approximant surfaces as nasalised. Consonants alternate as in Table 3.
Table 2 Paraguayan Guaraní consonant inventory, including phonemes and mixed-articulation allophones.

Table 3 Allophonic consonant alternations before oral vs. nasal vowels.

3.2. Nasalisation
Both regressive (leftward) and progressive (rightward) spreads of nasalisation are attested in PG. I argue that these two types of nasalisation differ in important ways. In this study, I investigate what conditions variation in nasalisation, and how that variation is connected to the direction of nasalisation.
3.2.1. Regressive nasal harmony
Many languages of the Tupí–Guaraní family, including PG, exhibit long-distance regressive nasal harmony systems (Lapierre & Michael Reference Lapierre and Michael2018; Miranda Reference Miranda2018; Baraúna Reference Baraúna2020; Miranda & Picanço Reference Miranda and Picanço2020). The harmony system in PG has been the subject of extensive description and analysis in the theoretical literature for decades (Gregores & Suárez Reference Gregores and Suárez1967; Goldsmith Reference Goldsmith1976; Piggott Reference Piggott1992; Steriade Reference Steriade, Huffman and Krakow1993; Beckman Reference Beckman1998; Walker Reference Walker1999, Reference Walker2000; Kaiser Reference Kaiser2008). Nasality spreads leftwards from a phonemic nasal vowel, and the domain of nasal harmony is the root and its prefixes (Lapierre & Michael Reference Lapierre and Michael2018). The effects of nasal harmony are clear from the juxtaposition of (7a) and (7b). The two sentences differ in whether the root – ‘know’ in (7a) vs. ‘hug’ in (7b) – contains a phonemic nasal vowel, which in turn results in different surface forms for all the other morphemes within the word.
The example in (7b) demonstrates that the presence of a phonemic nasal vowel results in nasalisation of segments to its left within the word. In PG, a phonemic nasal consonant additionally triggers regressive nasalisation, as in (8). I assume that the underlying form of the root ‘listen’ is /h-enu/, with a nasal consonant /n/ and oral vowel /u/: the underlying nasal consonant surfaces as post-oralised before the oral vowel, and spreads its nasality leftward. As a result, prefixes before a root containing a nasal consonant surface as nasalised, as in (8). All prefixes show predictable effects of regressive nasal harmony.
The voiceless obstruents /p/, /t/, /k/, /h/ and /
/ are transparent to regressive nasal harmony, as visible in the roots ‘cut’ (9a) and ‘hit’ (9b), while all other segments in the inventory show surface effects of nasalisation.Footnote
2
No consonants in the inventory block the spread of nasality.
Just as a nasal consonant in a root triggers nasalisation of segments to its left, so does a nasal consonant in a prefix, like the initial segment of the causative prefix in (10a). However, a nasal vowel or consonant in a suffix does not trigger the nasalisation of material earlier in the word. For instance, the frustrative suffix contains a final nasal vowel /ã/, which triggers nasalisation within the suffix itself but not beyond, as in (10b).Footnote 3
The presence of a suffix with an initial alveolar flap and following nasal vowel exceptionally results in the spreading of nasality one segment to the left:Footnote 4 in (11), for instance, the final vowel /o/ of the root ‘wife’ surfaces as phonetically nasal, but does not trigger nasalisation of segments further to the left. I assume that this surface form is articulatorily, rather than phonologically, driven.
In summary, nasality spreads from a phonemic nasal vowel or consonant within the phonological word in PG. This nasal harmony process applies from right to left, as it affects segments to the left of the trigger. Regressive nasal harmony targets all segments except voiceless obstruents, which are transparent to harmony.
3.2.2. Progressive nasalisation
Nasalisation in PG also has a limited rightward spread. Following a stressed nasal vowel, the locative enclitic /=pe/ may undergo nasalisation to surface as [=mẽ], as illustrated in (12).
While regressive nasalisation spreads throughout roots and all prefixes, progressive nasalisation occurs only with a very select group of targets (Lunt Reference Lunt and Anderson1973; Humbert & Piggott Reference Humbert, Piggott, Booij and van de Weijer1997). In his grammar of PG, Estigarribia (Reference Estigarribia2020) lists 11 suffixes and enclitics affected by progressive nasalisation: each morpheme has one allomorph following an oral vowel, and a different (nasal) allomorph following an oral vowel; these are shown in Table 4. I find that only five of these 11 suffixes productively undergo nasalisation for the speakers with whom I worked, listed as ‘productive’ in Table 4.Footnote 5 The productive morphemes include the totalitative (total), which indicates application and/or exhaustion of a given event to the whole object or subject (Estigarribia Reference Estigarribia2020); the locative (loc); an enclitic indicating ‘until’; the incipient (incip), representing the immediate future; and the nominal past (n.pst), which restricts the temporal interpretation of a noun to the past, similar to English ‘former’ (Tonhauser Reference Tonhauser2007). The unproductive morphemes include the collective (coll), indicating plurality of countable items; the collective plural (coll.pl), indicating a place of abundance; various passive nominalisers; and the enclitic meaning ‘towards’.
Table 4 Allomorphs of target suffixes in Paraguayan Guaraní.

While voiceless stops are transparent to regressive nasal harmony, progressive nasalisation exclusively targets morphemes with initial voiceless stops.Footnote
6
Some suffixes and enclitics undergo total nasalisation – for example, loc [pe]
$\sim $
[mẽ] – while the initial voiceless stop of others pre-nasalises – for example, total [pa]
$\sim $
[mba]. This distinction is attributable to historical origin (Russell Reference Russell, Marianne, Reisinger and Underhill2023): suffixes which pre-nasalise have earlier origins as roots,Footnote
7
and the pre-nasalisation of voiceless-stop-initial roots was a productive process in Proto-Tupí–Guaraní (Estigarribia Reference Estigarribia2021).
While regressive nasalisation may be triggered by either a phonemic nasal vowel or a consonant, the same is not true for progressive nasalisation. If a nasal consonant is followed by an oral vowel, and therefore surfaces as post-oralised, as in the root /h-enu/ ‘listen’ in (13), nasality cannot spread rightwards, as shown in (13b). Therefore, only stressed nasal vowels should be able to trigger progressive nasalisation in PG.
Progressive nasalisation involves the rightward spread of nasality. The trigger is a phonemic nasal vowel, and the target is a suffix-initial syllable beginning with a voiceless stop belonging to one of the five productive suffixes listed in Table 4. The surface effect of progressive nasalisation differs based on the suffix: some suffixes show the effects of full nasalisation, while others pre-nasalise.
3.3. Interim summary
I have described two processes of nasalisation in PG. In Table 5, I compare several properties of these processes. Regressive harmony targets all preceding segments except for voiceless obstruents, which are transparent. Progressive nasalisation, on the other hand, exclusively targets syllables with initial voiceless stops. Additionally, regressive harmony appears to be fully productive, while progressive nasalisation is limited to a very small number of particular suffixes and enclitics, with morpheme-specific effects.
Table 5 Comparing directions of nasalisation in Paraguayan Guaraní.

I now turn to the interactions of these two nasalisation processes with roots of Spanish etymological origin. Russell (Reference Russell2022a) finds that the presence of a nasal consonant in a Spanish-origin root can trigger the nasalisation of prefixes. The observation that regressive nasal harmony also applies to some extent to Spanish-origin roots provides support for the claim that it is in fact a productive process in the language. However, the interactions of progressive nasalisation and Spanish-origin roots have gone unreported in prior literature; I seek to fill that gap here. Gaining insight into these interactions is crucial to an understanding of the differences between regressive and progressive nasalisation in PG.
4. Spanish-origin items
Paraguay has been cited as a case of a country in which stable bilingualism is the norm (Romaine Reference Romaine1995; Trudgill Reference Trudgill1995). According to the 2012 census of Paraguay, 77% of Paraguayans speak Guaraní (DGEEC 2012). Of that 77%, 8% do not speak Spanish. Additionally, 73% of Paraguayans speak Spanish, of which 5% do not speak Guaraní. Since the majority of the population is bilingual in PG and Spanish, there is no clear division between the two languages in everyday life. The term Jopará, which means ‘mixture’ in PG, refers to the colloquial variety which involves frequent language mixing between PG and Spanish (Lustig Reference Lustig2010; Estigarribia Reference Estigarribia2015). The intricacies of Jopará have been documented and described in many publications, including Morínigo (Reference Morínigo1959); Melià (Reference Melià1974); Boidin (Reference Boidin, Dietrich and Symeonidis2006); Bakker et al. (Reference Bakker, Gómez Rendón, Hekking, Stolz, Bakker and Palomo2008); Palacios Alcaine (Reference Palacios Alcaine and Alcaine2008); Dietrich (Reference Dietrich2010); Lustig (Reference Lustig2010); Cardona (Reference Cardona2008), among others. The use of morphemes from both Spanish and PG is extremely common at all levels of formality of language use in Paraguay, but the proportion contributed from each language varies considerably (Estigarribia Reference Estigarribia2015), and sociolinguistic work has focused on the comparative uses of PG and Spanish in different spheres (Gynan Reference Gynan1998, Reference Gynan2011; Choi Reference Choi2003, Reference Choi2004, Reference Choi2005; Zajiícová Reference Zajiícová2009). Studies of language attitudes in Paraguay have found that the population generally holds positive attitudes towards both languages, albeit for different reasons: people report a sense of pride and identity related to the use of PG, and connect the use of Spanish to economic value (Choi Reference Choi2003; Gynan Reference Gynan2011). Language use in Paraguay is fundamentally multilingual, and speakers recruit their knowledge of both languages in everyday speech. In this study, I focus specifically on the use of lexical items of PG and Spanish origin in PG.
4.1. Phonotactic adaptation
Individual Spanish-origin lexical items display various repairs of violations of PG phonotactics. Pinta & Smith (Reference Pinta, Smith, Estigarribia and Pinta2017) propose five lexical strata in PG, based on phonological repairs of loanwords from Spanish, as reproduced in Table 6. They identify four different properties: the presence of a nasal coda (N Codas), any coda consonant (Codas), non-final stress (Non-final stress and complex onsets (#CC). The more properties are repaired, the more native-like a stratum is, and conversely, the more properties are tolerated, the more foreign-like a stratum is. Pinta and Smith do not consider behaviour in contexts of nasalisation in their analysis of lexical strata.
Table 6 Lexical strata in Paraguayan Guaraní; reproduced from Pinta & Smith (Reference Pinta, Smith, Estigarribia and Pinta2017: 306).

I provide an example of a lexical item from each stratum in Table 7; examples are taken from Pinta & Smith’s (Reference Pinta, Smith, Estigarribia and Pinta2017) discussion of their proposed strata. In the nativ(ised) stratum, which includes native Guaraní items as well as fully nativised loans, all syllables are open, stress is final and no complex onsets occur, as visible from the adaptation of culantro ‘coriander’ to [kuɾã
tũ]. In the mostly nativised stratum, all syllables are open and stress is final, but onset consonant clusters are tolerated, as in the initial syllable of ‘Pluto’. In the partially nativised stratum, all syllables are open, but non-final stress is tolerated, and so are onset consonant clusters, as evidenced by the adaptation of ‘London’ with the onset cluster in the second syllable intact, but the final coda /s/ deleted. In the barely nativised stratum, non-nasal codas are tolerated, as are non-final stress patterns and onset consonant clusters; however, nasal codas are repaired, as visible from the adaptation of the nucleus and coda of ‘department store’ as a nasal vowel. Finally, in the unadapted stratum, non-final stress and onset consonant clusters are tolerated, and codas of all kinds are tolerated, including nasal ones, as in the first syllable of ‘salad’.
Table 7 Examples of adaptations sorted by stratum in Paraguayan Guaraní.

The stratum into which a loan falls is partially a reflection of the time depth of the loan, à la Itô & Mester (Reference Itô, Mester and Tsujimura1999), but is mainly dependent on the loan’s phonotactic similarity to PG in its unadapted form. For instance, adaptation of the Spanish word papá ‘father’ into PG necessitates no overt repair of PG phonotactics, and would thereby be classified as Level 1 (‘nativ(ised)’). I acknowledge the limits of this system of lexical strata, and use it here simply as a proxy measure for quantifying the well-formedness of a Spanish-origin lexical item vis-à-vis PG phonotactics.
4.2. Nasality
Paraguayan Spanish has three phonemic nasal consonants (bilabial /m/, alveolar /n/ and palatal /
/), but no phonemic nasal vowels (Cassano Reference Cassano1971). It is not the case that vowels are never pronounced with phonetic nasalisation in Spanish; however, nasalisation is not a target of speech production in Spanish vowels, and nasalisation is not a phonologically active feature of vowels in the language (Solé Reference Solé1992). Spanish borrowings contain surface phonotactic environments that never occur natively in PG, since phonemic nasal consonants in PG always surface as partially oralised allophones before oral vowels. An underlying sequence /ma/ must be realised either as [mba] or as [mã] on the surface in PG due to phonotactic constraints in the language; however, lexical items of Spanish origin are not subject to the same phonotactic constraints. In some older borrowings, this mismatch in phonotactics is in fact repaired, as shown in Table 8. Some sequences of a vowel and nasal coda consonant in Spanish borrowings are reinterpreted in PG as phonemic nasal vowels. In words like ‘heart’, ‘pants’ and ‘soap’, a word-final VN sequence in Spanish is reinterpreted as a stressed nasal vowel in PG, and these words obey nasal harmony within the root. The word for ‘pillow’ demonstrates the outcome when a nasal consonant in a Spanish-origin root does not appear within the same syllable as the stressed vowel: this /m/ spreads its nasality leftwards, and surfaces as post-oralised [mb], exactly as expected given the phonotactic constraints of PG.
Table 8 Adaptations of selected Spanish-origin items with nasal consonants into Paraguayan Guaraní.

However, most Spanish-origin items are pronounced roughly as they are in Spanish: nasal–oral sequences are not repaired, and these items do not exhibit root-internal nasal harmony. The difference in strategy between adaptations to morpheme-internal nasal harmony and apparent non-adaptations is likely related to the diachronic expansion of PG–Spanish bilingualism in Paraguay. Spanish influence on PG took place slowly and gradually over a long period of time until quite recently, when urbanisation has accelerated the rate of Guaraní speakers acquiring and using Spanish (Zajiícová Reference Zajiícová2009; Fernández Barrera Reference Fernández Barrera2015). Since the majority of users of PG today are bilingual with Spanish, they seem to be more tolerant of violations of PG phonotactics by Spanish-origin items (Pinta Reference Pinta2013; Pinta & Smith Reference Pinta, Smith, Estigarribia and Pinta2017). Earlier borrowings, therefore, underwent nativisation more than recent borrowings, which maintain the phonology of Spanish more faithfully.
4.3. Nasalisation
The presence of a nasal consonant in a Spanish-origin root triggers nasalisation of prefixes (Thun Reference Thun2005; Russell Reference Russell2022a), even when the vowels and consonants in between are oral, as illustrated in (14). Examples below are presented using a four-line method, in which the top line is the assumed underlying representation, with Spanish-origin roots (in Spanish orthography) italicised, and underlying nasal consonants bolded. In the second line – the surface IPA representation – the targets of nasalisation are underlined.
The presence of a nasal consonant may also trigger nasalisation in suffixes, even when the vowels and consonants in between are oral, as in (15).
However, importantly, both consultants in an elicitation setting often expressed uncertainty regarding the acceptability of the combination of a Spanish-origin root and suffix nasalisation, as illustrated in (16).
Spanish-origin roots may participate in nasalisation of prefixes and suffixes, but not in root-internal harmony, with the exception of the few nativised borrowings. The interactions of Spanish-origin items with regressive nasal harmony have previously been analysed as the innovation of a novel system of consonant harmony, in which a nasal consonant in a Spanish-origin root is in correspondence with consonants in prefixes, resulting in nasalisation of those prefixal consonants (Russell Reference Russell2022a). Interactions of Spanish-origin roots and progressive nasalisation are still unclear at this point. Although both consultants produce and accept some forms in which a nasal consonant in a Spanish-origin root triggers progressive nasalisation in an elicitation setting, they also verbally express uncertainty about the acceptability of similar forms.
5. Quantifying variation
In the elicitation context in which data were collected, variation in actual application of nasalisation abounded. In separate elicitation sessions, a consultant provided two different PG forms for the same English sentence: one in which no prefixes nasalised, in (17a), and one in which only the prefix closest to the root nasalised, in (17b). The same consultant also accepted a third form, shown in (17c), in which all prefixes were pronounced as nasalised, in the second session.
The quantitative analysis of variation is not informative in an elicitation context, because the frequency of forms is determined by the question-and-answer format of the elicitation interaction and may not reflect the distribution of forms in spontaneous language use. I, therefore, now turn to corpus data from sociolinguistic interviews to examine variation in nasalisation in PG and its interactions with direction of nasalisation and root etymological origin.
5.1. Methods
The data presented here come from a corpus of sociolinguistic interviews with native PG speakers in Paraguay (Bittar Prieto Reference Bittar Prieto2021). Twenty-six interviews were conducted; each interview lasted roughly an hour. Half of the interview data was collected in the urban Bañado Sur neighbourhood of Asunción in 2015. These conversational interviews were conducted by Israel Pedrozo, a resident of Bañado Sur and native speaker of PG and Paraguayan Spanish. The age of the participants in the urban community ranged from 18 to 68 (mean = 43.9, sd = 18.7). The other half of the data was collected in the rural community of San Juan Nepomuceno, roughly 200 kilometres from Asunción. The interviews in this community were conducted between October 2019 and January 2020 by Antonio Zena, a native of San Juan Nepomuceno. The ages of the participants in the rural area ranged from 18 to 73 (mean = 45.6, sd = 15.7). For each interview, all utterances were transcribed in PG and translated to Spanish by the interviewers themselves as well as by Josefina Bittar, a linguist and heritage speaker of PG.
All potential sites of nasalisation – both instances in which nasalisation of an affix actually occurred and those in which it could have but did not – were added to a data frame. The relevant variable context is the combination of an affix affected by nasalisation and a nasal root: either a PG-origin root containing a phonemic nasal segment, or a Spanish-origin root containing a nasal consonant. Data collection was limited to cases of certain affixes, specifically those affixes in which consonants clearly alternate between oral and nasal, listed in Table 9. Nasalisation of affixes that comprise only a single vowel, or a single vowel and an obstruent, was not included in the data set, due to the limitations of measuring nasality in these contexts without data about nasal airflow. Additionally, tokens were excluded when it was not possible to ascertain the etymological origin of the root (e.g., mamá ‘mother’).
Table 9 Targets of nasalisation included in the data set.

Though the nominal past tense suffix kwe had been found to productively nasalise in the elicitation context, there were no tokens of its nasal form in the corpus, so it was excluded from the data set. The corpus was also checked for the six target suffixes which had not been found to productively nasalise in the elicitation context (cf. Table 4): the collective
; the collective plural tɨ; the passive nominalisers pɨ, pɨ
and
; and the enclitic ‘towards’
oto. The corpus included two tokens of the nasal allomorph of the collective
both following the lexical item mĩtã ‘child’. These tokens were not included in the final data set. The corpus included no tokens of the collective plural or any of the passive nominalisers, and no tokens of the nasal allomorph of ‘towards’
.
The total number of tokens included in the data set was 3,641. Each token, defined as an affix that is a potential undergoer of nasalisation, was coded for the dependent variable – viz., whether nasalisation of an affix occurred – as well as a number of linguistic and social factors. Social factors included gender (self-identified as male or female), age and community affiliation (rural or urban). Linguistic factors included direction of nasalisation (regressive or progressive), morpheme identity of the target affix, etymological origin of the root (PG or Spanish), morphological category (adjective, noun or verb) and log-transformed root token frequency within the corpus.Footnote 8 Gender, age and community affiliation (rural or urban) were included in order to investigate whether social factors have any significant impact on application of nasalisation. Older people and rural communities are often associated with a form of PG which is considered more ‘pure’, while young people and urban communities may be associated with bilingualism and language mixing (Rubin Reference Rubin1968; Gómez-Rendón Reference Gómez-Rendón, Matras and Sakel2007). The direction of nasalisation was identified as either regressive (leftward) or progressive (rightward). I have argued that the two types of nasalisation differ in important ways, and therefore hypothesise that they will show significantly different patterns. Several characteristics of the root were identified and included in the analysis: etymological origin (PG or Spanish), morphological category and log-transformed token frequency.
Tokens were also coded for three additional factors which are relevant only for Spanish-origin items: phonotactic well-formedness of the root with respect to PG phonotactics (measured from 1 to 5 as per Pinta & Smith’s classification), distance (in segments) between the trigger and target, and whether or not the nasal consonant trigger appears in the stressed syllable. Phonotactic well-formedness (lexical stratum) is relevant only for Spanish-origin items, as all PG-origin items in the corpus belong to the native stratum. If the phonological similarity of a Spanish-origin root to native PG items is a factor in predicting its rate of nasalising affixes, lexical stratum is expected to have a significant effect. The distance between the trigger and target was measured in terms of number of intervening segments, in order to investigate if distance has a significant impact on application of nasalisation. Since oral consonants and vowels may intervene between a trigger in a Spanish-origin root and a target, this distance ranges in the corpus from zero to ten segments. In every case of a PG-origin item triggering nasalisation, though, the distance is zero. Finally, stress was included as a factor in order to assess if the relationship between stress and nasality in PG holds for Spanish-origin items as well: namely, that nasality is contrastive only on a stressed vowel, as described in §3.1. Stress was coded as binary: Yes, if the nasal consonant trigger in a root appears within the stressed syllable, and No if the nasal consonant trigger is in an unstressed syllable. Because of PG-internal phonotactics, regressive nasal harmony can be triggered by either a nasal consonant or a stressed nasal vowel. If a nasal consonant in a Spanish-origin root behaves like a phonemic nasal consonant in a PG-origin root, it is predicted to be able to trigger regressive nasal harmony. However, progressive nasalisation may only be triggered by a stressed nasal vowel, never by a phonemic nasal consonant (cf. (13)). Therefore, every trigger of suffix nasalisation is predicted to be a nasal vowel in a stressed syllable. This factor was included in order to evaluate whether a root in which a nasal consonant trigger appears within a stressed syllable nasalises suffixes at a significantly higher rate than those in which the trigger of nasalisation appears in an unstressed syllable.
5.2. Findings
As indicated in Table 10, the corpus includes more tokens of regressive nasal harmony than progressive nasalisation. Tokens of PG-origin triggers of nasal harmony slightly outnumber Spanish-origin triggers. However, Spanish-origin roots appear in contexts of progressive nasalisation more frequently than PG-origin roots. This is likely attributable to the high frequency of suffixes attaching to proper nouns like place names, which are often of Spanish origin. The rate of nasalisation triggered by PG-origin roots is over 90% for both directions. With Spanish-origin roots, however, we see a drastic difference in nasalisation rate between regressive harmony, where nasalisation occurs 77.6% of the time, and progressive nasalisation, where it occurs with a mere 5.7% of the tokens. It is clear that a more in-depth investigation of the difference in nasalisation rates along the axis of direction, as well as by etymological origin of the root, is necessary.
Table 10 Distribution of tokens by direction and etymological origin.

The nasalisation rates for specific affixes, shown in Table 11, prove to be particularly enlightening.Footnote 9 A PG-origin root triggers nasalisation of most prefixes over 90% of the time, with three notable exceptions, namely the three Set B morphemes (i.e., the forms used when the transitive object or stative intransitive subject controls agreement). The lower nasalisation rate of these morphemes may reflect their distinct morphological status, as they have been analysed as proclitics (Woolford Reference Woolford, Legendre, Putnam, de Swart and Zaroukian2016; Zubizarreta & Pancheva Reference Zubizarreta and Pancheva2017). The rate of prefix nasalisation triggered by Spanish-origin roots is generally slightly lower, but consistently above 85%. With the three Set B morphemes, however, nasalisation rates are again much lower. When it comes to suffixes and enclitics, we see differences between individual morphemes. The large gap between the relatively high rates of nasalisation following a PG-origin root and the quite low rates of nasalisation following a Spanish-origin root is apparent.
Table 11 Distribution of tokens by affix and root etymological origin.

5.2.1. Regressive nasal harmony
The total number of tokens included in the data set of regressive harmony was 2,747 (1,575 tokens preceding roots of PG origin, plus 1,172 tokens preceding roots of Spanish origin). The variation in regressive nasal harmony was statistically modelled using mixed-effects logistic regression.Footnote
10
The model included seven fixed effects – gender, age, community affiliation, target affix identity, root etymological origin, morphological category and frequency – and a by-speaker random intercept. The three factors specified in §5.1 that are relevant only for Spanish-origin items (lexical stratum of the root, distance between trigger and target and whether or not the nasal consonant trigger appears in the stressed syllable) were not included in this model due to the asymmetry in the data. The model also included an interaction term between the target of harmony and the etymological origin of the root. None of the social factors significantly improved model fit at the threshold of
$p<0.05$
. Additionally, morphological category had no significant impact on model fit. Though root origin was not significant as a main effect, it is significant in its interactions with target identity. The non-significant predictors were removed, with the exception of those involved in significant interactions, resulting in the final model in Table 12. The marginal
$r^2$
of this model is 0.348, representing the variance explained solely by the fixed effects, and the conditional
$r^2$
is 0.363, representing the variance explained by the entire model. In the model summary, a positive estimate indicates that the given factor level favours the nasal variant compared to the baseline, while a negative estimate indicates that the level favours the oral variant. Plots are provided in Figures B1 and B2 in Appendix B.
Table 12 Summary of model of regressive harmony.

Set B morphemes nasalise at a significantly lower rate than other morphemes. Root token frequency is significant in the model of regressive nasal harmony: the more frequent a root within the corpus, the higher the predicted rate of nasalisation. Etymological origin of the root is significant only in its interaction with two of the three Set B morphemes.
Variation in regressive nasal harmony triggered by only PG-origin items was modelled using mixed-effects logistic regression to isolate the factors relevant for those items. The total number of tokens included in this data set was 1,575. This model included six fixed effects – gender, age, community affiliation, target identity, morphological category and root token frequency – as well as a by-speaker random intercept. Again, none of the social factors significantly improved model fit at the threshold of
$p<0.05$
. Morphological category and root frequency also had no significant impact on model fit. The non-significant predictors were removed, resulting in the final model in Table 13. The marginal
$r^2$
of this model is 0.233, and the conditional
$r^2$
is 0.240. The only significant predictor was the target identity of the morpheme: specifically, Set B morphemes nasalise at a significantly lower rate than other morphemes.
Table 13 Summary of model of regressive harmony triggered by PG-origin items.

Variation in regressive nasal harmony triggered by only Spanish-origin items was modelled using mixed-effects logistic regression to isolate the factors relevant for those items. The total number of tokens included in this data set was 1,172. This model included nine fixed effects – gender, age, community affiliation, target identity, morphological category, frequency and two additional factors: root lexical stratum and distance – as well as a by-speaker random intercept, and an interaction between the target of harmony and the lexical stratum of the root. Again, none of the social factors significantly improved model fit at the threshold of
$p<0.05$
. Morphological category also had no significant impact on model fit. The lexical stratum of the root was not significant, either as a main effect or in an interaction with the target morpheme identity. The non-significant predictors were removed, resulting in the final model in Table 14. The marginal
$r^2$
of this model is 0.398, and the conditional
$r^2$
is 0.422. Plots are provided in Figure B3 in Appendix B.
Table 14 Summary of model of regressive harmony triggered by Spanish-origin items.

In summary, Set B agreement morphemes nasalise at a significantly lower rate than all other morphemes. The token frequency of the trigger of nasal harmony significantly positively correlates with nasalisation rate. Spanish-origin roots do not trigger nasalisation at a significantly lower rate than PG-origin roots; however, root etymological origin is significant in its interaction with certain Set B morphemes. In such cases, Spanish-origin roots trigger nasalisation of Set B morphemes at a lower rate. When looking specifically at Spanish-origin roots, the lexical stratum of the root does not significantly affect the rate of nasalisation. An increase in distance between trigger and target significantly negatively correlates with nasalisation rate.
5.2.2. Progressive nasalisation
In statistical analysis of the progressive nasalisation data, the incip suffix was excluded from the model due to the low number of available tokens (
$n=7$
). The total number of tokens included in the data set of progressive nasalisation was 880. The variation in progressive nasalisation was statistically modelled using mixed-effects logistic regression, including seven fixed effects – gender, age, community affiliation, target affix identity, root etymological origin, morphological category and frequency – and a by-speaker random intercept. The model also included an interaction term between the target morpheme identity and the etymological origin of the root. Again, none of the social factors significantly improved model fit. Additionally, neither the morphological category of the root nor root token frequency had a significant impact on model fit. The non-significant predictors were removed, resulting in the final model in Table 15. The marginal
$r^2$
of this model is 0.779 and the conditional
$r^2$
is 0.793. A plot is provided in Figure B4 in Appendix B.
Table 15 Summary of model of progressive nasalisation.

All three morphemes included in this model display distinctly different patterns of nasalisation. Spanish-origin roots are predicted to trigger nasalisation at a lower rate than PG-origin roots across the board; however, the size of this effect is specific to each morpheme.
Variation in progressive nasal harmony triggered by only PG-origin items was modelled using mixed-effects logistic regression to isolate the factors relevant for those items. The total number of tokens included in this data set was 389. This model included six fixed effects – gender, age, community affiliation, target identity, morphological category and root token frequency – as well as a by-speaker random intercept and an interaction term between target identity and root frequency. Again, none of the social factors significantly improved model fit at the threshold of
$p<0.05$
. Morphological category also had no significant impact on model fit. The non-significant predictors were removed, resulting in the final model in Table 16. The marginal
$r^2$
of this model is 0.228, and the conditional
$r^2$
is 0.300.
Table 16 Summary of model of progressive nasalisation with PG-origin roots.

When including only tokens of progressive nasalisation following a Spanish-origin root, the total number of tokens in this data set was 497. This model included nine fixed effects – gender, age, community affiliation, target affix identity, morphological category, frequency, root lexical stratum, stress and distance – and a by-speaker random intercept, as well as an interaction term between lexical stratum and affix morpheme identity. Again, none of the social factors significantly improved model fit, and neither the morphological category of the root nor frequency had a significant impact on model fit. The non-significant predictors were removed, resulting in the final model in Table 17. (Plots are provided in Figures B5 and B6 in Appendix B.) The marginal
$r^2$
of this model is 0.376, and the conditional
$r^2$
is 0.437. The probability of the nasal variant is predicted to be above zero only for those lexical items which most closely resemble PG-origin items in terms of lexical stratum, distance between target and trigger of nasalisation, and stress. Specifically, the Spanish-origin items which most closely resemble PG-origin items are those which include few, if any, violations of PG phonotactics, bear stress on a syllable containing a nasal consonant, and have little to no intervening material between that syllable and a prefix.
Table 17 Summary of model of progressive nasalisation triggered by Spanish-origin items.

Looking at all contexts of progressive nasalisation, all morphemes nasalise at significantly different rates from one another, and Spanish-origin roots undergo nasalisation at a significantly lower rate than PG-origin roots. The interaction between root origin and morpheme identity is significant for all morphemes. When we consider only nasalisation triggered by Spanish-origin items, lexical stratum has a significant impact: the more phonotactically well-formed a root is in PG, the higher the rate of nasalisation. In fact, only those Spanish-origin roots which share their phonological properties with PG-origin roots are predicted to trigger progressive nasalisation at all. Roots in which the trigger of nasalisation appears within a stressed syllable nasalise at a significantly higher rate than those in which the trigger is in an unstressed syllable, signifying that the language-specific connection between stress and nasality in PG is maintained for Spanish-origin roots as well. An increase in distance between trigger and target significantly negatively correlates with nasalisation rate.
6. Discussion
All nasalisation effects cannot be attributed to a single (morpho-)phonological process in PG. Findings from both data collected through elicitation and statistical modelling of a corpus of sociolinguistic interviews provide strong support for an analysis in which regressive nasal harmony and progressive nasalisation are in fact distinct processes, rather than reflections of bidirectional harmony. Regressive and progressive nasalisation differ in their effects and domains. While regressive nasalisation affects all segments except voiceless obstruents, progressive nasalisation targets only suffix-initial syllables beginning in voiceless stops, and the actual realisation of nasalisation depends on the individual morpheme. Regressive nasalisation is productive, to the point that even a nasal consonant within a Spanish-origin root may trigger nasalisation of prefixes. Nevertheless, this nasalisation pattern is distinct from that triggered by PG-origin roots, because it can operate at a distance. Progressive nasalisation, on the other hand, productively applies only for a very small number of suffixes and enclitics. Even among that limited set, actual rates of nasalisation are specific to each morpheme. Variation is additionally conditioned by different factors for each type of nasalisation. The rate at which regressive nasal harmony applies is predicted to differ significantly between different classes in the morphology, as proclitics and prefixes pattern distinctly from each other. This relationship between harmony and the morphosyntax points to a synchronically active process in which phonological constraints interact differently with proclitics and prefixes. In progressive nasalisation, however, phonologically or morphologically similar morphemes do not pattern together; rather, each morpheme behaves distinctly.
Additionally, PG regressive nasal harmony has been extended to Spanish-origin items: neither etymological origin nor lexical stratum is significant as a predictor of the application rate of nasal harmony. Even though Spanish-origin roots do not participate in root-internal harmony, a nasal consonant within such a root can trigger the nasalisation of prefixes and proclitics, constituting a case of innovated long-distance consonant harmony (Russell Reference Russell2022a). Progressive nasalisation, though, has not been extended to Spanish-origin roots. Root etymological origin – PG vs. Spanish – accounts for the majority of the variation in progressive nasalisation. The rate of progressive nasalisation triggered by a Spanish-origin root is predicted to be above zero only when that Spanish-origin root closely resembles a PG-origin root in terms of phonotactics: lexical stratum, co-occurrence of stress and nasality on the same syllable, and distance between trigger and target of nasalisation are all significant. Interactions with root etymological origin point to the synchronic productivity of regressive nasal harmony, which is productive with all Spanish-origin items, as opposed to progressive nasalisation, which is limited only to the most nativised of Spanish-origin lexical items. The findings contribute to the larger literature concerning the adaptation of loaned material to harmony systems. Cross-linguistically, it has been claimed that harmony applies at a lower rate, if at all, to loanwords (Clements & Sezer Reference Clements, Sezer, van der Hulst and Smith1982; Ringen & Heinämäki Reference Ringen and Heinämäki1999; Kertész Reference Kertész2003; Puthaval Reference Puthaval2013). In this study, I have shown that although Spanish-origin roots can trigger the nasalisation of prefixes and proclitics in PG, they do so at a significantly lower rate than PG-origin roots.
Nasal harmony and progressive nasalisation are dependent on different mechanisms, consistent with proposals made by Lapierre & Michael (Reference Lapierre and Michael2018) and Estigarribia (Reference Estigarribia2021), and counter to the claim that PG has bidirectional harmony (Lunt Reference Lunt and Anderson1973; Goldsmith Reference Goldsmith1976). Regressive nasal harmony in PG can be straightforwardly handled as a synchronically active phenomenon arising through the combination of two different nasalisation processes: agreement of adjacent syllable nuclei and coarticulation within syllables (Thomas Reference Thomas2014; Russell Reference Russell2022a). I assume that variation in nasal harmony arises from reweighting of phonological constraints, though I leave the specifics of this analysis for future work. I propose that progressive nasalisation, on the other hand, is best accounted for as suppletive allomorphy, in which each suffix or enclitic that productively nasalises is associated with both an oral and a nasal allomorph. The actual forms of the allomorphs are not predictable from the synchronic phonology, and are instead morpheme-specific vestiges of diachronic nasalisation processes. Specifically, root-initial voiceless stop pre-nasalisation is attributable to a historical nasalisation process that has ceased to be productive in PG (Estigarribia Reference Estigarribia2021; Russell Reference Russell, Marianne, Reisinger and Underhill2023). Rates of progressive nasalisation therefore represent different rates of selection of each allomorph, which are particular to the specific suffix or enclitic.
Findings about the differences between regressive and progressive nasalisation in PG have implications for our understanding of directionality in harmony. The default direction of harmony has been found to be regressive (Hansson Reference Hansson2001, Reference Hansson2010; Hyman Reference Hyman2002), which is reflected in PG. Directionality has also been argued to follow from morphological structure (Baković Reference Baković2000, Reference Baković2003). In such a proposal, harmony is stem-controlled, operating from the root outwards to affixes. This, therefore, predicts that prefixing languages will exhibit regressive harmony, suffixing languages will exhibit progressive harmony, and languages with both prefixes and suffixes will exhibit bidirectional harmony. The PG nasalisation system I have described here appears to constitute a counterexample to this last prediction, as PG has both prefixes and suffixes, and yet I argue that the observed patterns are not reflections of bidirectional harmony.Footnote 11 I propose that a more appropriate formulation of the proposal would involve invoking prosodic structure as well as, or in place of, morphological structure. The domain of nasal harmony in PG is the prosodic word, which includes prefixes but excludes suffixes. If directionality follows from prosodic structure, harmony is predicted to apply at the prosodic word level; such an analysis would not predict progressive nasal harmony in PG, which is indeed borne out here. Further research is necessary to assess the typological validity of this prosodic analysis of the directionality of harmony.
Though languages with vowel and/or consonant harmony systems are widespread around the world (Rose & Walker Reference Rose, Walker, Goldsmith, Riggle and Yu2011), many have never been examined through quantitative studies of variable harmony application rate, as I have presented here. A close examination of the factors that condition variation in harmony provides invaluable insight into the mechanisms underlying harmony. In the existing literature, several factors have been found to significantly affect the application rate of harmony across typologically diverse languages, including distance between trigger and target in Navajo (Martin Reference Martin2005; Palakurthy Reference Palakurthy2021), distance in terms of morphological template in Tommo So (McPherson & Hayes Reference McPherson and Hayes2016), and root token frequency in Uyghur (Mayer Reference Mayer2005). The present study of PG nasal harmony supports these previous findings, as well as adding morphosyntactic attachment as another relevant factor.
The application rate of sibilant harmony in Navajo (nav, Athabaskan, USA) decreases with distance (measured in syllables) between the trigger and target (Martin Reference Martin2005). The results of this corpus-based study of PG nasalisation parallel this finding, although distance is relevant only in the case of Spanish-origin roots. The importance of distance between the trigger and target of harmony is not particularly surprising, as cases of long-distance harmony are quite rare (Rose & Walker Reference Rose, Walker, Goldsmith, Riggle and Yu2011) – and a greater distance between trigger and target could lessen the effects of functional motivations for harmony, such as speech planning and coarticulation. In common analyses of harmony, such as Agreement by Correspondence (Rose & Walker Reference Rose and Walker2004), which treat harmony processes as a form of featural agreement between segments, correspondence relations exist between similar segments, such that corresponding segments agree on the surface with respect to some feature, such as nasality. Correspondence between segments may be limited at a specified distance (Bennett Reference Bennett2015; Shih & Inkelas Reference Shih and Inkelas2019). In order to model a pattern like that of nasal harmony with Spanish-origin items in PG, distance may thereby be represented, potentially gradiently, as a factor in the application of harmony.
Vowel harmony in suffixes in Tommo So (dto, Dogon, Mali) applies with diminishing frequency in outer layers of the morphology (McPherson & Hayes Reference McPherson and Hayes2016). Unlike the findings described for Navajo (and here for PG), the Tommo So pattern is not necessarily connected to distance in number of segments or syllables, but rather distance in terms of layers of the morphology. In PG, though, differential application of nasalisation to individual morphemes appears to be most closely connected to the distinction between cliticisation and affixation, rather than to the slot in the verbal template (cf. (2)). The negative prefix, for instance, is further removed from the root than agreement in terms of the verbal template, but nasalises at a higher rate than the Set B agreement morphemes. Additionally, the two sets of agreement morphemes compete for a single slot (Velázquez-Castillo Reference Velázquez-Castillo1991), and yet display significantly different rates of nasalisation. However, if we disregard the exceptional Set B morphemes, PG nasal harmony interactions with prefixes do appear to follow the Tommo So pattern, although nasalisation rates are quite similar across the board. The negation prefix, which is the outermost layer of the morphology examined in this study, has the lowest nasalisation rate, at 91.7%. Additionally, the epenthetic /j/, which is the innermost layer of morphology examined in this study, has the highest nasalisation rate, at 98%.
Backness harmony in Uyghur (uig, Turkic, China) is correlated with the token frequency of the root (Mayer Reference Mayer2005). In Uyghur, a vowel raising process converts harmonic vowels into transparent vowels, rendering the harmony pattern opaque. The rate of opaque harmony for a root is predicted by its token frequency: the more frequent a root is overall, the more likely it is to display opaque harmony. In this study, I have found frequency to be relevant for PG nasal harmony as well: the more frequent the root, the more likely it is to trigger nasal harmony. The interactions between frequency and harmony application rate present an opportunity for interesting future connections to the larger literature regarding frequency effects on phonological processes (e.g. Bybee Reference Bybee2002; Anttila Reference Anttila2006; Coetzee Reference Coetzee2008; Coetzee & Kawahara Reference Coetzee and Kawahara2012).
The PG corpus data shed light on another factor conditioning variation which has not as yet been discussed in the literature: different types of morphosyntactic attachment reflect distinct harmony patterns. I have shown that Set B morphemes nasalise at a significantly lower rate than all other prefixes. This difference is not attributable to any different phonological property of Set B morphemes, and instead may reflect a morphological difference between prefixation and cliticisation. Questions still remain, however, regarding the behaviour of specific morphemes as well as the interactions between root origin and Set B morphemes. The 2sg.b proclitic, for instance, does nasalise at a lower rate than prefixes, but to a lesser extent: it is possible that frequency effects could be in play here, as tokens of the 2sg.b proclitic are quite frequent in the corpus, particularly compared to other Set B proclitics. Further investigation of the interactions of PG morphosyntax and phonology is necessary; regardless, this data make an important contribution to the literature in demonstrating that morphosyntax could be a significant factor conditioning variation of harmony application.
Finally, these findings run counter to several persistent language ideologies. None of the social factors included in this study – gender, age and community affiliation (rural or urban) – were found to significantly improve model fit for any of the models presented here. Within Paraguay, ideologies of linguistic purism abound, in which rural communities are associated with ‘true’ PG, and urban communities with bilingualism and language mixing with Spanish (Rubin Reference Rubin1968; Gómez-Rendón Reference Gómez-Rendón, Matras and Sakel2007). Additionally, rural communities in Paraguay are reported to have a higher degree of language loyalty to Guaraní than urban communities (Solé Reference Solé1991; Gynan Reference Gynan1998). Ideologies about the relative purity of rural vs. urban PG go hand in hand with attitudes related to age. As urban areas are associated with youth in Paraguay, there is an expectation that younger generations of people in Paraguay are more likely to be bilingual in PG and Spanish, and more likely to be exposed to bilingual speech (Bittar Prieto Reference Bittar Prieto2021). Given that nasal harmony is a characteristic property of PG, and is absent in Spanish, one might hypothesise that rates of nasalisation would be higher for speakers in rural communities (and for older speakers) than for young speakers and those in urban communities. However, again, no social factors were found to significantly contribute to predicted nasalisation rate at all.Footnote 12 These findings are informative in that they potentially counter widespread ideologies, deconstructing the artificial construction of an ‘idyllic rural space’ and veneration of the speech of elders as more pure (Gordon Reference Gordon2019). It remains to be seen whether this lack of differentiation according to age and setting holds for other phonetic, phonological and morphosyntactic factors in PG.
7. Conclusion
Investigating a corpus of sociolinguistic interviews sheds light on the complex nature of nasalisation in PG. The two types of nasalisation in PG – regressive and progressive – are in fact distinct, and actual rates of progressive and regressive nasalisation significantly differ from each other. I argue that regressive nasal harmony is synchronically active, while morpheme-specific progressive nasalisation is phonologically conditioned suppletive allomorphy. Several different factors contribute to the observed nasalisation rate, including the direction of nasalisation, target morpheme identity, root etymological origin, root token frequency, relationship to stress and distance between the trigger and the target. The etymological origin of the root (Spanish vs. PG) significantly affects the rate of progressive, but not regressive nasalisation, signalling that harmony has been extended to Spanish-origin items. Various phonological factors, like stress and distance, are relevant in determining the rate of nasalisation triggered by a Spanish-origin root. Regressive nasal harmony additionally interacts with the morphosyntax: Set B agreement morphemes, which have been analysed as clitics, nasalise at a significantly lower rate than all other prefixes. These findings contribute to a deeper understanding of nasalisation within PG and of the factors that condition variable harmony application rate across languages.
A. Abbreviations

B. Figures

Figure B1 Predicted probabilities of the nasal variant for significant main effects in regressive nasal harmony (see Table 12).

Figure B2 Predicted probability of nasal variant for the interactions of root etymological origin and target morpheme identity in regressive nasal harmony (see Table 12).

Figure B3 Predicted probability of nasal variant for the significant main effects in regressive nasalisation triggered by Spanish-origin roots (see Table 14).

Figure B4 Predicted probability of nasal variant for the interactions of root etymological origin and target morpheme identity in progressive nasalisation (see Table 15).

Figure B5 Predicted probability of nasal variant for the interactions of root lexical stratum and target morpheme identity in progressive nasalisation triggered by Spanish-origin roots (see Table 17).

Figure B6 Predicted probability of nasal variant for the significant effects in progressive nasalisation triggered by Spanish-origin roots (see Table 17).
Acknowledgements
I am immensely grateful to Maria Gómez and Irma Ovelar Easty, who have been so gracious in sharing so much of their time and knowledge with me – aguyjevete! A huge thank you to Josefina Bittar for her generosity in sharing this incredible resource of Guaraní data with me. Thanks to Lev Michael, Isaac Bleaman, Hannah Sande and Maksymilian Dąbkowski, as well as audiences at the LSA and three anonymous reviewers for comments and suggestions on various aspects of this work. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1752814. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.
Competing interests
The author declares none.




































