An acoustic study of Tetsǫ́t’ıné stress: Iambic stress in a quantity-sensitive tone language

Abstract This paper presents both distributional and acoustic phonetic evidence for iambic stress in Tetsǫ́t'ıné (ISO: CHP), a Dene (Athapaskan) language with contrastive vowel length and four contrastive tones. In our acoustic study, we find that the primary correlate of stress in Tetsǫ́t'ıné is duration, whereas intensity plays a secondary but statistically significant role. There was no statistically significant effect on F0 in our results. We discuss our results in relation to several proposals regarding the typology of stress systems. Based on the Functional Load Hypothesis (Berinstein 1979) and Dispersion Theory (Flemming 1995, 2001), we find that our results are to some extent unexpected. We suggest that our results are most consistent with the Iambic–Trochaic Law (Hayes 1995), which predicts that iambic stress systems prefer to use duration as their primary stress correlate.

In his now classic cross-linguistic study of the typology of stress systems, Hayes suggests that stress may be unique among phonological categories, in that its phonetic realisation depends almost entirely upon properties that are also used (even within the same language) for other purposes: (1) 'The multiple phonetic cues for stress and the subordinate role of loudness are particularly interesting when one considers that languages use duration and pitch in their phonological systems for entirely different purposes. Duration is the phonetic cue for vowel length, which is phonemic in many languages…. Further, pitch is the phonetic cue for tone in languages with phonemic tone systems and is also the basis of intonation. The basic point is this: aside from the marginal role of loudness, stress is parasitic in the sense that it invokes phonetic resources that serve other phonological ends' (Hayes 1995: 7). Hayes (1995: 7) goes on to suggest that languages with a phonemic vowel length contrast will avoid using duration as a correlate of stress. Tuttle (1998: 18) applies this same line of reasoning to tone, and provides evidence that pitch is not used as a correlate of stress in Lower Tanana Dene (Athabaskan), a language with contrastive tone. Along these same lines, all other things being equal, we might expect that a language with phonemic vowel length might rely more on pitch as a cue to stress, whereas a language with contrastive tone might rely more on duration. This typological generalisation has been termed the Functional Load Hypothesis (Berinstein 1979), as stated in (2).
(2) 'Change in F0, increased duration, and increased intensity, in that order, constitute the unmarked universal hierarchy for perception of stress in languages with no phonetic contrasts in tone or vowel length; in languages with such contrasts the perceptual cue correlated with that contrast (i.e. F0 with tone and duration with length) will be superseded by the other cues in the hierarchy' (Berinstein 1979: 2, cited in Lunden et al. 2017. However, in the literature, the Functional Load Hypothesis has been seen as controversial. For example, Lunden et al. (2017), based on a typological survey of 140 languages, found that there is no correlation between whether a language has contrastive vowel length and whether it uses vowel duration as a phonetic cue for stress. In this context, our paper poses the following empirical question: What would we expect stress to look like in a language with both contrastive tone and contrastive vowel length? In this paper, we will present an acoustic study of stress in Tetsǫ́t'ıné, a Dene (Athabaskan) language variety with four contrastive tones (high, low, rising, and falling) and a surface vowel length contrast (short vs. long) in both stems and prefixes. We will provide both acoustic phonetic and distributional evidence to argue that this language exhibits a quantity-sensitive, left-to-right iambic stress system, in addition to its tone and vowel length contrasts.
If the Functional Load Hypothesis were taken at face value, we might expect that a language such as Tetsǫ́t'ıné would employ intensity as its main stress correlate. Indeed, a very similar prediction is made by Dispersion Theory (Flemming 1995(Flemming , 2001, as we will discuss in §7. At the same time, the statements in (1) and (2) both seem to assume a universal dispreference for using intensity as a primary stress correlate. If this dispreference for intensity were strong enough, it might be expected that a language such as we describe could not actually exist. Indeed, it has been claimed that no language can exhibit stress, tone and phonemic vowel length simultaneously. For example, Spahr (2016: 206) states that 'in a language with both stress and autosegmental tone, stress must be represented as a quantity contrast…leaving no room for an independent length contrast.' Since languages exhibiting stress, phonemic vowel length and tone are indeed typologically rare (though see Potisuk et al. 1996 for one example), we believe that an acoustic study of Tetsǫ́t'ıné stress will be an important contribution to our understanding of the phonetic typology of stress systems, since the language exhibits a combination of prosodic features which is generally assumed to be typologically unusual.
Our study also adds to a growing body of work on metrical structure in tone languages. The earliest such study that we are aware of is that of Rice (1990) on metrical structure in Hare. Rice uses trochaic footing to explain several phonological processes in Hare, including movement of high tones and vowel syncope. In her analysis of the Neo-Štokavian dialect of Serbian or Croatian, Zec (1999) presents evidence for a bidirectional interaction between tones and metrical structure whereby the foot inventory is constrained by tones, and the distribution of tones is constrained by feet. Finally, DeLacy (2002) appeals to the foot to account for the preferential attraction of stress by the falling HL tone over the simple H tone in Ayutla Mixtec. However, we believe that our study stands apart from these previous studies for several reasons: we provide a combination of distributional and acoustic phonetic evidence for iambic stress in Tetsǫ́t'ıné; the distributional evidence includes several different processes (tone movement, vowel length adjustment and consonant deletion); and we provide evidence for iterative stress (more than one stress per word). Therefore, it is our opinion that when one considers the evidence as a whole, Tetsǫ́t'ıné may provide the most compelling case for metrical structure in a tone language among all the examples just cited.
The remainder of this paper is organised as follows. In §2 we provide background on previous work on stress in Dene languages. In §3 we provide background information on Tetsǫ́t'ıné, including distributional evidence for contrastive vowel length, tone and stress. Sections 4-6 describe our acoustic study. In §4 we present our hypothesis, where we list four possible tone patterns for two-syllable words and eight for three-syllable words, and the hypothesised locations of stresses in each of these tone patterns. In §5 we describe our experimental methods, and in §6 we present our results. In §7, we discuss our results in terms of several theoretical proposals relating to stress: the Functional Load Hypothesis, which is closely related to Dispersion Theory (Flemming 1995(Flemming , 2001, and the Iambic-Trochaic Law (Kager 1993;Hayes 1995). Broadly speaking, our results suggest that duration is the primary cue for stress in this language. This is predicted by the Iambic-Trochaic Law, for iambic languages, but not by Dispersion Theory or the Functional Load Hypothesis, which predict, all other things being equal, that intensity ought to be the primary stress cue in a language with both contrastive vowel length and tone. Therefore, we interpret our results as providing phonetic evidence in support of the Iambic-Trochaic Law. §8 provides a summary and conclusion.

Previous work on stress in Dene languages
In early work on Dene languages, at least for those Dene languages that are tonal, it was often assumed that stress played little or no role in the phonological system (Rice and Hargus 2005b: 34). For example, Hoijer (1946, cited in Rice andHargus 2005b: 34) describes Chiricahua Apache as 'a succession of evenly stressed syllables', and Everett (1998) provides some useful historical background which may explain this state of affairs. Whereas in early work (e.g. Bloomfield 1933) it was assumed that stress was realised primarily by increased amplitude, by the late 1950s stress came to be directly associated with fundamental frequency. This gave rise to the assumption held by many linguists then as now that stress and tone are somehow mutually exclusive: a 'tone language' cannot also be a 'stress language'.
More recently, several linguists have found evidence for stress in Dene languages, including metrical stress, from a variety of perspectives. Rice (1990) presents morphophonemic evidence for trochaic footing in Hare and Tuttle (1998) presents phonetic evidence for trochaic footing in Tanana. Hargus (2005a) provides evidence that high tone attracts stress in Sekani, whereas Hargus (2005b) finds near-minimal pairs for stress in both Sekani and Deg Xinag. One difference between Hargus's (2005b) study and ours is that Hargus specifically excluded verbs from the data set, whereas our study focuses entirely on verbs. This is because we believe that stress plays a significant role in shaping verbal morphophonemics (see §3.4). Finally, Leer (2005) argues that stress has played a significant role in the history of Dene languages, specifically in causing a shortening of suffixes with full (long) vowels after the stem.
Thus, whereas several previous authors have provided evidence for stress in Dene languages, our study is novel in that we present converging evidence from morphophonemics ( §3.4) and acoustic phonetics ( §6) for iambic stress in Tetsǫ́t'ıné. In addition, with the exception of Tuttle (1998), our study is unique in that we situate our results within the phonetic typology of stress systems (Gordon and Roettger 2017;Lunden et al. 2017): given the exceptionally crowded prosodic space of Tetsǫ́t'ınéwith a vowel length contrast and four contrastive toneshow is it possible to phonetically realise stress without neutralising the other prosodic contrasts in the language?

Segmental inventory
Tetsǫ́t'ıné is a dialect of Dëne Sųłıné (Ethnologue: CHP) spoken in Canada's Northwest Territories. It is a member of the Dene (Athabaskan) language family. In this section we will present the segmental inventory of Tetsǫ́t'ıné, as well as the transcription system which will be used in this paper. The consonant inventory in (3) is similar to the Dëne Sųłıné consonant inventory proposed by Li (1946: 398), the main difference being that Tetsǫ́t'ıné does not seem to possess a labial-velar series (cf. Cook 2004: 22-23), whereas Tetsǫ́t'ıné does seem to distinguish alveolar /n/ from palatal /ɲ/ underlyingly (Jaker and Kiparsky 2020). Our choice of symbols to represent the consonant inventory in (3) also differs from the standard orthography of the language, as well as from convention among Deneists. In the Dene linguistics literature, the 'plain' stops and affricates, which are phonetically weakly voiced or voiceless, are customarily transcribed as voiced ⟨d, dz, dl⟩, whereas the aspirate series is transcribed as voiceless ⟨t, ts, tɬ⟩ (e.g. Li 1946;Ackroyd 1982;Rice 1989;Cook 2004). In this paper, following IPA convention, we will transcribe the plain series as voiceless [t, ts, tɬ], and the aspirate series as aspirated [t h , ts h , tɬ h ], following, for example, Lovick (2020 Dene languages are polysynthetic, prefixing languages, with a stem preceded by a number of prefix positions (Rice 1989). In Tetsǫ́t'ıné, a contrast between long and short vowels is found in both stems and prefixes. In stems, this contrast is largely a reflex of the historical (Proto-Dene) contrast between full and reduced vowels (cf. Krauss 1964Krauss , 1983. The contrast between long and short vowels in stems is fundamentally a length contrast, and realised phonetically as a combination of duration and vowel quality differences (Jaker 2019), not unlike the contrast between 'tense' and 'lax' vowels in English. This is reflected in our transcription of vowel length in stems (e.g. /ʌ/ vs. /aː/, as shown in (4)). In prefixes, long vowels are of more recent historical origin, resulting from the deletion of intervocalic consonants (Cook 2004: 41), which may still be a synchronically active process (Jaker to appear). That is, long vowels in prefixes originate historically as sequences of two adjacent vowels which were previously separated by a consonant. Although we lack phonetic data on the correlation between vowel length and vowel quality in prefixes in Tetsǫ́t'ıné, Jaker's subjective impression is that it is phonetically more of a pure length difference.
(4) Tetsǫ́t'ıné vowel inventory In previous work on Tetsǫ́t'ıné and related Dëne Sųłıné dialects, nearly all studies agree, for the dialects they examine, that Dëne Sųłıné exhibits some sort of vowel length contrast on the surface (Li 1933(Li , 1946Haas 1968;Krauss 1983;Cook 1983Cook , 2004. Sources differ, however, as to whether this vowel length contrast is the same in stems as in prefixes, and as to whether the contrast is underlying or derived in the synchronic phonology. Cook (1983: 424), for example, explicitly denies the existence of an underlying vowel length contrast in Dëne Sųłıné.
In this paper, we will assume that long vowels in stems (i.e. full vowels) are underlying, whereas long vowels in prefixes are synchronically derived (Jaker to appear). However, nothing in our acoustic study ( §4-6) or the interpretation of its results depends upon this assumption.

Stems
Stems in Tetsǫ́t'ıné exhibit an inventory of five long (i.e. full) vowels /iː/, /eː/, /aː/, /oː/, /uː/, and three short (i.e. reduced) vowels, /ə/, /ʌ/, /ʊ/ (in IPA broad transcription; see Jaker 2019 for a discussion of the precise phonetic quality of these vowels). The relationship between long and short vowels in stems is primarily a historical one, which has been mediated by a vowel reduction process that applied in Pre-Proto-Dene (Krauss 1964). A relationship between long and short stem vowels synchronically exists only in a few fossilised stem alternations (cf. Li 1933Li , 1946). The vowel system in (4) could be thought of as a standard five-vowel system for the long vowels, plus a three-vowel system for the short vowels. In terms of their phonological features (Jaker 2020), it is expected that /iː/ and /eː/ will reduce to /ə/, /aː/ will reduce to /ʌ/, and /uː/ and /oː/ will reduce to /ʊ/, wherever stem alternations involving vowel length are still found in the language.
For ease of exposition, we will compare only pairs of vowels which differ from each other solely in the property of being long or short, without additional vowel height differences. Thus, in the examples below we will compare /ʌ/ vs. /aː/, /ə/ vs. /eː/, and /ʊ/ vs. /uː/. Since it is often difficult to find true minimal pairs in Dene languages, due to the large phoneme inventory of these languages, some near-minimal pairs are also included. The vowels being compared are highlighted in bold.
iii. seʦ h uːné 'my grandmother' iv. hut h ʊ́n 'hold' (3sg.) iv. ʔerílt'uːs 'glue' (3sg.) As mentioned above, many of the examples in (5)-(7) are not perfect minimal pairs, often involving additional differences in consonant place of articulation and tone. However, we are not aware of any evidence that consonant place of articulation, such as the difference between /ʦ h / and /tɬ h /, could condition a vowel length difference, as in (6a.ii) and (6b.ii). The role of tone, however, does require some comment. In prefixes, as we will see in §3.4, there is some interaction between tone and vowel length, which is mediated by the fact that high tone attracts stress. In stems, however, there is no evidence that stem tone has any effect on whether the stem vowel is long or short. For example, in motion verbs, stems generally have a high tone in the imperfective, and a low tone in the perfective. But if the stem contains a long vowel, that vowel will remain long whether the stem tone is high or low (e.g. heetéːl 'they (pl.) leave', heéteːl 'they (pl.) left'). Similarly, a short vowel will remain short, whether its stem tone is high or low (e.g. heeʔʌ́s 'they (two) leave', heéʔʌs 'they (two) left'). Therefore, tone differences in the examples in (5)-(7) do not constitute a confound in relation to the contrast between long and short vowels.

Prefixes
In prefixes, Tetsǫ́t'ıné has an inventory of five vowels, /i/, /e/, /a/, /o/ and /u/, all of which can be made long in certain morphophonemic environments. Briefly, all Phonology long vowels in prefixes are the result of intervocalic consonant deletion, although it is not always the case that intervocalic consonant deletion results in a long vowel. (See Jaker (to appear) for a detailed study of consonant deletion and prefix vowel length in Tetsǫ́t'ıné). Since all surface long vowels are derived via morphophonemic processes, it could be said that prefix vowel length is contrastive on the surface in Tetsǫ́t'ıné, in that it realises morphosyntactic distinctions on the surface, even though it is not contrastive underlyingly. Some examples will be provided below.
There are two situations in which vowel length can be used minimally to signal a grammatical contrast: number marking in 'bare verbs' and aspect marking in ɣe conjugation, d/l-classifier verbs. We will examine each of these in turn below.
In verbs which lack a thematic disjunct prefix or any other thematic material -Jaker and Cardinal (2020: 74) call these 'bare verbs'the third-person singular imperfective forms contain a short vowel, whereas the corresponding third-person plural forms contain a long vowel. Some examples are given in (8)  In d/l-classifier, ɣe conjugation verbs (Jaker and Cardinal 2020: 88), the perfective forms are distinguished from their corresponding imperfective forms by vowel lengthening. Short vowels with high tone in the imperfective become long vowels with falling tone in the perfective, whereas short vowels with low tone in the imperfective become long vowels with low tone in the perfective. This pattern is best illustrated using the verb ʃétʰiː 'eat', as shown in (9).
(9) Perfectivity marked by vowel lengthening (Jaker & Cardinal 2020: 128-129) Imperfective: short vowel Perfective: long vowel a. ʃést h ĩː ʃéheːljiː 'they (pl.) ate' To summarise, in this section we have seen that vowel length is used contrastively in Tetsǫ́t'ıné. In stems, it is used to distinguish different lexical items, as we saw in (5)-(7). In prefixes, vowel length is used to realise morphosyntactic distinctions, as we saw in (8)-(9). This has implications for the phonetic realisation of stress. Apart from those few cases where stress does seem to alter phonological vowel length (see §3.4), to the extent to which vowel duration is used as a phonetic correlate of stress, we would expect that it would be used in such a way as to not neutralise contrastive length oppositions. That is, all other things being equal, we would expect a lesser degree of vowel lengthening under stress in Tetsǫ́t'ıné as compared with a language which lacks contrastive vowel length.

Distributional evidence for tone contrast
In this section we will provide an overview of the tone contrasts that exist in Tetsǫ́t'ıné. The first general fact to establish is that the inventory of tonal contrasts depends both on position (stems versus prefixes) and vowel length (short versus long). Specifically, in prefixes, short vowels contrast two tones (high and low), whereas long vowels contrast four tones (high, low, rising and falling). In stems, on the other hand, there are only two tones, high or low, regardless of whether the vowel is long (full) or short (reduced). This is illustrated in (10).
(10) Tone contrasts in different positions (Jaker & Cardinal 2020: 116, 122, 124, 146, 173;Cardinal et al. 2021) Prefixes Stems The generalisation that stem vowels contrast only two tones is true only when the nucleus of the stem syllable consists of a single vocalic root node. When the stem contains a diphthong, both level and contour tones are possible, as in θai 'sand' (low tone), θáí 'long ago' (high tone), ʔeɬdðái 'dry fish' (falling tone), teʧənðaí 'sawdust' (rising tone). Contour tones can also arise in stems due to postlexical consonantvowel metathesis and vowel coalescence in a phrasal context: thus compare ʧí:ze 'whiskeyjack' and ʧî:z ʧ h o: 'hawk' (lit. 'big whiskeyjack'). The broader generalisation which emerges from these examples is that in both stems and prefixes, the number of possible tonal specifications is equal to the number of vocalic root nodes. Thus, in prefixes, all long vowels consist of two root nodes and are therefore able to host contour tones. In stems, long vowels can host only level tones, because they consist of a single vocalic root node, except in the cases of consonant-vowel metathesis or diphthongs, as described above.
Given this context, in the remainder of this section we will focus our presentation on long vowels in prefixes, since this is where the full set of tonal contrasts can be most easily illustrated. Although as part of our experimental design we excluded contour tones from the target words in our stimulus set ( §4), it is important to consider the full set of tonal contrasts that exist in the language, since these define the overall 'tone space' and can potentially have an effect on how F0 is used as a stress correlate.
In the following examples, we will provide sets of morphologically related words, all with long vowels in the penultimate syllable, which contrast high versus falling tone, and low versus rising tone. In the terminative paradigms of motion and handling verbs, we find falling tones in the singular forms, and high tones in the first-person dual and plural forms. This is illustrated in (11).
(11) Terminative motion verbs: falling tone in singular, high tone in first-person dual/ plural (Jaker & Cardinal 2020: 147-148, 184-185) a. Singular: falling tone In the third-person dual and plural forms of the inceptive paradigms of motion verbs, we find a low tone in the imperfective and a rising tone in the perfective. This is illustrated in (12).
(12) Inceptive motion verbs: low tone in imperfective, rising tone in perfective (Jaker & Cardinal 2020: 171-172; Jaker's field notes 2 November 2020) a. Imperfective: low tone i. heːʔʌs 'they (du.) leave (on land)' ii. heːk h íː 'they (du.) leave (by boat)' iii. heːt'aːɣ 'they (du.) leave (by airplane)' iv. heːtéːl 'they (pl.) leave' b. Perfective: rising tone i. hěːʔʌs 'they (du.) left (on land)' ii. hěːk h ĩː 'they (du.) left (by boat)' iii. hěːt'aːɣ 'they (du.) left (by airplane)' iv. hěːteːl 'they (pl.) left' As part of our experimental design, we excluded any words containing contour tones from the set of target words (see §5). However, in this section we have provided an illustration of the contrastive status of contour tones for the following reason: in a language with contrastive contour tones, it has been proposed that for every tone, the initial pitch target, the final pitch target, the magnitude of the pitch rise or fall, and the slope of the rise or fall are all phonetically specified in the phonetic component of the grammar (Flemming and Cho 2017). To the extent to which lexical tonal contrasts are not neutralised, this severely constrains the extent to which F0 can be used as a phonetic correlate of stress. Therefore, we would expect, all other things being equal, that Tetsǫ́t'ıné would exhibit a lesser magnitude of higher F0 under stress when compared with languages without contrastive tone or even languages with only two contrastive tones.

Distributional evidence for stress
In Tetsǫ́t'ıné, in the great majority of cases, stress is non-neutralising. That is, if the position of long vowels or high tones conflicts with a left-to-right iambic stress pattern, it is the position of stress which is adjusted rather than the tone or vowel length (Jaker and Kiparsky 2020). This is true of the set of target words which we will examine in §4-6. However, there is a minority of cases in which conflict between the stress pattern and the tone or vowel length patterns is repaired by altering tone and/or vowel length. These cases are important, because they provide evidence that stress is an integral component of the lexical phonology of the language and not a feature which can be restricted to the postlexical and/or phonetic component of the grammar. Rather, stress plays an active role in the phonological computation.
Tetsǫ́t'ıné is a quantity-sensitive, left-to-right iambic system in which high tone attracts stress. If we put aside the role of tone, the interaction between quantity and iambic feet in Tetsǫ́t'ıné is typologically normal for an iambic system (cf. Hayes 1995). Thus, every heavy syllable constitutes the head of a foot. Light syllables can form a foot together with a heavy syllable to their immediate right, or else two light syllables can also form a foot. This is illustrated in the examples in (13).
Here, H and L refer to heavy and light syllables, respectively, rather than tones. The coda consonant ɬ does not add coda weight in (13c) and (13d). Note also that we assume that word-final light syllables are extrametrical, as in (13e).
(13) Syllable weight and foot parsing in Tetsǫ́t'ıné (Jaker & Cardinal 2020: 40-41, 76) The stress pattern in (13) holds true for all words with level tone patterns (i.e. alllow tones or all-high tones). Where words contain a mix of low and high tones, high tone attracts stress, sometimes to a different position than we would expect solely based on the weight patterns in (13). For example, the word (ˈʃé.ne)(ˈt h ĩː) 'you eat' has trochaic stress, due to the initial high tone, despite having the same weight pattern as in (13c). The relationship between tone and stress will be discussed in greater detail in §4.
In this section, we will examine three types of cases which provide distributional evidence for iambic stress. In §3.4.1, we will examine cases where stress conditions vowel length adjustment (i.e. the transfer of vowel length from its expected location onto a neighbouring vowel). In §3.4.2, we will examine cases of movement of tone, where high tone surfaces one syllable farther left than expected to align with the iambic stress pattern. Finally, in §3.4.3, we will examine cases where consonants delete intervocalically to make the word more easily parsable into iambic feet.

Stress conditions vowel length adjustment
In Tetsǫ́t'ıné, when the consonant ɣ deletes intervocalically, it often leaves behind a long vowel. In the repetitive paradigm of reflexive verbs (meaning 'do to oneself repeatedly'), in the singular forms, this long vowel surfaces in its expected position, the place from which ɣ was deleted, which also happens to be the second syllable from the left edge and, therefore, the strong position of an iambic foot. This is illustrated in (14). In (14) and elsewhere, we assume that word-final codas (which are also stem-final) are moraic, whereas most codas in prefixes are not moraic, with two exceptions to be described in §3.4.3.
(14) Long vowel surfaces in expected position in reflexive repetitive singular forms (Jaker & Cardinal 2020: 85, 218) By contrast, in (15), based on the segmental phonology of the language, we would expect to find a long vowel in the third syllable of the word, following the prefix he, since this is where the consonant ɣ has been deleted intervocalically. Instead, we find a long vowel in the second syllable, where there is no reason for a long vowel to arise based on segmental processes. However, if Tetsǫ́t'ıné is an iambic language, then these forms have a straightforward explanation. In a left-to-right iambic system, a sequence of alternating light and heavy syllables can be parsed more harmonically than a light-light-heavy-heavy sequence of syllables. Specifically, (light-heavy) iambs are most harmonic according to the constraint GROUPINGHARMONY(IAMB) (Prince 1990(Prince , 1991. Therefore, in the examples in (15), it appears that a mora has floated one syllable leftwards from its expected location to allow for a more harmonic iambic parse. 1 (15) Long vowel surfaces in unexpected position in reflexive repetitive third-person plural forms (Jaker & Cardinal 2020: 85, 218) Underlying form Surface form Expected form a. /ʔete-he-ɣe-t-ʦ'ǝr/ (ʔe.ˈteː)(he.ˈʦ'ǝˊr) *(ʔe.ˈte)(ˈheː)(ˈʦ'ǝr) 'they scratched themselves (repeatedly)' b. /ʔete-he-ɣe-t-k h áːr/ (ʔe.ˈteː)(he.ˈk h áːr) *(ʔe.ˈte)(ˈheː)(ˈk h áːr) 'they slapped themselves (repeatedly)'

Stress conditions movement of tone
It has been previously observed that in some languages high tone attracts stress, an observation which DeLacy (2002,2007) has formalised as the Tone-to-Stress Principle, or TSP. In Tetsǫ́t'ıné, most cases of potential mismatches between stress and tone are repaired by moving the position of stress, as we will see in §4. However, there is a minority of cases where stress-tone mismatches are repaired by moving a high tone one syllable leftwards from its otherwise expected position. This seems to occur only in reflexive semelfactive verbs (meaning 'to do to oneself once') (Jaker and Kiparsky 2020). Reflexive semelfactive forms contain the conjugation marker hí in the imperfective and optative (Jaker and Cardinal 2020: 113-114). In singular forms, where this conjugation marker falls on the second syllable from the left edge of the prosodic word (and thus in the strong position of an iambic foot), high tone surfaces in its expected position, as shown in (16). (16) High tone surfaces in expected position in reflexive semelfactive singular forms (Jaker & Cardinal 2020: 100, 110, 217-218) Underlying form Surface form Expected form a. /ʔete-hí-t-tθ'íː/ ( ʔe.ˈtí)(ˈtθ'íː) On the other hand, in the corresponding plural forms, high tone occurs one syllable to the left of its expected position, as shown in (17). The result of this, on the surface, is that high tone still falls on the second syllable from the left edge of the prosodic word, a stressed position, by left-to-right iambic foot parsing. 2 (17) High tone surfaces in unexpected position in reflexive semelfactive plural forms (Jaker & Cardinal 2020: 100, 110, 217-218) Underlying form Surface form Expected form a.
/ʔete-he-hí-ɣu-t-ì-t'uːs/ (ʔe.ˈté)(hul.ˈt'uːs) *(ʔe.ˈte)(ˈhǔːl)(ˈt'uːs) 'they will punch themselves (once)' If we compare the expected forms in (17) to the actual forms, we also notice that the actual surface forms also involve vowel length adjustment. That is, in all of the expected forms, the third syllable contains a long vowel, whereas in the actual surface forms the vowel is shortened. This is also expected in an iambic system, since the third syllable is an odd-numbered syllable, which falls in the weak position of a foot by left-to-right iambic foot parsing. Unlike what we saw in §3.4.1, however, the length is not transferred to the preceding syllable; rather, the extra mora is simply deleted.

Stress conditions intervocalic consonant deletion
Finally, we will consider cases where a consonant is deleted intervocalically for prosodic reasons, specifically, as we shall see, to avoid a stress lapse. The initial consonant n of the ne qualifier prefix is normally preserved following all other prefixes, including the third-person plural subject prefix he. An example is the verb neɬʔĩː 'see' shown in (18); we include that both the surface forms and a metrical parse into iambic feet.
(18) Imperfective paradigm of neɬʔĩː 'see', with iambic parse (Jaker & Cardinal 2020: 133) Underlying form Surface form Iambic parse Subject a. /ne-s-ɬ-ʔĩː/ nesʔĩː (nes.ˈʔĩː) Recall that in Tetsǫ́t'ıné, all consonants are moraic word-finally. Based on evidence from the morphophonemics of optative paradigms (Jaker and Cardinal 2020: 104-111), it appears that the coda consonants of the first-person-plural prefix hít and the second-person-plural prefix uh also count as moraic, whereas all other coda consonants in prefixes count as non-moraic. For this reason, the coda consonants in (18d) and (18e) (18) can be parsed into a sequence of well-formed iambic feet. In verbs with additional syllables, however, this is not always possible. Consider the perfective paradigm of the verb háútenelt h ən 'learn', shown in (19). (19) Perfective paradigm of háútenelt h ǝn 'learn', with iambic parse (Jaker's field notes, 17 July 2020) In all of the surface forms in (19), the output is parsed into the same, well-formed sequence of iambic feet of the form (heavy)(light-heavy)(heavy). However,in (19f) this is achieved by deletion of n intervocalically. If n were retained in this form, given that high tone attracts stress in the language, the predicted output would be *(ˈháú)te (he.ˈnéɬ)(ˈt h ən), with a stress lapse between the second and third syllables. This deletion of n in the third-person plural form is obligatory in the imperfective, perfective and optative paradigms of the verbs 'teach' and 'learn'. It also occurs variably in verbs which use the PERSISTIVE derivational string (Jaker and Cardinal 2020: 194-204). An example is the verb níníjúː 'bring dogs to a place'; the imperfective forms are given in (20).
(20) Imperfective paradigm of níníjúː 'bring dogs to a place', with iambic parse (Jaker & Cardinal 2020: 196) Underlying form Surface form Iambic parse Subject a. /ní-ne-hí-s-júː/ níníʃúː There does not seem to be any likely alternative phonological or morphological explanation for this consonant deletion other than to adhere to a left-to-right iambic prosodic pattern. And if there are cases where consonants are deleted to improve iambic foot parsing, this is one more piece of evidence that iambic stress is phonologically active in the language.

Summary
In this section, we have seen three types of evidence that stress is active in the phonological grammar of Tetsǫ́t'ıné: vowel length adjustments, movement of tone and intervocalic consonant deletion. These types of evidence are significant, because it has been claimed that in a language with contrastive tone, stress can be represented only 'covertly' on the length tier (Spahr 2016: 200). The idea of 'covert' stress implies that stress is somehow read off of other phonological structure but does not operate upon that structure. In this section, however, we have seen that metrical stress actively manipulates phonological material on the tone, moraic and segmental tiers. This would seem to suggest that in Tetsǫ́t'ıné stress exists on its own representational tier (the metrical grid) and plays a fully active role in the phonological computation.
More generally, we have seen evidence that Tetsǫ́t'ıné has a contrastive vowel length opposition ( §3.2) and four contrastive tones ( §3.3) in addition to an iambic stress system. At this point, we are left with an odd juxtaposition: in this language, stress must be audible enough to play an active role in the phonology and be acquirable by learners, while at the same time not neutralising the vowel length and tone contrasts that also exist in the language. In §4-6 we will describe an acoustic study we conducted to see what combination of acoustic stress correlates Tetsǫ́t'ıné employs to make this rather unusual combination possible. 3

Hypothesis
The basic hypothesis we sought to investigate was that Tetsǫ́t'ıné exhibits iambic stress realised phonetically as a combination of increased duration, increased amplitude and increased F0 of the stressed vowel when compared with neighbouring unstressed vowels. More technically speaking, we sought to reject the null hypothesis that the position of stress has no effect on the relative duration, amplitude or F0 of adjacent vowels. To formulate this hypothesis precisely, however, it is first necessary to define exactly which syllables were predicted to be stressed under our hypothesis. Specifically, both heavy syllables and high tone attract stress in Tetsǫ́t'ıné; therefore, it is necessary to specify the set of tone patterns and weight patterns used in the stimulus set.
In designing our stimulus set, the first step was to exclude syllable weight from the set of variables under investigation. Since, as discussed previously, some coda consonants in prefixes are moraic, this meant excluding all words with a medial consonant cluster. Thus, all target words were of the form CVCVː(C) for two-syllable words, and CVCVCVː(C) for three-syllable words. Regarding vowel length, we selected words where all of the prefix vowels were short, and the stem vowel, full (long). This resulted in a (light-heavy) weight pattern for all two-syllable words, and a (light-light)(heavy) weight pattern for all three-syllable words. While it would have been more desirable from an experimental design point of view to have all light syllables, this would have been impossible, given the prosodic morphology of the language, since all stems (the final syllable) are obligatorily heavy (Jaker and Cardinal 2020: 73). Since the stem syllable must be heavy regardless, we chose to use all full stem vowels, rather than all reduced stem vowels, since full vowels in stems are much more common. Finally, we allowed for an optional final coda consonant to increase the number of possible lexical items we could use. The end result was that the weight pattern was held constant, with a (light-heavy) pattern for all two-syllable words and a (light-light)(heavy) pattern for all three-syllable words.
The hypothesised locations of stresses were based primarily on distributional evidence, as illustrated in §3.4, wherever such evidence was applicable. In cases where no distributional evidence was applicable (such as the question of whether the final syllable is stressed in (22g)-(22h)), whether a syllable was hypothesised to be stressed or unstressed was based on Jaker's subjective impression of stress; Jaker is not a native speaker of the language. We propose that Tetsǫ́t'ıné stress is iambic by default. Because high tone attracts stress, in accordance with the TSP (DeLacy 2002, 2007), however, a high-low tone pattern will result in a trochee in two-syllable words. This is illustrated in (21), where we provide examples of all four logically possible tone patterns in two-syllable words.
(21) Tone patterns and resulting stress patterns for two-syllable words (Jaker & Cardinal 2020: 76, 171-172, 254) Tone pattern Predicted stress pattern Example a. Low-Low unstressed-stressed (he.ˈk h eː) 'they (du.) sit' b. Low-High unstressed-stressed (hu.ˈsáː) 'I will leave' c. High-Low stressed-unstressed (ˈhí.tãː) 'we drink' d. High-High unstressed-stressed (hí.ˈtéːl) 'we (pl.) leave' As shown in (21), our analysis predicts an unstressed-stressed pattern for the (lowlow), (low-high), and (high-high) tone patterns in (21a), (21b) and (21d), and a stressed-unstressed pattern for the (high-low) tone pattern in (21c). For three-syllable words, the stress pattern of the first two syllables will follow the same patterns as in (21). As for the third syllable, it will be unstressed if it has a low tone and the preceding syllable has high tone; otherwise, it will be stressed. This is illustrated in (22).
(22) Tone patterns and resulting stress patterns for three-syllable words (Jaker & Cardinal 2020: 127-128, 141, 175-176, 179, 195) (22), under our hypothesis, the majority of tone patterns result in a stress clash between the second and third syllables. A stress clash does not occur in (22e) or (22f), where the second syllable is unstressed, or in (22c) or (22g), where the final syllable is unstressed. We suggest that the reason why the final syllable is unstressed in (22c) and (22g) is that a low-toned syllable becomes unstressed following a high-toned syllable (DeLacy 2002), although a formal analysis of this phenomenon is beyond the scope of this paper.
Two other issues require comment at this stage: the role of morphology and the role of primary versus secondary stress. Although our experimental design did not distinguish between primary and secondary stress, Jaker's subjective impression is that where there is more than one stress in a prosodic word, the rightmost stress is the primary stress. 4 Regarding morphology, it should be noted that in all of the conditions in (21) and (22), the final syllable is also the stem syllable, which is, according to (21) and (22), stressed in three-quarters of the conditions. This raises the question of whether, when the final syllable is stressed, this stress can be attributed to its morphological status as the stem syllable rather than to its status as the head of an iambic foot. It is clear that morphology cannot be the sole factor which determines stress, since this would not explain why the final syllable is unstressed in cases of a final high-low tone pattern (as in (21c), (22c) and (22g)). In addition, morphological factors would also not explain why the second syllable of three-syllable words is stressed in all conditions except (22e) and (22f). Nevertheless, we do believe that morphology is relevant to the stress system, in that it is probably not an accident that a language in which stems are word-final has developed an iambic stress pattern, in which, by default, the final syllable is also stressed. Indeed, it has been observed that the more prefixing a language is, the higher is the likelihood of it exhibiting final stress (Gordon 2006: 209-213). The fact that morphological and prosodic dimensions of prominence largely coincide makes the system as a whole more diachronically stable and could be seen as an example of what is referred to as 'harmonic alignment' (Prince and Smolensky 2004).
To summarise, Tetsǫ́t'ıné has contrastive tone and vowel length, as well as a phonologically active stress system. In designing our stimuli, we selected a constant weight pattern for all target words: (light-heavy) for two-syllable words and (lightlight) (heavy) for three-syllable words. There is a complex interaction in the language between tone and stress. In some cases, as we saw earlier in §3.4.2, stress conditions movement of tone. In other cases, as we have seen in this section, tone attracts stress onto a different syllable than where it would be expected to appear, based on just left-to-right iambic foot parsing. Specifically, initial (highlow) tone sequences result in trochaic, rather than iambic feet. In sections 5 and 6, we will describe the acoustic study conducted to find acoustic evidence for the stress patterns we proposed in (21) and (22), which, we hypothesise, result from the different tone patterns shown.

Experimental design and stimuli
The target words for the experimental stimuli were all two-and three-syllable verbs, taken from Jaker and Cardinal's (2020) Tetsǫ́t'ıné Verb Grammar. As described previously, when selecting the target words we excluded any words with wordmedial consonant clusters, words with contour tones and words with non-final long vowels. The result is that all of the target words have only level tones (high or low) on all of the syllables, and all of the words have a constant weight pattern (light-heavy) for two-syllable words, and (light-light)(heavy) for three-syllable words.
With these factors controlled for, the independent variable was stress, whereas the dependent variables were the set of stress correlates (amplitude, intensity and F0). However, given that tone attracts stress, the only way to manipulate the position of stress was to manipulate tone, as shown in (21) and (22). Therefore, our goal was to find phonetic evidence for the stress patterns resulting from each possible tone pattern, in two-and three-syllable words. In two-syllable words, there are 2 2 or four possible tone patterns, as shown in (21), whereas in three-syllable words, there are 2 3 or eight possible tone patterns, as shown in (22). We selected a total of 36 two-syllable words and 28 three-syllable words as stimuli, which are listed in the online supplementary material.
The stimuli were shown to subjects in the form of a PowerPoint presentation, the word being written out in the standard orthography as part of a carrier sentence, along with a picture to illustrate the sentence. The investigator first read the sentence aloud, then asked participants to repeat the sentence twice, and then repeat the target word by itself three times, for a total of five repetitions of the target word (the target word was highlighted in bold in the written stimuli). It was necessary for the investigator to read the sentence out loud because not all participants were proficient in the standard orthography. We acknowledge the potential risk of this procedure; namely, that the pronunciation of the investigator (who is not a native speaker) might influence the participants. However, in Jaker's experience, speakers do not hesitate to correct perceived errors on the part of the investigator; for example, if a sentence is either ungrammatical or not culturally appropriate (see below). It is thus reasonable to expect that native speakers would correct any mispronunciations on the part of the investigator to make them conform to a more native-like pronunciation, in particular, a native-like stress pattern.
In designing these stimuli, and especially the carrier sentences, we gave priority to creating sentences that were culturally appropriate and provided a natural context in which the target word could be used. An example is given in Figure 1. The reason for this is that speakers, especially those who are not accustomed to working with linguists, will often reject a word as ungrammatical if it is not provided in such a context (indeed, this happened several times in the current experiment, in spite of our efforts just mentioned). In a number of cases, the speaker chose to change the target word provided to a different word. In these cases, our policy was that if the new word conformed to the conditions outlined above (same weight pattern, no contour tones and no consonant clusters), then the new word was retained. If, on the other hand, the new word provided by the speaker contained a different weight pattern, a contour tone, and/or a word-medial consonant cluster, the word was excluded from our measurement set.
One disadvantage of the approach outlined above is that it does not control for many of the things which carrier sentences are usually used to control for; for example, boundary tones, and the effect of the tone, stress or weight of adjacent words. However, we have found this approach necessary, based on past experience that subjects require all target words to be provided in a natural context.
We interviewed a total of four subjects in Yellowknife and Łútsëlk'é, Northwest Territories, three female speakers and one male speaker. All four speakers were in their 50s or 60s, and also spoke English fluently. Three of the four speakers had basic familiarity with the standard orthography of the language, whereas one did not. However, this speaker was still able to complete the task by repeating the sentence uttered by the investigator.
Finally, we will briefly mention the question of vowel quality.  (House 1961;Toivonen et al. 2015) and different inherent intensities. Ideally, one would want to hold vowel quality constant, just as we did with weight, to avoid this potential confound. Unfortunately, given that verbal prefixes constitute a relatively small, closed set in Dene languages, we would not have been able to find target words to illustrate all of the tone patterns in (21) and (22), if we had, for example, restricted ourselves to only words with e (as in henejeː 'they grow' or henét h eːs 'they are sleeping'). Therefore, it was necessary to include words with different vowel patterns; for example, náθijaː 'I went'. However, the effect of these different vowel qualities was accounted for post hoc as part of our statistical analysis (see §5.4).

Recording equipment
Subjects were recorded using a Marantz PMD 671 Compact Flash Recorder, recording at 24 bits and 44 kHz, using two cardioid condenser microphones placed approximately eight inches from the subjects. Recordings took place at the Yellowknives Dene First Nation Land and Environment Office in Yellowknife, and at the Co-op Bed and Breakfast in Łútsëlk'é. The recording environment was not soundproof, and there was occasional background noise; a small number of tokens needed to be thrown out for this reason.

Segmentation and measurements
Segmentation was done in Praat (Boersma and Weenink 2020), and all segments measured were vowels. When defining the beginning and the end of a vowel, the general principle was that transitions were part of the vowel. Thus, for example, in a sequence such as [aja] or [ana], only the region of relatively stable formant structure was counted as part of the consonants [ j] or [n]; the transitions in and out of these consonants were counted as part of the neighbouring vowels. An illustration of segmentation involving sonorant and glide consonants is given in Figure 2, using the form heneje 'they grow'.
Our decision to include target words with nasal, liquid and glide consonants was for the same reason that we included words with different vowel qualities: the need to achieve broad descriptive coverage. That is, there would not have been enough stimuli illustrating all the different tone patterns to measure, based on Jaker and Cardinal (2020), if we had restricted ourselves to only words with intervocalic obstruents. Thus, while the inclusion of intervocalic sonorants does introduce some potential for segmentation error, due to the inherent indeterminacy in demarcating sonorants, we feel that this disadvantage is balanced by the possibility of creating an exhaustive phonetic description of the stress system; that is, measurements of all possible tone and stress patterns.
There were also some additional considerations when demarcating stops and affricates. For the transition from a vowel into a stop or affricate, we counted the end of complex formant structure as the end of the vowel. For the transition from a stop or Figure 2. Segmentation involving intervocalic n and j Phonology affricate into a vowel, it was also necessary to consider laryngeal features. For plain stops [t, k], the vowel begins immediately following the release burst; and for ejective stops [t', k'], immediately following the glottal release. However, it has been noted that aspirate stops in Dene languages are accompanied by a great deal of frication noise, such that they might almost be transcribed as affricates [t x , k x ] rather than aspirates [t h , k h ] (McDonough and Wood 2008). For this reason, following both aspirate stops and affricates, we counted the period of aspiration and/or frication noise as belonging to the consonant, and the beginning of complex formant structure as the beginning of the vowel. Similarly, with transitions in and out of fricatives, we counted the end of complex formant structure as the end of the vowel, and the beginning of complex formant structure as the beginning of the vowel.
Duration and intensity measures were extracted from each vowel in each syllable. Duration measures were taken in seconds. Intensity was extracted at ten equally spaced points over the duration of each vowel to facilitate dynamic comparisons of intensity for stressed and unstressed vowels, and was measured in decibels. F0 was measured in Hertz. For F0, we chose to use mean values rather than dynamic measurements, because there were a substantial number of unidentified pitch values in the output script.

Statistical methods
Our statistical analysis primarily made use of linear models. The duration of the vowels in each syllable was compared statistically using linear models using the lmer() function (Bates et al. 2015) in R (R Core Team 2021), and degrees of freedom were calculated using the Satterthwaite's (1946) method with the lmerTest package (Kuznetsova et al. 2017). We performed a total of two linear models, one for two-syllable words and one for three-syllable words. Each model had the fixed effects of stress (two levels: unstressed and stressed), tone (two levels: high and low), vowel quality (seven levels: a, ã, e, i, ĩ, u, ui), syllable (two levels: non-final and final), and an interaction between stress and syllable. We included random effects of participant and item (i.e., which word from the list the data was extracted from). Each of the random effects were coded as a random intercept in the model. Post-hoc tests were performed with the emmeans() package (Length 2020). We generated dynamic intensity plots using generalised additive mixed models (GAMMs). With GAMMs, we can specify random smooth terms to account for differences across participants and words when calculating the group trends. We used the mgcv package (Wood 2011) in R. We included fixed effects of stress (two levels: stressed and unstressed), and we included smoothing terms for interval, and interval by stress. We included factor smooths (i.e. random effects) for interval: stress by participant and interval by word (i.e. which word the intensity was extracted from).
We also performed an F0 analysis for words with flat tone patterns (e.g. low-low or high-high-high) to see if stress had an effect. We compared the mean F0 in each syllable for each of the four word types (two linear models: two syllables, three syllables). Each of the linear models had a main effect of syllable (two or three levels: syllable one, syllable two and syllable three), tone (two levels: low tones and high tones), and vowel quality (six levels: e, i, u, ui, ã, ĩ; five levels: i, u, a, ã, ĩ) and a random effect of participant and of item (i.e. the word that data was extracted from). The random effects were coded as a random intercept in the model. However, in the three-syllable model, the model produced a singular fit, so we removed the random intercept for item. Linear discriminant analysis (LDA) was performed using flipMultivariates (Displayr 2021). For the LDA, we examined the difference between stressed and unstressed syllables to determine the cue weighting. We performed two LDA analyses with the package flipMultivariates: one for two-syllable words and one for three-syllable words. We coded duration, intensity and F0 into the model. We used the maximum intensity value for each word for each participant, and the mean F0 values. Before performing the LDA, we converted all the values into z-scores to control for inter-speaker differences in speaking rate and intensity. Statistical analysis was done with a two-tailed model, multiple comparisons correction, and false-discovery-rate correction applied to the entire table simultaneously. All plots were generated using the ggplot2 package (Wickham 2016).
The linear models for two-syllable words revealed that a correlate of stress is related to duration, as stressed syllables are longer than unstressed syllables in both the first and second syllable position.
The linear models of the three-syllable words revealed that in each case the stressed syllable had a longer duration than unstressed syllables, suggesting this is a correlate of stress in this language.

Intensity results
The GAMM results for the intensity examination of two-syllable words showed the same trend: in both the first (n = 7,899) and the second syllable (n = 7,569), peak intensity was higher for stressed syllables than unstressed syllables. However, there were subtle differences in the trajectory of the intensity contour. In the first syllable, unstressed and stressed syllables shared a similar sharp increase until approximately the midpoint of the vowel. However, there was a sharper drop in the stressed syllables' intensity, resulting in a margining in the intensity values. In the second syllable, there was also a sharp increase in intensity; however, the increase peaked at only approximately one-quarter of the duration of the vowel. There was then a sharp drop in intensity for unstressed and stressed vowels. The overall trajectories and drop in intensity resulted in maintenance of a significant difference between unstressed and stressed syllables. Figure 5 presents the GAMM results for the intensity analysis of two syllables. Syllable one is on the left, and syllable two is on the right. Table III presents the approximate significance of smooth terms for syllable one and syllable two along with the R 2 values. The intensity examination also revealed significant differences between unstressed and stressed syllables in three-syllable words, as shown in Figure 6. In the first syllable (n = 5,206), intensity began higher and peaked higher for stressed syllables compared to unstressed syllables. However, after the midpoint of the vowel, the intensity for stressed syllables dropped, while the intensity for unstressed remained level. The drop in intensity resulted in an overall lower value for stressed syllables than unstressed syllables at the very end of the syllable. In the second syllable (n = 5,300), the intensity of stressed syllables was higher than that of unstressed syllables. However, they both shared a similar trajectory. There was a gradual increase in intensity until the midpoint of the vowel, when intensity began to drop. The drop was sharper in the stressed syllable but maintained a significant difference from the unstressed syllables over the entire duration. In the third syllable (n = 4,884), the unstressed and stressed syllables began with no significant difference. However, there was a longer increase in intensity for stressed Figure 5. Dynamic intensity plots for the first (left) and second (right) syllable in two-syllable words. The solid line indicates stressed syllables, and the dashed line indicates unstressed syllables. Grey shading indicates 95% confidence intervals syllables, resulting in a higher-intensity contour. The peak was just after one-quarter of the vowel and began a decline at the midpoint of the vowel. The stressed syllables had a sharper decline, resulting in no significant difference between unstressed and stressed syllables. However, as with the two-syllable words, the final syllable (here, the third syllable) had much sharper and greater declines in overall intensity for both unstressed and stressed syllables. Table IV presents the approximate significance of smooth terms for syllable one, syllable two and syllable three, along with the R 2 values.
The GAMM analysis revealed differences in overall trajectories and values for intensity in stressed and unstressed syllables in both two-and three-syllable words. Further, those differences also existed for each syllable within each word type. However, the overall tendency was for a higher peak intensity for stressed syllables compared to unstressed syllables.

F0 results
The results of the linear model for two-syllable words (n = 795) with high-high and low-low tone revealed a main effect of tone (F(1, 33) = 88.43; p < 0.001) and vowel (F(6, 148) = 3.79; p = 0.002), but the effect of syllable did not reach  Figure 6. Dynamic intensity plots for the first (left), second (middle) and third (right) syllables in three-syllable words. Solid lines indicate stressed syllables; dashed lines indicate unstressed syllables. Grey shading indicates 95% confidence intervals significance (F(1, 360) = 3.23; p = 0.070). The R 2 for the model was 0.571. Table V presents a summary of the fixed effects for the linear model for two-syllable words, and Figure 7 presents the violin plot of the mean F0 by syllable for low-low and high-high tone patterns.
The F0 analysis revealed that syllable position did not play a major role in F0. In fact, only the first syllable in the three-syllable context showed an increase in F0 based on position, suggesting that F0 falls as words increase in length. Thus, the data suggests that F0 is not a strong correlate of stress. The LDA for two-syllable words produced correct predictions 74.62% of the time (stressed: 74.34%; unstressed 74.9%). The LDA revealed that the strongest predictor of stress was duration (R 2 = 0.19), whereas both intensity (R 2 = 0.06) and F0 (R 2 = 0.04) Figure 7. Violin plot of the mean F0 for high-high (left) and low-low (right) by syllable position The LDA for three-syllable words produced 73.60% correct predictions (stressed: 76.88%; unstressed 69.16%). The LDA revealed that the strongest predictor of stress was duration (R 2 = 0.23), whereas both intensity (R 2 = 0.03) and F0 (R 2 = 0.01) were weaker predictors. Table VIII presents the results of the of the LDA. The model correctly predicted stressed syllables 803 times (101 miscategorisations as unstressed), and unstressed syllables 487 times (181 miscategorisations as stressed).
The data revealed that the strongest correlate of stress was duration (R 2 = 0.19; 0.23). This was far stronger than intensity or F0, although all three variables do play a significant role in correlates of stress (all p < 0.001). In both cases, intensity was the second-strongest predictor, although only marginally (e.g. Intensity: 0.03 vs. F0: 0.01). Therefore, we can determine the following hierarchy for correlates of stress in this language: duration > intensity > F0.
It should be noted, however, that our LDA results for F0 are not directly comparable with the F0 results reported in §6.3. This is because the F0 results in §6.3 included only words with level tone sequences such as high-high or low-low. The LDA results in Table VII, however, include words with all possible tone patterns, including, for example, high-low words with initial stress and low-high words with final stress.

Discussion
In the preceding sections, we saw that there was a statistically significant effect of stress on all three stress correlates in Tetsǫ́t'ıné, which is broadly consistent with the evidence presented earlier in §3 that stress plays an active role in morphophonemic alternations in this language. The effect of intensity was always significant and in the direction expected, as seen in Figures 5 and 6, where stressed syllables had greater intensity than unstressed syllables. The effect on F0, on the other hand, was not statistically significant in two-syllable words. In three-syllable words, the effect actually went in the opposite direction than expected, where we observed an overall gradual decline in F0 across all three syllables. Since stress is typically associated with higher F0 cross-linguistically, (DeLacy 2002(DeLacy , 2007Gordon and Roettger 2017), 5 it is likely that other factors may have been responsible for the lowering of F0 in our resultsmost probably, the effect of phrase-final boundary tones. Thus, our results could be interpreted as showing a lack of effect of stress on F0 in this language. We also found a significant effect of stress on duration. Recall that our stimuli were designed such that all two-syllable words exhibited a (light-heavy) weight pattern, whereas three-syllable words exhibited a (light-light)(heavy) weight pattern, due to the fact that we chose stems containing long (full) vowels. However, this does not constitute a confound in our analysis, because the weight pattern was held constant throughout all the stimuli, and our analysis compared the same position (first, second or third syllable) when it was either stressed or unstressed. Thus, we found that for each position within the word, the same vowel (whether phonologically long or short) was longer when stressed than when unstressed. We now return to the more general question with which we began this paper: How should we expect stress to be realised phonetically in a language which has both contrastive vowel length and four contrastive tones? That is, which of the stress correlates duration, F0, or intensityshould we expect to be the primary stress correlate in this language? It is, of course, somewhat problematic to try to answer this question directly, because these three correlates are measured in three distinct units of measurementmilliseconds, decibels and Hertzwhich are not directly comparable with each other. However, the discriminant analysis presented in §6.4 suggests that whereas all three stress correlates play a role in distinguishing stressed from unstressed syllables, duration is by far the most accurate predictor of stress in this language, with over ten times greater weight than the other two stress correlates. In other words, it appears that in Tetsǫ́t'ıné, duration is the primary correlate of stress.
We are aware of only two detailed phonetic studies of stress in a language with both contrastive vowel length and tone. The first is Potisuk et al.'s (1996) study of the acoustic correlates of stress in Thai. Their main finding was similar to ours, in that duration was the primary correlate of stress in Thai. However, the authors found that change in the shape of F0 contours was a secondary cue to stress, whereas there was no significant effect of stress on intensity (Potisuk et al. 1996: 210). Just as in our study, the fact that F0 plays a minor role in realising stress was expected (since Thai has five contrastive tones), whereas the primary role of duration was unexpected (since Thai also has contrastive vowel length). Potisuk et al. (1996: 211) speculate that the different behaviours of F0 and duration may be due to the different functional loads of the two phonetic properties: there are far more minimal pairs involving tone in Thai than there are involving vowel length. It is not clear to us whether this line of explanation could be extended to Tetsǫ́t'ıné. That is, it is not obvious to us that in Tetsǫ́t'ıné, tone has a higher functional load than does vowel length, except in the very abstract sense that vowels contrast two degrees of length, but there are four contrastive tones. Additional lexicostatistical work on Tetsǫ́t'ıné would be necessary to properly evaluate this hypothesis. The other such study was by Everett (1998) on Pirahã. Pirahã has both contrastive vowel length and tone. Everett finds that the primary correlates of stress in this language are amplitude and duration, whereas F0 and vowel formant frequencies seem to play little or no role. Thus, both studies found that duration played a role in realising stress, in tone languages with vowel length, though only one study also found a significant effect of amplitude.
How might one interpret these results? It is possible, contrary to the Functional Load Hypothesis in (2), that there is actually a universal preference for realising stress using duration rather than F0. Based on a survey of 110 studies of 75 languages, (Gordon and Roettger 2017: 8) found that duration was 'the most successful marker of stress, distinguishing stress in 90% […] of the languages studied'. However, in line with the Functional Load Hypothesis, they also found that F0 was used to cue stress in only two out of nine tone languages examined (Gordon and Roettger 2017: 9), and that six out of seven tone languages used intensity as a marker of stress (Gordon and Roettger 2017: 10). Thus, our Tetsǫ́t'ıné results might be seen as typologically normal for a tone language, except that Gordon and Roettger's sample size of tone languages was small, and the authors also do not specify how many of the tone languages examined also had contrastive vowel length. A study which specifically addressed the Functional Load Hypothesis in relation to contrastive vowel length (Lunden et al. 2017) found no support for the Functional Load Hypothesis: that is, languages with contrastive vowel length were just as likely to employ duration as a correlate of stress as languages without contrastive vowel length.
In the remainder of this section, we will situate our results within a broader theoretical context. We will consider the predictions of Dispersion Theory (Flemming 1995(Flemming , 2001, which is conceptually related to the Functional Load Hypothesis, as well as the Iambic-Trochaic Law (Kager 1993;Hayes 1995), and we will discuss how the predictions of these theories compare with our results.
The predictions of the Functional Load Hypothesis in (2) for Tetsǫ́t'ıné are fairly straightforward. Although the Functional Load Hypothesis posits a universal bias in favour of using F0 to realise stress, in a language in which both tone and vowel length are contrastive, it predicts that stress should be realised mainly by intensity. As we have seen, our results follow from the Functional Load Hypothesis only to the extent that F0 does not seem to be a correlate of stress in Tetsǫ́t'ıné. The fact that duration is the primary correlate of stress, whereas intensity plays a secondary role, is unexpected.
Dispersion Theory (Flemming 1995(Flemming , 2001 seems to make very similar predictions to the Functional Load Hypothesis and runs into similar challenges in relation to our data. Dispersion Theory proposes a unified model of phonetics and phonology in which phonological inventories and their phonetic realisations, as well as their phonological behaviours, can be explained based on three general principles, as described in (23). (23) General principles of Dispersion Theory (based on Flemming 2001: 25) a. Maximise the number of contrasts (in any given context). b. Maximise the distinctiveness of contrasts. c. Minimise effort.
The majority of work that we are aware of within Dispersion Theory has focused on explaining the structure of vowel inventories (Schwartz et al. 1997;Padgett and Tabain 2005;Trudgill 2009;Becker-Kristal 2010;Hall 2011). Although the study by Padgett and Tabain (2005) examines the effect of stress on the Russian vowel space, we are not aware of any studies within the framework of Dispersion Theory which directly address the question of how stress itself should be realised, given other facts about the phonological inventory of a language. Nevertheless, we believe that if the principles in (23) are applied to the central question of this paper, a fairly clear prediction emerges. If we assume that vowel length and tonal contrasts are to be maintained in accordance with principle (23a), and that, by the same principle, stress must also be realised, then a conflict arises in the implementation of principle (23b). All other things being equal, to the extent to which stress is realised by increased F0, it makes stress more distinctive but tonal contrasts less distinctive (and conversely); likewise, to the extent to which stress is realised by increased duration, it makes stress more distinctive but vowel length contrasts less distinctive (and conversely). There is only one way to avoid this conflict, and this would be to employ a third stress correlateintensitywhich is not independently contrastive in the language. Thus, it seems clear to us that in a language such as Tetsǫ́t'ıné, in which both vowel length and tone are contrastive, Dispersion Theory predicts that intensity ought to be the primary correlate of stress.
Our results are only partially consistent with these predictions. As shown in Figures 7 and 8, the effect of stress on F0 was either non-significant or went in the opposite direction than expected, suggesting that stress itself may not be responsible for the effect on F0 in these data. This would be consistent with Dispersion Theory, in that we would not expect a large effect of stress on F0 in a language with four contrastive tones. On the other hand, the LDA in Tables VII and VIII suggests that duration is a more reliable predictor of stress in this language than intensity. From a Dispersion Theory perspective, this is surprising for a language which has contrastive vowel length. The only way in which Dispersion Theory could be reconciled with our results would be to invoke principle (23c), minimise effort. That is, it could be assumed that the realisation of stress as increased intensity inherently involves greater effort (perhaps due to the greater subglottal pressure required) than the realisation of stress as F0 or duration. However, we are not aware of any means, at present, by which articulatory effort can be compared across different stress correlates. There is no way to determine, for example, that an increase in intensity of 3 dB requires greater effort than an increase in duration of 50 ms. Alternatively, a reviewer suggests another possible explanation based on perceptibility: while there are many languages in the world with durational contrasts, and many languages with tonal contrasts, there are no languages with a contrast based purely on intensity. It may thus be that intensity is inherently less perceptible than other phonetic correlates of stress. 6 However, this explanation also raises a similar issue of how to compare perceptibility across different stress correlates. Thus, while our results may still be compatible with Dispersion Theory on a conceptual level, we do not believe it is possible to formally model our results in this framework until such time as a method is devised to quantify articulatory effort and perceptibility across different stress correlates, as suggested above. Another theoretical proposal we wish to consider in relation to our results is the Iambic-Trochaic Law (Bolton 1894;Kager 1993;Hayes 1995;Mellander 2003). The definition of the Iambic-Trochaic Law given by Hayes (1995) is reproduced in (24).
(24) Iambic-Trochaic Law (Hayes 1995: 80) a. Elements contrasting in intensity naturally form groupings with initial prominence. b. Elements contrasting in duration naturally form groupings with final prominence.
As stated by Hayes (1995), the Iambic-Trochaic Law refers to foot parsing: syllables may be grouped together into units of initial prominence (trochaic feet) or final prominence (iambic feet), depending on how the syllable nuclei differ from each other in intensity or in duration. However, the converse of this principle has also been assumed: that the grouping of syllables into groups with initial prominence or final prominence predicts how that prominence will be realised. Thus, Kager's (1993) formulation of the Iambic-Trochaic Law is given in (25). (25) Iambic-Trochaic Law (Kager 1993: 382) a. Trochaic systems have durationally even feet. b. Iambic systems have durationally uneven feet.
The logic in (25) could similarly be applied to intensity: under the Iambic-Trochaic Law, we expect that trochaic systems would have feet with uneven intensity, whereas iambic systems would have feet with even intensity. This generalisation may ultimately be grounded in perceptibility: if an iambic system were to employ intensity as its primary stress correlate, then a sequence of (soft-loud)(soft-loud) syllables would be reinterpreted as soft(loud-soft)(loud), given what appears to be a universal tendency for syllables varying in intensity to be grouped head-initially (Crowhurst 2016(Crowhurst , 2018(Crowhurst , 2020Crowhurst and Teodocio Olivares 2014). In other words, listeners would re-analyse the system as trochaic. Viewed in this light, the Iambic-Trochaic Law predicts a correlation between metrical rhythm type and the primary stress correlate used: trochaic systems are predicted to employ intensity as a primary stress correlate, whereas iambic systems are predicted to employ duration.
As we saw in §3.4, there is ample distributional evidence that Tetsǫ́t'ıné is an iambic language. Iambic feet condition vowel length adjustments ( §3.4.1), movement of tone ( §3.4.2) and deletion of consonants to avoid stress lapse ( §3.4.3). As we also saw in §6, duration also appears to be the most robust correlate of stress in this language. From the perspective of the Iambic-Trochaic Law, these facts are expected: Tetsǫ́t'ıné employs duration as its main stress correlate rather than intensity because it is an iambic language. Thus, on the whole, our results seem to be more consistent with the Iambic-Trochaic Law than Dispersion Theory or the Functional Load Hypothesis, to the extent that these models make predictions about the phonetic realisation of stress cross-linguistically.

Conclusion
It has been observed that, cross-linguistically, stress is typically realised by some combination of increased F0 and/or increased duration (Gordon and Roettger 2017). It has also been claimed that languages avoid using phonetic properties which are independently contrastive as correlates of stress and also have a dispreference for employing intensity as a stress correlate (Berinstein 1979;Hayes 1995). This set of assumptions is potentially problematic for a language such as Tetsǫ́t'ıné, in which vowel length and tone are both contrastive. It has been claimed that in such a language, having a phonologically active stress system is impossible, and that stress can be 'covert' only (Spahr 2016). However, in this paper, we have provided both phonological distributional evidence as well as acoustic phonetic evidence for iambic stress in Tetsǫ́t'ıné. Regarding the manner in which stress is realised, we have found that duration rather than intensity is the primary correlate of stress in this language, which would follow from our understanding of the Iambic-Trochaic Law (Hayes 1995). This is in spite of the fact that vowel length is independently contrastive in Tetsǫ́t'ıné. However, as Lunden et al. (2017) have observed, increased vowel duration under stress does not necessarily obscure contrastive vowel length distinctions. Phonetic data from Tetsǫ́t'ıné itself would seem to support this view: in the present study, word-final stressed vowels were approximately 13% longer than unstressed vowels (Figures 3 and 4). In contrast, a study of contrastive vowel length by Jaker (2019) found that phonemically full (long) stem vowels were approximately 50% longer than reduced (short) vowels, and that this contrastive length difference was also accompanied by marked differences in vowel quality. In other words, phonetic lengthening under stress in Tetsǫ́t'ıné is non-neutralising, because phonetic lengthening is of a much lesser magnitude than phonemic length differences, and the latter are also enhanced by other acoustic cues.
If the use of duration as the primary stress cue in Tetsǫ́t'ıné is indeed a result of the Iambic-Trochaic Law, as we have suggested, a question for future research is whether what we have observed in Tetsǫ́t'ıné may hold true of iambic languages more generally. The Iambic-Trochaic Law predicts that, in iambic languages, duration will be the primary correlate of stress, whereas intensity should play only a minor role. Indeed, this is our impression. For example, in Menominee, an iambic language, the vowel in the head syllable of a foot is phonetically lengthened (Milligan 2006). In this context, it is noteworthy that in a recent survey of the literature on perceptual studies relating to the Iambic-Trochaic Law (Crowhurst 2020), all of the languages surveyed were either trochaic (e.g. English, German and Spanish) or else had no clear rhythm type (e.g. French and Japanese); there are no studies of which we are aware of rhythmic grouping biases in speakers of iambic languages. There have been similarly few production studies examining the realisation of stress in iambic languages, even though it is our impression that this seems to be the most widespread rhythm type in the indigenous languages of North America. Therefore, we believe our results underscore the need for additional phonetic studies on stress in iambic languages to verify whether these languages do, as a group, differ fundamentally from trochaic languages at the phonetic level.
Supplementary material. The supplementary material for this article can be found at https://doi.org/10. 1017/S0952675722000069