Kazakh

Kazakh (ISO 639-3, kaz) is a Kipchak (Northwestern) Turkic language with approximately ten million speakers (Muhamedowa 2015). While the majority of Kazakh speakers live in the Republic of Kazakhstan, significant Kazakh-speaking populations exist throughout Central Asia. See Figure 1 for a map of the region. Kazakh spoken in Kazakhstan is described as having three or four dialects, but many researchers agree that differences between dialects are small and largely lexical (Kara 2002, Grenoble 2003, Muhamedowa 2015; see Amanzholov 1959 for more on Kazakh dialects).

Approximant w j Lateral l5 approximant whether an alternation is between two phonemes or between allophones of a single phoneme. In addition to the consonants shown above, Kazakh speakers variably incorporate non-native sounds, like the Russian labiodental fricatives, [f] and [v], as well as the voiceless glottal fricative, [h], from Arabic. These are exemplified by words like /fAwn5 A/ 'fauna', /vAgon5 / 'railway car', and /Z É ijhAn5 / 'world'. For the glottal fricative, there is a great deal of variation both within-and between-speakers. For example, our consultant produced [h] '. This class of non-native sounds is not uncommon among Kazakhs in Kazakhstan, but speakers report that these sounds are rare or even non-existent in the speech of Kazakhs in China or Mongolia. The Kazakh consonantal inventory is exemplified below in onset and coda positions. 1 Voiceless plosives in Kazakh tend to be aspirated, but aspiration is often reduced in connected speech. Word-initially, voiced plosives are generally pre-voiced. Mean voice onset time (VOT, n = 73) by place of articulation is shown in Table 1. Observe that among the voiced plosives, /g/ is prevoiced more noticeably than /d5 / or /b/. As for the voiceless plosives, /p/ is realized with more aspiration than /t5 /, /k/, or /q/. The velar and uvular plosives, as well as other dorsal obstruents, undergo alternations based on the backness of adjacent vowels. The velar obstruents occur with front vowels and uvular obstruents occur with back vowels. In loans, however, the dorsal obstruents may occur with both front and back vowels. This is exemplified by the loan, /krAn5 / 'faucet'. In addition, both velar and uvular obstruents may occur adjacent to / É ij/, as in /q É ij/ 'manure' and /k É ij/ 'wear.IMP', as well as in loans like, /" É ijmAr5 At5 / 'building' and /g É ijt5 Ar5 A/ 'guitar'. The uvular plosive varies between a true plosive, as in /qUs/ 'bird' and /wAq/ 'time', and a voiceless uvular fricative in connected speech, as in /m´qt5 d5 É i 9e-g É i 9en5 / [m´Xt5 d5 É i 9eg É i 9en5 ] 'strong say-PFV' produced in the passage below. In this way, the contrast between /q/ and /X/ is often neutralized in the spoken language. Perhaps due to influence from Russian, the voiceless uvular fricative is, in some instances, produced as a voiceless velar fricative, [x], as in /XAn5 / [xAn5 ] 'khan'.
Within a root, both voiced and voiceless plosives may occur in syllable onset and coda position, as demonstrated below. Note especially that voiceless plosives may occur intervocalically within morphological roots, as in /bAqA/ 'frog'. The voiceless velar plosive is also found intervocalically in the possessive pronominal suffix, as in /m É i 9en5 -IkI/ '1S-POSS.PRO' and /s5 Iz5 -d5 IkI/ '2S.FORM-POSS.PRO'. In Table 2, also observe that obstruents in coda position agree with a following obstruent in voicing.

Stem
Gloss stem-plural stem-accusative stem-ablative stem = Q Related to suffix onset desonorization, suffix onsets undergo nasal harmony when flanked by two nasals (Balakayev 1962, Davis 1998, Kuhn 2014, Gopal 2015. Compare the accusative-inflected forms to the genitive-and ablative-inflected forms in Tables 4 and 5. The accusative and genitive suffixes have an underlying dental nasal initially, evidenced by [Apt5 A-n5] 'week-ACC' and [Apt5 A-n´N] 'week-GEN', whereas the ablative suffix has an underlying /d5 /, as in [Apt5 A-d5 An5 ] 'week-ABL' (Table 4a). When the suffix-initial nasal of the accusative suffix is concatenated to a stem-final nasal, the suffix nasal undergoes desonorization to /d5 /, as in [n5  'bread-ACC' (Table 4e). However, in the genitive morpheme, which has a suffix-final nasal, the onset does not undergo desonorization. Instead, the onset nasal is retained, as in [n5 An5 -n5N] 'bread-GEN' (Table 5c). In contrast to other forms, the initial /d5 / of the ablative morpheme is nasalized to [n5 ] when following a stem-final nasal, as seen in [n5 An5 -n5 An5 ] 'bread-ABL'. The preceding nasal must be stem-final for nasal harmony to occur. In /n5 Ar5 / 'dromedary' (Table 4c, Table 5a), the nasal is stem-initial, but in the case-inflected forms below the stem-initial nasal is ignored for nasal harmony. Generally, suffixal /d5 / is nasalized to [n5 ] if it is immediately preceded by a nasal and followed by a vowel and a second nasal consonant (Table 5). In Table 5a, the stem-initial nasal does not trigger nasalization of the initial consonant of the genitive or ablative suffixes, which both possess a final nasal consonant. In Table 5b-d though, the stem-final consonant is nasal, which along with the suffix-final nasal in the ablative and genitive morphemes, triggers nasalization of /d5 / to [n5 ]. Note that the stem-final nasal does not trigger nasalization on its own, since the accusative suffix does not undergo nasalization in Table 5.

Vowels
More so than any other issue in Kazakh phonology, scholars have provided vastly divergent accounts of the vowel inventory. Descriptions suggest as few as five to as many as eleven phonemic vowels in the language (Dzhunisbekov 1972;Kirchner 1992Kirchner , 1998Vajda 1994;Kara 2002;Yessenbayev, Karabalayeva & Sharipbayev 2012;Sharipbayev 2013;McCollum 2015;Washington 2016). Many writers, though, have suggested nine vowels are phonemic in Kazakh. In this paper we propose an eleven-vowel inventory /A o´U Q e P I Y ij uw/, exemplified by the vowel space shown below (see Sharipbayev 2013).
To determine more precisely the vowel qualities present in the Kazakh inventory we measured the first and second formants (F1 and F2) of vowels collected during elicitation. As vowel quality in non-initial syllables is severely restricted due to vowel harmony, F1 and F2 were measured at the midpoint of initial-syllable vowels (n = 272). Mean formant values with one standard deviation ellipses are presented in Figure 2 and Table 6.  Beginning with the low vowels, the front vowel, /Q/, is somewhat distinct from the other vowels in that it historically derives from /A/, often from Arabic and Persian loans (Kirchner 1998: 319). This might explain why /Q/ may trigger front or back vowel suffixes, as in [Ql5 -s5 Iz5 ] 'strength-PRV' and [kYn5 Q-"A] 'error-DAT'. Additionally, this vowel is not as common as many of the other vowels in the language, and is typically limited to initial syllables. As for /A/, this vowel is typically produced with a central-back articulation, although it may vary noticeably in its backness. Before the velarized variant of /l5 /, most speakers produce this vowel as [A= ] or even [Å], [mA= ®] 'livestock', although this is less noticeable for our consultant. Remaining variance seems to correlate with position, where initial positions tend to be more backed than subsequent syllables. See discussion in the Stress and Intonation section below for more on gradient fronting of /A/.
Moving upward in the vowel space, the vowel transcribed as /´/ is variously rendered /F/, /µ/, or /Ú -/ in other works (Abuov 1994, Johanson & Csató 1998, Bowman & Lokshin 2013, Muhamedowa 2015. The realization of this vowel in Kazakh, as well as in Turkish and Kyrgyz, is typically central and schwa-like (Kiliç & Ögüt 2004, Washington 2016. Like /A/, /´/ tends to shift forward in non-initial syllables. As for the back vowels /o/ and /U/, as noted above, /U/ is sometimes transcribed as /u/ or /o/, and /o/ is sometimes transcribed as /ç/ (Balakayev 1962;Dzhunisbekov 1972Dzhunisbekov , 1980Abuov 1994;Vajda 1994;Yessenbayev et al. 2012). In Washington (2016), these vowels are transcribed as /uU/ and /U/ (see also Vajda 1994). In both Yessenbayev et al. (2012) and Washington (2016), these two vowels show significant overlap, as they do in Figure 2. The higher vowel /U/ alternates with /´/ for labial harmony but the mid vowel /o/ almost never surfaces via labial harmony, but when it does, it alternates with /A/, as seen in /Zol5 Among the more front-central round vowels, /Y/ is sometimes transcribed as /ö/ or /O/ (Dzhunisbekov 1972, Yessenbayev et al. 2012). This vowel may be front or, as is often the case, central [™4 ]. In contrast, /P/ is, as far as we know, always central, and often described as diphthongal (Dzhunisbekov 1972(Dzhunisbekov , 1980Vajda 1994;Washington 2016). These two vowels have recently been analyzed as /Y/ and /Y˘/ (McCollum 2018), as well as /™/ and /y™/ (Washington 2016). When /Y/ is centralized, duration helps discriminate between /Y/ and /P/, as the mid vowel is longer than the high vowel (Washington 2016, McCollum 2018. Both front rounded vowels may occur in initial syllables. The shorter vowel, /Y/, may also occur in non-initial positions, alternating with /I/ for labial harmony. However, /P/ rarely occurs in non-initial syllables, and if it does, it alternates with / É i 9e/. Among the front vowels, / É i 9e/ is noticeably diphthongal. Many researchers have argued that this vowel is a phonemic diphthong (Dzhunisbekov 1972(Dzhunisbekov , 1980Vajda 1994;Washington 2016). Krueger (1980) suggests that the diphthongal nature of this vowel may be conditioned by position, where, in absolute initial position, this vowel is more diphthongal than elsewhere. McCollum (2018) finds that / É i 9e/ is more monophthongal in colloquial speech. However, observe the realization of / É i 9e/ in / É i 9et5 / 'meat' shown in Figure 3. Note the decrease in F2 and the increase in F1 throughout the vowel. During the first third of the vowel, F2 decreases slightly, averaging around 2800 Hz. By the end of the vowel, though, F2 averages approximately 2000 Hz. Throughout the vowel, F1 rises steadily from around 300 Hz to over 600 Hz.
The front vowel that we transcribe as /I/ has also been analyzed as /i/ and /Š/ (Dzhunisbekov 1972(Dzhunisbekov , 1980Vajda 1994;Kirchner 1998;Washington 2016). This vowel is shorter, and more prone to centralization than / É i 9e/. This vowel also tends to lower in non-initial positions.
We analyze /I/, /Y/, /´/, and /U/ as forming one natural class, and / É i 9e/, /P/, /A/, and /o/ forming a separate class. Traditionally, these two groups of vowels have been analyzed as differing in height (Balakayev 1962;Dzhunisbekov 1972Dzhunisbekov , 1980Abuov 1994;Kirchner 1998). However, some recent work has proposed that these classes of vowels differ in length, not height. For instance, McCollum (2015aMcCollum ( , 2018a and Washington (2016) report that the second set of vowels are over twice as long as the first set. From a historical perspective, those arguing for a length distinction have suggested that this arose from a height distinction, with the short vowels deriving from historically higher vowels (Johanson 1998, Washington 2016. During elicitation, durations varied significantly for each phoneme, and no trends supported the more recent length-based analysis. As a result, we analyze these as differing in height, but note the possibility that length is also a possible distinguishing factor. Whether distinguished by length or height, the high (short) vowels are subject to reduction, even to the point of deletion in non-final syllables, as in /kQs5 Ip-k Éi 9er5 / [kQs5 pk Éi 9er5 ] 'profession-AGT'.

Diphthongs
As noted above, / É i 9e P o/ are often analyzed as diphthongs. To better understand the realization of these three vowels, we measured F1 and F2 at three points during each vowel, 25%, 50%, and 75%. These are plotted in Figure 4 alongside the other six vowels commonly accepted as phonemic in Kazakh. In Figure 4, these three putative diphthongs are black, and monophthongs are gray. We also present results from a third category, vowels that are typically treated as vowel + glide sequences, / É ij/ and / É uw/. We discuss the status of these vowels later, suggesting that these are (marginal) phonemes. Observe that, unsurprisingly, the monophthongs do not show significant F1 or F2 movement throughout their production. Among the monophthongs, the greatest amount of formant dynamism is seen for /Q/. Among the putative diphthongs, / É i 9e/ and /P/ show more movement in the vowel space than /o/. The front diphthong transitions from a more [i]-like vowel earlier in its production to a more [e] or even [E]-like vowel by toward the end of its production. The central round vowel moves forward in the vowel space (this is also evident in Figure 10 below, in the discussion of vowel harmony), but this shift is not as drastic as one might expect from some of the transcriptions provided in the literature, e.g. (Washington 2016 andVajda 1994, respectively). Furthermore, the transcription [ É y™] predicts that this vowel is backed and not fronted throughout its articulation. As for the mid back round vowel, /o/, this vowel is transcribed as [ É wU] and [ É uU] in Vajda (1994) and Washington (2016), respectively. These authors thus use its putative diphthongal character to differentiate it from /U/. In our data, however, there is no acoustic shift consistent with this analysis, suggesting that, at least for our consultant, this vowel is better represented as a monophthong.
In addition to / É i 9e/, we suggest that there exist two other diphthongs in the language, / É ij/ and / É uw/. The contrastive status of these two diphthongs is more marginal, occurring infrequently in the lexicon. These two sounds are indicated as marginal in Figure 4, in contrast to the other sounds, monophthongal and diphthongal, whose contrastive status in the language is much clearer. A number of writers have argued that these two vowels, represented orthographically as ‡~• and ‡y• respectively, are not part of the underlying inventory, but are vowel + glide sequences. Dzhunisbekov (1972) contends that ‡~• is composed of an unrounded vowel, /I/~/´/ that alternates according to palatal harmony and the palatal glide (see also Kirchner 1998). Furthermore, it is typically assumed that ‡y• is also complex, consisting of a round vowel, /Y/~/U/ that alternates according to palatal harmony, and the labial-velar glide (Balakayev 1962, Dzhunisbekov 1972, Vajda 1994, Kirchner 1998 see also Bowman & Lokshin 2014). Alternatively, it is possible to treat these graphemes as two additional vowels in the inventory, / É ij/ and / É uw/ (Aralbayev 1970, McCollum 2015). We discuss these two possible analyses below.
Good evidence for analyzing these graphemes as vowel + glide sequences comes from desonorization and the realization of the non-past verbal suffix. First, these vowels trigger desonorization when they occur stem-finally. Second, the non-past suffix is realized as a non-high vowel, alternating between / É i 9e/ and /A/ after consonant-final roots, as in [k É i 9et5 -i 9e-d5 I] 'leave-NPST-3 and [qUr5 -A-d5´] 'construct-NPST-3'. However, the non-past suffixes surfaces as a palatal glide after vowel-final roots, as in [qAr5 A-j-d5´] 'look-NPST-3'. Thus, if these are actually vowels, they should trigger the glide allomorph of the non-past suffix. If they are a vowel + consonant sequence, though, they should trigger the vowel allomorph of the non-past suffix. Furthermore, when the non-past suffix attaches to other vowel-final stems, such as /t5 An5/ ‡;aKÓI• 'recognize.IMP', this suffix, as noted above, is realized by a palatal glide. Thus, vowel-final verb roots inflected in the non-past may create /I/ + /j/ or /´/ + /j/ sequences. When they do, /I+j/ and /´+j/ sequences are written as ‡~•. So, [t5 An5-j-d5´] 'recognize-NPST-3' and [ É i 9es5 t5 I-j-d5 I] 'hear-NPST-3' are written as ‡;aK~2ÓI• and ‡ec;~2i•. This morphologically-conditioned sequence of vowel + glide is orthographically encoded as ‡~•, which is evidence for this grapheme as a sequence of vowel + glide. However, phonetic evidence is at odds with the above morphological and phonological evidence for treating these graphemes as vowel + glide sequences. Consider the waveforms and spectrograms of ‡~• before front and back vowel suffixes in Figure 5. First, from Table 2 above, recall that mean F2 for /I/ and /´/ are 2040 Hz and 1409 Hz, respectively. Thus, if ‡~• is composed of /I+j/ or /´+j/, then initial portion of the first-syllable vowel in ‡~;;i• [ É ijt5 -t5 I] 'dog-ACC' should be located near 2040 Hz. In Figure 5, this is not the case, and F2 of the vocalic portion of [ É ijt5 ] is stable, with a mean value of 2846 Hz. While the acoustic targets of diphthongs may differ from those of monophthongs in a given language, they do not typically exhibit a difference of 800 Hz. Moreover, in ‡b~2ÓI• [m É ij-d5] 'brain-ACC', where a back vowel suffix follows ‡~•, F2 of the initial vowel remains high, averaging 2938 Hz. Consequently, there is no phonetic evidence that ‡~• is composed of two separate phonemes, /I/ + /j/ or /´/ + /j/. Given the phonetic realization of these vowels, it is possible that ‡~• and ‡y• simply represent two additional phonemes in the language. In order to maintain the vowel + glide analysis, it is possible to treat the surface realizations in Figure 5 as a byproduct of regressive assimilation, where the preceding vowel assimilates to the frontness of the following glide. Problematically, predicts that /´/+/j/ should never surface in the language, and should always be repaired to something like /i/ or / É ij/. However, a [´j] sequence is attested in ‡cÓI~( • /s5j/ 'gift', which is shown in Figure 6. Compare the clear rise in F2 from vowel onset in this word to the minimal F2 rise in ‡b~2ÓI• /m É ij-d5/ above. This sequence of /´/+/j/ is clearly different from the initial-syllable vowel in /m É ij-d5/. As for the vowel ‡y• / É uw/, this vowel is phonetically distinct from /U/. In Figure 2, it is apparent that F1 of / É uw/ is lower than that of /U/ or /Y/. Moreover, F2 of / É uw/ is lower than F2 of /U/. If, as in many prior analyses, ‡y• is composed of a short round vowel /Y/~/U/ and the labial-velar glide, then one would expect to see lowering of F1, a drastic reduction in spectral energy around F3, and a decrease in amplitude. In ‡;y• /t5 É uw/ 'flag', which is shown in Figure 7, none of these characteristics are found. Before moving on to consider vowel harmony in the language, the grapheme ‡y• may also occur post-vocalically, representing /w/, as in ‡;ay• /t5 Aw/ 'mountain'. This grapheme may also occur in non-initial positions as a gerundial suffix. In this context, most writers assume that the gerundial suffix varies by the backness of the stem (Balakayev 1962;Dzhunisbekov 1972Dzhunisbekov , 1980Kirchner 1998;Fazylzhanova 2016;Kuderinova et al. 2016;McCollum 2019). After front vowels, as in /Il5 -É uw/ 'hang-GER', F2 of / É uw/ falls between values typical for the short front and short back round vowels, with noticeable F2 depression throughout the course of the vowel. In the token shown in Figure 8, one might narrowly transcribe the vowel portion of the gerundial suffix as [ É Y™4 ]. We interpret the centralization of / É uw/ in this context as an effect of read speech in the lab (see also Bowman & Lokshin 2014). In casual speech, the gerundial suffix in front vowel contexts approximates that of underlying /Y/, although typically longer than initial-syllable /Y/.

Vowel harmony
Two harmony processes operate in Kazakh. The first and more pervasive harmony process is palatal (or backness) harmony. 3 Excluding the relatively few cases of disharmony in the language, words contain only vowels from the front, / É i 9e P I Y/, or back, /A o´U/, set. As / É ij/ and /Q/ may trigger both front and back vowel suffixes, and due to the rarity of root-internal / É uw/, these vowels are excluded from further discussion. The effect of harmony is evident within roots (Table 7), as well as across morphological boundaries (Table 8). The set of possible root-internal vowel sequences after initial unrounded vowels below illustrates this significant restriction. In Table 7, all root-internal vowels agree in backness. In the top rows, all rootinternal vowels are [+back], while in the lower rows all root-internal vowels are [−back]. Within roots there are some instances of disharmony. These typically arise through loanword incorporation, as in /r5 É ijz5 A/ 'satisfied' from Arabic, and /l5 É ijmon5 / 'lemon' from Russian, although compounding may also result in disharmony. The high front vowel, / É ij/, often occurs in disharmonic sequences, as in the two cases above, but disharmony may occur with other vowels, too, although it is less common.
Across morphological boundaries, the backness of the most proximate vowel dictates that all suffix vowels agree in backness (a small set of morphemes do not alternate for backness; see Kirchner 1998, Kara 2002, Muhamedowa 2015. The set of vowels permissible across morpheme boundaries is illustrated for unrounded roots in Table 8. Observe that /A/ alternates with / É i 9e/ in the locative suffix, and /´/ with /I/ in the accusative suffix. Front vowel variants of these two suffixes occur when the stem is [−back], and back vowel variants of these suffixes occur when the stem is [+back]. Kazakh exhibits a second harmony pattern, labial harmony. Root-internal vowel sequences after initial round vowels are demonstrated in Table 9. Take into account that labial harmony co-operates with palatal harmony, so the sequences of vowels shown below all obey palatal harmony. After round vowels, /´/ may round to /U/, /I/ may round to /Y/, and / É i 9e/ may round to /P/, although this alternation is far less common. Least common of all, /A/ almost never rounds to /o/ (compare, however, /Zol5 AwS´-"A/ [Zo®owS´-"A] 'traveler-DAT' from the text below). 4 Variation in labial harmony is common in the language, but only slight variation emerged from our corpus (see McCollum 2018). Labial harmony is decidedly gradient in Kazakh, often triggering incomplete assimilation of the target vowel. Consider two tokens of /t5 Yl5 É i 9ek/ 'chick' below in Figure 9. In the first spectrogram, maximum F2 during the second syllable nears 2500 Hz, while in the second, maximum F2 reaches only 2200 Hz. In this second token, the second-syllable vowel does not approximate surface / É i 9e/ or /P/, but shows significant coarticulation, so we represent this surface token as [ É i 9e ¶ ] with phonetic rounding. In contrast, the second-syllable vowel of this first token exhibits no coarticulation, surfacing as [ É i 9e].
Across morpheme boundaries, labial harmony is even further restricted, as seen in the words in Table 10. High vowels optionally undergo harmony while non-high vowels typically do not. Compare the spectrograms for [t5 Ps5 -t5 Éi 9e] 'chest-LOC' and [s5 Éi 9es5 -t5 Éi 9e] 'sound-LOC' in Figure 10. The initial vowel of /t5 Ps5 -t5 Éi 9e/ has a mean F2 of 1902 Hz. In contrast, mean F2 of the initial vowel in this token of /s5 É i 9es5 -t5 Éi 9e/ is 2412 Hz. If harmony were categorical, F2 of a non-high vowel after /P/ should resemble F2 of initial /P/. However, mean F2 of the second syllable vowel in this token of /t5 Ps5 -t5 Éi 9e/ is 2364 Hz, far more closely approximating F2 of initial and final / É i 9e/ in [s5 Éi 9es5 -t5 Éi 9e], which are 2412 Hz and 2307 Hz, respectively.  When these restrictions on labial harmony are compared to older descriptions of the language (Menges 1947, Balakayev 1962, Korn 1969, Dzhunisbekov 1972 it is evident that labial harmony is diminishing in contemporary Kazakh (McCollum 2015). This is particularly true in colloquial speech, although harmony is more often retained in more formal, literary speech (Abuov 1994). Lastly, Dzhunisbekov (1980) argues that all consonants are produced with allophonic backing and labialization, in accordance with both harmony processes (see also McCollum 2015). So, harmony may affect both vowels and consonants in the language.

Syllable structure
Kazakh syllables typically consist of an onset and a nucleus, CV, especially in polysyllabic words. In monosyllables, though, CVC forms predominate. There are, in fact, almost no CV monosyllabic words in the language, and those that are attested are function words. Other syllable types include a lone vowel, V, as in [I.l5 É Yw] 'hang-GER' and [o.®Ar5 ] '3P'. Complex onsets are not attested in native words, but may occur in borrowings. In many cases, loans with complex onsets are realized with an intrusive vowel between the onset cluster, as in the Russian loan, /kr5 An5 / [k´( RAn5 ] 'faucet'. Complex codas are rare, but may occur under three conditions: one, the first member of the coda is a sonorant; two, the second member of the coda is voiceless; and three, both members of the coda agree in place of articulation. Complex codas are exemplified in Table 11 with words like /r5 É i 9eNk/ 'color' and /bUl5 t5 / 'cloud'. To repair illicit complex codas, a vowel is inserted between the two consonants (see Krippes 1993, Kara 2002; see also Clements & Sezer 1982 for Turkish), as in /XAl5 q/ [XA.®´q] 'nation' and /Awz5 / [A.w´z5 ] 'mouth'. In these unmarked nominative forms epenthesis occurs, but when a vowel-initial suffix is concatenated to the root, the root-final consonant syllabifies with the suffix, as in /XAl5 q-´/ [XA®.q´] 'nation-POSS.3' and /Awz5 -´/ [Aw.z5] 'mouth-POSS.3', removing the need to insert a vowel.
In connected speech, significant reduction of the initial vowel is common, even to the point of deletion. This vowel syncope process affects high vowels more significantly than non-high vowels (Kirchner 1998: 319;Muhamedowa 2015: 273-276;Washington 2016: 145). High vowels are often elided while non-high vowels rarely undergo complete elision.

Stress
Descriptions of Kazakh have typically suggested that stress falls word-finally in the language (Balakayev 1962: 78;Muhamedowa 2015: 285-288). Kirchner (1998: 320), however, argues that Kazakh words bear two accents, a word-initial stress accent, that is realized by increased intensity, and a word-final pitch accent, realized by a rising tone. Johanson (1998: 34-35) notes that in Turkic the word-initial accent is also affected by syllable weight, with heavier syllables attracting stress. In contrast, others contend that Kazakh does not have word-level stress, but rather only phrase-level prominence (Dzhunisbekov 1980(Dzhunisbekov , 1987Abuov 1994: 41-42;Vajda 1994: 644-647). The words collected for the previous sections of the paper cannot address this topic since words were elicited in isolation, and in some cases, there is clear list intonation in the recordings.
We conducted a small production study to assess these claims from the literature. Specifically, we tested whether Kazakh words are marked by increased intensity on the initial syllable and rising f0 on the final syllable. The speaker produced 271 words (n = 899 syllables) in the carrier phrase, [o® ' s5 Pz5 In5 qAjt5 A®5 Ap Ajt5 t5] ‡Ov ' cPpiK Ha~( ;avaz a~( ;;ÓI•'S/he repeated the word __'. Mean intensity (in decibels, dB) was measured over the entire duration of the target vowel. Maximum fundamental frequency (f0; in Hertz) was reported for each vowel. No obvious tracking errors were noticed upon initial inspection.
We manipulated three variables for our study: vowel height, position in the word, and syllable type. We restricted our study to two vowels, /A/ and /´/, controlling for backness and rounding. For our predictors, vowel height was binary, since we used two vowels of distinct heights. Position was ternary (initial, medial, and final syllables), and syllable type was also ternary (open, simple coda, complex coda). Vowels from monosyllabic words were classified as word-final. Six lexical items were used for stimuli, and each lexical item was produced with zero to four suffixes, resulting in words from one to six syllables in length. The words used in our study are shown in Table 12.
We fit a linear mixed effects model to each dependent variable (duration, intensity, maximum f0, F1, and F2) using the three predictors above (vowel height, position, and syllable type) as fixed effects in conjunction with a random intercept for lexical root and random slopes for each fixed effect. The significance of each predictor was determined by likelihood ratio testing.
To determine if initial syllables are louder than non-initial syllables, the intensity of each vowel was measured, and the overall means of these measurements are presented in Table 13. Initial syllables in the target words were characterized by greater amplitude than non-initial syllables (χ 2 (1) = 58.40, p < .001), in conformity with Kirchner (1998) and Johanson (1998), and unsurprisingly, the low vowel was produced with more intensity than the higher vowel (χ 2 (1) = 403.38, p < .001). However, decreases in intensity by-position were smaller with the higher vowel. Thus, the interaction between position and vowel height was also significant (χ 2 (1) = 37.94, p < .001). However, syllable type was not significant, suggesting that syllable complexity does not correlate with increased intensity in Kazakh. Next, we examined Kirchner's claim that the final syllable is marked with a rising tone, which should correspond to an increased maximum f0 for final syllables. Mean maximum f0 by position is presented in Table 14. Initial syllables and closed syllables exhibited significantly higher maximum f0 in target words (syllable type: χ 2 (1) = 33.68, p < .001; position: χ 2 (1) = 51.99, p < .001). The interaction between the syllable type and position was not significant. We found no evidence for a word-final pitch accent, as described in Kirchner (1998) and Johanson (1998), since f0 decreased throughout the word. In addition to maximum f0, we also measured mean f0 across the entire vowel and found the same results. Thus far we have seen that initial syllables are marked by greater intensity than noninitial syllables, and the final syllables do not show evidence of a pitch accent manifested through rising f0. In other words, we found no evidence for claims that Kazakh exhibits a final pitch accent, and perhaps only marginal evidence for an initial stress accent. The evidence from intensity is marginal because intensity tends to decrease across a phrase independent of stress, so it is not surprising that the initial syllables are louder than other syllables. This same tendency held across the entire carrier phrase: earlier syllables in the phrase were louder than later syllables.
To further examine potential acoustic manifestations of stress, we measured duration and vowel quality. Duration was measured from the onset to the offset of the second formant. Since vowel elision is common, particularly in initial syllables, duration measures reported below do not include elided vowels. Vowel quality was examined using F1 and F2, which were measured at the midpoint of the vowel.
Before moving onto vowel quality, it is important to note that the duration results above do not necessarily indicate that final syllables are stressed. This monotonic increase in vowel duration could derive either from a word-level stress pattern or from phrase-level lengthening. To examine the potential relationship between position in the word and vowel quality, we measured F1 and F2 of vowels in target words. Generally, F1 decreases across the words, indicating a general raising pattern for both /A/ and /´/ (χ 2 (1) = 95.53, p < .001), shown in Table 16. Vowel quality was also affected by syllable type, with closed syllables having higher F1 and lower F2 (F1: χ 2 (1) = 17.46, p < .001; F2: χ 2 (1) = 74.07, p < .001). Interestingly, back vowels were significantly fronted in non-initial syllables (χ 2 (1) = 220.08, p < .001), which is seen in Table 17 and Figure 11 (see also McCollum 2015). This pattern is also found in the recorded passage transcribed below and appears to neutralize backness distinctions in later syllables. For instance, mean F1 and F2 over the entire duration of the word-final /A/ in /Ar5 Al5 Ar5n5 d5 A/ 'among themselves' was 658 Hz and 1932 Hz, respectively. Mean F1 and F2 of /Q/ in /d5 Ql5 / 'exactly' was 679 Hz and 1920 Hz, indicating a relatively full neutralization of F1-F2 differences between /A/ and /Q/. Lastly, F2 of /´/ was significantly higher than F2 of /A/ (χ 2 (1) = 4.60, p = .03).  To summarize, duration and F2 increased throughout the word while intensity, maximum f0, and F1 decreased throughout the word. Since the only vowels examined were back vowels, vowel quality results tentatively suggest that initial-syllable vowels may be produced with more peripheral vowel qualities. It is unclear, though, if the results necessarily relate to stress. From our results, we do not extrapolate to argue more concretely for either the existence or stress or its placement in Kazakh, in part, because only one speaker participated in the study. Future work will need to further explore acoustic prominence and how it relates to putative stress in the language.

Intonation
Declarative statements involve a gradual decline in f0. Yes/no questions are typically produced with a phrase-final rise in f0. As for wh-questions, the question word is often associated with rising f0. A minimal triplet consisting of a declarative statement, a yes/no question, and a wh-question is presented below in Figure 12 (see Bazarbayeva 2008 for more on Kazakh intonation). In summary, Kazakh intonation is marked by gradual downdrift in declaratives, but rising f0 marks question morphemes, both on independent wh-words, and on the question enclitic.

Transcription of the recorded passage
The version of 'The North Wind and the Sun' presented below was first translated by the speaker from the Russian version presented in Yanushevskaya & Bunčić (2015) and then recorded.