Kazakh

Adam G. McCollum; Si Chen

doi:10.1017/S0025100319000185

Kazakh

Published online by Cambridge University Press: 26 February 2020

Adam G. McCollum and

Si Chen

Show author details

Adam G. McCollum: Affiliation:
Rutgers Universityadam.mccollum@rutgers.edu
Si Chen: Affiliation:
Hong Kong Polytechnic Universitysarah.chen@polyu.edu.hk

Article contents

Extract
Consonants
Vowels
Syllable structure
Stress and intonation
Transcription of the recorded passage
Supplementary material
Footnotes
References

Rights & Permissions

Extract

Kazakh (ISO 639-3, kaz) is a Kipchak (Northwestern) Turkic language with approximately ten million speakers (Muhamedowa 2015). While the majority of Kazakh speakers live in the Republic of Kazakhstan, significant Kazakh-speaking populations exist throughout Central Asia. See Figure 1 for a map of the region. Kazakh spoken in Kazakhstan is described as having three or four dialects, but many researchers agree that differences between dialects are small and largely lexical (Kara 2002, Grenoble 2003, Muhamedowa 2015; see Amanzholov 1959 for more on Kazakh dialects).

Type: Illustrations of the IPA
Information: Journal of the International Phonetic Association , Volume 51 , Issue 2 , August 2021 , pp. 276 - 298

DOI: https://doi.org/10.1017/S0025100319000185 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © International Phonetic Association 2020

Kazakh (ISO 639-3, kaz) is a Kipchak (Northwestern) Turkic language with approximately ten million speakers (Muhamedowa Reference Muhamedowa2015). While the majority of Kazakh speakers live in the Republic of Kazakhstan, significant Kazakh-speaking populations exist throughout Central Asia. See Figure 1 for a map of the region. Kazakh spoken in Kazakhstan is described as having three or four dialects, but many researchers agree that differences between dialects are small and largely lexical (Kara Reference Kara2002, Grenoble Reference Grenoble2003, Muhamedowa Reference Muhamedowa2015; see Amanzholov Reference Amanzholov1959 for more on Kazakh dialects).

Figure 1 Political map of Central Asia.

Sound files were recorded from one Kazakh speaker in San Diego, CA. The analysis derives largely from sound files recorded in San Diego, California, but is also informed by significant fieldwork in Kazakhstan. The consultant is a female in her early thirties from the Zhambul region of southeastern Kazakhstan. She has lived in the U.S. for five years, and speaks Kazakh, Russian, Turkish, and English. Excluding the data collected for the analysis of stress, as well as ‘The North Wind and the Sun’ passage, all words were produced in isolation.

Consonants

The Kazakh consonantal inventory consists of the twenty contrastive sounds listed below (Balakayev Reference Balakayev1962: 40). Note that many (and possibly all) of these consonants alternate for the backness and roundness of flanking vowels, making it difficult in some cases to determine whether an alternation is between two phonemes or between allophones of a single phoneme. In addition to the consonants shown above, Kazakh speakers variably incorporate non-native sounds, like the Russian labiodental fricatives, [f] and [v], as well as the voiceless glottal fricative, [h], from Arabic. These are exemplified by words like /fɑwn̪ɑ/ ‘fauna’, /vɑɡon̪/ ‘railway car’, and /ʒi͡jhɑn̪/ ‘world’. For the glottal fricative, there is a great deal of variation both within- and between-speakers. For example, our consultant produced [h] in <жиhан> [ʒi͡jhɑn̪] ‘world’ but [χ] in <жиhаз> [ʒi͡jχɑz̪] ‘furniture’. This class of non-native sounds is not uncommon among Kazakhs in Kazakhstan, but speakers report that these sounds are rare or even non-existent in the speech of Kazakhs in China or Mongolia. The Kazakh consonantal inventory is exemplified below in onset and coda positions.Footnote ¹

Voiceless plosives in Kazakh tend to be aspirated, but aspiration is often reduced in connected speech. Word-initially, voiced plosives are generally pre-voiced. Mean voice onset time (VOT, n=73) by place of articulation is shown in Table 1. Observe that among the voiced plosives, /ɡ/ is prevoiced more noticeably than /d̪/ or /b/. As for the voiceless plosives, /p/ is realized with more aspiration than / t̪/, /k/, or /q/.

Table 1. Voice onset time (VOT, in ms) for voiced and voiceless plosives by place of articulation.

The velar and uvular plosives, as well as other dorsal obstruents, undergo alternations based on the backness of adjacent vowels. The velar obstruents occur with front vowels and uvular obstruents occur with back vowels. In loans, however, the dorsal obstruents may occur with both front and back vowels. This is exemplified by the loan, /krɑn/ ‘faucet’. In addition, both velar and uvular obstruents may occur adjacent to /i͡j/, as in /qi͡j/ ‘manure’ and /ki͡j/ ‘wear.imp’, as well as in loans like, /ʁi͡jmɑrɑt̪/ ‘building’ and /ɡi͡jt̪ɑrɑ/ ‘guitar’. The uvular plosive varies between a true plosive, as in [qʊs̪] ‘bird’ and [wɑq] ‘time’, and a voiceless uvular fricative in connected speech, as in /məqt̪ə d̪i̯͡e-ɡi̯͡en̪/ [məχt̪ə d̪i̯͡eɡi̯͡en̪] ‘strong say-pfv’ produced in the passage below. In this way, the contrast between /q/ and /χ/ is often neutralized in the spoken language. Perhaps due to influence from Russian, the voiceless uvular fricative is, in some instances, produced as a voiceless velar fricative, [x], as in /χɑn̪/ [xɑn̪] ‘khan’.

Within a root, both voiced and voiceless plosives may occur in syllable onset and coda position, as demonstrated below. Note especially that voiceless plosives may occur intervocalically within morphological roots, as in /bɑqɑ/ ‘frog’. The voiceless velar plosive is also found intervocalically in the possessive pronominal suffix, as in /mi̯͡en̪-ɪkɪ/ ‘1s-poss.pro’ and /s̪ɪz̪-d̪ɪkɪ/ ‘2s.form-poss.pro’. In Table 2, also observe that obstruents in coda position agree with a following obstruent in voicing.

Table 2. Obstruent voicing in onset and coda positions.

As seen above, root-internal plosives may be voiced or voiceless. However, word-finally plosives are voiceless. The voiced dental plosive does not occur root-finally.Footnote ² Intervocalically, the voiceless dental plosive does not undergo voicing, as in [ki̯͡et̪-i̯͡e-d̪ɪ] ‘leave-npst-3’. However, the voiced dental plosive may devoice when it occurs word-finally due to deletion of the word-final vowel, like [ki̯͡el̪-i̯͡e-d̪ɪ] ‘come-npst-3’, which is colloquially produced as [ki̯͡el̪i̯͡et̪].

At morpheme boundaries, intervocalic plosives are often voiced and spirantized, as in /kɵp-ɪ/ [kɵβɪ] ‘much-poss.3’ and Table 3. Also, plosives may undergo spirantization across word boundaries, as in /ɑq i̯͡eʃkɪ/ [ɑʁ i̯͡eʃkɪ] ‘white goat’. Of the plosives, the dental plosives seem least affected by spirantization, but for some speakers spirantization of /d̪/ to [ð] still occurs.

Table 3 Intervocalic spirantization of obstruents.

Unlike the plosives, Kazakh allows the voiced sibilants /z/ and /ʒ/ in coda positions, as in the pronominal, /bɪz̪/ ‘we’, and the Persian loan /t̪æʒ/ ‘crown’. Further, unlike the plosives, intervocalic fricatives do not undergo voicing at morpheme boundaries, as can be seen in forms like, /t̪ɑs̪-ə/ ‘stone-poss.3’ and /qɑz̪-ə/ ‘goose-poss.3’. The sibilants are often subject to assimilation word-internally, in compounds, and across word boundaries. Within words, sibilants may trigger regressive devoicing of voiced coda sibilants, as in /ki̯͡ez̪-s̪i̯͡e/ [ki̯͡es̪-s̪i̯͡e] ‘travel-cond’ and /t̪æʒ-s̪ɪz̪/ [t̪æʃ-s̪ɪz̪] ‘crown-PRV’. Across constituent members of a compound both devoicing and minor place assimilation are evident in words like /t̪ɑs̪ ʒol̪/ [t̪ɑʃ ʃoɫ] ‘stone road (paved road)’ and /bi̯͡es̪-ʒʏz̪/ [bi̯͡eʃ-ʃʏz̪] ‘five hundred’ (Krippes Reference Krippes1993). In onsets, postalveolar fricatives are realized as affricates in some dialects (Amanzholov Reference Amanzholov1959). Some speakers report this distinction being used as a regional shibboleth. As with the uvular plosive, the uvular fricative is subject to noticeable variation. On several occasions during elicitation the fricative was produced as a voiced uvular plosive, as in [ʃɑpɑɴ-ɢɑ] ‘coat-dat’ and once as a uvular trill, in [ʒoɫowʃə-ʀɑ] ‘traveler-dat’.

Among the sonorants, nasals are produced at three major places of articulation, although it is reported that the velar nasal is realized as a uvular in back vowel contexts (Vajda Reference Vajda1994). Also, the dental nasal undergoes assimilation to the place of articulation of a following obstruent, exemplified by /ʃɑpɑn̪-ʁɑ/ [ʃɑpɑɴ-ɢɑ] ‘coat-dat’ (Krippes Reference Krippes1993). The lateral approximant is velarized in back vowel contexts, as in /mɑl̪/ [mɑɫ] ‘livestock’. The trill may be reduced to a tap in connected speech, as in /qər̪ɑn̪/ [qər̪ɑn̪] ‘hawk’ and /qʊr̪-ɑ-s̪əz̪-d̪ɑr̪=mɑ/ [qʊɾɑs̪əz̪d̪ɑr̪mɑ] ‘construct-npst-2.form-pl=q’. In word-initial position, /r̪/ is often preceded by a prothetic vowel, like in /r̪i̯͡eŋk/ [ɪri̯͡eŋk] ‘color’. The palatal and labiovelar approximants are less common than other sonorants, particularly in word-initial position.

Two further consonantal processes are deserving of mention: desonorization and nasal harmony. Desonorization targets suffix (and in some cases, enclitic) onsets, specifically /n l m/, creating heterosyllabic consonantal sequences with falling sonority (Vajda Reference Vajda1994, Davis Reference Davis1998, Kara Reference Kara2002, Gouskova Reference Gouskova2004, Gopal Reference Gopal2015). This is evident in the plural suffix, /-l̪ɑr/, the accusative suffix, /-n̪ə/, and the question enclitic, /=mɑ/, shown in Table 4. Suffixes with an initial /n̪/, like the accusative suffix, undergo desonorization to /d̪/ after approximants, trills, laterals, nasals, and voiced obstruents. Suffixes with an initial lateral, like the plural suffix, are produced with /l̪/ ([ɫ] below, due to backness harmony) after stem-final vowels approximants, and trills (Table 4a–c), but are produced with /d̪/ after stem-final laterals, nasals and voiced obstruents (Table 4d–f). Morphemes with initial /m/, like the question enclitic under desonorization to /b/ after nasals and voiced obstruents (Table 4d–f). All three segments are produced as voiceless plosives when preceded by voiceless obstruents, as seen below (Table 4g, h).

Table 4 Suffix desonorization and nasal harmony.

Related to suffix onset desonorization, suffix onsets undergo nasal harmony when flanked by two nasals (Balakayev Reference Balakayev1962, Davis Reference Davis1998, Kuhn Reference Kuhn2014, Gopal Reference Gopal2015). Compare the accusative-inflected forms to the genitive- and ablative-inflected forms above. The accusative and genitive suffixes have an underlying dental nasal initially, evidenced by [ɑpt̪ɑ-n̪ə] ‘week-acc’ and [apt̪a-nəŋ] ‘week-gen’, whereas the ablative suffix has an underlying /d̪/, as in [ɑpt̪ɑ-d̪ɑn̪] ‘week-abl’ (Table 4a). When the suffix-initial nasal of the accusative suffix is concatenated to a stem-final nasal, the suffix nasal undergoes desonorization to /d̪/, as in [n̪ɑn̪-d̪ə] ‘bread-acc’ (Table 4e). However, in the genitive morpheme, which has a suffix-final nasal, the onset does not undergo desonorization. Instead, the onset nasal is retained, as in [n̪an̪-n̪əŋ] ‘bread-gen’ (Table 5c). In contrast to other forms, the initial /d̪/ of the ablative morpheme is nasalized to [n̪] when following a stem-final nasal, as seen in [n̪ɑn̪-n̪ɑn̪] ‘bread-abl’. The preceding nasal must be stem-final for nasal harmony to occur. In /n̪ɑr/ ‘dromedary’ (Table 4c, Table 5a), the nasal is stem-initial, but in the case-inflected forms below the stem-initial nasal is ignored for nasal harmony. Generally, suffixal /d̪/ is nasalized to [n̪] if it is immediately preceded by a nasal and followed by a vowel and a second nasal consonant (Table 5). In Table 5a, the stem-initial nasal does not trigger nasaliztion of the initial consonant of the genitive or ablative suffixes, which both possess a final nasal consonant. In Table 5b–d though, the stem-final consonant is nasal, which along with the suffix-final nasal in the ablative and genitive morphemes, triggers nasalization of /d̪/ to [n̪]. Note that the stem-final nasal does not trigger nasalization on its own, since the accusative suffix does not undergo nasalization in Table 5.

Table 5 Nasal harmony.

Vowels

More so than any other issue in Kazakh phonology, scholars have provided vastly divergent accounts of the vowel inventory. Descriptions suggest as few as five to as many as eleven phonemic vowels in the language (Dzhunisbekov Reference Dzhunisbekov1972; Kirchner Reference Kirchner1992, Reference Kirchner1998; Vajda Reference Vajda1994; Kara Reference Kara2002; Yessenbayev, Karabalayeva & Sharipbayev Reference Yessenbayev, Karabalayeva and Sharipbayev2012; Sharipbayev Reference Sharipbayev2013; McCollum Reference McCollum2015; Washington Reference Washington2016). Many writers, though, have suggested nine vowels are phonemic in Kazakh. In this paper we propose an eleven-vowel inventory /ɑ o ə ʊ æ e ɵ ɪ ʏ ij uw/, exemplified by the vowel space shown below (see Sharipbayev Reference Sharipbayev2013).

To determine more precisely the vowel qualities present in the Kazakh inventory we measured the first and second formants (F1 and F2) of vowels collected during elicitation. As vowel quality in non-initial syllables is severely restricted due to vowel harmony, F1 and F2 were measured at the midpoint of initial-syllable vowels (n=272). Mean formant values with one standard deviation ellipses are presented in Figure 2 and Table 6.

Figure 2 Mean F1 and F2 (Bark) of vowel phonemes with one-standard deviation ellipses.

Table 6 Mean F1, F2, and F3 for each phoneme (in Hz and Bark).

Beginning with the low vowels, the front vowel, /æ/, is somewhat distinct from the other vowels in that it historically derives from /ɑ/, often from Arabic and Persian loans (Kirchner Reference Kirchner1998: 319). This might explain why /æ/ may trigger front or back vowel suffixes, as in [æl̪-s̪ɪz̪] ‘strength-PRV’ and /kʏn̪æ-ʁɑ/ ‘error-dat’. Additionally, this vowel is not as common as many of the other vowels in the language, and is typically limited to initial syllables. As for /ɑ/, this vowel is typically produced with a central–back articulation, although it may vary noticeably in its backness. Before the velarized variant of /l̪/, most speakers produce this vowel as [ɑ̱] or even [ɒ], [mɑ̱ɫ] ‘livestock’, although this is less noticeable for our consultant. Remaining variance seems to correlate with position, where initial positions tend to be more backed than subsequent syllables. See discussion in the Stress and Intonation section below for more on gradient fronting of /ɑ/.

Moving upward in the vowel space, the vowel transcribed as /ə/ herein is variously rendered /ɤ/, /ɯ/, or /ï/ in other works (Abuov Reference Abuov1994, Johanson & Csató Reference Johanson and Csató1998, Bowman & Lokshin 2013, Muhamedowa Reference Muhamedowa2015). The realization of this vowel in Kazakh, as well as in Turkish and Kyrgyz, is typically central and schwa-like (Kiliç & Öğüt Reference Kiliç and Öğüt2004, Washington Reference Washington2016). Like /ɑ/, /ə/ tends to shift forward in non-initial syllables.

As for the back vowels /o/ and /ʊ/, as noted above, /ʊ/ is sometimes transcribed as /u/ or /o/, and /o/ is sometimes transcribed as /ɔ/ (Balakayev Reference Balakayev1962; Dzhunisbekov Reference Dzhunisbekov1972, Reference Dzhunisbekov1980; Abuov Reference Abuov1994; Vajda Reference Vajda1994; Yessenbayev et al. Reference Yessenbayev, Karabalayeva and Sharipbayev2012). In Washington (Reference Washington2016), these vowels are transcribed as /uʊ/ and /ʊ/ (see also Vajda Reference Vajda1994). In both Yessenbayev et al. (Reference Yessenbayev, Karabalayeva and Sharipbayev2012) and Washington (Reference Washington2016), these two vowels show significant overlap, as they do in Figure 2. The higher vowel /ʊ/ alternates with /ə/ for labial harmony but the mid vowel /o/ almost never surfaces via labial harmony, but when it does, it alternates with /ɑ/, as seen in /ʒol̪ɑwʃə-ʁɑ/ [ʒoɫowʃə-ʀɑ] ‘traveler-DAT’.

Among the more front–central round vowels, /ʏ/ is sometimes transcribed as /ü/ or /ø/ (Dzhunisbekov Reference Dzhunisbekov1972, Yessenbayev et al. Reference Yessenbayev, Karabalayeva and Sharipbayev2012). This vowel may be front or, as is often the case, central [ʉ̞]. In contrast, /ө/ is, as far as we know, always central, and often described as diphthongal (Dzhunisbekov Reference Dzhunisbekov1972, Reference Dzhunisbekov1980; Vajda Reference Vajda1994; Washington Reference Washington2016). These two vowels have recently been analyzed as /ʏ/ and /ʏː/ (McCollum Reference McCollum2018), as well as /ʉ/ and /yʉ/ (Washington Reference Washington2016). When /ʏ/ is centralized, duration helps discriminate between /ʏ/ and /ɵ/, as the mid vowel is longer than the high vowel (Washington Reference Washington2016, McCollum Reference McCollum2018). Both front rounded vowels may occur in initial syllables. The shorter vowel, /ʏ/, may also occur in non-initial positions, alternating with /ɪ/ for labial harmony. However, /ɵ/ rarely occurs in non-initial syllables, and if it does, it alternates with /i̯͡e/.

Among the front vowels, /i̯͡e/ is noticeably diphthongal. Many researchers have argued that this vowel is a phonemic diphthong (Dzhunisbekov Reference Dzhunisbekov1972, Reference Dzhunisbekov1980; Vajda Reference Vajda1994; Washington Reference Washington2016). Krueger (Reference Krueger1980) suggests that the diphthongal nature of this vowel may be conditioned by position, where, in absolute initial position, this vowel is more diphthongal than elsewhere. McCollum (Reference McCollum2018) finds that /i̯͡e/ is more monophthongal in colloquial speech. However, observe the realization of /i̯͡e/ in /i̯͡et̪/ ‘meat’ shown in Figure 3. Note the decrease in F2 and the increase in F1 throughout the vowel. During the first third of the vowel, F2 decreases slightly, averaging around 2800 Hz. By the end of the vowel, though, F2 averages approximately 2000 Hz. Throughout the vowel, F1 rises steadily from around 300 Hz to over 600 Hz.

Figure 3 Waveform and spectrogram of [i̯͡et̪] ‘meat’.

The front vowel that we transcribe as /ɪ/ has also been analyzed as /i/ and /ɘ/ (Dzhunisbekov Reference Dzhunisbekov1972, Reference Dzhunisbekov1980; Vajda Reference Vajda1994; Kirchner Reference Kirchner1998; Washington Reference Washington2016). This vowel is shorter, and more prone to centralization than /i̯͡e/. This vowel also tends to lower in non-initial positions.

We analyze /ɪ/, /ʏ/, /ə/, and /ʊ/ as forming one natural class, and / i̯͡e/, /ɵ/, /ɑ/, and o/ forming a separate class. Traditionally, these two groups of vowels have been analyzed as differing in height (Balakayev Reference Balakayev1962; Dzhunisbekov Reference Dzhunisbekov1972, Reference Dzhunisbekov1980; Abuov Reference Abuov1994; Kirchner Reference Kirchner1998). However, some recent work has proposed that these classes of vowels differ in length, not height. For instance, McCollum (2015a, 2018a) and Washington (Reference Washington2016) report that the second set of vowels are over twice as long as the first set. From a historical perspective, those arguing for a length distinction have suggested that this arose from a height distinction, with the short vowels deriving from historically higher vowels (Johanson Reference Johanson1998, Washington Reference Washington2016). During elicitation, durations varied significantly for each phoneme, and no trends supported the more recent length-based analysis. As a result, we analyze these as differing in height, but note the possibility that length is also a possible distinguishing factor. Whether distinguished by length or height, the high (short) vowels are subject to reduction, even to the point of deletion in non-final syllables, as in /kæs̪ɪp-ki̯͡er̪/ [kæs̪pki̯͡er̪] ‘profession-agt’.

Diphthongs

As noted above, /i̯͡e ɵ o/ are often analyzed as diphthongs. To better understand the realization of these three vowels, we measured F1 and F2 at three points during each vowel, 25%, 50%, and 75%. These are plotted in Figure 4 alongside the other six vowels commonly accepted as phonemic in Kazakh. In Figure 4, these three putative diphthongs are black, and monophthongs are gray. We also present results from a third category, vowels that are typically treated as vowel + glide sequences, /i͡j/ and /u͡w/. We discuss the status of these vowels later, suggesting that these are (marginal) phonemes.

Figure 4 Mean F1 and F2 (Bark) at 25%, 50%, and 75% points of each vowel.

Observe that, unsurprisingly, the monophthongs do not show significant F1 or F2 movement throughout their production. Among the monophthongs, the greatest amount of formant dynamism is seen for /æ/. Among the putative diphthongs, /i̯͡e/ and /ɵ/ show more movement in the vowel space than /o/. The front diphthong transitions from a more [i]-like vowel earlier in its production to a more [e] or even [ɛ]-like vowel by toward the end of its production. The central round vowel moves forward in the vowel space (this is also evident in Figure 10 below, in the discussion of vowel harmony), but this shift is not as drastic as one might expect from some of the transcriptions provided in the literature, [y͡ʉ] or [w͡ʉ] (Washington Reference Washington2016 and Vajda Reference Vajda1994, respectively). Furthermore, the transcription [y͡ʉ] predicts that this vowel is backed and not fronted throughout its articulation. As to the mid back round vowel, /o/, this vowel is transcribed as [w͡ʊ] and [u͡ʊ] in Vajda (Reference Vajda1994) and Washington (Reference Washington2016), respectively. These authors thus use its putative diphthongal character to differentiate it from /ʊ/. In our data, however, there is no acoustic shift consistent with this analysis, suggesting that, at least for our consultant, this vowel is better represented as a monophthong.

In addition to /i̯͡e/, we suggest that there exist two other diphthongs in the language, /i͡j/ and /u͡w/. The contrastive status of these two diphthongs is more marginal, occurring infrequently in the lexicon. These two sounds are indicated as marginal in Figure 4, in contrast to the other sounds, monophthongal and diphthongal, whose contrastive status in the language is much clearer. A number of writers have argued that these two vowels, represented orthographically as <и> and <у> respectively, are not part of the underlying inventory, but are vowel + glide sequences. Dzhunisbekov (Reference Dzhunisbekov1972) contends that <и> is composed of an unrounded vowel, /ɪ/~/ə/ that alternates according to palatal harmony and the palatal glide (see also Kirchner Reference Kirchner1998). Furthermore, it is typically assumed that <у> is also complex, consisting of a round vowel, /ʏ/~/ʊ/ that alternates according to palatal harmony, and the labiovelar glide (Balakayev Reference Balakayev1962, Dzhunisbekov Reference Dzhunisbekov1972, Vajda Reference Vajda1994, Kirchner Reference Kirchner1998 see also Bowman & Lokshin Reference Bowman and Lokshin2014). Alternatively, it is possible to treat these graphemes as two additional vowels in the inventory, /i͡j/ and /u͡w/ (Aralbayev Reference Aralbayev1970, McCollum Reference McCollum2015). We discuss these two possible analyses below.

Good evidence for analyzing these graphemes as vowel + glide sequences comes from desonorization and the realization of the non-past verbal suffix. First, these vowels trigger desonorization when they occur stem-finally. Second, the non-past suffix is realized as a non-high vowel, alternating between /i̯͡e/ and /ɑ/ after consonant-final roots, as in [ki̯͡et̪-i̯e-d̪ɪ] ‘leave-npst-3 and [qʊr-ɑ-d̪ə] ‘construct-npst-3’. However, the non-past suffixes surfaces as a palatal glide after vowel-final roots, as in [qɑrɑ-j-d̪ə[ ‘look-npst-3’. Thus, if these are actually vowels, they should trigger the glide allomorph of the non-past suffix. If they are a vowel + consonant sequence, though, they should trigger the vowel allomorph of the non-past suffix. Furthermore, when the non-past suffix attaches to other vowel-final stems, such as /t̪ɑn̪ə/ <таны> ‘recognize.imp’, this suffix, as noted above, is realized by a palatal glide. Thus, vowel-final verb roots inflected in the non-past may create /ɪ/ + /j/ or /ə/ + /j/ sequences. When they do, /ɪ+j/ and /ə+j/ sequences are written as <и>. So, /t̪ɑn̪ə-j-d̪ə/ ‘recognize-npst-3’ and /i̯͡es̪t̪ɪ-j-d̪ɪ/ ‘hear-npst-3’ are written as <таниды> and <естиді>. This morphologically-conditioned sequence of vowel + glide is orthographically encoded as <и>, which is evidence for this grapheme as a sequence of vowel + glide.

However, phonetic evidence is at odds with the above morphological and phonological evidence for treating these graphemes as vowel + glide sequences. Consider the waveforms and spectrograms of <и> before front and back vowel suffixes in Figure 5. First, from Table 2 above, recall that mean F2 for /ɪ/ and /ə/ are 2040 Hz and 1409 Hz, respectively. Thus, if <и> is composed of /ɪ+j/ or /ə+j/, then initial portion of the first-syllable vowel in <итті> [i͡jt̪-t̪ɪ] ‘dog-acc’ should be located near 2040 Hz. In Figure 5, this is not the case, and F2 of the vocalic portion of [i͡jt̪] is stable, with a mean value of 2846 Hz. While the acoustic targets of diphthongs may differ from those of monophthongs in a given language, they do not typically exhibit a difference of 800 Hz. Moreover, in <миды> [mi͡j-d̪ə] ‘brain-acc’, where a back vowel suffix follows <и>, F2 of the initial vowel remains high, averaging 2938 Hz. Consequently, there is no phonetic evidence that <и> is composed of two separate phonemes, /ɪ/ + /j/ or /ə/ + /j/. Given the phonetic realization of these vowels, it is possible that <и> and <у> simply represent two additional phonemes in the language.

Figure 5 Waveforms and spectrograms of <итті> [i͡jt̪-t̪ɪ] ‘dog-acc’ and <миды> [mi͡j-d̪ə] ‘brain-acc’.

In order to maintain the vowel + glide analysis, it is possible to treat the surface realizations in Figure 5 as a byproduct of regressive assimilation, where the preceding vowel assimilates to the frontness of the following glide. Problematically, predicts that /ə/+/j/ should never surface in the language, and should always be repaired to something like /i/ or /i͡j/. However, a [əj] sequence is attested in <сый> /s̪əj/ ‘gift’, which is shown in Figure 6. Compare the clear rise in F2 from vowel onset in this word to the minimal F2 rise in <миды> /mi͡j-d̪ə/ above. This sequence of /ə/+/j/ is clearly different from the initial-syllable vowel in /mi͡j-d̪ə/.

Figure 6 Waveform and spectrogram of <сый> [s̪əj] ‘gift’.

As for the vowel <у> /u͡w/, this vowel is phonetically distinct from /ʊ/. In Figure 2, it is apparent that F1 of /u͡w/ is lower than that of /ʊ/ or /ʏ/. Moreover, F2 of /u͡w/ is lower than F2 of /ʊ/. If, as in many prior analyses, <у> is composed of a short round vowel /ʏ/ ~ /ʊ/ and the labiovelar glide, then one would expect to see lowering of F1, a drastic reduction in spectral energy around F3, and a decrease in amplitude. In <ту> /t̪u͡w/ ‘flag’, which is shown in Figure 7, none of these characteristics are found.

Figure 7 Waveform and spectrogram of <ту> /t̪u͡w/ ‘flag’.

Before moving on to consider vowel harmony in the language, the grapheme <у> may also occur post-vocalically, representing /w/ post-vocalically, as in <тау> /t̪ɑw/ ‘mountain’. This grapheme may also occur in non-initial positions as a gerundial suffix. In this context, most writers assume that the gerundial suffix varies by the backness of the stem (Balakayev Reference Balakayev1962; Dzhunisbekov Reference Dzhunisbekov1972, Reference Dzhunisbekov1980; Kirchner Reference Kirchner1998; Fazylzhanova Reference Fazylzhanova2016; Kuderinova et al. Reference Kuderinova, Amanbayeva, Fazylzhanova, Amirzhanova and Zhumabayeva2016; McCollum Reference McCollum2019). After front vowels, as in /ɪl̪-u͡w/ ‘hang-ger’, F2 of /u͡w/ falls between values typical for the short front and short back round vowels, with noticeable F2 depression throughout the course of the vowel. In the token shown in Figure 8, one might narrowly transcribe the vowel portion of the gerundial suffix as [ʏ͡ʉ̞]. We interpret the centralization of /u͡w/ in this context as an effect of read speech in the lab (see also Bowman & Lokshin Reference Bowman and Lokshin2014). In casual speech, the gerundial suffix in front vowel contexts approximates that of underlying /ʏ/ in casual speech, although typically longer than initial-syllable /ʏ/.

Figure 8 Waveform and spectrogram of <ілу> /ɪl̪-u͡w/ [ɪl̪ʏ͡ʉw] ‘hang-ger’.

Vowel harmony

Two harmony processes operate in Kazakh. The first and more pervasive harmony process is palatal (or backness) harmony.Footnote ³ Excluding the relatively few cases of disharmony in the language, words contain only vowels from the front, /i̯͡e ɵ ɪ ʏ/, or back, /ɑ o ə ʊ/, set. As /i͡j/ and /æ/ may trigger both front and back vowel suffixes, and due to the rarity of root-internal /u͡w/, these vowels are excluded from further discussion. The effect of harmony is evident within roots (Table 7), as well as across morphological boundaries (Table 8). The set of possible root-internal vowel sequences after initial unrounded vowels below illustrates this significant restriction. In Table 7, all root-internal vowels agree in backness. In the top rows, all root-internal vowels are [+back], while in the lower rows all root-internal vowels are [-back].

Table 7 Palatal harmony within roots.

Table 8 Palatal harmony across morpheme boundaries.

Within roots there are some instances of disharmony. These typically arise through loanword incorporation, as in /ri͡jz̪ɑ/ ‘satisfied’ from Arabic, and /l̪i͡jmon̪/ ‘lemon’ from Russian, although compounding may also result in disharmony. The high front vowel, /i͡j/, often occurs in disharmonic sequences, as in the two cases above, but disharmony may occur with other vowels, too, although it is less common.

Across morphological boundaries, the backness of the most proximate vowel dictates that all suffix vowels agree in backness (a small set of morphemes do not alternate for backness; see Kirchner Reference Kirchner1998, Kara Reference Kara2002, Muhamedowa Reference Muhamedowa2015). The set of vowels permissible across morpheme boundaries is illustrated for unrounded roots in Table 8. Observe that /ɑ/ alternates with /i̯͡e/ in the locative suffix, and /ə/ with /ɪ/ in the accusative suffix. Front vowel variants of these two suffixes occur when the stem is [-back], and back vowel variants of these suffixes occur when the stem is [+back].

Kazakh exhibits a second harmony pattern, labial harmony. Root-internal vowel sequences after initial round vowels are demonstrated in Table 9. Take into account that labial harmony co-operates with palatal harmony, so the sequences of vowel shown below all obey palatal harmony. After round vowels, /ə/ may round to /ʊ/, /ɪ/ may round to /ʏ/, and /i̯͡e/ may round to /ɵ/, although this alternation is far less common. Least common of all, /ɑ/ almost never rounds to /o/ (compare, however, /ʒol̪ɑwʃə-ʁɑ/ [ʒoɫowʃə-ʀɑ] ‘traveler-dat’ from the text below).Footnote ⁴ Variation in labial harmony is common in the language, but only slight variation emerged from our corpus (see McCollum Reference McCollum2018).

Table 9 Labial harmony within roots.

Labial harmony is decidedly gradient in Kazakh, often triggering incomplete assimilation of the target vowel. Consider two tokens of /t̪ʏl̪i̯͡ek/ ‘chick’ below in Figure 9. In the first spectrogram, maximum F2 during the second syllable nears 2500 Hz, while in the second, maximum F2 reaches only 2200 Hz. In this second token, the second-syllable vowel does not approximate surface /i̯͡e/ or /ɵ/, but shows significant coarticulation, so we represent this surface token as [i̯͡e̹] with phonetic rounding. In contrast, the second-syllable vowel of this first token exhibits no coarticulation, surfacing as [i̯͡e].

Figure 9 Two productions of <түлек> /t̪ʏl̪i̯͡ek/ ‘chick’. The left-hand example exhibits no coarticulation, while the right-hand example shows noticeable subphonemic rounding.

Across morpheme boundaries, labial harmony is even further restricted, as seen in the words in Table 10. High vowels optionally undergo harmony while non-high vowels typically do not. Compare the spectrograms for /t̪ɵs̪-t̪i̯͡e/ ‘chest-loc’ and /s̪i̯͡es̪-t̪i̯͡e/ ‘sound-loc’ in Figure 10. The initial vowel of /t̪ɵs̪-t̪i̯͡e/ has a mean F2 of 1902 Hz. In contrast, mean F2 of the initial vowel in this token of /s̪i̯͡es̪-t̪i̯͡e/ is 2412 Hz. If harmony were categorical, F2 of a non-high vowel after /ɵ/ should resemble F2 of initial /ɵ/. However, mean F2 of the second syllable vowel in this token of /t̪ɵs̪-t̪i̯͡e/ is 2364 Hz, far more closely approximating F2 of initial and final /i̯͡e/ in /s̪i̯͡es̪-t̪i̯͡e/, which are 2412 Hz and 2307 Hz, respectively.

Table 10 Labial harmony across morpheme boundaries.

When these restrictions on labial harmony are compared to older descriptions of the language (Menges Reference Menges1947, Balakayev Reference Balakayev1962, Korn Reference Korn1969, Dzhunisbekov Reference Dzhunisbekov1972) it is evident that labial harmony is diminishing in contemporary Kazakh (McCollum Reference McCollum2015). This is particularly true in colloquial speech, although harmony is more often retained in more formal, literary speech (Abuov Reference Abuov1994). Lastly, Dzhunisbekov (Reference Dzhunisbekov1980) argues that all consonants are produced with allophonic backing and labialization, in accordance with both harmony processes (see also McCollum Reference McCollum2015). So, harmony may affect both vowels and consonants in the language.

Figure 10 Waveforms and spectrograms of <төсте> [t̪ɵs̪-t̪i̯͡e] ‘chest-loc’ and <сесте> [s̪i̯͡es̪-t̪i̯͡e] ‘sound-loc’.

Syllable structure

Kazakh syllables typically consist of an onset and a nucleus, CV, especially in polysyllabic words. In monosyllables, though, CVC forms predominate. There are, in fact, almost no CV monosyllabic words in the language, and those that are attested are function words. Other syllable types include a lone vowel, V, as in [ɪ.l̪ʏ͡w] ‘hang-ger’ and [o.ɫɑr] ‘3p’. Complex onsets are not attested in native words, but may occur in borrowings. In many cases, loans with complex onsets are realized with an intrusive vowel between the onset cluster, as in the Russian loan, /krɑn̪/ [kə̆ɾɑn̪] ‘faucet’.

Complex codas are rare, but may occur under three conditions: one, the first member of the coda is a sonorant; two, the second member of the coda is voiceless; and three, both members of the coda agree in place of articulation. Complex codas are exemplified in Table 11 with words like /r̪i̯͡eŋk/ ‘color’ and /bʊl̪t̪/ ‘cloud’.

Table 11 Syllable types.

To repair illicit complex codas, a vowel is inserted between the two consonants (see Krippes Reference Krippes1993, Kara Reference Kara2002; see also Clements & Sezer Reference Clements, Sezer, van der Hulst and Smith1982 for Turkish), as in /χɑl̪q/ [χɑ.ɫəq] ‘nation’ and /ɑwz̪/ [ɑ.wəz̪] ‘mouth’. In these unmarked nominative forms epenthesis occurs, but when a vowel-initial suffix is concatenated to the root, the root-final consonant syllabifies with the suffix, as in /χɑl̪q-ə/ [χɑɫ.qə] ‘nation-poss.3’ and /ɑwz̪-ə/ [ɑw.z̪ə] ‘mouth-poss.3’, removing the need to insert a vowel.

In connected speech, significant reduction of the initial vowel is common, even to the point of deletion. This vowel syncope process affects high vowels more significantly than non-high vowels (Kirchner Reference Kirchner1998: 319; Muhamedowa Reference Muhamedowa2015: 273–276; Washington Reference Washington2016: 145). High vowels are often elided while non-high vowels rarely undergo complete elision.

Stress and intonation

Stress

Descriptions of Kazakh have typically suggested that stress falls word-finally in the language (Balakayev Reference Balakayev1962: 78; Muhamedowa Reference Muhamedowa2015: 285–288). Kirchner (Reference Kirchner1998: 320), however, argues that Kazakh words bear two accents, a word-initial stress accent, that is realized by increased intensity, and a word-final pitch accent, realized by a rising tone. Johanson (Reference Johanson1998: 34–35) notes that in Turkic the word-initial accent is also affected by syllable weight, with heavier syllables attracting stress. In contrast, others contend that Kazakh does not have word-level stress, but rather only phrase-level prominence (Dzhunisbekov Reference Dzhunisbekov1980, Reference Dzhunisbekov1987; Abuov Reference Abuov1994: 41–42; Vajda Reference Vajda1994: 644–647). The words collected for the previous sections of the paper cannot address this topic since words were elicited in isolation, and in some cases, there is clear list intonation in the recordings.

We conducted a small production study to assess these claims from the literature. Specifically, we tested whether Kazakh words are marked by increased intensity on the initial syllable and rising f0 on the final syllable. The speaker produced 271 words (N=899 syllables) in the carrier phrase, /oɫ _____ s̪ɵz̪ɪn̪ qɑjt̪ɑɫ̪ɑp ɑjt̪t̪ə/<Ол _____ сөзін қайталап айтты> ‘S/he repeated the word _____’. Mean intensity (in decibels, dB) was measured over the entire duration of the target vowel. Maximum fundamental frequency (f0; in Hertz) was reported for each vowel. No obvious tracking errors were noticed upon initial inspection.

We manipulated three variables for our study: vowel height, position in the word, and syllable type. We restricted our study to two vowels, /ɑ/ and /ə/, controlling for backness and rounding. For our predictors, vowel height was binary, since we used two vowels of distinct heights. Position was ternary (initial, medial, and final syllables), and syllable type was also ternary (open, simple coda, complex coda). Vowels from monosyllabic words were classified as word-final. Six lexical items were used for stimuli, and each lexical item was produced with zero to four suffixes, resulting in words from one to six syllables in length. The words used in our study are shown in Table 12.

Table 12 Example stimuli for stress study.

We fit a linear mixed effects model to each dependent variable (duration, intensity, maximum f0, F1, and F2) using the three predictors above (vowel height, position, and syllable type) as fixed effects in conjunction with a random intercept for lexical root and random slopes for each fixed effect. The significance of each predictor was determined by likelihood ratio testing.

To determine if initial syllables are louder than non-initial syllables, intensity of each vowel was measured, and the overall means of these measurements are presented in Table 13. Initial syllables in the target words were characterized by greater amplitude than non-initial syllables (χ²(1) = 58.40, p < .001), in conformity with Kirchner (Reference Kirchner1998) and Johanson (Reference Johanson1998), and unsurprisingly, the low vowel was produced with more intensity than the higher vowel (χ²(1) = 403.38, p < .001). However, decreases in intensity by-position were smaller with the higher vowel. Thus, the interaction between position and vowel height was also significant (χ²(1) = 37.94, p < .001). However, syllable type was not significant, suggesting that syllable complexity does not correlate with increased intensity in Kazakh.

Table 13 Mean intensity by position and syllable type for /ə/ and /ɑ/ (in dB, with one standard deviation).

Next, we examined Kirchner’s claim that the final syllable is marked with a rising tone, which should correspond to an increased maximum f0 for final syllables. Mean maximum f0 by position is presented in Table 14. Initial syllables and closed syllables exhibited significantly higher maximum f0 in target words (syllable type: χ²(1) = 33.68, p < .001; position: χ²(1) = 51.99, p < .001). The interaction between the syllable type and position was not significant. We found no evidence for a word-final pitch accent, as described in Kirchner (Reference Kirchner1998) and Johanson (Reference Johanson1998), since f0 decreased throughout the word. In addition to maximum f0, we also measured mean f0 across the entire vowel and found the same results.

Table 14 Mean maximum f0 by position and syllable type for /ə/ and /ɑ/ (in Hz, with one standard deviation).

Thus far we have seen that initial syllables are marked by greater intensity than non-initial syllables, and the final syllables do not show evidence of a pitch accent manifested through rising f0. In other words, we found no evidence for claims that Kazakh exhibits a final pitch accent, and perhaps only marginal evidence for an initial stress accent. The evidence from intensity is marginal because intensity tends to decrease across a phrase independent of stress, so it is not surprising that the initial syllables are louder than other syllables. This same tendency held across the entire carrier phrase: earlier syllables in the phrase were louder than later syllables.

To further examine potential acoustic manifestations of stress, we measured duration and vowel quality. Duration was measured from the onset to the offset of the second formant. Since vowel elision is common, particularly in initial syllables, duration measures reported below do not include elided vowels. Vowel quality was examined using F1 and F2, which were measured at the midpoint of the vowel.

For duration, /ɑ/ was significantly longer than /ə/ (χ²(1) = 382.23, p < .001). Note that /ɑ/ is about twice as long as /ə/, as seen in Table 15. Open syllables were also marked by significantly longer vowels than closed syllables (χ²(1) = 168.28, p < .001). Additionally, vowel duration significantly increased throughout the word (χ²(1) = 192.71, p < .001). Observe in Table 15 that vowel duration increases monotonically from initial to medial to final syllables.

Table 15 Mean duration by position and syllable type for /ə/ and /ɑ/(in ms, with one standard deviation).

Before moving onto vowel quality, it is important to note that the duration results above do not necessarily indicate that final syllables are stressed. This monotonic increase in vowel duration could derive either from a word-level stress pattern or from phrase-level lengthening.

To examine the potential relationship between position in the word and vowel quality, we measured F1 and F2 of vowels in target words. Generally, F1 decreases across the words, indicating a general raising pattern for both /ɑ/ and /ə/ (χ²(1) = 95.53, p < .001), shown in Table 16. Vowel quality was also affected by syllable type, with closed syllables having higher F1 and lower F2 (F1: χ²(1) = 17.46, p < .001; F2: χ²(1) = 74.07, p < .001).

Table 16 Mean F1 at vowel midpoint by position and syllable type for /ə/ and /ɑ/ (in Hz and Bark, with one standard deviation).

Table 17 Mean F2 at vowel midpoint by position and syllable type for /ə/ and /ɑ/ (in Hz and Bark, with one standard deviation).

Interestingly, back vowels were significantly fronted in non-initial syllables (χ²(1) = 220.08, p < .001), which is seen in Table 17 and Figure 11 (see also McCollum Reference McCollum2015). This pattern is also found in the recorded passage transcribed below and appears to neutralize backness distinctions in later syllables. For instance, mean F1 and F2 over the entire duration of the word-final /ɑ/ in /ɑrɑl̪ɑrən̪d̪ɑ/ ‘among themselves’ was 658 Hz and 1932 Hz, respectively. Mean F1 and F2 of /æ/ in /d̪æl̪/ ‘exactly’ was 679 Hz and 1920 Hz, indicating a relatively full neutralization of F1–F2 differences between /ɑ/ and /æ/. Lastly, F2 of /ə/ was significantly higher than F2 of /ɑ/ (χ²(1) = 4.60, p = .03).

Figure 11 Mean F1 and F2 of /ɑ/ and /ə/ by position (in Bark, with one-standard deviation ellipses).

To summarize, duration and F2 increased throughout the word while intensity, maximum f0, and F1 decreased throughout the word. Since the only vowels examined were back vowels, vowel quality results tentatively suggest that initial-syllable vowels may be produced with more peripheral vowel qualities. It is unclear, though, if the results necessarily relate to stress. From our results, we do not extrapolate to argue more concretely for either the existence or stress or its placement in Kazakh, in part, because only one speaker participated in the study. Future work will need to further explore acoustic prominence and how it relates to putative stress in the language.

Intonation

Declarative statements involve a gradual decline in f0. Yes/no questions are typically produced with a phrase-final rise in f0. As for wh-questions, the question word is often associated with rising f0. A minimal triplet consisting of a declarative statement, a yes/no question, and a wh-question is presented below in Figure 12 (see Bazarbayeva Reference Bazarbayeva2008 for more on Kazakh intonation).

Figure 12 Minimal triplet of three intonational patterns (top = declarative, middle = yes/no question, bottom = wh-question).

In summary, Kazakh intonation is marked by gradual downdrift in declaratives, but rising f0 marks question morphemes, both on independent wh-words, and on the question enclitic.

Transcription of the recorded passage

The version of ‘The North Wind and the Sun’ presented below was first translated by the speaker from the Russian version presented in Yanushevskaya & Bunčić (Reference Yanushevskaya and Bunčić2015) and then recorded.

Orthographic version

Бір күні солтүстік жел мен күн екеуі араларында кім мықты екенін шеше алмай бәсікелеседі. Дәл осы мезетте жол бойында шапанға оранып келе жатқан жолаушыны кезіктіреді. Екеуіне ой келеді, кім де кім жолаушыға үстіндегі шапанын шешкізе алса, сол мықты деген шешімге келеді. Солтүстік жел бар күшімен жел үрлей бастайды, ол қатты үрлеген сайын жолаушы шапанына орана түседі. Амалы таусылған солтүстік жел кезекті күнге берді. Күн жарқырап шығып, жолаушының үстіне нұрын шаша бастағанда, күн нұрына жылынған жолаушы үстіндегі шапанын шешеді. Осы оқиғадан кейін солтүстік желі күннің өзінен мықты екенін мойындауға тура келді.

Phonemic transcription

bɪr̪ kʏn̪ɪ s̪ol̪t̪ʏs̪t̪ɪk ʒi̯͡el̪ mi̯͡en̪ kʏn̪ i̯͡eki̯͡eʏwɪ ɑr̪ɑl̪ɑr̪ən̪d̪ɑ kɪm məqt̪ə i̯͡eki̯͡en̪ɪn̪ ʃi̯͡eʃi̯͡e ɑl̪mɑj ǀ bæs̪i̯͡eki̯͡eli̯͡es̪i̯͡ed̪ɪ ǁ d̪æl̪ os̪ə mi̯͡ez̪i̯͡et̪t̪i̯͡e ʒol̪ bojən̪d̪ɑ ʃɑpɑn̪ʁɑ or̪ɑn̪əp ki̯͡el̪i̯͡e ʒɑt̪qɑn̪ ʒol̪ɑwʃən̪ə ki̯͡ez̪ɪkt̪ɪr̪i̯͡ed̪ɪ ǁ i̯͡eki̯͡eʏɪn̪i̯͡e oj ki̯͡el̪i̯͡ed̪ɪ ǀ kɪm d̪i̯͡e kɪm ʒol̪ɑwʃəʁɑ ʏs̪t̪ɪn̪d̪i̯͡eɡɪ ʃɑpɑn̪ən̪ ʃi̯͡eʃkɪz̪i̯͡e ɑl̪s̪ɑ ǀ s̪ol̪ məqt̪ə d̪i̯͡eɡi̯͡en̪ ʃi̯͡eʃɪmɡi̯͡e ki̯͡el̪i̯͡ed̪ɪ ǀ s̪ol̪t̪ʏs̪t̪ɪk ʒi̯͡el̪ bɑr̪ kʏʃɪmi̯͡en̪ ʒi̯͡el̪ ʏr̪l̪i̯͡ej bɑs̪t̪ɑjd̪ə ǁ ol̪ qɑt̪t̪ə ʏr̪l̪i̯͡eɡi̯͡en̪ s̪ɑjən̪ ʒol̪ɑwʃə ʃɑpɑn̪ən̪ɑ or̪ɑn̪ɑ t̪ʏs̪i̯͡ed̪ɪ ǁ ɑmɑl̪ə t̪ɑws̪əl̪ʁɑn̪ s̪ol̪t̪ʏs̪t̪ɪk ʒi̯͡el ki̯͡ez̪i̯͡ekt̪ɪ kʏn̪ɡi̯͡e bi̯͡er̪d̪ɪ ǁ kʏn̪ ʒɑr̪qər̪ɑp ʃəʁəp ʒol̪ɑwʃən̪əŋ ʏs̪t̪ɪn̪i̯͡e n̪ʊr̪ən̪ ʃɑʃɑ bɑs̪t̪ɑʁɑn̪d̪ɑ kʏn̪ n̪ʊr̪ən̪ɑ ʒəl̪ən̪ʁɑn̪ ʒol̪ɑwʃə ʏs̪t̪ɪn̪d̪i̯͡eɡɪ ʃɑpɑn̪ən̪ ʃi̯͡eʃi̯͡ed̪ɪ ǁ os̪ə oqiʁɑd̪ɑn̪ ki̯͡ejɪn̪ s̪ol̪t̪ʏs̪t̪ɪk ʒi̯͡el̪ɪ kʏn̪n̪ɪŋ ɵz̪ɪn̪i̯͡en̪ məqt̪ə i̯͡eki̯͡en̪ɪn̪ mojən̪d̪ɑwʁɑ t̪u͡wr̪ɑ ki̯͡el̪d̪ɪ

Phonetic transcription

bɪr̪ kʰʏ̆n̪ɪː s̪oˑɫt̪st̪ɪk ʒi̯͡el̪ mi̯͡en̪ kʰʏn̪ i̯͡ekʰjʏ͡w ɑr̪ɑɫaɾn̪̩d̪æˑ kʰɘm͡ məχt̪ɪ̆ i̯͡ekʰi̯͡en̪n̪̩ ʃi̯͡eʃi̯͡e ɑɫmaj ǀ bæs̪ekʰi̯͡el̪es̪i̯̰͡ḛd̪ɪ̆ ǁ d̪æl̪ ɔ̟s̪ɪ̆ mi̯͡ez̪i̯͡et̪t̪i̯͡e ʒoɫ bojin̪d̪æː ʃɑpɑɴɢɑ͡ oˑɾan̪ɪ̆p kʰi̯͡el̪i̯͡e ʒɑt̪̚qɑn̪ ʒoɫɑwʃə̆n̪ɘ̆ kʰi̯͡ez̪ɪk̚t̪ɪɾi̯͡ed̪ɪ ‖ i̯͡ekʰj͡ʉwɪn̪j͡e ŭ͡oj kʰi̯͡el̪i̯͡et̪̚ ǀ kʰɘm d̪i̯͡e kʰɘm ʒoɫowʃə̆ʀɑː s̪t̪n̪̩d̪i̯͡eɣɪ ʃapɑn̪n̪̩ ʃi̯͡eʃkɪz̪i̯͡e ɑˑɫsɑ̰ː ǀ s̪oˑɫ məχt̪ə d̪i̯͡eɣi̯͡en̪ ʃi̯͡eʃɪ̆mɡḛ ki̯͡el̪i̯͡ed̪ə̆ ǀ s̪oɫt̪s̪t̪ɪk ʒi̯͡el̪ ǀ bɑːr̪ kʏ̆ʃʏmi̯͡e̪n ʒi̯͡el̪ ʏɾl̪i͡j βɜˑst̪ajd̪ɪ ǁ oɫ qɑːt̪̚t̪ə͡ ʏ̆r̪li̯͡eɣi̯͡en̪ s̪ɑjə̃n̪ ʒoɫɒwʃə̆ ʃɐ̆pɑn̪n̪ɑ oɾɑn̪ɑ t̪s̪i̯͡ḛ̹d̪ɪ̰ ǁ ɑmɑˑɫə t̪ɑwsɫ̩ʁɑn̪ s̪oɫst̪ʰʏk ʒi̯͡el̪ ǀ kʰi̯͡ez̪i̯͡ekt̪ə kʉn̪͡ŋɡi̯͡e bi̯͡eɾd̪ɪˑ ǁ kʏn̪ ʒɑːr̪qə̆r̪ɑːp̚ ʃəʁəp ʃoɫowʃn̪əŋ s̪t̪n̪i̯͡e n̪əɾn̪̩ ʃɐ̥ʃɑˑ βɐs̪t̪aʀan̪d̪æː ǀ kʏ̆n̪͡ n̪ʊɾən̪ɑˑ ʒəɫɴ̩ɢɑːn̪ ʤoɫɑwʃə ǀ ʏ̆s̪t̪n̪̩d̪i̯͡eɡɪ ʃɑ̆pʰɑn̪n̪̩ ʃɪʃi̯͡ḛd̪ɪ̰ˑ ǁ oˑs̩ʊ͡ oˑq͡χɨʁɑd̪ɑˑn̪ kʰi͡jn̪̩ ǀ s̪oɫt̪s̪t̪ʰʏ̆k ʒi̯͡el̪ɪ kʰʏ̆n̪n̪ɘŋ u̯͡ɵzn̪i̯͡en̪ məχt̪ɛ̆͜ ͡i̯ekʰi̯͡en̪ɘn mojn̪̩d̪ɑwʁɑˑ t̪ʰṵɾɑ̰ kʰi̯͡ḛl̪d̪ɪ̰ˑ

Acknowledgements

We would like to thank the Kazakh speaker who generously shared her time and language with us. In addition, we are grateful to Marc Garellek, Sharon Rose, Amalia Arvaniti, and an anonymous reviewer for their insightful comments and suggestions, which have greatly improved the paper.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/S0025100319000185.

Footnotes

1 The following glosses are used herein: 3 = third person, abl = ablative, acc = accusative, agt = agentive, cond = conditional, dat = dative, form = formal, gen = genitive, ger = gerund, imp = imperative, loc = locative, neg = negative, npst = non-past, pfv = perfective, pl = plural, poss = possessive, pro = pronominal, prv = privative, pst = past, Q = question, ‘-’ precedes a suffix, ‘=’ precedes an enclitic.

2 One potential exception to this is the compound <медбике> /mi̯͡ed̪bi͡jki̯͡e/ ‘nurse’, which is a compound loan from the Russian word for nurse, <медсестра> /m^jeds^jestra/, and / bi͡jki̯͡e/ from Arabic for ‘wife’.

3 However, Vajda (1994) and Washington (2015, 2016) have suggested that putative palatal harmony is actually tongue root harmony.

4 The rounding of the second-syllable vowel here may derive from both the initial vowel and the following labial-velar glide.

References

Abuov, Zhoumaghaly. 1994. The phonetics of Kazakh and the theory of synharmonism. UCLA Working Papers in Phonetics 88, 39–54.Google Scholar

Amanzholov, Sarsen. 1959. Voprosy dialektologii i istorii kazakhskogo i︠a︡zyka [Issues of the dialectology and history of the Kazakh language]. Alma-Ata: National Instructional Institute in the name of Abai.Google Scholar

Aralbayev, Zh. A. 1970. Vokalizm kazaxskogo jazyka: Ocherki po eksperimental’noj fonetiki i fonologii [Vocalism of the Kazakh language: Essays on experimental phonetics and phonology]. Alma-Ata: Nauka.Google Scholar

Balakayev, Maulen. 1962. Sovremennyj kazakhskij jazyk: Fonetika i morfologija [The modern Kazakh language: Phonetics and morphology]. Alma-Ata: Nauka.Google Scholar

Bazarbayeva, Z.M. 2008. Kazaxskaya intonatsiya [Kazakh intonation]. Almaty: Daik Press.Google Scholar

Bowman, Samuel R. & Lokshin, Benjamin. 2014. Idiosyncratically transparent vowels in Kazakh. Proceedings of the 2013 Annual Meeting on Phonology.CrossRef Google Scholar

Clements, George N. & Sezer, Engin. 1982. Vowel and consonant disharmony in Turkish. In van der Hulst, Harry & Smith, Norval (eds.), The structure of phonological representations, Part II, 213–255. Dordrecht: Foris.Google Scholar

Davis, Stuart. 1998. Syllable contact in Optimality Theory. Korean Journal of Linguistics 23, 181–211.Google Scholar

Dzhunisbekov, Alimkhan. 1972. Glasnye kazakhskogo jazyka [Vowels of the Kazakh language]. Alma-Ata: Nauka.Google Scholar

Dzhunisbekov, Alimkhan. 1980. Singarmonizm v kazakhskom jazyke [Sinharmonism in the Kazakh language]. Alma-Ata: Nauka.Google Scholar

Dzhunisbekov, Alimkhan. 1987. The Turkic word prosody problem. Proceedings of the 11th Congress of the International Phonetics Association (ICPhS XI), 321–323.Google Scholar

Fazylzhanova, Anar (ed.). 2016. Zhanga ulttyq alipbi negizinde qazaq zhazuyn reformalaw: Teorijasy men praktikasy [Kazakh writing reform based on the new national alphabet: Theory and practice]. Almaty: Qazaq tili baspasy.Google Scholar

Gopal, Deepthi. 2015. Markedness and syllable contact in Kazakh internal sandhi. Poster presented at The 23rd Manchester Phonology Meeting.Google Scholar

Gouskova, Maria. 2004. Relational hierarchies in Optimality Theory: The case of syllable contact. Phonology 21, 201–250.CrossRef Google Scholar

Grenoble, Lenore. 2003. Language policy in the Soviet Union. Dordrecht: Kluwer.CrossRef Google Scholar

Johanson, Lars. 1998. Structure of Turkic. In Johanson & Csató (eds.), 30–66.Google Scholar

Johanson, Lars & Csató, Éva Ágnes (eds.). 1998. The Turkic languages. Routledge: New York.Google Scholar

Kara, David Somfai. 2002. Kazak. Munich: Lincom Europa.Google Scholar

Kiliç, Mehmet Akif & Öğüt, Fatih. 2004. A high unrounded vowel in Turkish: Is it a central or back vowel? Speech Communication 43, 143–154.CrossRef Google Scholar

Kirchner, Mark. 1992. Phonologie des Kasachischen. Untersuchungen anhand von Sprachaufnahmen aus der kasachischen Exilgruppe in Istanbul [Kazakh phonology: Investigations based on voice recordings from the Kazakh exile group in Istanbul]. Harrassowitz: Weisbaden.Google Scholar

Kirchner, Mark. 1998. Kazakh and Karakalpak. In Johanson & Csató (eds.), 318–332.Google Scholar

Korn, David. 1969. Types of labial vowel harmony in the Turkic languages. Anthropological Linguistics 11, 98–106.Google Scholar

Krippes, Karl A. 1993. Kazakh grammatical sketch with affix list. Columbia, MD: Dunwoody.Google Scholar

Krueger, John R. 1980. Introduction to Kazakh: Grammatical outline, Kazakh reader, Kazakh–English phrasebook, and Kazakh–English glossary. Bloomington, IN: Indiana University Research Institute for Inner Asian Studies.Google Scholar

Kuderinova, Q., Amanbayeva, A., Fazylzhanova, A., Amirzhanova, N. & Zhumabayeva, Zh.. 2016. Tazhiribe natizhesinin tiltanymdyq taldanymy: alipbidegi aripterdi zheke sipattamasy [Linguistic analysis of experimental results]. In Fazylzhanova (ed.), 266–388.Google Scholar

Kuhn, Jeremy. 2014. Harmony via positive agreement: Evidence from trigger-based count effects. Proceedings of NELS 43, 253–264.Google Scholar

McCollum, Adam G. 2015. Labial harmonic shift in Kazakh: Mapping the pathways and motivations for change. BLS 41, 329–352.CrossRef Google Scholar

McCollum, Adam G. 2018. Vowel dispersion and Kazakh labial harmony. Phonology 35, 287–326.CrossRef Google Scholar

McCollum, Adam G. 2019. The theoretical consequences of data collection practices: A case study in vowel harmony. Linguistic Discovery 16(2), 72–110. Google Scholar

Menges, Karl. 1947. Qaraqalpaq grammar. New York: King’s Crown Press.CrossRef Google Scholar

Menges, Karl. 1995. The Turkic languages and peoples: An introduction to Turkic studies, 2nd edn. Weisbaden: Harrassowitz.Google Scholar

Muhamedowa, Raihan. 2015. Kazakh: A comprehensive grammar. New York: Routledge:CrossRef Google Scholar

Sharipbayev, A. A. 2013. Problems and prospects of computer processing of the Kazakh language. Proceedings of the 1st International Conference on Computer Processing of the Turkic Languages, 18–22.Google Scholar

Vajda, Edward. 1994. Kazakh phonology. In Edward H. Kaplan & DonaldW. Whisenhunt (eds.), Opuscula altaica: Essays presented in honor of Henry Schwarz, 603–650. Bellingham, WA: Western Washington University.Google Scholar

Washington, Jonathan North. 2015. An ultrasound study of the articulatory correlates of vowel anteriority in three Turkic languages. Poster presented at Tu+1, University of Massachusetts, Amherst.Google Scholar

Washington, Jonathan North. 2016. An investigation of vowel anteriority in three Turkic languages using ultrasound tongue imaging. Ph.D. dissertation, Indiana University.Google Scholar

Yanushevskaya, Irena & Bunčić, Daniel. 2015. Russian. Journal of the International Phonetic Association 45, 221–228.CrossRef Google Scholar

Yessenbayev, Zhandos, Karabalayeva, Muslima & Sharipbayev, Altynbek. 2012. Formant analysis and mathematical model of Kazakh vowels. Presented at Computer Modelling and Simulation (UKSim), 2012 UKSim 14th International Conference on IEEE.CrossRef Google Scholar

Figure 1 Political map of Central Asia.

Table 1. Voice onset time (VOT, in ms) for voiced and voiceless plosives by place of articulation.

Table 2. Obstruent voicing in onset and coda positions.

Table 3 Intervocalic spirantization of obstruents.

Table 4 Suffix desonorization and nasal harmony.

Table 5 Nasal harmony.

Figure 2 Mean F1 and F2 (Bark) of vowel phonemes with one-standard deviation ellipses.

Table 6 Mean F1, F2, and F3 for each phoneme (in Hz and Bark).

Figure 3 Waveform and spectrogram of [i̯͡et̪] ‘meat’.

Figure 4 Mean F1 and F2 (Bark) at 25%, 50%, and 75% points of each vowel.

Figure 5 Waveforms and spectrograms of <итті> [i͡jt̪-t̪ɪ] ‘dog-acc’ and <миды> [mi͡j-d̪ə] ‘brain-acc’.

Figure 6 Waveform and spectrogram of <сый> [s̪əj] ‘gift’.

Figure 7 Waveform and spectrogram of <ту> /t̪u͡w/ ‘flag’.

Figure 8 Waveform and spectrogram of <ілу> /ɪl̪-u͡w/ [ɪl̪ʏ͡ʉw] ‘hang-ger’.

Table 7 Palatal harmony within roots.

Table 8 Palatal harmony across morpheme boundaries.

Table 9 Labial harmony within roots.

Figure 9 Two productions of <түлек> /t̪ʏl̪i̯͡ek/ ‘chick’. The left-hand example exhibits no coarticulation, while the right-hand example shows noticeable subphonemic rounding.

Table 10 Labial harmony across morpheme boundaries.

Figure 10 Waveforms and spectrograms of <төсте> [t̪ɵs̪-t̪i̯͡e] ‘chest-loc’ and <сесте> [s̪i̯͡es̪-t̪i̯͡e] ‘sound-loc’.

Table 11 Syllable types.

Table 12 Example stimuli for stress study.

Table 13 Mean intensity by position and syllable type for /ə/ and /ɑ/ (in dB, with one standard deviation).

Table 14 Mean maximum f0 by position and syllable type for /ə/ and /ɑ/ (in Hz, with one standard deviation).

Table 15 Mean duration by position and syllable type for /ə/ and /ɑ/(in ms, with one standard deviation).

Table 16 Mean F1 at vowel midpoint by position and syllable type for /ə/ and /ɑ/ (in Hz and Bark, with one standard deviation).

Table 17 Mean F2 at vowel midpoint by position and syllable type for /ə/ and /ɑ/ (in Hz and Bark, with one standard deviation).

Figure 11 Mean F1 and F2 of /ɑ/ and /ə/ by position (in Bark, with one-standard deviation ellipses).

Figure 12 Minimal triplet of three intonational patterns (top = declarative, middle = yes/no question, bottom = wh-question).

McCollum and Chen et al. supplementary material

File 16.9 MB

Article contents

Kazakh

Extract

Consonants

Vowels

Diphthongs

Vowel harmony

Syllable structure

Stress and intonation

Stress

Intonation

Transcription of the recorded passage

Orthographic version

Phonemic transcription

Phonetic transcription

Acknowledgements

Supplementary material

Footnotes

References

McCollum and Chen et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests