Measuring Speech

Section III - Measuring Speech

Published online by Cambridge University Press: 11 November 2021

Edited by

Rachael-Anne Knight and

Jane Setter

Show author details

Rachael-Anne Knight: Affiliation:
City, University of London
Jane Setter: Affiliation:
University of Reading

Book contents

Get access

Summary

A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Type: Chapter
Information: The Cambridge Handbook of Phonetics , pp. 259 - 404

DOI: https://doi.org/10.1017/9781108644198 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2021

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

10.7 References

Adank, P., van Hout, R. & Smits, R. (2001). A comparison between human vowel normalization strategies and acoustic vowel transformation techniques. In Proceedings of the 7th International Conference on Speech Communication and Technology (Eurospeech 2001). Aalborg, Vol. I. pp. 481–4.Google Scholar

Best, C. T. (1995). A direct realist perspective on cross-language speech perception. In Strange, W., ed., Speech Perception and Linguistic Experience: Issues in Cross-Language Research. Timonium, MD: York Press, pp. 167–200.Google Scholar

Best, C. T. & Tyler, M. D. (2007). Nonnative and second-language speech perception: Commonalities and complementarities. In Munro, M. J. & Bohn, O.-S., eds., Language Experience in Second Language Speech Learning: In honor of James Emil Flege. Amsterdam: John Benjamins, pp. 13–34.Google Scholar

Bigi, B. & Hirst, D. (2019). Speech phonetization alignment and syllabification (SPPAS): A tool for the automatic analysis of speech prosody. www.sppas.org/.Google Scholar

Boersma, P. & Weenink, D. (2019). Praat: Doing Phonetics by Computer [computer program]. www.fon.hum.uva.nl/praat/.Google Scholar

Bohn, O.-S. (2017). Cross-language and second language speech perception. In Fernandez, E. M. & Cairns, H. S., eds., The Handbook of Psycholinguistics. New York: John Wiley and Sons, pp. 213–39.Google Scholar

Catford, J. C. (1994). A Practical Introduction to Phonetics. Oxford: Oxford University Press.Google Scholar

Chiba, T. & Kajiyama, M. (1941). The Vowel, Its Nature and Structure. Tokyo: TokyoKaiseikan.Google Scholar

Delattre, P. (1948). Un triangle acoustique des voyelles orales du Français. The French Review, 21(6), 477–84.Google Scholar

Durand, J., Gut, U. & Kristoffersen, G. (2017). The Oxford Handbook of Corpus Phonology. Oxford: Oxford University Press.Google Scholar

Fant, G. (1960). Acoustic Theory of Speech Production. The Hague: Mouton.Google Scholar

Fant, G. (1967). A note on vocal tract size factors and non-uniform F-pattern scaling. Speech Transmission Laboratory: Quarterly Progress and Status Reports, 4, 22–30.Google Scholar

Fant, G. (1973). Speech Sounds and Features. Boston, MA: MIT Press.Google Scholar

Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In Strange, W., ed., Speech Perception and Linguistic Experience: Issues in Cross-Language Research. Timonium, MD: York Press, pp. 233–77.Google Scholar

Flege, J. E. (1999). Age of learning and constraints on second-language speech. In Birdsong, D., ed., Second Language Acquisition and the Critical Period Hypothesis. Mahwah, NJ: Lawrence Erlbaum Associates, pp. 101–31.Google Scholar

Flynn, N. (2011). Comparing vowel formant normalisation procedures. York Papers in Linguistics Series, 2(11), 1–28.Google Scholar

Fowler, C. A. & Housum, J. (1987). Talkers’ signalling of ‘new’ and ‘old’ words in speech and listeners’ perception and use of the distinction. Journal of Memory and Language, 26, 489–504.Google Scholar

Garofolo, J. S., Lamel, L. F., Fisher, W. M., Fiscus, J. G., Pallett, D. S., Dahlgren, N. L. et al. (1993). TIMIT. Acoustic-phonetic continuous speech corpus. https://catalog.ldc.upenn.edu/LDC93S1.Google Scholar

Gick, B., Wilson, I. & Derrick, D. (2013). Articulatory Phonetics. Chichester, UK: Wiley-Blackwell.Google Scholar

Hagiwara, R. (1997). Dialect variation and formant frequency: The American English vowels revisited. Journal of the Acoustical Society of America, 102 (1), 655–8.Google Scholar

Harrington, J. (2006). Phonetic Analysis of Speech Corpora. Malden, MA: Blackwell.Google Scholar

Hermann, L. (1894). Beiträge zur Lehre von der Klangwahrnehmung. Pflügers Arch., 56, 467–99.Google Scholar

Hillenbrand, J., Getty, L. A., Clark, M. J. & Wheeler, K. (1995). Acoustic characteristics of American English vowels. Journal of the Acoustical Society of America, 97(5), 3099–111.Google Scholar

Hillenbrand, J. (2019). Alvin3. http://homepages.wmich.edu/~hillenbr/.Google Scholar

International Phonetic Association. (2019). IPA Chart. www.internationalphoneticassociation.org/content/ipa-chart.Google Scholar

Johnson, K., Flemming, E. & Wright, R. (1993). The hyperspace effect: Phonetic targets are hyperarticulated. Language, 69(3), 505–28.Google Scholar

Jones, D. (1917). An English Pronouncing Dictionary. London: Dent.Google Scholar

Joos, M. (1948). Acoustic phonetics. Language Monographs, 23, 136.Google Scholar

Keen, J. A. (1940). A note on the comparative size of the cochlear canal in mammals. Journal of Anatomy, 73(4), 524–7.Google Scholar

Kreiman, J. & Gerratt, B. R. (2010). Perceptual sensitivity to first harmonic amplitude in the voice source. Journal of the Acoustical Society of America, 128(4), 2085–9.Google Scholar

Ladefoged, P. (2001). Vowels and Consonants: An Introduction to the Sounds of Languages. Malden, MA: Blackwell.Google Scholar

Ladefoged, P. & Maddieson, I. (1996). The Sounds of the World’s Languages. Malden, MA: Blackwell.Google Scholar

Lindblom, B. (1990). Explaining phonetic variation: A sketch of the H-H theory. In Hardcastle, W. J. & Marchal, A., eds., Speech Production and Speech Modelling. London: Kluwer Academic Press, pp. 403–39.Google Scholar

Lindblom, B. & Sundberg, J. (1971). Acoustical consequences of lip, tongue, jaw, and larynx movement. Journal of the Acoustical Society of America, 50, 1166–79.Google Scholar

Lobanov, B. M. (1971). Classification of Russian vowels spoken by different speakers. Journal of the Acoustical Society of America, 49(2B), 606–8.Google Scholar

Maddieson, I. (1984). Patterns of Sounds. Cambridge: Cambridge University Press.Google Scholar

Nordström, P. E. & Lindblom, B. (1975). A normalization procedure for vowel formant data. In Proceedings of the 8th International Congress of Phonetic Sciences in Leeds, August, paper 212.Google Scholar

Öhman, S. (1964). Note on palatalization in Russian. MIT Quarterly Progress Report, 73, 167–71.Google Scholar

Passy, P. (1888). Our revised alphabet. The Phonetic Teacher, 7–8, 57–60.Google Scholar

Peterson, G. E. & Barney, H. L. (1952). Control methods used in a study of the vowels. Journal of the Acoustical Society of America, 24, 175–84.Google Scholar

Pfitzinger, H. & Niebuhr, O. (2011). Historical development of phonetic vowel systems: The last 400 years. In Proceedings of the 17th International Congress of Phonetic Sciences, Hong Kong, China, 160–3.Google Scholar

Pisoni, D. B. (1975). Auditory short-term memory and vowel perception. Memory and Cognition, 3, 7–18.Google Scholar

Pitt, M. A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E. & Fosler-Lussier, E. (2007). Buckeye Corpus of Conversational Speech (2nd release). Columbus, OH: Department of Psychology, Ohio State University. https://buckeyecorpus.osu.edu/.Google Scholar

Potter, R. K. & Steinberg, J. C. (1950). Towards the specification of speech. Journal of the Acoustical Society of America, 22, 803–23.CrossRef Google Scholar

Reetz, H. & Jongman, A. (2009). Phonetics: Transcription, Production, Acoustics, and Perception. Chichester, UK: Wiley-Blackwell.Google Scholar

Renwick, M. E. L. & Ladd, D. R. (2016). Phonetic distinctiveness vs. lexical contrastiveness in non-robust phonemic contrasts. Laboratory Phonology, 7(1), 1–29.Google Scholar

Sóskuthy, M. (2019). Generalised Additive Mixed Models for Dynamic Analysis in Linguistics: A Practical Introduction [Computing Research Repository]. https://arxiv.org/abs/1703.05339v1.Google Scholar

Stevens, S. S., Volkmann, J. & Newman, E. B. (1937). A scale for the measurement of the psychological magnitude pitch. Journal of the Acoustical Society of America, 8(3), 185–90.CrossRef Google Scholar

Stevens, K. N., Kasowski, S. & Fant, G. (1953). An electrical analog of the vocal tract. Journal of the Acoustical Society of America, 25(4), 734–42.Google Scholar

Titze, I. R. (2011). Vocal fold mass is not a useful quantity for describing f₀ in vocalization. Journal of Speech, Language, and Hearing Research, 54(2), 520–2.Google Scholar

Traunmüller, H. (1990). Analytical expressions for the tonotopic sensory scale. Journal of the Acoustical Society of America, 88(1), 97–100.Google Scholar

Van Hoof, S. & Verhoeven, J. (2011). Intrinsic vowel f₀, the size of vowel inventories and second language acquisition. Journal of Phonetics, 39, 168–77.Google Scholar

Vilain, C., Berthommier, F. & Boë, L.-J. (2015). A brief history of articulatory–acoustic vowel representation. In 1st International Workshop on the History of Speech Communication Research (HSCR 2015), Dresden, France.Google Scholar

Watt, D. & Fabricius, A. (2002). Evaluation of a technique for improving the mapping of multiple speakers’ vowel spaces in the F1~F2 plane. Leeds Working Papers in Linguistics and Phonetics, 9, 159–73.Google Scholar

Whalen, D. H. & Levitt, A. G. (1995). The universality of intrinsic f₀ of vowels. Journal of Phonetics, 23, 349–66.Google Scholar

Whalen, D. H., Magen, H. S., Pouplier, M., Kang, A. M. & Iskarous, K. (2004a). Vowel production and perception: Hyperarticulation without a hyperspace effect. Language and Speech, 47(2), 155–74.Google Scholar

Whalen, D. H., Magen, H. S., Pouplier, M., Kang, A. M. & Iskarous, K. (2004b). Vowel target without a hyperspace effect. Language, 80(3), 377–80.Google Scholar

Wood, S. N. (2006). Generalised Additive Mixed Models: An Introduction, with R. Boca Raton, FL: CRC Press.Google Scholar

Wright, R. (2003). Factors of lexical competition in vowel articulation. In Local, J., Ogden, R. & Temple, R., eds., Papers in Laboratory Phonology VI. Cambridge: Cambridge University Press, pp. 75–87.Google Scholar

Yang, B. (1990). Development of Vowel Normalization Procedures: English and Korean. Doctoral dissertation, University of Texas at Austin. http://fonetiks.info/bgyang/db/yangphd.pdf.Google Scholar

Yang, B. (1996). A comparative study of American English and Korean vowels produced by male and female speakers. Journal of Phonetics, 24(2), 245–61.Google Scholar

Yang, B. (2006). Discrimination of synthesised English vowels by American and Korean listeners. Phonetics and Speech Sciences, 13(1), 7–27.Google Scholar

Yang, B. (2009a). Formant trajectories of English vowels produced by American males. Phonetics and Speech Sciences, 1(3), 65–72.Google Scholar

Yang, B. (2009b). English vowel spaces produced and perceived by Americans and Koreans. In Lee, C., Simpson, G. B. & Kim, Y., eds., The Handbook of East Asian Psycholinguistics. Volume III: Korean. New York: Cambridge University Press, pp. 390–7.Google Scholar

Yang, B. (2010). Formant trajectories of English high tense and lax vowels produced by Korean and American speakers. Korean Journal of Linguistics, 35(2), 407–21.Google Scholar

Yang, B. (2018). Pitch trajectories of English vowels produced by American men, women, and children. Phonetics and Speech Sciences, 10(4), 31–7.Google Scholar

Yang, B. (2019). A comparison of normalized formant trajectories of English vowels produced by American men and women. Phonetics and Speech Sciences, 11(1), 1–8.Google Scholar

Yang, B. & Whalen, D. H. (2015). Perception and production of English vowels by American males and females. Australian Journal of Linguistics, 35(2), 121–41.Google Scholar

Yost, W. A. (2000). Fundamentals of Hearing: An Introduction. London: Academic Press.Google Scholar

Yun, W., Yoon, K., Park, S., Lee, J., Cho, S., Kang, D. et al. (2015). The Korean corpus of spontaneous speech. Phonetics and Speech Sciences, 7(2), 103–9.CrossRef Google Scholar

Zwicker, E. (1961). Subdivision of the audible frequency range into critical bands. Journal of the Acoustical Society of America, 33(2), 248.Google Scholar

Zwicker, E. & Terhardt, E. (1980). Analytical expressions for critical-band rate and critical bandwidth as a function of frequency. Journal of the Acoustical Society of America, 68(5), 1523–5.Google Scholar

11.7 References

Abramson, A. S. & Whalen, D. H. (2017). Voice Onset Time (VOT) at 50: Theoretical and practical issues in measuring voicing distinctions. Journal of Phonetics, 63, 75–86.CrossRef Google Scholar PubMed

Ashby, M. & Maidment, J. (2005). Introducing Phonetic Science. Cambridge: Cambridge University Press.Google Scholar

Bauer, M. (2005). Lenition of the flap in American English. University of Pennsylvania Working Papers in Linguistics, 10(2), 31–43.Google Scholar

Behrens, S. J. & Blumstein, S. E. (1988). Acoustic characteristics of English voiceless fricatives: A descriptive analysis. Journal of Phonetics, 16(3), 295–98.CrossRef Google Scholar

Bennett, R. (2010). Contrast and laryngeal states in Tz’utujil. In McGuire, G., ed., UC Santa Cruz Linguistics Research Center Annual Report. Santa Cruz, CA: LRC Publications, pp. 93–120.Google Scholar

Bevier, Jr., L. (1900). The acoustic analysis of the vowels from the phonographic record. Physical Review (Series I), 10(4), 193–203.Google Scholar

Bjorndahl, C. (2015). The phonetics and phonology of segment classification: A case study of /v/. In Raimy, E. and Cairns, C. E., eds., The Segment in Phonetics and Phonology. Malden, MA: John Wiley & Sons, pp. 236–50.Google Scholar

Blumstein, S. E. & Stevens, K. N. (1979). Acoustic invariance in speech production: Evidence from measurements of the spectral characteristics of stop consonants. Journal of the Acoustical Society of America, 66(4), 1001–17.Google Scholar

Blumstein, S.E., Cooper, W.E., Zurif, E.B. & Caramazza, A. (1977). The perception and production of voice-onset time in aphasia. Neuropsychologia, 15(3), 371–83.Google Scholar

Boersma, P. & Weenink, D. (2018). Praat: Doing Phonetics by Computer [computer program]. Version 6.0.39, www.praat.org/.Google Scholar

Carballo, G. & Mendoza, E. (2000). Acoustic characteristics of trill productions by groups of Spanish children. Clinical Linguistics & Phonetics, 14(8), 587–601.Google Scholar

Carrasco, P., Hualde, J. I. & Simonet, M. (2012). Dialectal differences in Spanish voiced obstruent allophony: Costa Rican versus Iberian Spanish. Phonetica, 69(3), 149–79.Google Scholar

Chen, M. & Clumeck, H. (1975). Denasalization in Korean: A search for universals. In Hyman, L. M. and Ohala, J. J., eds., Nasálfest: Papers from a Symposium on Nasals and Nasalization. Stanford, CA: Stanford University Press, pp. 125–31.Google Scholar

Cho, T. & Ladefoged, P. (1999). Variation and universals in VOT: Evidence from 18 languages. Journal of Phonetics, 27(2), 207–29.Google Scholar

Cho, T., Jun, S. A. & Ladefoged, P. (2002). Acoustic and aerodynamic correlates of Korean stops and fricatives. Journal of Phonetics, 30(2), 193–228.Google Scholar

Chodroff, E. & Wilson, C. (2014). Burst spectrum as a cue for the stop voicing contrast in American English. Journal of the Acoustical Society of America, 136(5), 2762–72.Google Scholar

Cooper, F. S., Delattre, P. C., Liberman, A. M., Borst, J. M. & Gerstman, L. J. (1952). Some experiments on the perception of synthetic speech sounds. Journal of the Acoustical Society of America, 24(6), 597–606.Google Scholar

Cordeiro, G. F., Montagnoli, A. N., Ubrig, M. T., Menezes, M. H. M. & Tsuji, D. H. (2015). Comparison of tongue and lip trills with phonation of the sustained vowel /ε/ regarding the periodicity of the electroglottographic waveform and the amplitude of the electroglottographic signal. Open Journal of Acoustics, 5(04), 226–38.Google Scholar

Crandall, I. B. & Sacia, C. F. (1924). A dynamical study of the vowel sounds. Bell System Technical Journal, 3(2), 232–7.Google Scholar

Delattre, P. C., Liberman, A. M. & Cooper, F. S. (1955). Acoustic loci and transitional cues for consonants. Journal of the Acoustical Society of America, 27(4), 769–73.Google Scholar

Derrick, D. & Schultz, B. (2013). Acoustic correlates of flaps in North American English. In Proceedings of Meetings on Acoustics ICA2013, Montreal, Canada, pp. 1–5.Google Scholar

Donders, F. C. (1864). Zur Klangfarbe der Vocale. Vorläufige Notiz. Annalen der Physik, 199(11), 527–8.CrossRef Google Scholar

Donders, F. C. (1870). De Physiologie der Spraakklanken: in het bijzonder van die der Nederlandsche taal geschetst. Utrecht: van der Post, Jr.Google Scholar

Dorman, M. F., Raphael, L. J. & Isenberg, D. (1980). Acoustic cues for a fricative-affricate contrast in word-final position. Journal of Phonetics, 8(4), 397–405.Google Scholar

Eliason, N. E. (1942). Two notes on vowel and consonant quantity. American Speech, 17(3), 166–8.Google Scholar

Fant, G. (1960). Acoustic Theory of Speech Production. The Hague: Mouton.Google Scholar

Figueroa, M., Painequeo, J., Márquez, C., Salamanca, G. & Bertín, D. (2019). Evidencia del contraste interdental/alveolar en el mapudungun hablado en la costa: un estudio acústico-estadístico. Onomázein, 44(09), 191–216.Google Scholar

Fourier, J. B. J. (1822), Théorie Analytique de la Chaleur. Paris: Firmin Didot.Google Scholar

Fujimura, O. (1962). Analysis of nasal consonants. Journal of the Acoustical Society of America, 34(12), 1865–75.Google Scholar

Garnes, S. (1975). An acoustic analysis of double articulations in Ibibio. In Herbert, R. K., ed., Proceedings of the 6th Conference on African Linguistics. Columbus: Ohio State, pp. 44–5.Google Scholar

Geng, P., Gu, W. & Fujisaki, H. (2018). Acoustic and perceptual characteristics of Mandarin speech in homosexual and heterosexual male speakers. In Proceedings of INTERSPEECH 2018, Hyderabad, India, pp. 1726–30.Google Scholar

Gick, B., Wilson, I. & Derrick, D. (2012). Articulatory Phonetics. Malden, MA: John Wiley & Sons.Google Scholar

Gordon, M., Barthmaier, P. & Sands, K. (2002). A cross-linguistic acoustic study of voiceless fricatives. Journal of the International Phonetic Association, 32(2), 141–74.Google Scholar

Hedrick, M. S. & Ohde, R. N. (1993). Effect of relative amplitude of frication on perception of place of articulation. Journal of the Acoustical Society of America, 94(4), 2005–26.Google Scholar

Heinz, J. M. & Stevens, K. N. (1961). On the properties of voiceless fricative consonants. Journal of the Acoustical Society of America, 33(5), 589–96.Google Scholar

Hualde, J. I., Simonet, M., Shosted, R. & Nadeu, M. (2010). Quantifying Iberian spirantization: Acoustics and articulation. In 40th Linguistic Symposium on Romance Languages, Seattle, WA, pp. 26–8.Google Scholar

Hughes, G. W. & Halle, M. (1956). Spectral properties of fricative consonants. Journal of the Acoustical Society of America, 28(2), 303–10.Google Scholar

Husain, R. A. & Husain, T. M. (2017). Acoustic measurement of voiced implosives: Evidence of voiced implosives in a US dialect. Southern Journal of Linguistics, 41(1), 62–87.Google Scholar

Impieri, D., Tønseth, K. A., Hide, Ø. , Brinck, E. L., Høgevold, H. E. & Filip, C. (2018). Impact of orthognathic surgery on velopharyngeal function by evaluating speech and cephalometric radiographs. Journal of Plastic, Reconstructive & Aesthetic Surgery, 71(12), 1786–95.Google Scholar

Iskarous, K., Fowler, C. A. & Whalen, D. H. (2010). Locus equations are an acoustic expression of articulator synergy. Journal of the Acoustical Society of America, 128(4), 2021–32.Google Scholar

Jannedy, S. & Weirich, M. (2017). Spectral moments vs. discrete cosine transformation coefficients: Evaluation of acoustic measures distinguishing two merging German fricatives. Journal of the Acoustical Society of America, 142(1), 395–405.Google Scholar

Jenkin, F. & Ewing, J. A. (1878). On the harmonic analysis of certain vowel sounds. Transactions of The Royal Society of Edinburgh, 28(3), 745–75.Google Scholar

Jessen, M. (2002). An acoustic study of contrasting plosives and click accompaniments in Xhosa. Phonetica, 59(2–3), 150–79.Google Scholar

Johnson, K. (1993). Acoustic and auditory analyses of Xhosa clicks and pulmonics. UCLA Working Papers in Phonetics, 83, 33–45.Google Scholar

Johnson, K. (2012). Acoustic and Auditory Phonetics, 3rd ed. Oxford: Wiley-Blackwell.Google Scholar

Jongman, A., Wayland, R. & Wong, S. (2000). Acoustic characteristics of English fricatives. Journal of the Acoustical Society of America, 108(3), 1252–63.Google Scholar

Kewley-Port, D. & Preston, M. S. (1974). Early apical stop production: A voice onset time analysis. Journal of Phonetics, 2, 195–210.Google Scholar

Kim, H. (2001). The place of articulation of the Korean plain affricate in intervocalic position: An articulatory and acoustic study. Journal of the International Phonetic Association, 31(2), 229–57.Google Scholar

Kim, Y. S. (2011). An Acoustic, Aerodynamic, and Perceptual Investigation of Word-initial Denasalization in Korean. Unpublished doctoral dissertation, University College London.Google Scholar

Kingston, J. (2008). Lenition. In Colantoni, L. & Steelem, J., eds., Selected Proceedings of the 3rd Conference on Laboratory Approaches to Spanish Phonology. Somerville, MA: Cascadilla Proceedings Project, pp. 1–31.Google Scholar

Kurowski, K. & Blumstein, S. E. (1987). Acoustic properties for place of articulation in nasal consonants. Journal of the Acoustical Society of America, 81(6), 1917–27.Google Scholar

Ladefoged, P. (2003). Phonetic Data Analysis: An Introduction to Fieldwork and Instrumental Techniques. Malden, MA: Blackwell.Google Scholar

Ladefoged, P. & Johnson, K. (2011). A Course in Phonetics, 6th ed. Boston, MA: Wadsworth.Google Scholar

Ladefoged, P. & Maddieson, I. (1996). Sounds of the World’s Languages. Oxford: Blackwell.Google Scholar

Laver, J. (1994). Principles of Phonetics. Cambridge: Cambridge University Press.Google Scholar

Lee, H. & Jongman, A. (2016). A diachronic investigation of the vowels and fricatives in Korean: An acoustic comparison of the Seoul and South Kyungsang dialects. Journal of the International Phonetic Association, 46(2), 157–84.Google Scholar

Li, F., Bunta, F. & Tomblin, J. B. (2017). Alveolar and postalveolar voiceless fricative and affricate productions of Spanish–English bilingual children with cochlear implants. Journal of Speech, Language, and Hearing Research, 60(9), 2427–41.Google Scholar

Li, S. & Gu, W. (2015). Acoustic analysis of Mandarin affricates. In Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany, pp. 1–5.Google Scholar

Lisker, L. (1986). ‘Voicing’ in English: A catalogue of acoustic features signaling /b/ versus /p/ in trochees. Language and Speech, 29(1), 3–11.Google Scholar

Lisker, L. & Abramson, A. S. (1964). A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20(3), 384–422.Google Scholar

Lisker, L. & Abramson, A. S. (1967). Some effects of context on voice onset time in English stops. Language and Speech, 10(1), 1–28.Google Scholar

Ma, J., Chen, X., Wu, Y. & Zhang, L. (2018). Effects of age and sex on voice onset time: Evidence from Mandarin voiceless stops. Logopedics Phoniatrics Vocology, 43(2), 56–62.Google Scholar

Martin, S. E. (1951). Korean phonemics. Language, 27(4), 519–33.Google Scholar

Martínez-Celdrán, E. (2004). Problems in the classification of approximants. Journal of the International Phonetic Association, 34(2), 201–10.Google Scholar

Miller, A. & Shah, S. (2009). The acoustics of Mangetti Dune !Xung clicks. In Uther, M., Moore, R. & Cox, S., eds., Proceedings of the 10th Annual Conference of the International Speech Communication Association. Brighton, UK: Causal Productions, pp. 2283–6.Google Scholar

Miller-Ockhuizen, A. & Sands, B. E. (2000). Contrastive lateral clicks and variation in click types. In Proceedings of the Sixth International Conference on Spoken Language Processing ICSLP, Beijing, China, pp. 1–4.Google Scholar

Modell, J. D. & Rich, G. J. (1915). A preliminary study of vowel qualities. The American Journal of Psychology, 26(3), 453–6.Google Scholar

Munson, B. & Urberg Carlson, K. (2016). An exploration of methods for rating children’s productions of sibilant fricatives. Speech, Language and Hearing, 19(1), 36–45.Google Scholar

Nissen, S. L. & Fox, R. A. (2005). Acoustic and spectral characteristics of young children’s fricative productions: A developmental perspective. Journal of the Acoustical Society of America, 118(4), 2570–8.Google Scholar

Nittrouer, S., Lowenstein, J. H. & Tarr, E. (2013). Amplitude rise time does not cue the /bɑ/–/wɑ/ contrast for adults or children. Journal of Speech, Language, and Hearing Research, 56(2), 427–40.Google Scholar

Paget, R. A. (1924). The nature and artificial production of consonant sounds. Proceedings of the Royal Society of London, 106(736), 150–74.Google Scholar

Parmenter, C. E. & Carman, J. N. (1932). Some remarks on Italian quantity. Italica, 9(4), 103–8.Google Scholar

Patil, V. & Rao, P. (2008). Acoustic cues to manner of articulation of obstruents in Marathi. In Proceedings of Frontiers of Research on Speech and Music FRSM, Kolkata, India, pp. 1–5.Google Scholar

Penney, J., Cox, F., Miles, K. & Palethorpe, S. (2018). Glottalisation as a cue to coda consonant voicing in Australian English. Journal of Phonetics, 66, 161–84.Google Scholar

Piccinini, P. & Arvaniti, A. (2015). Voice onset time in Spanish–English spontaneous code-switching. Journal of Phonetics, 52, 121–37.Google Scholar

Piñeros, C. E. (2002). Markedness and laziness in Spanish obstruents. Lingua, 112(5), 379–413.Google Scholar

Qi, Y. & Fox, R. A. (1992). Analysis of nasal consonants using perceptual linear prediction. Journal of the Acoustical Society of America, 91(3), 1718–26.Google Scholar

Raymond, M. & Parker, S. (2005). Initial and medial geminate trills in Arop-Lokep. Journal of the International Phonetic Association, 35(1), 99–111.Google Scholar

Recasens, D. & Espinosa, A. (2007). An electropalatographic and acoustic study of affricates and fricatives in two Catalan dialects. Journal of the International Phonetic Association, 37(2), 143–72.Google Scholar

Reetz, H. & Jongman, A. (2009). Phonetics: Transcription, Production, Acoustics, and Perception. Cambridge, MA: Wiley-Blackwell.Google Scholar

Reidy, P. F., Kristensen, K., Winn, M. B., Litovsky, R. Y. & Edwards, J. R. (2017). The acoustics of word-initial fricatives and their effect on word-level intelligibility in children with bilateral cochlear implants. Ear and Hearing, 38(1), 42.Google Scholar

Saz, O., Deena, S., Doulaty, M., Hasan, M., Khaliq, B., Milner, R. et al. (2018). Lightly supervised alignment of subtitles on multi-genre broadcasts. Multimedia Tools and Applications, 77(23), 30533–50.Google Scholar

Shin, J. (2019). Vowels and Consonants. In Brown, L. and Yeon, J., eds., The Handbook of Korean Linguistics. Chichester, UK: Wiley-Blackwell, pp. 1–21.Google Scholar

Spajić, S., Ladefoged, P. & Bhaskararao, P. (1996). The trills of Toda. Journal of the International Phonetic Association, 26(1), 1–21.Google Scholar

Spinu, L. & Lilley, J. (2016). A comparison of cepstral coefficients and spectral moments in the classification of Romanian fricatives. Journal of Phonetics, 57, 40–58.Google Scholar

Stevens, K. N. (2000). Acoustic Phonetics. Cambridge, MA: MIT Press.Google Scholar

Strevens, P. (1960). Spectra of fricative noise in human speech. Language and Speech, 3(1), 32–49.Google Scholar

Sussman, H. M., McCaffrey, H. A. & Matthews, S. A. (1991). An investigation of locus equations as a source of relational invariance for stop place categorization. Journal of the Acoustical Society of America, 90(3), 1309–25.Google Scholar

Tabain, M. (1998). Non-sibilant fricatives in English: Spectral information above 10 kHz. Phonetica, 55, 107–30.Google Scholar

Thirumuru, R. & Vuppala, A. K. (2018). Automatic detection of retroflex approximants in a continuous Tamil speech. Circuits, Systems and Signal Processing, 37(7), 2837–51.Google Scholar

Turk, A., Nakai, S. & Sugahara, M. (2006). Acoustic segment durations in prosodic research: A practical guide. Methods in Empirical Prosody Research, 3, 1–28.Google Scholar

Umeda, H. (1957). The phonemic system of Modern Korean. Journal of the Linguistic Society of Japan, 32, 60–82.Google Scholar

Upadhyay, N. & Rosales, H. G. (2018). Robust recognition of English speech in noisy environments using frequency warped signal processing. National Academy Science Letters, 41(1), 15–22.Google Scholar

Warner, N. & Tucker, B. V. (2017). An effect of flaps on the fourth formant in English. Journal of the International Phonetic Association, 47(1), 1–15.Google Scholar

Yang, B. (1993). A voice onset time comparison of English and Korean stop consonants. Research Journal of Dongeui University, 20, 41–59.Google Scholar

Yoo, K. (2015). Domain-initial denasalisation in Busan Korean: A cross-generational case study. In Proceedings of the 18th International Congress of Phonetic Sciences ICPhS, Glasgow, pp. 1–5.Google Scholar

Yoshida, K. (2008). Phonetic implementation of Korean ‘denasalization’ and its variation related to prosody. IULC Working Papers Online, 8(1), 1–23.Google Scholar

Zhu, J. & Chen, Y. (2016). Effect of several acoustic cues on perceiving Mandarin retroflex affricates and fricatives in continuous speech. Journal of the Acoustical Society of America, 140(1), 461–70.Google Scholar

Zsiga, E. C. (2013). The Sounds of Language: An Introduction to Phonetics and Phonology. New York: Wiley-Blackwell.Google Scholar

12.7 References

Abercrombie, D. (1967). Elements of General Phonetics. Edinburgh: Edinburgh University Press.Google Scholar

Arvaniti, A. (2009). Rhythm, timing and the timing of rhythm. Phonetica, 66, 46–63.Google Scholar

Arvaniti, A. (2012a). The usefulness of metrics in the quantification of speech rhythm. Journal of Phonetics, 40(3), 351–73.Google Scholar

Arvaniti, A. (2012b). Rhythm classes and speech perception. In Niebuhr, O. & Pfitzinger, H., eds., Prosodies: Context, Function, and Communication. Berlin: Walter de Gruyter, pp. 75–92.Google Scholar

Arvaniti, A. & Rathcke, T. (2015). The role of stress in syllable monitoring. Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK: The University of Glasgow. www.icphs2015.info/pdfs/Papers/ICPHS0212.pdf.Google Scholar

Arvaniti, A. & Rodriquez, T. (2013). The role of rhythm class, speaking rate, and f₀ in language discrimination. Laboratory Phonology, 4(1), 7–38.Google Scholar

Auer, P., Couper-Kuhlen, E. & Müller, F. (1999). Language in Time: The Rhythm and Tempo of Spoken Interaction. New York: Oxford University Press.Google Scholar

Balasubramanian, T. (1980). Timing in Tamil. Journal of Phonetics, 8, 449–67.Google Scholar

Baltazani, M. (2007). Prosodic rhythm and the status of vowel reduction in Greek. In Selected Papers on Theoretical and Applied Linguistics from the 17th International Symposium on Theoretical and Applied Linguistics, vol. 1. Thessaloniki: Department of Theoretical and Applied Linguistics, pp. 31–43.Google Scholar

Barry, W. & Andreeva, Β. (2001). Cross-language similarities and differences in spontaneous speech patterns. Journal of the International Phonetic Association, 31, 51–66.Google Scholar

Barry, W. J., Andreeva, B., Russo, M., Dimitrova, S. & Kostadinova, T. (2003). Do rhythm measures tell us anything about language type? In Proceedings of 15th International Congress of Phonetic Sciences, Barcelona, pp. 2693–6.Google Scholar

Beckman, M. E. (1986). Stress and Non-Stress Accent. Dordrecht: Foris.Google Scholar

Bertinetto, P. M. (1989). Reflections on the dichotomy ‘stress’ vs. ‘syllable-timing’. Revue de Phonétique Appliquée, 91-92-93, 99–130.Google Scholar

Bertrán, A. P. (1999). Prosodic typology: On the dichotomy between stress-timed and syllable-timed languages. Language Design, 2, 103–30.Google Scholar

Bohannon, J., Koch, D., Homm, P. & Driehaus, A. (2015). Chocolate with high cocoa content as a weight-loss accelerator. International Archives of Medicine, Section: Endocrinology 8(55). https://doi.org/10.3823/1654.Google Scholar

Bolton, T. L. (1894). Rhythm. The American Journal of Psychology, 6(2), 145–238.Google Scholar

Borzone de Manrique, A. M. & Signorini, A. (1983). Segmental duration and rhythm in Spanish. Journal of Phonetics, 11, 117–28.Google Scholar

Chung, Y. & Arvaniti, A. (2013). Speech rhythm in Korean: Experiments in speech cycling. Proceedings of Meetings on Acoustics (POMA): Proceedings of 21st International Congress of Acoustics, Montréal, 2–7 June 2013. http://scitation.aip.org/content/asa/journal/poma.Google Scholar

Clarke, E. F. (1999). Rhythm and timing in music. In Deutsch, D., ed., The Psychology of Music. New York: Academic Press, pp. 473–500.Google Scholar

Classe, A. (1939). The Rhythm of English Prose. Oxford: Basil Blackwell.Google Scholar

Cummins, F. & Port, R. F. (1998). Rhythmic constraints on stress-timing in English. Journal of Phonetics, 31, 139–48.Google Scholar

Cummins, F. (2009). Rhythm as an affordance for the entrainment of movement. Phonetica, 66(1–2), 15–28.Google Scholar

Cutler, A. & Otake, T. (1994). Mora or phoneme? Further evidence for language-specific listening. Journal of Memory and Language, 33, 824–44.Google Scholar

Cutler, A., Mehler, J., Norris, D. & Seguí, J. (1986). The syllable’s differing role in the segmentation of French and English. Journal of Memory and Language, 25, 385–400.Google Scholar

Dankovicová, J. & Dellwo, V. (2007). Czech speech rhythm and the rhythm class hypothesis. In Proceedings of 16th International Congress of Phonetic Sciences, Saarbrücken, Germany, pp. 1241–4.Google Scholar

Dauer, R. M. (1983). Stress-timing and syllable-timing reanalyzed. Journal of Phonetics, 11, 51–62.Google Scholar

Dauer, R. M. (1987). Phonetic and phonological components of language rhythm. In Proceedings of 11th International Congress of Phonetic Sciences, Tallinn, pp. 447–9.Google Scholar

Dellwo, V. (2006). Rhythm and speech rate: A variation coefficient for deltaC. In Karnowski, P. & Szigeti, I., eds., Language and Language-Processing: Proceedings of the 38th Linguistic Colloquium. Frankfurt: Peter Lang, pp. 231–41.Google Scholar

Dellwo, V., Aschenberner, B., Dancovicová, J. & Wagner, P. (2004). The BonnTempo Corpus and Tools: A database for the combined study of speech rhythm and rate. In Proceedings of the 8th International Conference on Spoken Language Processing, Jeju Island, Korea, pp. 777–80.Google Scholar

Dilley, L. C. & McAuley, J. D. (2008). Distal prosodic context affects word segmentation and lexical processing. Journal of Memory and Language, 59, 294–311.Google Scholar

Dowling, W. J. & Harwood, D. L. (1986). Music Cognition. Orlando, FL: Academic Press.Google Scholar

Farnetani, E. & Kori, S. (1990). Rhythmic structure in Italian noun phrases: A study of vowel durations. Phonetica, 47, 50–65.Google Scholar

Fletcher, J. (1991). Rhythm and final lengthening in French. Journal of Phonetics, 19(2), 193–212.Google Scholar

Fraisse, P. (1963). The Psychology of Time, New York: Harper & Row.Google Scholar

Fraisse, P. (1982). Rhythm and tempo. In Deutsch, D., ed., The Psychology of Music. New York: Academic Press, pp. 149–80.Google Scholar

Friberg, A. & Sundberg, J. (1995). Time discrimination in a monotonic, isochronous sequence. Journal of the Acoustical Society of America, 98(5), 2524–31.Google Scholar

Frota, S. & Vigário, M. (2001). On the correlates of rhythmic distinctions: The European/Brazilian Portuguese case. Probus, 13, 247–75.Google Scholar

Goswami, U. (2011). A temporal sampling framework for developmental dyslexia. Trends in Cognitive Sciences, 15, 3–10.Google Scholar

Goswami, U. & Leong, V. (2013). Speech rhythm and temporal structure: Converging perspectives? Laboratory Phonology, 4(1), 67–92.Google Scholar

Grabe, E. & Low, E. L. (2002). Acoustic correlates of rhythm class. In Gussenhoven, C. & Warner, N., eds., Laboratory Phonology 7. Berlin: Mouton de Gruyter, pp. 515–46.Google Scholar

Hannon, E. E., Lévêque, Y., Nave, K. M. & Trehub, S. E. (2016). Exaggeration of language-specific rhythms in English and French children’s songs. Frontiers of Psychology 2016, 7, 939. https://doi.org/10.3389/fpsyg.2016.00939.Google Scholar

Harris, M. J. & Gries, S. T. (2011). Measures of speech rhythm and the role of corpus-based word frequency: A multifactorial comparison of Spanish(-English) speakers. International Journal of English Studies, 11(2), 1–22.Google Scholar

Harris, M. J., Gries, S. T. & Miglio, V. G. (2014). Prosody and its applications to forensic linguistics. Linguistic Evidence in Security, Law and Intelligence, 2(2). https://doi.org/10.5195/lesli.2014.12.Google Scholar

Hawkins, S. (2003). Roles and representations of systematic fine phonetic detail in speech understanding. Journal of Phonetics, 31, 373–405.Google Scholar

Hawkins, S. (2014). Situational influences on rhythmicity in speech, music, and their interaction. Philosophical Transactions of the Royal Society of London B. 369, 20130398. https://dx.doi.org/10.1098/rstb.2013.0398.Google Scholar

Hayes, B. (1995). Metrical Stress Theory: Principles and Case Studies. Chicago, IL: University of Chicago Press.Google Scholar

Horton, R. & Arvaniti, A. (2013). Cluster and classes in the rhythm metrics. San Diego Linguistic Papers, 4, 28–52. http://escholarship.org/uc/item/0tt1j553.Google Scholar

James, W. (1890/1950). The Principles of Psychology. New York: Dover Reprint. (Originally published 1890.)Google Scholar

Jeon, H. & Arvaniti, A. (2017). The effects of prosodic context on word segmentation: Rhythmic irregularity and localised lengthening in Korean. Journal of the Acoustical Society of America, 141, 4251–63.Google Scholar

Jinbo, K. (1927/1980). Kokugo no onseijou no tokushitsu [The top phonetic characteristics of Japanese]. In Shibata, T., Kitamura, H. & Kindaichi, H., eds., Nihon no gengogaku [Linguistics of Japan]. Tokyo:Taishukan, pp. 5–15. (Originally published 1927.)Google Scholar

Jones, D. (1972). An Outline of English Phonetics, 9th ed. Cambridge: Cambridge University Press. (Originally published 1918.)Google Scholar

Jones, M. R. (1981). Only time can tell: On the topology of mental space and time. Critical Inquiry, 7, 557–76.Google Scholar

Jun, S. (2005). Korean intonational phonology and prosodic transcription. In Jun, S., ed., Prosodic Typology: The Phonology of Intonation and Phrasing. Oxford: Oxford University Press, pp. 201–29.Google Scholar

Kaminskaïa, S., Tennant, J. & Russell, A. (2016). Prosodic rhythm in Ontario French. Journal of French Language Studies, 26(2), 183–208.Google Scholar

Keane, E. (2006). Rhythmic characteristics of colloquial and formal Tamil. Language and Speech, 49, 299–332.Google Scholar

Klatt, D. H. (1976). Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. Journal of the Acoustical Society of America, 59, 1208–21.Google Scholar

Knight, R. (2011). Assessing the temporal reliability of rhythm metrics. Journal of the International Phonetic Association, 41(3), 271–81.Google Scholar

Kohler, K. (2009). Rhythm in speech and language: A new research paradigm. Phonetica, 66, 29–45.Google Scholar

Lee, C. S. & Todd, N. P. M. A. (2004). Towards an auditory account of speech rhythm: Application of a model of the auditory ‘primal-sketch’ to two multi-language corpora. Cognition, 9, 225–54.Google Scholar

Lehiste, I. (1977). Isochrony reconsidered. Journal of Phonetics, 5, 253–63.Google Scholar

Lerdahl, F. & Jackendoff, R. (1981). A Generative Theory of Tonal Music. Cambridge, MA: MIT Press.Google Scholar

Li, A. & Post, B. (2014). L2 acquisition of prosodic properties of speech rhythm. Studies in Second Language Acquisition, 36(2), 223–55.Google Scholar

Lin, H. & Wang, Q. (2007). Mandarin rhythm: An acoustic study. Journal of Chinese Language and Computing, 17(3), 127–40.Google Scholar

Lloyd James, A. (1940). Speech Signals in Telephony. London: Pitman & Sons.Google Scholar

Loehr, D. (2007). Aspects of rhythm in gesture and speech. Gesture, 72, 179–214.Google Scholar

London, J. (2012). Hearing in Time: Psychological Aspects of Musical Meter. Oxford: Oxford University Press.Google Scholar

Loukina, A., Kochanski, G., Rosner, B., Keane, E. & Shih, C. (2011). Rhythm measures and dimensions of durational variation in speech. Journal of the Acoustical Society of America, 129(5), 3258–70.Google Scholar

Low, E. L., Grabe, E. & Nolan, F. (2000). Quantitative characterisations of speech rhythm: ‘Syllable-timing’ in Singapore English. Language and Speech, 43, 377–401.Google Scholar

Lowit, A. (2014). Quantification of rhythm problems in disordered speech: A re-evaluation. Philosophical Transactions of the Royal Society B, 369 (1658). https://doi.org/10.1098/rstb.2013.0404.Google Scholar

Luo, H. & Poeppel, D. (2007). Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron, 54, 1001–10.Google Scholar

Mattys, S. L. & Melhorn, J. F. (2005). How do syllables contribute to the perception of spoken English? Insight from the migration paradigm. Language and Speech, 48(2), 223–53.Google Scholar

Miller, M. (1984). On the perception of rhythm. Journal of Phonetics, 12, 75–83.Google Scholar

Mok, P. (2009). On the syllable-timing of Cantonese and Beijing Mandarin. Chinese Journal of Phonetics, 2, 148–54.Google Scholar

Molnar, M., Gervain, J. & Carreiras, M. (2014). Within-rhythm class native language discrimination abilities of Basque-Spanish monolingual and bilingual infants at 3.5 months of age. Infancy, 19(3), 326–37.Google Scholar

Moon-Hwan, C. (2004). Rhythm typology of Korean speech. Cognitive Processing, 5, 249–53.Google Scholar

Murty, L., Otake, T. & Cutler, A. (2007). Perceptual tests of rhythmic similarity: I. Mora rhythm. Language and Speech, 50, 77–99.Google Scholar

Nakatani, L. H., O’Connor, K. D. & Aston, C. H. (1981). Prosodic aspects of American English speech rhythm. Phonetica, 38, 84–106.Google Scholar

Nazzi, T. & Ramus, F. (2003). Perception and acquisition of linguistic rhythm by infants. Speech Communication, 41, 233–43.Google Scholar

Nazzi, T., Jusczyk, P. W. & Johnson, E. K. (2000). Language discrimination by English-learning 5-month-olds: Effects of rhythm and familiarity. Journal of Memory and Language, 43, 1–19.Google Scholar

Nespor, M. & Vogel, I. (1989). On clashes and lapses. Phonology, 6, 69–116.CrossRef Google Scholar

Nolan, F. & Asu, E. L. (2009). The Pairwise Variability Index and coexisting rhythms in language. Phonetica, 66, 64–77.Google Scholar

Nolan, F. & Jeon, H. (2014). Speech rhythm: A metaphor? Philosophical Transactions of the Royal Society B, p. 369. https://doi.org/10.1098/rstb.2013.0396.Google Scholar

Parker Jones, O. (2006). Durational variability and stress-timing in Hawaiian. In P. Warren and C. I. Watson, eds., Proceedings of the 11th Australian International Conference on Speech & Science Technology, pp. 417–20.Google Scholar

Pellegrino, F., Coupé, C. & Marsico, E. (2011). A cross-language perspective on speech information rate. Language, 87, 539–58.Google Scholar

Pike, K. (1945). The Intonation of American English. Ann Arbor, MI: University of Michigan Press.Google Scholar

Pointon, G. E. (1980). Is Spanish really syllable-timed? Journal of Phonetics, 8, 293–304.Google Scholar

Pointon, G. E. (1995). Rhythm and duration in Spanish. In Lewis, J. W., ed., Studies in General and English Phonetics: Essays in Honour of Professor J. D. O’Connor. New York: Routledge, pp. 266–9.Google Scholar

Post, B. & Payne, E. (2018). Speech rhythm in development: What is the child acquiring? In Prieto, P. & Esteve-Gibert, N., eds., The Development of Prosody in First Language Acquisition. Amsterdam: John Benjamins, pp. 125–44.Google Scholar

Prieto, P., Vanrell, M., Astruc, L., Payne, E. & Post, B. (2012). Phonotactic and phrasal properties of speech rhythm: Evidence from Catalan, English, and Spanish. Speech Communication, 54(6), 681–702.Google Scholar

Ramus, F., Nespor, M. & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 73, 265–92.Google Scholar

Ramus, F., Dupoux, E. & Mehler, J. (2003). The psychological reality of rhythm class: Perceptual studies. In Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, pp. 337–40.Google Scholar

Renwick, M. E. L. (2013). Quantifying rhythm: Interspeaker variation in %V. Proceedings of Meetings on Acoustics (POMA), 14, 060011. http://dx.doi.org/10.1121/1.4854657.Google Scholar

Roach, P. (1982). On the distinction between ‘stress-timed’ and ‘syllable-timed’ languages. In Crystal, D., ed., Linguistic Controversies: Essays in Linguistic Theory and Practice in Honour of F. R. Palmer. London: Edward Arnold, pp. 73–9.Google Scholar

Rouas, J., Farinas, J., Pellegrino, F. & André-Obrecht, R. (2005). Rhythmic unit extraction and modelling for automatic language identification. Speech Communication, 47, 436–56.Google Scholar

Scott, D., Isard, S. D. & de Boysson-Bardies, B. (1985). Perceptual isochrony in English and French. Journal of Phonetics, 13, 155–62.Google Scholar

Sebastian, N. & Costa, A. (1997). Metrical information in speech segmentation in Spanish. Language and Cognitive Processes, 12 (5–6), 883–7.Google Scholar

Skoruppa, K., Pons, F., Christophe, A., Bosch, L., Dupoux, E., Sebastián-Gallés, et al. (2009). Language-specific stress perception by nine-month-old French and Spanish infants. Developmental Science, 12(6), 914–19.Google Scholar

Stockmal, V., Markus, D. & Bond, D. (2005). Measures of native and non-native rhythm in a quantity language. Language and Speech, 48, 55–63.Google Scholar

Tajima, K. & Port, R. F. (2003). Speech rhythm in English and Japanese. In Local, J., Ogden, R. & Temple, R., eds., Phonetic Interpretation: Papers in Laboratory Phonology VI. Cambridge: Cambridge University Press, pp. 322–39.Google Scholar

Tan, R. S. K. & Low, E. L. (2014). Rhythmic patterning in Malaysian and Singapore English. Language and Speech, 57(2), 196–214.Google Scholar

Tilsen, S. (2016). Selection and coordination: The articulatory basis for the emergence of phonological structure. Journal of Phonetics, 55, 53–77.Google Scholar

Tilsen, S. & Arvaniti, A. (2013). Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages. Journal of the Acoustical Society of America, 134(1), 628–39.Google Scholar

Tsiartsioni, E. (2003). The Acquisition of Features of Rhythm and Stop Voicing in Greek and English L2. Unpublished M.Phil. Dissertation, Trinity College Dublin.Google Scholar

Turk, A. E. & Shattuck-Hufnagel, S. (2000). Word-boundary-related duration patterns in English. Journal of Phonetics, 28, 397–440.Google Scholar

Tzakosta, M. (2004). Acquiring variable stress in Greek: An Optimality-Theoretic approach. Journal of Greek Linguistics, 5, 97–125.Google Scholar

Vaissière, J. (1991). Rhythm, accentuation and final lengthening in French. In Sundberg, J., Nord, L. & Carlson, R., eds., Music, Language, Speech and Brain. London: Palgrave, pp. 108–20.Google Scholar

Wagner, P. S. & Dellwo, V. (2004). Introducing YARD (Yet Another Rhythm Determination) and re-introducing isochrony to rhythm research. Proceedings of Speech Prosody, Nara, Japan, 2004. www.isca-speech.org/iscaweb/index.php/archive/online-archive.Google Scholar

Warner, N. & Arai, T. (2001). Japanese mora-timing: A review. Phonetica, 58, 1–25.Google Scholar

White, L. & Mattys, S. L. (2007). Calibrating rhythm: First language and second language studies. Journal of Phonetics, 35, 501–22.Google Scholar

White, L., Mattys, S. L. & Wiget, L. (2012). Language categorization by adults is based on sensitivity to durational cues, not rhythm class. Journal of Memory and Language, 66, 665–79.Google Scholar

Wiget, L., White, L., Schuppler, B., Grenon, I., Rauch, O. & Mattys, S. L. (2010). How stable are acoustic metrics of contrastive speech rhythm? Journal of the Acoustical Society of America, 127, 1559–69.Google Scholar

Woodrow, H. (1951). Time perception. In Stevens, S. S., ed., Handbook of Experimental Psychology. New York: Wiley, pp. 1224–36.Google Scholar

Zawaydeh, B. A., Tajima, K. & Kitahara, M. (2002). Discovering Arabic rhythm through a speech cycling task. In Parkinson, D. B. & Benmamoun, E., eds., Perspectives on Arabic Linguistics XIII-XIV. Amsterdam: John Benjamins, pp. 39–58.Google Scholar

13.7 References

Beranek, L. L. (1949). Acoustical Measurements. Melville, NY: Acoustical Society of America [revised edition 1988].Google Scholar

Bigi, B. (2015). SPPAS – Multi-lingual approaches to the automatic annotation of speech. The Phonetician (International Society of Phonetic Sciences), 111–112(I–II), 54–69.Google Scholar

Boersma, P & Weenink, D. (2019). Praat: Doing Phonetics by Computer [computer program]. Version 6.0.56, June 2019, www.praat.org.Google Scholar

Braun, M. (2001). Speech mirrors norm-tones: Absolute pitch as a normal but precognitive trait. Acoustics Research Letters Online, 2(3), 85–90.Google Scholar

Braun, M. (2006). A retrospective study of the spectral probability of spontaneous otoacoustic emissions: Rise of octave shifted second mode after infancy. Hearing Research, 215, 39–46.Google Scholar

Braun, M. & Chaloupka, V. (2005). Carbamazepine induced pitch shift and octave space representation. Hearing Research, 210, 85–92.Google Scholar

Brøndsted, T. (1997). Intonation contours distorted by tone patterns of stress groups and word accent. In Botinis, A., ed., Intonation: Theory, Models and Applications (Proceedings of an ISCA workshop). Athens: Athanasopoulos, pp. 55–8.Google Scholar

Chentir, A., Guerti, M. & Hirst, D. J. (2009). Extraction of standard Arabic micromelody. Journal of Computer Science, 5(2), 86–9.Google Scholar

Cho, H. & Rauzy, S. (2008). Phonetic pitch movements of accentual phrases in Korean read speech. In Proceedings of the 4th International Conference on Speech Prosody, Campinas, Brazil.Google Scholar

De Looze, C. (2010). Analyse et interprétation de l’empan temporel des variations prosodiques en français et en anglais. PhD thesis, Université de Provence, Aix-en-Provence, France.Google Scholar

De Looze, C. & Hirst, D. J. (2008). Detecting changes in key and range for the automatic modelling and coding of intonation. In Proceedings of 4th International Conference on Speech Prosody. Campinas, Brazil, pp. 135–8.Google Scholar

De Looze, C. & Hirst, D. J. (2014). The OMe (Octave-Median) scale: A natural scale for speech melody. Proceedings of the 7th International Conference on Speech Prosody, Dublin, pp. 910–13.Google Scholar

Di Cristo, A. & Hirst, D. J. (1986). Modelling French micromelody: Analysis and synthesis. Phonetica, 43 (1–3), 11–30.Google Scholar

Fant, G. (1968). Analysis and synthesis of speech processes. In Malmberg, B., ed., Manual of Phonetics. Amsterdam: North Holland, pp. 173–7.Google Scholar

Fant, G. (2004). Speech Acoustics and Phonetics. Dordrecht: Kluwer.Google Scholar

Fourcin, A. J. & Abberton, E. (1971). First applications of a new laryngograph. Medical and Biological Illustration, 21, 172–82.Google Scholar

Fujisaki, H. (2004). Information, prosody, and modeling – with emphasis on tonal features of speech. In Proceedings of the Second International Conference on Speech Prosody, Nara, Japan, pp. 1–10.Google Scholar

Fujisaki, H. & Nagashima, S. (1969). A model for the synthesis of pitch contours of connected speech. Annual Report of the Engineering Research Institute, 28, 53–60.Google Scholar

Gårding, E. (1998). Intonation in Swedish. In Hirst, D. J. and Di Cristo, A., eds., Intonation Systems: A Survey of Twenty Languages. Cambridge: Cambridge University Press, pp. 117–36.Google Scholar

Goldsmith, J. A. (1990). Autosegmental and Metrical Phonology. Cambridge, MA: Blackwell.Google Scholar

Graddol, D. (1986). Discourse specific pitch behaviour. In Johns Lewis, C., ed., Intonation in Discourse. Edinburgh: Croom Helm, pp. 221–38.Google Scholar

Halle, M. & Vergnaud, J.-R. (1987). An Essay on Stress. Cambridge, MA: MIT Press.Google Scholar

Hanson, H. (2009). Effects of obstruent consonants on fundamental frequency at vowel onset in English. Journal of the Acoustical Society of America, 125, 425–41.Google Scholar

’t Hart, J., Collier, R. & Cohen, A. (1990). A Perceptual Study of Intonation: An Experimental-Phonetic Approach to Speech Melody. Cambridge: Cambridge University Press.Google Scholar

Hermes, D. I. & van Gestel, I. E. (1991). The frequency scale of speech intonation. Journal of the Acoustical Society of America, 90, 97–102.Google Scholar

Hess, W. (1983). Pitch Determination of Speech Signals: Algorithms and Devices. Belin: Springer-Verlag.Google Scholar

Hirst, D. J. (1981). Phonological implications of a production model of English intonation. Phonologica, 1980, 195–201.Google Scholar

Hirst, D. J. (1983). Structures and categories in prosodic representations. In Cutler, A. & Ladd, D. R., eds., Prosody: Models & Measurements. Berlin: Springer, pp. 93–109.Google Scholar

Hirst, D. J. (2007). A Praat plugin for Momel and INTSINT with improved algorithms for modelling and coding intonation. In Proceedings of the XVIth International Conference of Phonetic Sciences (paper 1443), Saarbrücken, pp. 1233–6.Google Scholar

Hirst, D. J. (2012). Diapason.praat. Praat script. www.researchgate.net/publication/327764721_diapason.Google Scholar

Hirst, D. J. (2015). ProZed: A speech prosody editor for linguists, using analysis-by-synthesis. In Hirose, K. & Tao, J., eds., Speech Prosody in Speech Synthesis. Modeling and Generation of Prosody for High Quality and Flexible Speech Synthesis. Berlin: Springer-Verlag, pp. 3–17.Google Scholar

Hirst, D. J. & Espesser, R. (1993). Automatic modelling of fundamental frequency using a quadratic spline function. Travaux de l’Institut de Phonétique d’Aix, 15, 75–85.Google Scholar

Hirst, D. J., Di Cristo, A. & Espesser, R. (2000). Levels of representation and levels of analysis for intonation. In Horne, M., ed., Prosody: Theory and Experiment. Dordrecht: Kluwer Academic Publishers, pp. 51–87.Google Scholar

Hirst, D. J., Cho, H., Kim, S. & Yu, H. (2007). Evaluating two versions of the Momel pitch modeling algorithm on a corpus of read speech in Korean. In Proceedings of INTERSPEECH, VIII. Antwerp, Belgium, pp. 1649–52.Google Scholar

House, A. & Fairbanks, G. (1953). The influence of consonant environment upon the secondary acoustical characteristics of vowels. Journal of the Acoustical Society of America, 25, 105–13.Google Scholar

House, D. (1990). Tonal Perception in Speech. Lund: Lund University Press.Google Scholar

Iivonen, A. (1998). Intonation in Finnish. In Hirst, D. J. and Di Cristo, A., eds., Intonation Systems: A Survey of Twenty Languages. Cambridge: Cambridge University Press, pp. 331–47.Google Scholar

Imig, T. J. & Morel, A. (1985). Tonotopic organization in ventral nucleus of medial geniculate body in the cat. Journal of Neurophysiology, 53, 309–40.Google Scholar

Jassem, W. (1952). Intonation of Conversational English (educated Southern British). Wrocław: Wrocławskie Towarzystwo Naukowe [PDF available from the Speech and Language Data Repository, http://sldr.org/sldr000777/en].Google Scholar

Jones, D. (1909). Intonation Curves. Leipzig: Teubner.Google Scholar

Kiessling, A., Kompe, R., Niemann, H., Nöth, E. & Batliner, A. (1995). Voice source state as a source of information in speech recognition: Detection of laryngealizations. Natoasi Series of Computer and Systems Sciences, 147, 329–32.Google Scholar

Kuttner, F. A. (1975). Prince Chu Tsai-Yu’s life and work: A re-evaluation of his contribution to equal temperament theory. Ethnomusicology, 19(2), 163–206.Google Scholar

Liberman, M. (2017). Pitch contour perception. http://languagelog.ldc.upenn.edu/nll/?p=34251.Google Scholar

Lindley, Mark. (2001). Well-tempered clavier. In Sadie, S. & Tyrrell, J., eds., The New Grove Dictionary of Music and Musicians, 2nd ed. London: Macmillan.Google Scholar

Liu, J., Wang, N., Li, J., Shi, B. & Wang, H. (2009). Frequency distribution of synchronized spontaneous otoacoustic emissions showing sex-dependent differences and asymmetry between ears in 2- to 4- day-old neonates. International Journal of Pediatric Otorhinolaryngology, 73(5), 731–6.Google Scholar

Maghbouleh, A. (1998). Tobi accent type recognition. In Proceedings of the Sixth International Conference on Spoken Language Processing, Paper 0632.Google Scholar

Martin, P. (1981). Extraction de la fréquence fondamentale par intercorrélation avec une fonction peigne. 12e Journées d’Etude sur la Parole, SFA, Montréal.Google Scholar

Mertens, P. (2004). The Prosogram: Semi-automatic transcription of prosody based on a tonal perception model. In Proceedings of the 2nd International Conference on Speech Prosody, Nara, Japan, pp. 549–52.Google Scholar

Mertens, P. (2018). Prosogram, v 2.15. Pitch contour stylization based on a tonal perception model. https://sites.google.com/site/prosogram/home.Google Scholar

Mertens, P. & d’Alessandro, C. (1995). Pitch contour stylization using a tonal perception model. In Proceedings of the 13th International Congress of Phonetic Sciences vol. 4, pp. 228–31.Google Scholar

Mixdorff, H. -J. (1999). A novel approach to the fully automated extraction of Fujisaki model parameters. In Proceedings of ICASSP 1999, pp. 1281–4.Google Scholar

Moore, B. C. J. & Glasberg, B. R. (1983). Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. Journal of the Acoustical Society of America, 74, 750–3.Google Scholar

Moore, B. C. J. & Glasberg, B. R. (1996). A revision of Zwicker’s loudness model. Acta Acustica, 82, 335–45.Google Scholar

Morel, A. (1980). Codage des sons dans le corps genouille médian du chat: évaluation de l’organisation tonotopique de ses différents noyaux, PhD dissertation, Université de Lausanne, Juris, Zurich.Google Scholar

Morest, D. K. (1965). The laminar structure of the medial geniculate body of the cat. Journal of Anatomy 99, 143–60.Google Scholar

Nolan, F. (2003). Intonational equivalence: an experimental evaluation of pitch scales. In Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, pp. 771–4.Google Scholar

Nooteboom, S. (1999). The prosody of speech melody and rhythm. In Hardcastle, W. J. & Laver, J., eds., The Handbook of Phonetic Sciences. London: Blackwell, pp. 640–73.Google Scholar

O’Shaughnessy, D. (1987). Speech Communication: Human and Machine. Reading, MA: Addison-Wesley, p. 150.Google Scholar

Paeschke, A. & Sendlmeier, W. F. (2000). Prosodic characteristics of emotional speech: Measurements of fundamental frequency movements. In Proceedings of the ISCA Workshop on Speech and Emotion, Belfast, Ireland, pp. 75–80.Google Scholar

Rossi, M. (1971). Le seuil de glissando ou seuil de perception des variations tonales pour les sons de la parole. Phonetica, 23, 1–33.Google Scholar

Silverman, K. (1986). f₀ segmental cues depend on intonation: The case of the rise after voiced stops. Phonetica, 43(1–3), 76–91.Google Scholar

Steele, J. (1779). Prosodia Rationalis: or, an Essay towards Establishing the Melody and Measure of Speech, to be Expressed and Perpetuated by Peculiar Symbols, 2nd ed. London: J. Nichols.Google Scholar

Stevens, S., Volkman, J. & Newman, E. (1937). A scale for the measurement of the psychological magnitude of pitch. Journal of the Acoustical Society of America, 8, 185–90.Google Scholar

Taylor, P. (1995). The rise/fall/connection model of intonation. Speech Communication, 15(1–2), 169–86.Google Scholar

Traunmüller, H. (1990). Analytical expressions for the tonotopic sensory scale. Journal of the Acoustical Society of America, 88, 97–100.Google Scholar

Traunmüller, H. (1997). Auditory scales of frequency representation. www2.ling.su.se/staff/hartmut/bark.htm.Google Scholar

Umesh, S., Cohen, L. & Nelson, D. (1999). Fitting the Mel-scale. In Proceedings of the IEEE International Conference on Acoustics, Speech, Signal Processing, 1, Phoenix, Arizona, USA, March 1999, pp. 217–20.Google Scholar

Véronis, J., Hirst, D. J. & Ide, N. (1994). NL and speech in the Multext project. In Proceedings of AAAI Workshop on Integration of Natural Language and Speech, Seattle, USA, pp. 72–8.Google Scholar

Wightman, C. & Campbell, N. (1995). Improved labeling of prosodic structure. In IEEE Transactions on Speech and Audio Processing.Google Scholar

Wikipedia. (2018). Pitch detection algorithm. https://en.wikipedia.org/wiki/Pitch_detection_algorithm.Google Scholar

Wright, A. A., Rivera, J. J., Hulse, S. H., Shyan, M. & Neiworth, J. J. (2000). Music perception and octave generalization in rhesus monkeys. Journal of Experimental Psychology Gen 129 (3), 291–307.Google Scholar

Zwicker, E. (1961). Subdivision of the audible frequency range into critical bands (Frequenz-gruppen). Journal of the Acoustical Society of America, 33, 248.Google Scholar

Zwirner, E. & Zwirner, Z. K. (1937). Über das Hören und Messen des Sprachmelodie, Achiv für vergleichende Phonetik 1, pp. 35–47.Google Scholar

14.7 References

Anderson, V. B. (2000). Giving Weight to Phonetic Principles: The Case of Place of Articulation in Western Arrernte. PhD Thesis, UCLA.Google Scholar

Articulate Instruments Ltd. (2010). Articulate Assistant User Guide: Version 1.18, Edinburgh, UK: Articulate Instruments Ltd.Google Scholar

Bell-Berti, F. & Krakow, R. A. (1991). Anticipatory velar lowering: A coproduction account. Journal of the Acoustical Society of America, 90(1), 112–23.Google Scholar

Bernhardt, B., Gick, B., Bacsfalvi, P. & Adler-Bock, M. (2005). Ultrasound in speech therapy with adolescents and adults. Clinical Linguistics and Phonetics, 19(6–7), 605–17.Google Scholar

Bouhuys, A., Proctor, D. F. & Mead, J. (1966). Kinetic aspects of singing. Journal of Applied Physiology, 21(2), 483–96.Google Scholar

Browman, C. & Goldstein, L. (1992). Articulatory Phonology: An overview. Phonetica, 49(3–4), 155–80.Google Scholar

Brunner, J., Fuchs, S. & Perrier, P. (2009). On the relationship between palate shape and articulatory behavior. Journal of the Acoustical Society of America, 125(6), 3936–49.Google Scholar

Byrd, D. & Saltzman, E. (1998). Intragestural dynamics of multiple prosodic boundaries. Journal of Phonetics, 26(2), 173–99.Google Scholar

Byrd, D., Tobin, S., Bresch, E. & Narayanan, S. (2009). Timing effects of syllable structure and stress on nasals: A real-time MRI examination. Journal of Phonetics, 37(1), 97–110.Google Scholar

Chen, E. (2017, August 20). Guess the Word. Retrieved 26 September 2018, from https://ericlgame.itch.io/guess-the-word.Google Scholar

Cheng, H. Y., Murdoch, B. E., Goozée, J. V. & Scott, D. (2007). Electropalatographic assessment of tongue-to-palate contact patterns and variability in children, adolescents, and adults. Journal of Speech, Language, and Hearing Research, 50(2), 375–92.Google Scholar

Chiba, T. & Kajiyama, M. (1941). The Vowel: Its Nature and Structure. Tokyo: Tokyo-Kaiseikan.Google Scholar

Childers, D. G. & Krishnamurthy, A. K. (1985). A critical review of electroglottography. Critical Reviews in Biomedical Engineering, 12(2), 131–61.Google Scholar

Cusack, R., Cumming, N., Bor, D., Norris, D. & Lyzenga, J. (2005). Automated post-hoc noise cancellation tool for audio recordings acquired in an MRI scanner. Human Brain Mapping, 24(4), 299–304.Google Scholar

Dart, S. N. (1991). Articulatory and acoustic properties of apical and laminal articulations. In UCLA Working Papers in Phonetics, 79, 1–155.Google Scholar

Davidson, L. (2006). Comparing tongue shapes from ultrasound imaging using smoothing spline analysis of variance. Journal of the Acoustical Society of America, 120, 407–15.Google Scholar

Delvaux, V., Demolin, D., Harmegnies, B. & Soquet, A. (2008). The aerodynamics of nasalization in French. Journal of Phonetics, 36(4), 578–606.Google Scholar

Demolin, D. (2011). Aerodynamic techniques for phonetic fieldwork. In Proceedings of the 17th International Congress of Phonetic Sciences. City University of Hong Kong: Hong Kong, 84–7.Google Scholar

Ellis, L. & Hardcastle, W. (2002). Categorical and gradient properties of assimilation in alveolar to velar sequences: Evidence from EPG and EMA data. Journal of Phonetics, 30(3), 373–96.Google Scholar

Esling, J. H. (1996). Pharyngeal consonants and the aryepiglottic sphincter. Journal of the International Phonetic Association, 26(2), 65–88.Google Scholar

Esling, J. H., Fraser, K. E. & Harris, J. G. (2005). Glottal stop, glottalized resonants, and pharyngeals: A reinterpretation with evidence from a laryngoscopic study of Nuuchahnulth (Nootka). Journal of Phonetics, 33(4), 383–410.Google Scholar

Esposito, C. M. (2012). An acoustic and electroglottographic study of White Hmong tone and phonation. Journal of Phonetics, 40(3), 466–76.Google Scholar

Fant, G. (1970). Acoustic Theory of Speech Production: with Calculations Based on X-Ray Studies of Russian Articulations, vol. 2. Berlin: Walter de Gruyter.Google Scholar

Firth, J. (1948). Word-palatograms and articulation. Bulletin of the School of Oriental and African Studies, 12(3–4), 857–64.Google Scholar

Fowler, C. A. & Saltzman, E. (1993). Coordination and coarticulation in speech production. Language and Speech, 36(2–3), 171–95.Google Scholar

Frisch, S. A. & Wodzinski, S. M. (2016). Velar–vowel coarticulation in a virtual target model of stop production. Journal of Phonetics, 56, 52–65.Google Scholar

Fuchs, S. & Koenig, L. L. (2009). Simultaneous measures of electropalatography and intraoral pressure in selected voiceless lingual consonants and consonant sequences of German. Journal of the Acoustical Society of America, 126(4), 1988.Google Scholar

Fujimura, O., Kiritani, S. & Ishida, H. (1973). Computer controlled radiography for observation of movements of articulatory and other human organs. Computers in Biology and Medicine, 3(4), 371–84.Google Scholar

Gafos, A. I., Charlow, S., Shaw, J. A. & Hoole, P. (2014). Stochastic time analysis of syllable-referential intervals and simplex onsets. Journal of Phonetics, 44, 152–66.Google Scholar

Gibbon, F. E. (1990). Lingual activity in two speech-disordered children’s attempts to produce velar and alveolar stop consonants: Evidence from electropalatographic (EPG) data. International Journal of Language & Communication Disorders, 25(3), 329–40.Google Scholar

Giles, S. B. & Moll, K. L. (1975). Cinefluorographic study of selected allophones of English /l/. Phonetica, 31(3–4), 206–27.Google Scholar

Hardcastle, W. J. (1972). The use of electropalatography in phonetic research. Phonetica, 25(4), 197–215.Google Scholar

Herbst, C. T., Fitch, W. T. & Švec, J. G. (2010). Electroglottographic wavegrams: A technique for visualizing vocal fold dynamics noninvasively. Journal of the Acoustical Society of America, 128(5), 3070–8.Google Scholar

Horiguchi, S. & Bell-Berti, F. (1987). The Velotrace: A device for monitoring velar position. The Cleft Palate Journal, 24(2), 104–11.Google Scholar

Isshiki, N. (1964). Regulatory mechanism of voice intensity variation. Journal of Speech, Language, and Hearing Research, 7(1), 17–29.Google Scholar

Johnson, K. (2003). Acoustic and Auditory Phonetics, 2nd ed. Oxford: Blackwell.Google Scholar

Keating, P. A. (1990). The window model of coarticulation: articulatory evidence. In Kingston, J. & Beckman, M., eds., Papers in Laboratory Phonology I. Cambridge: Cambridge University Press, pp. 451–470.Google Scholar

Keating, P. A. (1991). Coronal places of articulation. In Paradis, C. & Prunet, J., eds., Phonetics and Phonology, Volume 2: The Special Status of Coronals. Cambridge, MA: Academic Press, pp. 29–48.Google Scholar

Kelsey, C. A., Minifie, F. D. & Hixon, T. (1969). Applications of ultrasound in speech research. Journal of Speech, Language, and Hearing Research, 12(3), 564.Google Scholar

Kemp, J. A. (1995). Phonetics: Precursors to modern approaches. In E. F. K. Koerner & R. E. Asher, eds., Concise History of the Language Sciences. Amsterdam: Elsevier, pp. 371–88.Google Scholar

Khatiwada, R. (2007). Nepalese retroflex stops: a static palatography study of inter-and intra-speaker variability. In Proceedings of the 8th INTERSPEECH, pp. 1422–5.Google Scholar

Krakow, R. A. (1999). Physiological organization of syllables: A review. Journal of Phonetics, 27(1), 23–54.Google Scholar

Krausert, C. R., Olszewski, A. E., Taylor, L. N., McMurray, J. S., Dailey, S. H. & Jiang, J. J. (2011). Mucosal wave measurement and visualization techniques. Journal of Voice, 25(4), 395–405.Google Scholar

Ladefoged, P. (1968). A Phonetic Study of West African Languages: An Auditory-Instrumental Survey. Cambridge: Cambridge University Press.Google Scholar

Li, M., Akgul, Y. & Kambhamettu, C. (2005). EdgeTrak [Computer Program]. Version 1.0.0.4.Google Scholar

Lieberman, P. (1968). Direct comparison of subglottal and esophageal pressure during speech. Journal of the Acoustical Society of America, 43(5), 1157–64.Google Scholar

Lin, S., Beddor, P. S. & Coetzee, A. W. (2014). Gestural reduction, lexical frequency, and sound change: A study of post-vocalic /l/. Laboratory Phonology, 5(1), 9–36.Google Scholar

Lin, S. & Demuth, K. (2015). Children’s acquisition of English onset and coda /l/: Articulatory evidence. Journal of Speech, Language, and Hearing Research, 58(1), 13–27.Google Scholar

Lingala, S. G., Sutton, B. P., Miquel, M. E. & Nayak, K. S. (2016). Recommendations for real-time speech MRI: Real-Time Speech MRI. Journal of Magnetic Resonance Imaging, 43(1), 28–44.Google Scholar

Lohscheller, J., Eysholdt, U., Toy, H. & Dollinger, M. (2008). Phonovibrography: Mapping high-speed movies of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics. IEEE Transactions on Medical Imaging, 27(3), 300–9.Google Scholar

McAllister Byun, T. & Hitchcock, E. R. (2012). Investigating the use of traditional and spectral biofeedback approaches to intervention for /r/ misarticulation. American Journal of Speech-Language Pathology, 21(3), 207–21.Google Scholar

McAllister Byun, T., Buchwald, A. & Mizoguchi, A. (2016). Covert contrast in velar fronting: An acoustic and ultrasound study. Clinical Linguistics & Phonetics, 30(3–5), 249–76.Google Scholar

Ménard, L., Toupin, C., Baum, S. R., Drouin, S., Aubin, J. & Tiede, M. (2013). Acoustic and articulatory analysis of French vowels produced by congenitally blind adults and sighted adults. Journal of the Acoustical Society of America, 134(4), 2975–87.Google Scholar

Mielke, J., Baker, A. & Archangeli, D. (2010). Variability and homogeneity in American English /r/ allophony and /s/ retraction. In Fougeron, C., Kuehnert, B., Imperio, M. & Vallee, N., eds., Papers in Laboratory Phonology X. Berlin: Mouton De Gruyter, pp. 699–730.Google Scholar

Mielke, J., Olson, K. S., Baker, A. & Archangeli, D. (2011). Articulation of the Kagayanen interdental approximant: An ultrasound study. Journal of Phonetics, 39(3), 403–12.Google Scholar

Mielke, J., Carignan, C. & Thomas, E. R. (2017). The articulatory dynamics of pre-velar and pre-nasal /æ/-raising in English: An ultrasound study. Journal of the Acoustical Society of America, 142(1), 332–49.Google Scholar

Miller, A. & Finch, K. (2011). Corrected high-frame rate anchored ultrasound with software alignment. Journal of Speech, Language, and Hearing Research, 54(2), 471–86.Google Scholar

Moisik, S. R., Lin, H. & Esling, J. H. (2014). A study of laryngeal gestures in Mandarin citation tones using simultaneous laryngoscopy and laryngeal ultrasound (SLLUS). Journal of the International Phonetic Association, 44(01), 21–58.Google Scholar

Narayanan, S., Nayak, K., Lee, S., Sethy, A. & Byrd, D. (2004). An approach to real-time magnetic resonance imaging for speech production. Journal of the Acoustical Society of America, 115(4), 1771–6.Google Scholar

Narayanan, S., Toutios, A., Ramanarayanan, V., Lammert, A., Kim, J., Lee, S. et al. (2014). Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC). Journal of the Acoustical Society of America, 136(3), 1307–11.Google Scholar

Öhman, S. & Stevens, K. (1963). Cineradiographic studies of speech: Procedures and objectives. Journal of the Acoustical Society of America, 35(11), 1889.Google Scholar

Ramanarayana, V., Tilsen, S., Proctor, M., Töger, J., Goldstein, L., Nayak, K. S. et al. (2018). Analysis of speech production real-time MRI. Computer Speech & Language, 52, 1–22.Google Scholar

Rothenberg, M. (1992). A multichannel electroglottograph. Journal of Voice, 6(1), 36–43.Google Scholar

Russell, G. O. (1929). The mechanism of speech. Journal of the Acoustical Society of America, 1(1), 83–109.Google Scholar

Schönle, P. W., Gräbe, K., Wenig, P., Höhne, J., Schrader, J. & Conrad, B. (1987). Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain and Language, 31(1), 26–35.Google Scholar

Scobbie, J. M., Gibbon, F., Hardcastle, W. J. & Fletcher, P. (2000). Covert contrast as a stage in the acquisition of phonetics and phonology. In Broe, M. & Pierrehumbert, J., eds., Papers in Laboratory Phonology V. Cambridge: Cambridge University Press, pp. 194–207.Google Scholar

Scobbie, J. M., Wrench, A. & van der Linden, M. (2008). Head-probe stabilisation in ultrasound tongue imaging using a headset to permit natural head movement. In Proceedings of the Eighth International Seminar on Speech Production, Strasbourg, pp. 373–6.Google Scholar

Scobbie, J. M., Turk, A., Geng, C., King, S., Lickley, R. & Richmond, K. (2013). The Edinburgh Speech Production Facility DoubleTalk Corpus. In Proceedings of the 14th INTERSPEECH, pp. 764–6.Google Scholar

Stevens, K. N. (1989). On the quantal nature of speech. Journal of Phonetics, 17, 3–46.Google Scholar

Stevens, K. N. & House, A. S. (1955). Development of a quantitative description of vowel articulation. Journal of the Acoustical Society of America, 27(3), 484–93.Google Scholar

Stone, M. (2005). A guide to analysing tongue motion from ultrasound images. Clinical Linguistics and Phonetics, 19(6–7), 455–502.Google Scholar

Stone, M., Davis, E. P., Douglas, A. S., Aiver, M. N., Gullapalli, R., Levine, W. S. et al. (2001). Modeling tongue surface contours from cine-MRI images. Journal of Speech, Language, and Hearing Research, 44(5), 1026–40.Google Scholar

Strenger, F. (1959). Methods for direct and indirect measurement of the sub-glottal air-pressure in phonation. Studia Linguistica, 13(1–2), 98–112.Google Scholar

Styler, W., Krivokapic, J., Parrell, B. & Kim, J. (2017). Using machine learning to identify articulatory gestures in time course data. Journal of the Acoustical Society of America, 142(4), 2579.Google Scholar

Švec, J. G. & Schutte, H. K. (1996). Videokymography: High-speed line scanning of vocal fold vibration. Journal of Voice, 10(2), 201–5.Google Scholar

Tabain, M., Fletcher, J. & Butcher, A. (2011). An EPG study of palatal consonants in two Australian languages. Language and Speech, 54(2), 265–82.Google Scholar

Titze, I. R. (1990). Interpretation of the electroglottographic signal. Journal of Voice, 4(1), 1–9.Google Scholar

Westbury, J., Milenkovic, P., Weismer, G. & Kent, R. (1990). X-ray microbeam speech production database. Journal of the Acoustical Society of America, 88(S1), S56–S56.Google Scholar

Wrench, A. (1999). MOCHA-TIMIT, speech database. Department of Speech and Language Sciences, Queen Margaret University College, Edinburgh.Google Scholar

Yehia, H., Rubin, P. & Vatikiotis-Bateson, E. (1998). Quantitative association of vocal-tract and facial behavior. Speech Communication, 26(1–2), 23–43.Google Scholar

Yuan, J. & Liberman, M. (2008). Speaker identification on the SCOTUS corpus. Journal of the Acoustical Society of America, 123(5), 5687–890.Google Scholar

Zharkova, N., Gibbon, F. E. & Lee, A. (2017). Using ultrasound tongue imaging to identify covert contrasts in children’s speech. Clinical Linguistics & Phonetics, 31(1), 21–34.Google Scholar

Zhou, X., Espy-Wilson, C., Boyce, S., Tiede, M., Holland, C. & Choe, A. (2008). A magnetic resonance imaging-based articulatory and acoustic study of ‘retroflex’ and ‘bunched’ American English /r/. Journal of the Acoustical Society of America, 103(6), 4466–81.Google Scholar

Zue, V., Seneff, S. & Glass, J. (1990). Speech database development at MIT: TIMIT and beyond. Speech Communication, 9(4), 351–6.Google Scholar

15.7 References

Adank, P., Stewart, A. J., Connell, I. & Wood, J. (2013). Accent imitation positively affects language attitudes. Frontiers of Psychology, 4, 280.Google Scholar

Bachorowski, J. A. & Owren, M. J. (1999). Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech. Journal of the Acoustical Society of America, 106(2), 1054–63.Google Scholar

Black, A. W., Zen, H. & Tokuda, K. (2007). Statistical parametric speech synthesis. Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. IV-1229–32.Google Scholar

Cohen, M. H., Giangola, J. P. & Balogh, J. (2004). Voice User Interface Design. Redwood City, CA: Addison-Wesley Longman.Google Scholar

Collins, S. A. (2000). Male voices and women’s choices. Animal Behavior, 60(6), 773–80.Google Scholar

Feinberg, D. R., Jones, B. C., Little, A. C. & Perrett, D. I. (2005). Manipulations of fundamental and formant frequencies influence the attractiveness of human male voices. Animal Behavior, 69(3), 561–8.Google Scholar

Flanagan, J. L. (1965). Speech Analysis, Synthesis and Perception. Berlin: Springer-Verlag.Google Scholar

Flanagan, J. L. (1972). Voices of men and machines. Journal of the Acoustical Society of America, 51, 1375–87.Google Scholar

Fitch, W. T. & Giedd, J. (1999). Morphology and development of the human vocal tract: A study using magnetic resonance imaging. Journal of the Acoustical Society of America, 106(3), 1511–22.Google Scholar

Hartman, D. E. & Danhauer, J. L. (1976). Perceptual features of speech for males in four perceived age decades. Journal of the Acoustical Society of America, 59(3), 713–15.Google Scholar

Hochreiter, S. & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–80.Google Scholar

Kalchbrenner, N., Elsen, E., Simonyan, K., Noury, S., Casagrande, N., Lockhart, E. et al. (2018). Efficient neural audio synthesis, arXiv, 1802.08435.Google Scholar

Jia, Y., Zhang, Y., Weiss, R. Wang, Q., Shen, J., Ren, F. et al. (2019). Transfer learning from speaker verification to multispeaker text-to-speech synthesis. arXiv, 1806.04558.Google Scholar

Kinsella, B. (2019). Why tech giants are so desperate to provide your voice assistant. Harvard Business Review, https://hbr.org/2019/05/why-tech-giants-are-so-desperate-to-provide-your-voice-assistant.Google Scholar

Knudson, J. (2019). Digital publishers prepare for the voice revolution. Econtent Magazine, www.econtentmag.com/Articles/Editorial/Feature/Digital-Publishers-Prepare-for-the-Voice-Revolution-130768.htm.Google Scholar

Light, J. C. & McNaughton, D. (2014). Communicative competence for individuals who require augmentative and alternative communication: A new definition for a new era of communication? Augmentative and Alternative Communication, 30(1), 1–18.Google Scholar

Linville, S. (1998). Acoustic correlates of perceived versus actual sexual orientation in men’s speech. Pholia Phoniatrica et Logopaedica, 50(1), 35–48.Google Scholar

Munson, B., McDonald, E., DeBoe, N. & White, A. (2006). The acoustic and perceptual bases of judgments of women and men’s sexual orientation from read speech. Journal of Phonetics, 34(2), 202–40.Google Scholar

Peschke, C., Ziegler, W., Eisenberger, J. & Baumbaertner, A. (2012). Phonological manipulation between speech perception and production activated a parieto-frontal circuit. NeuroImage, 59, 788–99.Google Scholar

Pierrehumbert, J., Bent, T., Munson, B., Bradlow, A. R. & Bailey, J. M. (2004). The influence of sexual orientation on vowel production. Journal of the Acoustic Society of America, 116, 1905–8.Google Scholar

Rabiner, L. & Juang, B. J. (1993). Fundamentals of Speech Recognition. Englewood Cliffs, NJ: Prentice Hall.Google Scholar

Ridley, L. [Lost Voice Guy]. (2012).Voice by Choice. Comedy sketch by Lee Ridley, Lost VoiceGuy[Video File]. Retrieved from www.youtube.com/watch?v=CMm_XL3Ipbo.Google Scholar

Schabus, D. (2009). Interpolation of Austrian German and Viennese Dialect/Sociolect in HMM-based Speech Synthesis. Thesis, Vienna University of Technology.Google Scholar

Smyth, R., Jacobs, G. & Rogers, H. (2003). Male voices and perceived sexual orientation: An experiment and theoretical approach. Language and Society, 32(2), 329–50.Google Scholar

Stevens, K. (1998). Acoustic Phonetics. Cambridge, MA: MIT Press.Google Scholar

Taylor, P. (2009). Text-to-Speech Synthesis. Cambridge: Cambridge University Press.Google Scholar

Tokuda, K., Nankaku, Y., Toda, T., Zen, H., Yamagishi, J. & Oura, K. (2013). Speech synthesis based on Hidden Markov Models. Proceedings of the IEEE, 101(5), 1234–52.Google Scholar

Toman, M. (2016). Transformation and Interpolation of Language Varieties for Speech Synthesis. Thesis, Vienna University of Technology.Google Scholar

Toman, M., Pucher, M. & Moosmüller, S. (2015). Unsupervised and phonologically controlled interpolation of Austrian German language varieties for speech synthesis. Speech Communication, 72, 176–93.Google Scholar

Toman, M, Meltzner, G. S. & Patel, R. (2018). Data requirements and augmentation for DNN-based speech synthesis from crowdsourced data. In Proceedings of INTERSPEECH 2018, Hyderabad, pp. 2878–82.Google Scholar

van den Oord, A, Dieleman, S., Zen, H., Simonya, K, Vinyals, O., Graves, A. et al. (2016). WaveNet: A Generative Model for Raw Audio. arXiv: 1609.03499.Google Scholar

Walton, J. & Orlikoff, R. (1994). Speaker race identification from acoustic cues in the vocal signal. Journal of Speech, Language, and Hearing Research, 37(4), 738–45.Google Scholar

Wang, Y., Skerry-Ryan, R. J., Stanton, D., Wu, Y., Weiss, R. J., Jaitly, N. et al. (2017). Tacotron: Towards end-to-end speech synthesis. Proceedings of INTERSPEECH 2017, Stockholm, pp. 4006–10.Google Scholar

Young, S. (2010). Cognitive user interfaces. IEEE Signal Processing Magazine, 27(3), 128–40.Google Scholar

Zen, H., Senior, A. and Schuster, M. (2013). Statistical parametric speech synthesis using deep neural networks. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 7962–6.Google Scholar

Zen, H., Agiomyrgiannakis, Y., Egberts, N., Henderson, F. & Szczepaniak, P. (2016). Fast, compact, and high quality LSTM-RNN-based statistical parametric speech synthesizers for mobile devices. In Proceedings of INTERSPEECH 2016, San Francisco, pp. 2273–7.Google Scholar

Zuckerman, M. & Miyake, K. (1993). The attractive voice: What makes it so? Journal of Nonverbal Behavior, 17(2), 119–35.Google Scholar