Skip to main content Accessibility help

Speech rates converge in scripted turn-taking conversations



When speakers engage in conversation, acoustic features of their utterances sometimes converge. We examined how the speech rate of participants changed when a confederate spoke at fast or slow rates during readings of scripted dialogues. A beat-tracking algorithm extracted the periodic relations between stressed syllables (beats) from acoustic recordings. The mean interbeat interval (IBI) between successive stressed syllables was compared across speech rates. Participants’ IBIs were smaller in the fast condition than in the slow condition; the difference between participants’ and the confederate's IBIs decreased across utterances. Cross-correlational analyses demonstrated mutual influences between speakers, with greater impact of the confederate on participants’ beat rates than vice versa. Beat rates converged in scripted conversations, suggesting speakers mutually entrain to one another's beat.


Corresponding author

ADDRESS FOR CORRESPONDENCE Caroline Palmer, Department of Psychology, McGill University, 1205 Dr. Penfield Avenue, Montreal, QC H3A 1B1, Canada. E-mail:


Hide All
Arvaniti, A. (2012). The usefulness of metrics in the quantification of speech rhythm. Journal of Phonetics, 40, 351373.
Beebe, L. M., & Giles, H. (1984). Speech-accommodation theories: A discussion in terms of second-language acquisition. International Journal of the Sociology of Language, 46, 532.
Bertinetto, P. M., & Fowler, C. A. (1989). On sensitivity to durational modifications in Italian and English. Rivista di Linguistica, 1, 6994.
Borrie, S. A., & Liss, J. M. (2014). Rhythm as a coordinating device: Entrainment with disordered speech. Journal of Speech, Language, and Hearing Research. Advance online publication.
Bosshardt, H.-G., Sappok, C., Knipschild, M., & Holscher, C. (1997). Spontaneous imitation of fundamental frequency and speech rate by nonstutterers and stutterers. Journal of Psycholinguistic Research, 26, 425448.
Branigan, H. P., Pickering, M. J., Pearson, J., & McLean, J. F. (2010). Linguistic alignment between humans and computers. Journal of Pragmatics, 42, 23552368.
Burgoon, J. K., Stern, L. A., & Dillman, L. (1995). Interpersonal adaptation: Dyadic interaction patterns. Cambridge: Cambridge University Press.
Classé, A. (1939). The rhythm of English prose. Oxford: Basil Blackwell.
Corriveau, K. H., & Goswami, U. (2009). Rhythmic motor entrainment in children with speech and language impairments: Tapping to the beat. Cortex, 45, 119130.
Crystal, T. H., & House, A. S. (1990). Articulation rate and the duration of syllables and stress groups in connected speech. Journal of the Acoustical Society of America, 88, 101112.
Cummins, F. (2009). Rhythm as entrainment: The case of synchronous speech. Journal of Phonetics, 37, 1628.
Cummins, F. (2012). Looking for rhythm in speech. Empirical Musicology Review, 7, 12.
Cummins, F., & Port, R. (1998). Rhythmic constraints on stress timing in English. Journal of Phonetics, 26, 145171.
Cutler, A. (1991). Linguistic rhythm and speech segmentation. In Sundberg, J., Nord, L., & Carlson, R. (Eds.), Music, language, speech, and brain (pp. 157166). London: Macmillan.
Dale, R., Fusaroli, R., Duran, N. D., & Richardson, D. C. (2013). The self-organization of human interaction. In Ross, B. H. (Ed.), The psychology of learning and motivation (pp. 4395). Waltham, MA: Academic Press.
Dauer, R. M. (1983). Stress-timing and syllable-timing reanalyzed. Journal of Phonetics, 11, 5162.
Ellis, D. P. W. (2007a). Beat tracking by dynamic programming. Journal of New Music Research, 36, 5160.
Ellis, D. P. W. (2007b). Music audio tempo estimation and beat tracking. Dan Ellis: Research, Retrieved from
Giles, H. (Ed.) (1977). Language, ethnicity, and intergroup relations (pp. 1370). London: Academic Press.
Gill, S. P. (2012). Rhythmic synchrony and mediated interaction: Towards a framework of rhythm in embodied interaction. AI & Society, 27, 111127.
Grosjean, F., & Lane, H. (1976). How the listener integrates the components of speaking rate. Journal of Experimental Psychology: Human Perception & Performance, 2, 538543.
Guaïtella, I. (1999). Rhythm in speech: What rhythmic organizations reveal about cognitive processes in spontaneous speech production versus reading aloud. Journal of Pragmatics, 31, 509523.
Huber, J. E. (2008). Effects of utterance length and vocal loudness on speech breathing in older adults. Respiratory Physiology & Neurobiology, 164, 323330.
Janata, P., Tomic, S. T., & Haberman, J. M. (2012). Sensorimotor coupling in music and the psychology of the groove. Journal of Experimental Psychology: General, 141, 5475.
Jassem, W., Hill, D. R., & Witten, I. H. (1984). Isochrony in English speech: Its statistical validity and linguistic relevance. Intonation, Accent and Rhythm, 8, 203225.
Jones, M. R. (2009). Musical time. In Hallam, S., Cross, I., & Thaut, M. (Eds.), The handbook of music psychology (pp. 8192). New York: Oxford University Press.
Jungers, M. K., Palmer, C., & Speer, S. R. (2002). Time after time: The coordinating influence of tempo in music and speech. Cognitive Processing, 1–2, 2135.
Kjelgaard, M. M., & Speer, S. R. (1999). Prosodic facilitation and interference in the resolution of temporary syntactic closure ambiguity. Journal of Memory & Language, 40, 153194.
Krause, J. C., & Braida, L. D. (2002). Investigating alternative forms of clear speech: The effects of speaking rate and speaking mode on intelligibility. Journal of the Acoustical Society of America, 112, 21652172.
Kuhlen, A. K., Allefeld, C., & Haynes, J. D. (2012). Content-specific coordination of listeners’ to speakers’ EEG during communication. Frontiers in Human Neuroscience, 6, 266.
Kuhlen, A. K., & Brennan, S. E. (2012). Language in dialogue: When confederates might be hazardous to your data. Psychonomic Bulletin & Review, 20, 5472.
Large, E. W., Fink, P., & Kelso, S. J. (2002). Tracking simple and complex sequences. Psychological Research, 66, 317.
Large, E. W., & Jones, M. R. (1999). The dynamics of attending: How people track time-varying events. Psychological Review, 106, 119.
Lehiste, I. (1977). Isochrony revisited. Journal of Phonetics, 5, 253263.
Lidji, P., Palmer, C., Peretz, I., & Morningstar, M. (2011). Listeners feel the beat: Entrainment to English and French speech rhythms. Psychonomic Bulletin & Review, 18, 10351041.
London, J. (2012). Hearing in time (pp. 37). Oxford: Oxford University Press.
Manson, J. H., Bryant, G. A., Gervais, M. M., & Kline, M. A. (2013). Convergence of speech rate in conversation predicts cooperation. Evolution & Human Behavior, 34, 419426.
Marcus, S. M. (1981). Acoustic determinants of perceptual center (P-center) location. Perception & Psychophysics, 30, 247256.
McFarland, D. H. (2001). Respiratory markers of conversational interaction. Journal of Speech, Language, and Hearing Research, 44, 128143.
McKinney, M. F., Moelants, D., Davies, M. E. P., & Klapuri, A. (2007). Evaluation of audio beat tracking and music tempo extraction algorithms. Journal of New Music Research, 36, 116.
Miller, A. (1949). Death of a salesman. New York: Viking Press.
Miller, J. L., & Grosjean, F. (1981). How the components of speaking rate influence perception of phonetic segments. Journal of Experimental Psychology: Human Perception and Performance, 7, 208215.
Miller, J. L., Grosjean, F., & Lomato, C. (1984). Articulation rate and its variability in spontaneous speech: A reanalysis and some implications. Phonetica, 41, 215225.
Morton, J., Marcus, S., & Frankish, C. (1976). Perceptual centers (P-centers). Psychological Review, 83, 405408.
Natale, M. (1975). Convergence of mean vocal intensity in dyadic communication as a function of social desirability. Journal of Personality & Social Psychology, 32, 790804.
Nozaradan, S., Peretz, I., Missal, M., & Mouraux, A. (2011). Tagging the neuronal entrainment to beat and meter. Journal of Neuroscience, 31, 1023410240.
Nozaradan, S., Peretz, I., & Mouraux, A. (2012). Selective neuronal entrainment to the beat and meter embedded in a musical rhythm. Journal of Neuroscience, 32, 1757217581.
Pardo, J. S. (2006). On phonetic convergence during conversational interaction. Journal of the Acoustical Society of America, 119, 23822393.
Pardo, J. S., Jay, I. C., Hoshino, R., Hasbun, S. M., Sowemimo-Coker, C., & Krauss, R. M. (2013). The influence of role-switching on phonetic convergence in conversation. Discourse Processes, 50, 276300.
Pardo, J. S., Jay, I. C., & Krauss, R. M. (2010). Conversational role influences speech imitation. Attention, Perception, & Psychophysics, 72, 22542264.
Patel, A. D. (2008). Music, language, and the brain (pp. 96154). New York: Oxford University Press.
Povel, D. J., & Essens, P. (1985). Perception of temporal patterns. Music Perception, 2, 411440.
Quenouille, M. (1949). Approximate tests of correlation in time series. Journal of the Royal Statistical Society, Series B, 11, 6884.
Ramus, F., & Mehler, J. (1999). Language identification with suprasegmental cues: A study based on speech resynthesis. Journal of the Acoustical Society of America, 105, 512521.
Richardson, M. J., Marsh, K. L., Isenhower, R. W., Goodman, J. R. L., & Schmidt, R. C. (2007). Rocking together: Dynamics of intentional and unintentional interpersonal coordination. Human Movement Science, 26, 867891.
Rothermich, K., Schmidt-Kassow, M., & Kotz, S. A. (2012). Rhythm's gonna get you: Regular meter facilitates semantic sentence processing. Neuropsychologia, 50, 232244.
Schmidt, R. C., Richardson, M. J., Arsenault, C., & Galantucci, B. (2007). Visual tracking and entrainment to an environmental rhythm. Journal of Experimental Psychology: Human Perception & Performance, 33, 860870.
Shen, Y., & Peterson, G. G. (1962). Isochronism in English. Studies in Linguistics, Occasional Papers, 9, 136.
Sluijter, A. M., Van Heuven, V. J., & Pacilly, J. J. (1997). Spectral balance as a cue in the perception of linguistic stress. Journal of the Acoustical Society of America, 101, 503513.
Stephens, G. J., Silbert, L. J., & Hasson, U. (2010). Speaker-listener neural coupling underlies successful communication. Proceedings of the National Academy of Science, 107, 1442514430.
Stevens, S. S., Volkmann, J., & Newman, E. B. (1937). The Mel scale equates the magnitude of perceived differences in pitch at different frequencies. Journal of the Acoustical Society of America, 8, 185190.
Street, R. L. (1984). Speech convergence and speech evaluation in fact-finding interviews. Human Communication Research, 11, 139169.
Street, R. L. Jr., Street, N. J., & Van Kleek, A. (1983). Speech convergence among talkative and reticent three-year-olds. Language Sciences, 5, 7996.
Tierney, A. T., & Kraus, N. (2013). The ability to tap to a beat relates to cognitive, linguistic, and perceptual skills. Brain & Language, 124, 225231.
Tukey, J. W. (1958). Bias and confidence in not quite large samples (abstract). Annals of Mathematical Statistics, 29, 614.
Villing, R., Ward, T., & Timoney, J. (2003, July 1–2). P-Centre extraction from speech: The need for a more reliable measure. Paper presented at the Irish Signals & Systems Conference (ISSC 2003), Limerick, Ireland.
Volaitis, L. E., & Miller, J. L. (1992). Phonetic prototypes: Influence of place of articulation and speaking rate on the internal structure of voicing categories. Journal of the Acoustical Society of America, 92, 723735.
Webb, J. T. (1969). Subject speech rates as a function of interviewer behavior. Language & Speech, 12, 5467.
Webb, J. T. (1972). Interview synchrony: An investigation of two speech rate measures in an automated standardized interview. In Pope, B. & Siegman, A. W. (Eds.), Studies in dyadic communication (pp. 115133). New York: Pergamon Press.
Wilde, O. (1908). Collected works of Oscar Wilde. Metheun, MA: Riverside Press.
Wilson, M., & Wilson, T. P. (2005). An oscillator model of the timing of turn-taking. Psychonomic Bulletin & Review, 12, 957968.

Related content

Powered by UNSILO

Speech rates converge in scripted turn-taking conversations



Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.