7.1 Introduction
The production and perception of acoustic communication signals in different species, including humans, have intrigued scientists all over the world and from diverse research fields such as linguistics, biology, and psychology. Some questions are important to answer in all fields. One is about the temporal structure of acoustic communication, and how it is used to convey information. These temporal structures or rhythms can act on different levels, within a phrase or between phrases, and they can also help in determining phrase boundaries. Perspectives on these connected questions are manifold, and cross-talk between the three disciplines is often limited. In this chapter, we are advocating for more cross-talk using two prosodic markers connected to rhythms as examples to showcase the advantages of combining linguists’ and biologists’ knowledge about the respective phenomena.
Rhythms can be defined in various ways (for an overview, see Turk and Shattuck-Hufnagel, Reference Turk and Shattuck-Hufnagel2013). One example is given by Patel, where rhythm is the “systematic patterning of sounds in terms of timing, accent and grouping” (Reference Patel2008:96). Building on this, we consider rhythm as a nonrandom, ordered, predictable, and repeated alternation of different elements in a sequence. The motivation to use this definition is to be relatively independent of theoretical concepts in one or the other research domain. We think a broader perspective including humans and nonhuman animals is important because it allows us to better understand the underlying principles and may root rhythms of human language in evolution.
According to this definition, the building blocks of rhythm are elements in a sequence. Even if it is a challenge on its own (Fletcher, Reference Fletcher, Hardcastle, Laver and Gibbon2010), determining these elements in a known human language, that is, linguistic units such as syllables, words, and prosodic phrases, might be more straightforward than determining the building blocks of rhythm in nonhuman vocalizations. Among other approaches, rhythmic markers might be a way to tackle this issue in nonhuman animals. We will focus on two prosodic markers of rhythm, fundamental frequency (f0) declination and final lengthening. Both are frequently found in human communication and have also been reported in nonhuman communication.
Final lengthening refers to a phenomenon where a final or penultimate syllable of an utterance or prosodic phrase is produced for a longer duration than when the same is uttered within an utterance or prosodic phrase (Fletcher, Reference Fletcher, Hardcastle, Laver and Gibbon2010) (Figure 7.1A). Lengthening can signal the end of a unit, which varies nonlinearly with the strength of a boundary (Kentner et al., Reference Kentner, Franz, Knoop and Menninghaus2023). It can co-occur with pauses and f0 lowering (Petrone et al., Reference Petrone, Truckenbrodt and Wellmann2017). From a perceptual side, it is a phonetic signature that might help to mark the end of a unit (Schel et al., Reference Schel, Tranquilli and Zuberbühler2009).
Explanation of f0 declination and final lenghthening in two examples.
A) Oscillogram of human language. The spoken text is: “Always there had been war between the giants and the gods.” The duration of the “s” of “giants” and “gods” is annotated, and the final “s” is much longer, demonstrating final lengthening. B) The spectrogram for the same human sentence is shown with a solid line indicating the f0. A clear f0 decline is visible. C) An oscillogram of a budgerigar twittering. D) The corresponding spectrogram of the twittering. The f0 is shown as an extra solid line, and an f0 decline is visible.

Figure 7.1 Long description
Panel A. A waveform, showing the sound's amplitude over time. Darker areas represent louder parts of the sound. A small graphic of a person is depicted at the top right. Panel B: A spectrogram, showing the sound's frequency over time. Darker areas represent higher sound energy at a given frequency. A small graphic of a person is depicted at the top right. Panel C: Another waveform, similar to panel A, along with a bird graphic at the top right. Panel D: Another spectrogram, similar to panel B, along with a bird graphic is present.
F0 declination has been widely discussed in phonological (Ladd, Reference Ladd1988, Reference Ladd2008) and phonetic terms (Strik and Boves, Reference Strik and Boves1995). We broadly define f0 declination as “the gradual decrease of f0 throughout an utterance” (Fuchs et al., Reference Fuchs, Petrone, Rochet-Capellan, Reichel and Koenig2015:35) to be as inclusive as possible concerning nonhuman animal communication (Figure 7.1A–D). F0 declination is common in statements as opposed to questions. From a phonetic perspective, it is calculated as a linear regression through f0 values in a given temporal window (e.g., 1–4 s in Yuan and Liberman, Reference Yuan and Liberman2014:69) that often corresponds to interpausal units or annotated prosodic phrases. The linear regression slope is negative by definition (f0 decline) and flattens with utterance length (e.g., Cooper and Sorensen, Reference Cooper and Sorensen1981; Swerts et al., Reference Swerts, Strangert and Heldner1996; Fuchs et al., Reference Fuchs, Petrone, Rochet-Capellan, Reichel and Koenig2015).
Together, f0 declination and final lengthening can be used to create a sense of units in a rhythmic sequence, helping to signal boundaries between units and convey information about the length of a unit in the sequence itself that may be repeated over time to produce structured time events.
There is a lot of potential in studying this in a comparative way between species. Linguistics could help to answer long-standing questions in animal communication, for example using knowledge of human languages to determine phrase boundaries or meaningful units. The other way around, we can use knowledge about animal communication and the opportunities we might have when studying a wide variety of animal species with different cognitive abilities or physical constraints to solve debates in linguistics on how to delineate phenomena influenced by cognitive abilities or biophysical mechanisms. There might be universal underlying motor principles in humans and nonhuman animals that we can only find when studying and comparing both.
We can observe variations in pitch and timing in nonhuman animal communication similar to prosodic features in human speech (Briefer, Reference Briefer2012; Hotchkin and Parks, Reference Hotchkin and Parks2013; Filippi, Reference Filippi2016). Oftentimes, these changes are caused by physiological alterations (for example the emotional state can influence muscle tension [Briefer, Reference Briefer2012]). Different kinds of information can be conveyed by those changes, whether it be individual identity in birds (and many other species, e.g., Linhart et al., Reference Linhart, Mahamoud-Issa, Stowell and Blumstein2022) or context in primates (Crockford et al., Reference Crockford, Gruber and Zuberbühler2018), among others.
There have been many recently published papers finding more and more similarities in human and nonhuman animal communication, for example finding that both penguins and gibbons also adhere to Zipf’s Law of Brevity, where the most frequent elements in communication are shorter in duration (Favaro et al., Reference Favaro, Gamba and Cresta2020; Huang et al., Reference Huang, Ma, Ma, Garber and Fan2020; Valente et al., Reference Valente, De Gregorio and Favaro2021). In biology, more research is needed to explore the extent and complexity of prosodic features in nonhuman communication. Animal communication may resemble human prosody, but there are important differences as far as we know now: Many animal vocalizations are innate, not learned through acquisition, and may be less flexible and expressive than human speech (Janik and Slater, Reference Janik and Slater2000; Tyack, Reference Tyack2019). At the same time, the spectrum between innate, adjusted, and learned vocalizations gives interesting options to study the prerequisites for prosodic phenomena.
To illustrate another field that would highly benefit from the comparative approach, we will focus on acoustic signals of humans and nonhuman animals using two concrete phenomena: final lengthening and f0 declination. These two phonetic markers have been chosen because they are involved in determining the units that can form a rhythmic sequence. Final lengthening is a phonetic signature of the end of a phrase and signals a boundary, while the slope of f0 declination can signal the length of an entire phrase. Both are frequent in human communication, and they have been considered as being universal even if language-specific modifications may be found (Fletcher, Reference Fletcher, Hardcastle, Laver and Gibbon2010). There have been debates about the origin of prosodic phenomena in general, for example whether they rely on linguistic representations or general physical properties. Similar to other scientists (Pika et al., Reference Pika, Wilkinson, Kendrick and Vernes2018; Matzinger and Fitch, Reference Matzinger and Fitch2021; Pouw and Fuchs, Reference Pouw and Fuchs2022; Hersh et al., Reference Hersh, Ravignani and Burchardt2023; Hoeschele et al., Reference Hoeschele, Wagner and Mann2023), we strongly believe that more interdisciplinary cross-fertilization is needed to better understand what properties are shared among human and nonhuman animals in their rhythm communication. By comparing the use of prosodic features in human and nonhuman animal communication, we can get insights into their evolutionary origins and development of speech rhythms. We can also shed light on the cognitive and biological underpinnings of human language and communication. On a different level, the comparative approach can provide a valuable perspective on the diversity of prosodic features used across different languages and language families and therefore might raise new questions in more specific research areas. We will be able to better identify universal patterns and language-specific variations. The road toward such interdisciplinary exchange is not without obstacles, but the joint venture, in our case between a biologist and a linguist, can enhance the collection of species producing selected phenomena, can initiate new recording and open source databases (Hersh et al., Reference Hersh, Ravignani and Burchardt2023), and may reduce the human-centric view on the evolution of complex acoustic communication and the insider bias when doing comparative work (Hoeschele et al., Reference Hoeschele, Wagner and Mann2023).
7.2 F0 Declination in Acoustic Signals: Human versus Nonhuman Animals
7.2.1 Characteristics of F0 Declination in Human Speech
F0 declination, the gradual decrease of f0 throughout an utterance, spans several words in human language and is a macro rhythm (rhythms with longer time units) in human speech. Strik and Boves (Reference Strik and Boves1995), among others, proposed to start analyzing time units larger than 1 second; otherwise, the calculation of the declination slope might be heavily affected by local f0 variations. F0 declination is not a rhythm in itself, but similar to final lengthening, it is a prosodic signature of a unit that can be repeated over time and form a rhythm.
While more cross-linguistic work on f0 declination using the same methodology is missing, it has been reported for various languages, such as Danish, Dutch, English, French, German, Greek, Japanese, and Spanish (see Fuchs et al., Reference Fuchs, Petrone, Rochet-Capellan, Reichel and Koenig2015, for an overview), so there is reason to believe it is a relatively robust phenomenon (Hauser and Fowler, Reference Hauser and Fowler1992), at least in Indo-European languages. However, it is not mandatory and can be modulated; for example, f0 can rise at the end of a phrase (called continuation rise) to signal the speaker’s motivation to continue talking, or it can rise signaling question intonation. There are also language specificities as well as other factors. For example, Lieberman et al. (Reference Lieberman, Katz, Jongman, Zimmerman and Miller1985) found more negative slopes in reading than in spontaneous speech.
A potential physiological origin of this phenomenon has been postulated. Lieberman (Reference Lieberman1967) measured subglottal pressure and f0 declination in three human participants reading declarative sentences and found a positive correlation between the two variables. He therefore suggested a potential origin in respiratory behavior, because respiration is a driving force of phonation. Others have argued that muscular tension in the vocal folds may be at the origin of f0 declination (e.g., Ohala, Reference Ohala and Fromkin1978) and that tensioning of the vocal folds is independent of respiration because the primary function of the larynx is to save lives and protect the lungs from foreign bodies, hence it must be independent and quick (Ohala, Reference Ohala and Hardcastle1990). The challenge for or against one or the other argument or a mix of the two is that measuring subglottal pressure and laryngeal tension is very invasive, so empirical data is limited.
Apart from the physiological origin of f0 declination, cognitive processes seem plausible as well. Since the slope of the f0 declination is correlated with the length of the upcoming utterance, some anticipatory planning may be involved (Yuan and Liberman, Reference Yuan and Liberman2014).
7.2.2 F0 Declination in Nonhuman Animal Communication
We occasionally find descriptions hinting at f0 declination in monkeys. For example, in a description of the vocalizations of the black and white colobus monkey, we find the following: “The final phrase is often deeper pitched than the others” (Marler, Reference Marler1972:181). This refers to the alarm call “roar.” This leads Schel et al. (Reference Schel, Tranquilli and Zuberbühler2009) to hypothesize that this phenomenon is perceptually conspicuous and marks the end of the sequence. In colobus monkeys a roaring sequence consists of one or more roaring phrases, where a phrase is a basic unit, made up of ~15 “pulses,” each with an average duration of 0.7 seconds, which makes a whole phrase around 10 seconds (Marler, Reference Marler1972).
Confusingly, in the same species, the exact opposite was reported, at least for the black colobus monkey (Colobus satanas): “Initial phrases all decreased in pitch during delivery, the terminal phrases all increased” in pitch (Oates and Trocco, Reference Oates and Trocco1983:100). This could have different explanations: The phenomenon could be dependent on unknown parameters that have differed between the studies, or coincidentally the decline in final phrases was observed in a different colobus monkey species.
The most detailed study on f0 declination in nonhuman animals was conducted in vervet monkeys (Cercopithecus aethiops) and rhesus macaques (Macaca mulatta), two very common model species (Hauser and Fowler, Reference Hauser and Fowler1992). For both species, vocal production also shows f0 declination, which is suggested to serve a similar communicative function as in human language (Hauser and Fowler, Reference Hauser and Fowler1992). Under investigation were vocalizations uttered during aggressive interaction for vervet monkeys in the wild and an affiliative vocalization for wild rhesus macaques. For both species an almost linear decline in f0 could be shown over call bouts of two calls and of three calls. Furthermore, for two call bouts of vervet monkeys, a correlation between the duration of the bout and f0 decline could be shown, as expected from human language literature (e.g., Yuan and Liberman, Reference Yuan and Liberman2014). No correlation could be found between the duration of the bout and the magnitude of the f0 decline in rhesus macaques. Another interesting thing to note is that the structured decline of f0 in vervet monkeys could only be seen in adults, not in juveniles, which could be explained by the fact that inter-call intervals in juveniles are generally longer, indicating juveniles might take a breath between calls, in contrast to adults. This makes it even more interesting to study the phenomenon further.
Another instance of f0 declination being reported in monkeys stems from baboons, where f0 is also reported to decline within bouts of calling. This is described to be independent of rank and age in a species where f0 generally is highly correlated to these two parameters (Fischer et al., Reference Fischer, Kitchen, Seyfarth and Cheney2004), making it a more general phenomenon worthy of further investigation.
In birds, f0 decline was found in the vocalizations of the budgerigar (Melopsittacus undulatus). Here, “mean F0 measurements were lower for segments in syllable-final position when compared to medial segments” (Mann et al., Reference Mann, Fitch, Tu and Hoeschele2021:6, Figure 4 caption).
7.2.3 Comparing F0 Declination in Human and Nonhuman Animal Communication
The number and detail of papers published on f0 declination are clearly more substantial in the linguistic domain, which shouldn’t imply that this is an exclusively human phenomenon. Papers published on f0 declination in animal communication include primates and birds. They often lack empirical work on the underlying mechanisms (different for final lengthening – see below). While papers on animal communication are a stepping stone toward a broader perspective, they are mostly descriptive.
Whether or not f0 declination is primarily the result of a decrease in subglottal pressure, a reduction in laryngeal tension throughout an utterance, a marker of anticipatory planning of an utterance, or a mixture of these is still unclear. The invasiveness of recording data in favor of the first two explanations limits the empirical evidence in humans.
There may also be some challenges when comparing humans and nonhuman animals. The minimum length of an utterance that can be considered for calculating the f0 declination slope may have to be adjusted for various animal species, similar to the terminology used. For an untrained linguist, the term “syllable” that is used in animal communication may have a very different connotation than for a biologist, for whom it may be clear that this is an utterance between silent pauses. Joint venture investigations on humans and nonhuman animals might also reveal deeper insights because they differ in their respiratory, vocal, and cognitive repertoire.
7.3 Final Lengthening: Humans versus Nonhuman Animals
7.3.1 Final Lengthening in Human Speech
Phonetic studies in a variety of languages found that final lengthening is a reliable phonetic marker determining the end of a speech chunk and is very pronounced next to a following pause (e.g., Klatt, Reference Klatt1976; Edwards et al., Reference Edwards, Beckman and Fletcher1991; for a review on languages, see Paschen et al., Reference Paschen, Fuchs and Seifart2022). There have also been considerations that the lengthening of the final segment is part of the pause (e.g., Krivokapić et al., Reference Krivokapić, Styler and Byrd2022) or can, in extreme cases, be produced instead of a pause in fast speech. Indeed, a pause is an important determiner of rhythm.
Final lengthening has been claimed to be universal in human language (Fletcher, Reference Fletcher, Hardcastle, Laver and Gibbon2010) with additional language specificities (e.g., Nakai et al., Reference Nakai, Kunnari, Turk, Suomi and Ylitalo2009). Since the term “universal” can be ambiguous in meaning (Bickel, Reference Bickel2011), we refer to statistical universal here, which relies on robust statistical evidence across languages but also allows for exceptions. There is only recent empirical evidence, using the same methodology for 25 mostly understudied languages, that the lengthening of vowels is a statistically robust cross-linguistic phenomenon (Paschen et al., Reference Paschen, Fuchs and Seifart2022). Language-specific variations, driven by phonological vowel length, were found as well. Sound-specific variations have also been reported, for example in Berkovits (Reference Berkovits1993) for Hebrew, who described stronger lengthening effects for final fricatives than stops. Paschen (Reference Paschen, Skarnitzl and Volín2023) provided evidence for Lower Sorbian that lengthening occurs in vowels, sonorants, and fricatives, but not in stops. While these latter segmental influences may be specific to human language, they may also give some hints that continuous airstream mechanisms make final lengthening more likely.
The degree of lengthening has been extensively discussed. For example, the pi-gesture model (Byrd and Saltzman, Reference Byrd and Saltzman2003) proposes that the longer the segments, the closer they are to the boundary. Moreover, lengthening varies with the boundary type. Final segments at major boundaries, for example at the end of a sentence, may be longer than segments at phrase boundaries within a sentence. While language-specific variants exist, this does not exclude the assumption that the underlying principles are physical in nature but have been shaped in various ways by the properties of the sounds and the users of individual languages.
What are the underlying mechanisms that may cause final lengthening? Do we also find them in other behavior or other species?
7.3.2 Final Lengthening in Nonhuman Animal Communication
Final lengthening is getting increased attention in nonhuman animal acoustic studies. It was found in birds and primates. We can find it as well as f0 declination in the budgerigar, a vocal learning parrot. It was reported that segments at the end of vocalizations were more likely to be longer. In 14 adult budgerigars, it was found that segments in syllable-final positions are on average longer than medial segments (Mann et al., Reference Mann, Fitch, Tu and Hoeschele2021). The budgerigar, thus, is the only nonhuman species so far where both final lengthening and f0 declination were observed. This is likely a sampling bias, where those phenomena have not been studied in many species. In another study on 80 different songbirds, the same could be shown: Song-final notes were significantly longer than nonfinal notes (Tierney et al., Reference Tierney, Russo and Patel2011). This is especially impressive as the analysis wasn’t conducted per species but across all species, indicating this to be a generally observable phenomenon in songbirds. A more detailed analysis to find possible differences between families would be interesting. Both papers argue that final lengthening can be observed in these songbirds as well as in humans, because of similar motor constraints. Both humans and songbirds show high control of their vocal articulators with the possibility to rapidly adjust them during vocal production. Nevertheless, abrupt termination of these movements might be difficult, as opposed to a gradual relaxation and therefore slowing of articulators, resulting in final lengthening. This argument is further strengthened by the fact that budgerigar segments in particular are produced within a single breath (Tierney et al., Reference Tierney, Russo and Patel2011; Mann et al., Reference Mann, Fitch, Tu and Hoeschele2021). It would be interesting to investigate respiratory kinematics and final lengthening in human speech production.
Final lengthening was also found in at least three different primate species: two crested gibbon species and the indri (Huang et al., Reference Huang, Ma, Ma, Garber and Fan2020; Valente et al., Reference Valente, De Gregorio and Favaro2021). In gibbons, this effect was found on two different structural levels, in vocal sequences and in bouts (where a vocal sequence is a short unit and a bout is made up of several sequences). The suggested reasons differ from those discussed in humans. The authors suggest a connection to males advertising their quality; that is, gibbons might be modulating the frequency of their calls rapidly at the end of sequences to advertise their individual quality to females as potential mating partners. An increase in frequency modulations, even though fast, would potentially lead to an increase in the duration of notes (Huang et al., Reference Huang, Ma, Ma, Garber and Fan2020). If this is true, the observed phenomenon of final lengthening in crested gibbons would only be a byproduct of other processes.
7.3.3 Comparing Final Lengthening in Human and Nonhuman Animal Communication
There are different explanations for final lengthening in the literature. It has been understood as a general motor property rather than being innate. Tierney et al. (Reference Tierney, Russo and Patel2011) attribute it to the energy efficiency of the underlying motor actions in humans and singing birds. As mentioned above, Huang et al. (Reference Huang, Ma, Ma, Garber and Fan2020) explain it as a byproduct rather than a phenomenon on its own. Matzinger and Fitch (Reference Matzinger and Fitch2021) mention the possibility that the slowing down of articulators could be a result of a change from exhalation to inhalation; that is, respiratory dynamics and the occurrence of a breathing pause cause final lengthening. We think that it could be a plausible explanation for those species that produce segments within one breath, but in humans, final lengthening can also be found without a change from exhalation to inhalation. Nevertheless, the relation of one breath to one segment (or one phrase in human communication) may have been at the origin of human language evolution. In spontaneous interactive dialogues, more than 50% of all turns consisted of only one breathing cycle (Rochet-Capellan and Fuchs, Reference Rochet-Capellan and Fuchs2014). Linguistic studies have mostly focused on language- and segment-specific properties in the implementation of final lengthening. Because the number of published papers on these specificities is so much more than in animal communication, one may implicitly assume that it is a phenomenon of human language.
Even if language variations exist, it does not exclude the possibility of an underlying motor principle. Humans and nonhuman animals clearly have motor constraints, but these may not be persistent under each and every situation, and all animals may also be able to compensate for it if needed. Monocausalities are rather rare in biological systems. For example, the available exhalation air in human speech may be physically constrained by the lung volume (vital capacity) of a speaker. In human and nonhuman acquisition, a high positive correlation between lung volume and utterance/vocalization length has been found (see the review in Fuchs and Rochet-Capellan, Reference Fuchs and Rochet-Capellan2021). Evidence for a similarly strong correlation between utterance length and vital capacity in adults is missing, because humans with smaller lung capacities may adjust their laryngeal resistance and lose less expiratory air to compensate for their physical constraints.
All in all, evidence for final lengthening in nonhuman animals has changed the perspective that the phenomenon is a purely linguistic one to the notion that it might be grounded in general motor constraints.
7.4 Future Directions
7.4.1 A Roadmap
In the venture of finding the biological underpinnings of prosodic phenomena such as f0 declination and final lengthening through a comparative approach combining knowledge and advantages from studying human and nonhuman communication, there are several issues to overcome and steps to take. We will lay out a possible roadmap to achieve this here, with concrete steps.
1) Establishing a common vocabulary and conceptual framework: This touches on issues mentioned earlier, where terms such as “syllable” might be understood differently between linguists and biologists, but also within biology. Clear terminology and glossaries and clear but adjustable definitions and criteria for measuring and analyzing features are important steps.
2) Establishing the necessary data basis: We would need to develop a comprehensive database of vocalizations from a wide range of species to study these and other phenomena comparatively. A database should optimally include both natural or spontaneous and elicited vocalizations. Most importantly, meta-information about the social or ecological contexts is important to know, as well as potential age and sex.
3) Identifying universal and species-specific patterns: Once established, researchers can begin to identify robust patterns in the use of prosodic features across species, as well as species-specific variations, utilizing the database. Eventually, this can help to shed light on the evolutionary and ecological factors that shape the use of these features in different species with different needs and skills.
4) Integrating insights from different disciplines: With a comprehensive knowledge of theories, mechanisms, and constraints in linguistics, biology, psychology, and neuroscience from interdisciplinary collaborations, we can identify new research questions and generate novel insights into the biological and cognitive foundations of communication.
7.4.2 Prosodic Rhythm Markers and Respiration
A future endeavor toward a better understanding of prosodic markers of rhythm could be to investigate the interaction between respiration, phonation (voice quality), and anticipatory planning in humans and nonhuman animals. Respiration itself is a biological rhythm, and the duration of breathing cycles can constrain how long an animal can phonate. At the end of long sequences when lung volume is reduced, laryngeal adjustments may be required to continue phonation. While this has been modelled for humans (Zhang, Reference Zhang2016), it may also exist in nonhuman animals. The amount of air inhaled may further shape acoustic properties such as intensity and to some extent f0 (Watson et al., Reference Watson, Ciccia and Weismer2003).
Vocalizations in nonhuman primates such as screams or grunts may have specific prosodic features that can be shaped by respiratory control. Birds have more complex respiratory and laryngeal systems than humans, including not only lungs but several additional air sacs that are necessary during avian flight. Phonation is produced with a syrinx that is much closer to the lungs than in humans, at the place where the trachea forks into the lungs. These physiological differences and control mechanisms are, on the one hand, a big challenge for comparative work; on the other hand, they may give us insights into which physiological or cognitive systems can produce these patterns and which features they have. Recent technological developments in thermography allow us to record breathing and acoustics even in free-ranging animals (Demartsev et al., Reference Demartsev, Manser and Tattersall2022).
7.5 Conclusion
This chapter examined the intersection between biophysical and cognitive constraints on vocal communication. To that end, we investigated what is known about two concrete prosodic phenomena connected to rhythm in human language and nonhuman animal communication: final lengthening and f0 declination. Both prosodic phenomena have been found in different nonhuman animal species, mostly in birds but also in primates. We are sure that more examples will be found in different species. For final lengthening the evidence in nonhuman animals has changed the perspective that the phenomenon is a purely linguistic one to the notion that it might be grounded in general motor constraints. There is clearly an advantage when humans and nonhuman animals are taken into account, to find these explanations.
Summary
Focusing on final lengthening and f0 declination as prosodic rhythm markers in human language and nonhuman animal communication, we explored the intersection of biophysical and cognitive factors. Examples from birds and primates challenge the notion of them being purely linguistic, suggesting a basis in general motor constraints. Further cross-species investigation promises additional insights.
Implications
F0 declination and final lengthening have been discussed as two properties demarcating the unit of a rhythm. Descriptions of rhythms in speech communication would benefit from the inclusion of nonhuman animal data for a better understanding of the underlying principles, and rooting rhythms of communication in evolution.
Gains
The evidence for f0 declination and final lengthening in nonhuman animals may change our human-centric perspective from purely linguistic phenomena to grounding it in general motor constraints, learned and innate behavior. A potential road toward further comparative analyses has been described.
