To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter reviews a classical problem, perception of tones, and suggests that our understanding of this topic may be enhanced by considering it as part of a larger topic: that of perception of acoustic repetition. As we shall see, periodic sounds repeated at tonal and infratonal frequencies appear to form a single perceptual continuum, with study in one range enhancing understanding in the other.
Terminology
Some terms used in psychoacoustics are ambiguous. The American National Standards Institute (ANSI, 1976/1999) booklet Acoustical Terminology defines some basic technical words as having two meanings, one applying to the stimulus and the other to the sensation produced by the stimulus. The confusion of terms describing stimuli and their sensory correlates is an old (and continuing) potential cause of serious conceptual confusions – a danger that in 1730 led Newton (1952, p. 124) to warn that it is incorrect to use such terms as red light or yellow light, since “ … the Rays to speak properly are not coloured.” However, the ANSI definitions for the word “tone” reflect current usage, and state that the word can refer to: “(a) Sound wave capable of exciting an auditory sensation having pitch. (b) Sound sensation having pitch.” A similar ambiguity involving use of the same term to denote both stimulus and sensation is stated formally in the ANSI definitions for the word “sound.” The use of both of these terms will be restricted here to describe only the stimuli.
The comprehension of speech and the appreciation of music require listeners to distinguish between different arrangements of component sounds. It is often assumed that the temporal resolution of successive items is required for these tasks, and that a blurring and perceptual inability to distinguish between permuted orders takes place if sounds follow each other too rapidly. However, there is evidence indicating that this common-sense assumption is false. When components follow each other at rates that are too rapid to permit the identification of order or even the sounds themselves, changes in their arrangement can be recognized readily. This chapter examines the rules governing the perception of sequences and other stimulus patterns, and how they apply to the special continua of speech and music.
Rate at which component sounds occur in speech and music
Speech is often considered to consist of a sequence of acoustic units called phones, which correspond to linguistic units called phonemes (the nature of phonemes will be discussed in some detail in Chapter 7). Phonemes occur at rates averaging more than 10 per second, with the order of these components defining syllables and words. Conversational English contains on average about 135 words per minute, and since the average word has about 5 phonemes, this corresponds to an average duration of about 90 ms per phoneme (Efron, 1963). It should be kept in mind that individual phonemes vary greatly in duration, and that the boundaries separating temporally contiguous phonemes are often not sharply defined.
This chapter provides a brief introduction to the physical nature of sound, the manner in which it is transmitted and transformed within the ear, and the nature of auditory neural responses.
The nature of auditory stimuli
The sounds responsible for hearing consist of rapid changes in air pressure that can be produced in a variety of ways – for example, by vibrations of objects such as the tines of a tuning fork or the wings of an insect, by puffs of air released by a siren or our vocal cords, and by the noisy turbulence of air escaping from a small opening. Sound travels through the air at sea level at a velocity of about 335 meters per second, or 1,100 feet per second, for all but very great amplitudes (extent of pressure changes) and for all waveforms (patterns of pressure changes over time). Special interest is attached to periodic sounds, or sounds having a fixed waveform repeated at a fixed frequency. Frequency is measured in hertz (Hz), or numbers of repetitions of a waveform per second; thus, 1,000 Hz corresponds to 1,000 repetitions of a particular waveform per second. The time required for one complete statement of an iterated waveform is its period. Periodic sounds from about 20 through 16,000 Hz can produce a sensation of pitch and are called tones.
Earlier chapters dealing with nonlinguistic auditory perception treated humans as receivers and processors of acoustic information. But when dealing with speech perception, it is necessary also to consider humans as generators of acoustic signals. The two topics of speech production and speech perception are closely linked, as we shall see.
We shall deal first with the generation of speech sounds and the nature of the acoustic signals. The topic of speech perception will then be described in relation to general principles, which are applicable to nonspeech sounds as well as to speech. Finally, the topic of special characteristics and mechanisms employed for the perception of speech will be examined.
Speech production
The structures used for producing speech have evolved from organs that served other functions in our prelinguistic ancestors and still perform nonlinguistic functions in humans.
It is convenient to divide the system for production of speech into three regions (see Figure 7.1). The subglottal system delivers air under pressure to the larynx (located within the Adam's apple) which contains a pair of vocal folds (also called vocal cords). The opening between the vocal folds is called the glottis, and the rapid opening and closing of the glottal slit interrupts the air flow, resulting in a buzz-like sound. The buzz is then spectrally shaped to form speech sounds or phonemes by the supralaryngeal vocal tract having the larynx at one end and the lips and the nostrils at the other.
Bird songs are among the most beautiful, complex sounds produced in the natural world and have inspired some of our greatest poets and composers. Whilst biologists are equally impressed, their curiosity is also aroused. How and why has such an elaborate form of communication developed among birds? Charles Darwin was one of many who struggled to attempt an answer, and the elaborate songs of male birds such as nightingales clearly influenced his thinking as he developed the theory of sexual selection. Since then, biologists from many different disciplines, ranging from molecular biology to ecology, have found bird song to be a fascinating and productive area for research. The scientific study of bird song has made important contributions to such areas as neurobiology, ethology and evolutionary biology. In doing so, it has generated a large and diverse literature, which can be frustrating to those attempting to enter or survey the field. At the moment, the choice is largely between wrestling with the original literature or tackling advanced, multi-author volumes. Although our book is aimed particularly at students of biology, we hope that our colleagues in different branches of biology and psychology will find it a useful introduction. We have also tried to make it accessible to the growing numbers of ornithologists and naturalists who increasingly want to know more about the animals they watch and study.