Published online by Cambridge University Press: 22 April 2026
With the development of the sound spectrograph there is evidence of renewed interest on the part of linguists in the possibilities of defining in physical terms the elements that linguistics sets up by essentially different criteria. The spectrograph has many obvious advantages over other instruments at the disposal of the investigator of speech. It yields a continuous picture of the frequency–intensity relations over a time interval of about 2.4 seconds, and performs this analysis in well under five minutes. Moreover, it obviates the necessity of relying on the results of laborious and time-consuming analyses of speech sounds originally produced subject to various ‘unnatural’ conditions. The rapidity with which the instrument operates makes it possible to deal with large quantities of analyzed material, while its ability to handle rapidly changing sounds makes it unnecessary to restrict one's study to sung or artificially prolonged vowels, or to regulate the level of pitch or intensity (loudness) at which the sounds to be studied are produced.
1 Material for this study was collected as part of a research program made possible by a grant from the Committee on the Language Program of the American Council of Learned Societies. The writer expresses his indebtedness to Ralph E. Potter, director of research of the Bell Telephone Laboratories, who placed certain facilities at his disposal, as well as to John C. Steinberg and—most especially—to Gordon E. Peterson, both of the Bell Laboratories staff, for their generous assistance. He is indebted also to Zellig S. Harris and Martin Joos for detailed criticism of this paper.
2 For a description of the design and operation of the sound spectrograph see W. Koenig, H. K. Dunn, and L. Y. Lacy, The sound spectrograph, Journal of the Acoustic Society of America 17.19–49 (1946). For reproductions of spectrograms of various speech sounds in various combinations, see R. K. Potter, G. A. Kopp, and H. C. Green, Visible speech (New York, 1947).
3 For basic acoustic phonetic theory and for applications of the sound spectrograph to linguistic problems see Martin Joos, Acoustic phonetics (Language Monograph No. 23, 1948). The present paper assumes an acquaintance with Joos's book, to which reference is hereby made once and for all for definitions of the terms used in this paper, for the necessary acoustic and electric-circuit theory, and for the theory of harmonic analysis implicit in the present discussion.
4 Loudness, an aspect of sound sensation, is not synonymous with intensity, a physical dimension of sound; but the one varies directly with the other. Strictly speaking, loudness depends on the frequency and quality of the sound heard, as well as on its intensity. Pitch and frequency are similarly related. Psycho-acoustics tells us to what extent we can talk about frequency and intensity as if they were identical with pitch and loudness respectively.
5 See for example L. Hjelmslev, Über die Beziehungen der Phonetik zur Sprachwissenschaft, Archiv für vergleichende Phonetik, Vol. 2, Nos. 3 and 4 (1938).
6 Generally it is left to linguists to make statements about the precise meaning of the linear orthography that is conventionally employed. See for example C. F. Hockett, Peiping phonology, JAOS 67.254 (1947).
7 Joos §2.40 ff.
8 Joos, chapter 1 and §§2.0–29.
9 Assuming that these two are practically independent of each other; see Joos §2.12. For a procedure based on this assumption, in which the transmission characteristics of the vocal cavities are investigated by holding the vocal organs in a fixed position and varying the pitch of the glottal tone, see Don Lewis, Vocal resonance, Journal of the Acoustic Society of America (1936).
10 Joos §2.25.
11 In Potter–Kopp–Green, Visible speech, the term ‘bar’ is used where Joos, in agreement with many of the European writers, has ‘formant’. Joos §§2.25–9.
12 See Koenig–Dunn–Lacy, The sound spectrograph, for reproductions of spectrograms made with filters of various widths. Potter–Kopp–Green, Visible speech, deal largely with broad-band spectrograms, while Joos prefers those made with the narrow band width. For discussion of reasons for preferring one to the other see Joos §3.17 and Fig. 32.
13 In accord with the procedure adopted by the Bell Telephone Laboratories staff, the formant frequency is found by measuring to the center of the broad resonance bar and adding 150~. The center of the broad filter's response curve is lower than that of the narrow filter by that amount (in effect). See Koenig–Dunn–Lacy, JASA 1946, Fig. 13.
14 This is true for the [æ] and [ɛ] vowels, not generally. For spectrograms of pep see Potter–Kopp–Green, Visible speech 84–5.
15 As some check on the appropriateness of the procedure, a number of segments chosen at random were measured for formant frequencies at intervals of 1/120th second over the entire time-span of the vowels. Measurements indicated that over a considerable fraction of a segment's duration (roughly the middle third) the variation in the frequencies does not exceed the range of variation for the point-time measurements of all the segments. For one of several possible alternative conventions that might be adopted see Joos §5.16.
16 The position of formant 3 relative to that of formant 2 appears to be of prime significance as acoustic correlate of r-color: Joos §§4.3 ff.
17 For a discussion of various definitions of probability and randomness and the experimentalist's way out of the circularity involved, see C. West Churchman, Probability theory, Philosophy of Science, Vol. 12, No. 3 (1945).
18 This is analogous to the convention whereby the breadth of a filter's response curve, which theoretically never falls to zero, is measured between the points on the curve at which the ordinate values are 0.707 times the value of the peak. See Joos §1.51.
19 Taking the ratio of the two formant frequencies did not give values useful in this connection.
20 The random error involved in measuring the formant frequencies was found to be about 12~. This means that the true value has a 50–50 chance of being within 12~ of the measured value.
21 The relation of one speaker's distinction between [æ] and [ɛ] in the environment p–p to another speaker's distinction between the same segments in the same environment can probably be stated in the same way.
22 Joos §§5.00 ff.
23 Joos §5.22.
24 Joos §5.68.