Acoustic Phonetics

Martin Joos

doi:10.2307/522229

Acoustic Phonetics

Published online by Cambridge University Press: 22 April 2026

Martin Joos

Article contents

Extract
References

Get access

Rights & Permissions

Extract

This book is not, as its title may seem to promise, an authoritative survey of an established science. Acoustic phonetics is now in its infancy. The time when general agreement will be reached on its principles is still far in the future. That is, we are at that point of its development where the appropriate sort of publication, according to settled custom, would consist of numerous short articles by various workers, reporting on single discoveries or presenting arguments on single points of theory. Only later, when data and discussion had accumulated to such an extent that the well-known phenomena of convergence had begun to show up, would a summary publication normally be in order, a book which could be trusted as a whole and in all but a small fraction of its details. Such a book was Eduard Sievers' Grundzüge der Phonetik, the culmination of decades of study and theorizing, which will never be truly superseded. The present book is certain to be superseded as a whole, and to be proved wrong or to require restatement at so many points that if the circumstances were at all ordinary there would be no justification for covering the ground that it covers in book form either now or for some years to come. But the present situation in acoustic phonetics is so very extraordinary that immediate book publication of a broad survey appears inescapable. There are two main reasons for this. One is the relative unpreparedness of linguists to deal with acoustic data, the other is the sudden, recent, and unpublicized development of the instruments for laboratory acoustic phonetics.

Information

Type: Research Article
Information: Language , Volume 24 , Issue 2-Part2: Language Monograph No. 23: Acoustic Phonetics, April-June , June 1948 , pp. 5 - 136

DOI: https://doi.org/10.2307/522229 [Opens in a new window]
Copyright: Copyright © 1948 Linguistic Society of America

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

¹ The author of this book had no part in the development of the key instrument—the Acoustic Spectrograph—but accidentally became the first linguist permitted to use it extensively, beginning over two years before the public disclosure of November 9, 1945 (in Science, Life, Time, and other publications). The plan of the book, conceived very early, matured during the 1947 Linguistic Institute at the University of Michigan, where roughly half its content (parts of Chapter 2, half of Chapter 3, and nearly all of Chapter 5) developed out of discussions with members of the Institute, visitors to the Summer Meeting of the Linguistic Society of America, and especially the members and visitors of a course in Instrumental Phonetics conducted by the author. As co-authors in this sense it is a great pleasure to be able to name Bernard Bloch, Y. R. Chao, Ernest F. Haden, Floyd Lounsbury, and W. F. Twaddell as constant contributors, and J M. Cowan, Charles F. Hockett, Kenneth L. Pike, Viola Waterhouse, and Joshua Whatmough as occasional contributors to the development of the book, with an apology to the rather larger number of persons whose contributions remain without explicit acknowledgment here simply because no minutes were kept. None of these can be held in any way responsible for the form which the book assumed subsequently: only Bloch and Twaddell have had any chance to criticize the version offered here, in each case too late for other than minor revisions.

² Of perception we know less than we know about the other points listed here, principally because it is so hard to set up hypotheses without becoming mentalistic—that is, because it is so hard to devise crucial experiments; but we know enough to be able to say already that the perception of speech sounds differs in some important way from the perception of any other sound. For example, even though a certain [a] may be precisely matched by a certain trumpet note (this is a question of sound as sound), yet the relation of this [a] to other [a] sounds and to other vowels is totally different from the relation of this trumpet note to other trumpet notes and to the notes of other instruments (this is now a question of perception and involves the brain as well as the ear, and not merely the brain as an organ with certain potentialities but the brain as trained to respond to speech sounds as linguistic signals, and as trained to respond to the trumpet sound as a musical element), so that the trumpet invariant called timbre is essentially different from the linguistically invariant [a] quality, for which a different name has to be found. Cf. §2.25.

³ ‘Talking nonsense’ here does not mean making false statements, nor does it mean using terms unfamiliar to physicists; it means making statements which cannot be converted into meaningful statements (true or false) by any translation of terminology. Discussions of sonority in phonetic literature generally belong in this category.

⁴ Listening to a person in the next room striking a match, one can hear whether it is a cardboard match struck on a paper folder or a wooden safety-match struck on a box of thin wood. But what would we think of anybody who classified match-lighting sounds into ‘paper’ and ‘box’, or discussed the rigidity of boxes as if it were a quality of sound? Yet it is precisely that procedure which has led to the present confusion and inadequacy in our discussion of American r sounds, to cite one outstanding inadequacy among many. It happens that the r-color question is one which has been pretty well cleared up by the methods of acoustic phonetics (§§4 31 ff.); others will doubtless follow.

⁵ H. Klinghardt, Artikulations- und Hörübungen² (Cöthen, 1914).

⁶ N. S. Trubetzkoy, Zur allgemeinen Theorie der phonologischen Vokalsysteme, TCLP 1.41 (1929): ‘Die Bewegung des Unterkiefers bewirkt verschiedene Grade der Öffnung des Vokals, denen akustisch verschiedene Stufen der Schallfülle entsprechen. Die Bewegungen der Lippen und der Zunge bewirken die Veränderung der Form und des Umfanges (namentlich der Länge) des Ansatzrohres, denen akustisch verschiedene Stufen der Eigentonhöhe entsprechen.’ Eigentonhöhe means ‘characteristic pitch’ but is often replaced by Eigenton (literally 'characteristic note'). Schallfülle properly means ‘sonority’, whereas ‘saturation’ is Sättigung; both words were used by the Prague school, often interchangeably, but also often with the difference that Schallfülle names the putative acoustic feature in the sound as sound, Sättigung the corresponding ‘Vorstellung’ or psychological reality, the latter being the favorite sort of reality in that school.

⁷ Richard A. S. Paget, Human Speech 42 (table), 45 bottom, and ch. 5 (New York and London, 1930). A. W. de Groot, Phonologie und Phonetik als Funktionswissenschaften, TCLP 3 (1931), deals with the two pitches or formants much as we shall do in this book, but apologetically, for he considers Trubetzkoy's categories to be ‘phenomenologically’ superior.

⁸ To the outsider it may not seem to matter that the Prague school phonologist spoke of ‘saturation’ when he ought to have spoken of a second pitch, but actually this is a matter of great consequence. Not that a name makes any difference by itself—in the 20th century we ought to be beyond that; but as one of a pair or set, the choice may be crucial. Here the choice of terms implies that the term EigentonhÖhe exhausts the information about placement on the musical scale and subsumes all its linguistically pertinent items—that is, that ‘pitch’ needs to be mentioned only once—so that a second linguistically significant feature must be a non-pitch feature (SchallfÜlle or SÄttigung), which presumably similarly exhausts and subsumes another set of pertinent acoustic items; and of course if a third dimension of linguistically pertinent quality were wanted it would have to be a subsuming of still another set of items, no item being assigned to more than one set. This, as we hope will become abundantly clear to readers of this book, is nonsense by definition (footnote 3), but it clearly is the only way that Trubetzkoy's pair of terms can be interpreted as a pair. And it is pernicious nonsense, because anybody who follows it is thereby debarred from seeking a second and a third feature that can be called a Tonhöhe or pitch as well as the first, and that, as we shall see, is precisely what has to be done in setting up acoustic vowel theory.

⁹ Bernard Bloch, A set of postulates for phonemic analysis, Lang. 24.3 ff. (1948), especially §0.2 and footnote 6. The elegance of Bloch's results calls for a re-examination of our acoustic data to see if they cannot be otherwise interpreted at certain points; some questions which he left open call for hints from the laboratory; and the weight of some of our laboratory data suggests that certain of his postulates may have to be replaced by others, e.g. his Postulate 11: see our footnote 85. The nearly simultaneous publication of this survey and that article will, it is to be hoped, lead to a friendly contention with immense possibilities for profit. The only regrettable feature of the situation is that there are a few conflicts in terminology. The different uses of the word phoneme will do no harm (our footnote 85); but it should be pointed out that here we find two totally different uses of the words congruent, phase, and aspect. This is nobody's fault, and may be corrected soon; meanwhile it should not be allowed to confuse the picture.

¹⁰ The term power will be used frequently where colloquial usage has energy. In technical writings both words are used, but with different meanings. Energy is measured in kilowatt-hours, for example, and we pay one dollar for a certain amount of it. Power is the rate of flow of energy, measured for example in kilowatts. It takes a certain amount of power to keep an electric light burning; it takes a certain amount of energy to keep it burning overnight. It takes a certain amount of energy to pronounce a word; it takes a certain amount of power to keep intoning [a].

¹¹ See footnote 10, where energy and power are discriminated.

¹² It may seem that there is a loss rather than a profit in undertaking to pursue a number of sinusoids through their history, rather than to pursue one wave described precisely as a whole, say by making an accurate drawing of it. Experience has shown that there is instead a very great gain in dealing with the numerous components rather than the single wave. First, that single wave will change its shape again and again during its history up to the time that it reaches the listener's ear, sometimes with drastic effects upon how it will sound, sometimes (when the change is principally a phase-shift) with no perceptible effect, and these changes in shape cannot be discussed economically by operating on the wave as a whole. Second, they can be discussed economically when the shape has been analysed into sinusoidal components : the fate of a wave at a certain point in its history can always be stated by a summary statement (usually a simple one) of how the fates of components depend on their frequency, e.g. ‘the percentage of transmission is inversely proportional to the sum of the loss constant and the square of the frequency divergence from the center frequency’ for a sharp resonator through which the wave is to pass, a statement from which the resultant wave after the particular transformation can be swiftly determined.

¹³ The exact equivalence—the one-to-one correspondence—of perceived quality and DPF is a cardinal point in acoustic theory. It is known as Ohm's Acoustical Law. The nature or implications of this equivalence will clear up gradually in the course of this book. At this point, however, it may be well to put one statement on record: If a given DPF is modified by perceptibly increasing or decreasing the power at a certain frequency, the original perceptual quality can not be restored by any compensating change at one or more different frequencies. This emphasizes an essential difference between acoustic perception and optical color perception, for if e.g. a shade of brown is slightly but perceptibly modified by removing several weak color components, the original shade can be exactly restored by adding other spectral colors different from those removed. Nothing like this can be done with sound.

¹⁴ Of course exact repetition is strictly impossible in phonetics, so that this sounds like an irrelevant remark; however, exact repetitiveness is a fair approximation and a convenient approximate description of what does occur. (Exactly repetitive is a clumsy but explicit paraphrase of the mathematical word periodic; the latter is not used here because it would inevitably be misunderstood—taken in too loose a sense—because of its popular connotations.) The occasion for the remark is that the anharmonic vowel theory of Scripture and others implies incommensurable frequencies in phonetics, and this implication is false, being based on a misunderstanding of the nature of harmonic analysis. See Oskar Vierling, Der Formantbegriff, Annalen der Physik 1936.219–32.

¹⁵ If this intermediate disturbance is very much stronger than shown in the figure, the spectrum will show alternate (even-numbered) harmonics stronger, a phenomenon which turns up occasionally in some speakers.

¹⁶ Something very like this (only slightly modified by resonance) has been observed with a tiny microphone inserted through a fistula in the throat just above the glottis, and another even more cogent observation has been communicated privately by Prof. Oskar Vierling of the Hannover Technische Hochschule.

¹⁷ This does not exclude the possibility that some other glottis adjustment, besides the pitch adjustment, may habitually accompany the utterance of particular vowels, e.g. the ‘creaky voice’ of some languages. All we are doing here is postulating the essential independence of two mechanisms; but of course any two mechanisms both controlled by the same brain may have their behavior correlated.

¹⁸ It is worth while to make sure that this sentence has been fully understood by setting up an artificial problem and working out the arithmetic. For example: glottal spectrum with 1 microwatt of power at 600~, 0.9 microwatt at 800~, 0.8 microwatt at 1000~; filter with 40 percent transmission at 600~, 80 percent at 800~, 30 percent at 1000~. It is understood that the filter has a characteristic percentage of transmission at every frequency (e.g. 75 percent at 780~), but this does not require mention because there is no glottal power between the frequencies named.

¹⁹ Francis J. Carmody, X-ray studies of speech articulation, University of California Publications in Modern Philology, Vol. 20, No. 4 (1937).

²⁰ This instrument, developed in the early 1940's and promised to be commercially available in 1948, is what makes detailed and precise discussion possible in acoustic phonetics. The author had the opportunity, unique among linguists, of using it continuously a large part of the time for nearly three years, but under conditions which forbade publication at the time and required leaving behind almost all data but what could be retained in the memory. This has often complicated discussion here by making it impossible to cite the best evidence for an important statement, so that the statement has had to be left unsupported until the crucial experiments could be repeated coram publico, or has had to be supported with second-best arguments. — This is not the place to describe the Acoustic Spectrograph in detail, but see §3.12 for a sketchy schematic description. The best published description is in Bell Telephone System Monograph B-1415, a reprint of articles published in the Journal of the Acoustical Society of America 17.1–89 (1946).

²¹ It is this spacing, apparently, which is somehow used by the brain to determine the pitch; the fundamental frequency itself need not be perceptible to give a correct pitch judgment according to the definition at the end of §1.26. The derivation of a pitch sensation from harmonic spacing is not dependent on the generation of difference tones in the ear, as is often said, but is a function of the brain. We can say this with complete confidence because the apparent pitch of a complex tone does not jump when the tone is weakened to the point (say 20 decibels above threshold) where the difference tones are known to be imperceptible.

²² Two different modifications of the Acoustic Spectrograph have already been invented which produce spectrograms from which the intensity can be read off directly and fairly precisely—to within one decibel from a spectrogram made without compression. Such spectrograms may yet be needed to answer certain questions—e.g. the difference between [i] and [y]—but for the present all we need is the uncalibrated spectrograms we already have, for reasons which will appear presently.

²³ Examples in Potter, Kopp, and Green, Visible Speech 330–7 (New York, 1947), Figs. 3c, 4c, 10c.

²⁴ Francis J. Carmody, op.cit.

²⁵ Op.cit. 230.

²⁶ Daniel Jones, Das System der Association Phonétique Internationale, in M. Heepe, Lautzeichen und ihre Anwendung in verschiedenen Sprachgebieten (Berlin, 1928); or Daniel Jones and Amerindo Camilli, Fondamenti di grafia fonetica (London, 1933).

²⁷ The French phonograph records of the USAFI course in Spoken French were used (now commercially available from Henry Holt & Co.); therefore the pronunciation, although on the average not at all careless, is still completely normal. Quite a few samples of each vowel were studied; more details in §5.17.

²⁸ Daniel Jones, op.cit. 20, and Daniel Jones and Amerindo Camilli, op.cit. 5: ‘Cardinal [i] is the vowel with the highest and frontest possible tongue position; a more extreme tongue position gives fricative [j]. Cardinal [α] is the lowest and backest possible vowel, beyond which we get a uvular fricative. Cardinal [e, ε, a] are three intermediate stages between [i] and [α] chosen so as to give four equal acoustic intervals; cardinal [ɔ, o, u] continue this series with the same acoustic intervals. The tongue positions of cardinal [i, a, α, u] have been determined by X-ray measurements.‘ (This quotation is a synthesis of the German and Italian texts.) It is clear that a single acoustic series, conceived as unidirectional, from [i] through the other cardinal vowels to [u], can only be impressionistic.

²⁹ Martin Joos, Narrow transcription of General American, Le Maître Phonétique 3.11.48 (1933).

³⁰ Svend Smith, Analysis of vowel sounds by ear, Archives néerlandaises de phonétique expérimentale 20 (1947).

³¹ Op.cit., especially 207 ff., §90.

³² R. A. S. Paget, op.cit. 44.

³³ R. A. S. Paget, op.cit., has a long ‘Note on the double-resonator theory of vowel sounds’ (Appendix I), signed by W. E. Benton, which uses a drastic simplification, yet the mathematics are very unwieldy.

³⁴ A rough estimate: see §3.53.

³⁵ The terms are borrowed from the communications engineers—telephone, radio, etc.—one of whom developed the phonetic applications of the terms quite fully for the benefit of the others and also very interestingly for us; see Homer Dudley, The carrier nature of speech, The Bell System Technical Journal 19 (1940).

³⁶ Failure to inhibit the subliminal articulations properly is what seems to have underlain the reported cases of pathological ‘repeaters‘—persons who repeat whatever they hear almost in step with the speaker, that is, a fraction of a second later.

³⁷ J. van Ginneken, Terug naar Schleicher, Donum natalicium Schrijnen (Nijmegen, 1929); and R. A. S. Paget, op.cit. ch. 9.

³⁸ The terms are mathematical. For example, a rubber glove remains topologically the same no matter what one does to it without damaging it.

^38a Since cases of perfect language learning by an adult may apparently be cited as an argument against the thesis of this paragraph, it seems worth while to point out that these persons are a small minority, and to record the conviction that they can be explained individually without affecting the main conclusion. The explanation need not always be the same, and a single instance may be described without any implications concerning others. This man is a native speaker of a non-Indo-European language. After studying English for a year or so in a school at home, he came to this country and learned a local variety of American English perfectly in one year more. His acquaintances recognize this as another manifestation of a chameleon-like adaptability and unassertiveness which also blocked his success as a scholar in a field where he has extraordinary talents.

³⁹ Otto Jespersen, Phonetische Grundfragen 80–4 (Leipzig & Berlin, 1904).

⁴⁰ Potter, Kopp, and Green, Visible Speech 45, shows a typical contrast between adult male and adult female resonances.

⁴¹ The term decomposition is offered here, for the first time in phonetics as far as the writer is aware, in place of the tentative term ‘componential analysis’ used by C. F. Hockett in conversation in the summer of 1947. See Chapter 5.

⁴² Stanley Smith Stevens and Hallowell Davis, Hearing—its psychology and physiology (New York, 1937), deal with these matters exhaustively.

⁴³ Stevens and Davis, op.cit. 89 (table).

⁴⁴ In this paragraph, for lack of a better brief term, the word at is used in an approximate sense, not precisely, as in §1.22.

⁴⁵ As a component of a complex phenomenon (wave) it is of course impossible for an 800~ oscillation to increase and decrease abruptly as proposed here. But by itself such a segment of ‘800~ oscillation’ can be generated in the laboratory, with a shape such as is shown in Fig. 31A, and when this alone is fed into a filter the filter's output will be as in Fig. 31B. The only objection would be that ‘a segment of 800~ oscillation’ is at best a rough shorthand way of describing Fig. 31A, which, considered as a phenomenon of infinite duration of which only a part is shown, contains power at an infinitude of frequencies. The difficulty here is that not even mathematical English has appropriate terminology for describing the situation both precisely and illuminatingly.

⁴⁶ These are not the customary definitions of time constant and decrement, but readers who know the customary mathematical definitions will also be able to prove that our definitions are algebraically equivalent to them.

⁴⁷ The rule as given is correct only when the filter profile is that of a simple resonator (Fig. 16 and §1.51). But it is very nearly correct for flat-topped filters also, and the error can safely be neglected. A more important point is that whenever the filter profile is anything but a simple resonance curve, the decrement does not correspond to the dissipation. See §2.30.

⁴⁸ That is, for detailed analysis as a contribution to linguistic theory. On the other hand, the 300~ filter gives a record which is easier to learn to read swiftly for word-identification, which is why it is used exclusively for that purpose in the book Visible Speech (Potter, Kopp, and Green).

⁴⁹ The acoustic psychologists use the term intensity for power per unit area of wave-front; for this we use the term power, neglecting the area factor because at every point in the discussion we can easily keep it from making any difference.

⁵⁰ In current literature such Fourier Series are regularly presented as full analyses of one period. If the one period were the only one in question, the popular error might do no great harm. But it is also customary to analyse one period after another, and present a Fourier Series for each, and then the results are strictly false and illusory. This point cannot be too emphatically put. The purpose in mentioning it here is not to attack earlier work and other workers, but to enable readers to understand the present exposition without prejudice.

⁵¹ Outside this range the pitch acuity of the ear and brain deteriorates abruptly. One might speculate about the relationship between this perception fact and the fact that formants within this range are the principal vowel diacritics in all languages, but the ultimate answer would doubtless be ‘non liquet’. Which came first, the chicken or the egg?

⁵² Stevens and Davis, op.cit., show that there are no actual filters, but it is easy to show that the mathematics of their theory cannot escape the same laws of indeterminacy that apply to our supposed filters.

⁵³ Stevens and Davis, op.cit. 220–4.

⁵⁴ Stevens and Davis, op.cit. 287. This was found after §§2.121 f. and §§2.124 f. were written.

⁵⁵ Testimony of foreign phoneticians would presumably be different, but we had better wait until we get some before we try to interpret it.

⁵⁶ The single observation cannot be made with any such precision; what is given here is the average of many observations, presented as a single observation to clarify the argument.

⁵⁷ It is necessary to avoid introducing extra noises; for example, if the fragment of vowel is secured by making and then breaking an electrical connection, there must be no direct-current component. A suitable apparatus in the University of Louisiana Speech Laboratory gave recognizable vowel fragments less than 1/200 second long.

⁵⁸ Lawton M. Hartman, The segmental phonemes of the Peiping dialect, Lang. 20.31 (1944).

⁵⁹ R. H. Stetson, The bases of phonology (Oberlin 1945).

⁶⁰ For example, by transcribing fence as [hfhεε͂ǝndtcsz̥h].

⁶¹ Kenneth L. Pike, Phonetics (Ann Arbor, Mich., 1943).

⁶² The artificial speech of the Vocoder's Synthesizer does not preserve the abruptness of explosions, but it does preserve the frequency-spread, and the effect is fairly natural. The Vocoder is described by Homer Dudley, Remaking speech, Journal of the Acoustical Society of America 11.169 ff. (1939).

⁶³ Cf. E. Sapir, Notes on the Gweabo language of Liberia, Lang. 7.30 ff. (1931).

⁶⁴ The word component itself is avoided here because of its use in harmonic analysis. See Charles P. Hockett, Componential analysis of Sierra Popoluca, IJAL 13.258 ff. (1947), where component is used with substantially the meaning of layer, and componential analysis is our analysis, with emphasis upon decomposition. Cf. also Bloch, Lang. 24.19 fn. 20.

⁶⁵ An example of calibration is given in §§3.53 ff.

⁶⁶ The objective procedure typically uses arithmetic. But arithmetic proves only arithmetical facts; it can prove nothing about the real world. Engineers use arithmetic for building bridges. Their only justification is that the bridges usually stand firm. When, occasionally, a bridge nevertheless falls, that does not mean that the arithmetic is false; it means that the particular act of faith (§1.13) was unjustified.

⁶⁷ Zellig S. Harris, Simultaneous components in phonology, Lang. 20.181 (1944).

⁶⁸ Charles F. Hockett, Componential analysis of Sierra Popoluca, IJAL 13.258 ff. (1947).

⁶⁹ Kenneth L. Pike, op.cit.

⁷⁰ Illustrations of this point have already turned up, notably in §4.21; it will be developed in full beginning §5.60.

⁷¹ An extensive discussion of these influences of consonants upon vowel color will be found in Potter, Kopp, and Green, Visible Speech 38–51. Only formant 2 was considered there because, with a linear frequency scale, the position of formant 1 is not so noticeably variable, and the other resonances had already been dismissed as relatively uninformative. We cannot agree with the conclusion that consonants differ essentially in the degree of fixity of their ‘hubs’: it seems to be based on a misreading of the spectrograms. But on the whole the discussion there is quite sound.

⁷² These are the courses and records now available commercially from Henry Holt & Co.

⁷³ See §2.40 and §3.54. A chart-distance of so many millimeters is reckoned as the same number of semitones no matter in what direction it extends, even on a slant.

⁷⁴ Dag is a trade-name for a valve-grinding compound which the speaker has used and has discussed with other workers while using it.

⁷⁵ A refreshing exception, which has not had as much effect on current doctrine as it merits, is P. Menzerath and A. de Lacerda, Koartikulation, Steuerung und Lautabgrenzung (Berlin, 1933).

⁷⁶ We must assume that the situation is no different in a VCV sequence, so that no proper consonant would be left in the middle after the glides had been subtracted. For simplicity in discussion, the wording applicable to the CVC sequence is used alone here.

⁷⁷ True, by judicious sampling it is often possible to get perceptibly pure samples of quite a few phones, but seldom as many as half the phones in an utterance.

⁷⁸ The three mechanical items which together completely determine how a thing responds to forces are inertia, elasticity, and viscosity (corresponding to electrical inductance, capacity, and resistance). The viscosity of live muscle is negligible; the elasticity is practically identical with the tonus, which is an innervation effect, so that it belongs to a neural not a mechanical explanation.

⁷⁹ This German speaker does not use a glottal stop in such a situation, so that the vowels follow each other with no disturbance. — The record shows, a little later, another interesting phenomenon which is not entirely unrelated to the present question. This is a faint r-color in the vowel of ist, indicating approach of the tongue-tip to the alveolar ridge before an apical [s]; see §§4.31 ff. for the reasons why the position of the resonance bands on this German spectrogram can be taken as evidence of r-color. Now r-color resulting from apical articulation (that articulation which, when more extreme, is called retroflexion) is not considered a normal item of German phonology. And this speaker uses only uvular r and its vocalic weakenings for his /r/ phoneme. But there are excellent reasons why we should not be surprised to find r-color here. It is a situation exactly parallel to nasalization in English or German: these are slur phenomena, and they occur freely because the phonology of the language does not call for suppressing them. French is a language which does not freely nasalize vowels; French and German are languages in which front vowels are not freely rounded (as they are in American, e.g. Poor baby! with [bøby]); American English is a language in which vowels are not freely r-colored. Hence weak r-color can freely turn up as a slur phenomenon in French or German, but appears in American only when there is an /r/ phoneme in the context. This is the a-priori argument; the spectrograms confirm it. To avoid prolixity, this point is not mentioned in the development of the slur theory on the following pages; the reader can insert it if he chooses, and will find that it only confirms what is said there. It means that slur is not blindly mechanical, but must be managed by an educable organ, namely the brain and specifically (probably) the cerebellum.

⁸⁰ See §5.26, and particularly the remark that ‘the speaker has ... a whole vowel-phone system for the context [d-d]‘. That remark is not to be withdrawn; only we see now that we ought not to let it mislead us into setting up a theory of the usual phonemic sort, where the vowel-phoneme of dad is merely a fiction, a purely abstract construct. This is phonetics, not phonemics.

⁸¹ In §§5.40 ff. the innervation waves, here presented as one wave for each phone, will be decomposed. Here in Fig. 38, on the other hand, each is presented globally—is displayed as if simple—which is of course false. It is believed, however, that the argument is not vitiated by this device, which was adopted in order to achieve clarity in the argument.

⁸² Lawton M. Hartman, Lang. 20.31, lists two cases, /jew/ and /wej/, with higher vowel allophones in tones 3 and 4 than in tones 1 and 2.

⁸³ Paul Passy, Kurze Darstellung des französischen Lautsystems, Phonetische Studien 1.115 (1888).

⁸⁴ Except, of course, for trills. But these are not immediately responses to innervations: they are driven by the flow of air.

⁸⁶ Not, of course, separately perceptible.

⁸⁷ The emphasis with which this conclusion can now be stated, in contrast to the usual statements in phonetic literature, results from the fact that earlier work was mostly done with isolated stressed syllables—often even with voiced fortis obstruents rather than with the normal lenis ones—and not with natural speech.

Article contents

Acoustic Phonetics

Extract

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests