4 Introduction to How We Talk About Language
This chapter offers a brief introduction to how we talk about language; specifically, we will look at the basic terms that linguists use to describe speech sounds and words. In order to talk about language, we first need a set of terms that will allow us to name and describe the sounds we find in language: terms from phonetics, the linguistic study of speech sounds. This chapter will introduce you to the symbols of the International Phonetic Alphabet (IPA) along with the terms you need to give technical descriptions of the sounds that the IPA symbols represent.
4.1 Sounds in Language
In order to represent the sounds of speech, linguists use a set of phonetic symbols, the IPA, which was created by nineteenth-century linguists in order to aid in the teaching of languages and in the preservation of languages considered to be endangered. A quick glance at English spelling gives us a strong indication as to why these linguists did not simply rely on existing writing systems to convey the sounds of speech – a distinct lack of a one-to-one correspondence between sound and symbol. In English spelling, the words route, shoot, and suit all contain the same vowel sound, yet employ different letters to indicate the “oooh” sound. In addition, the word route can be pronounced in two different ways, rhyming with shout or with shoot; in this case, we have the same letters, “ou,” being used to spell different sounds. Neither of these situations is helpful if one wants to capture the sounds of speech and pass those sounds on to someone else.
In IPA, there is a one-to-one correspondence between sound and symbol, with each IPA symbol representing a single speech sound. The symbols used in IPA are – helpfully or not for people familiar with English spelling – based on the Roman alphabet, with a few additional symbols added when needed. This familiarity often causes students some initial confusion in cases where a phonetic transcription looks like a familiar word, but a bit of practice does wonders. The key to reading IPA is to remember that it is not the same as reading written English. To that end, this text will distinguish between IPA and conventional English spelling by using angle brackets for spelling and straight brackets for IPA, such that the word spelled <beet> would look like [bit] in IPA.
The reason that we need an IPA is that sounds as units are not a given in human speech. Different languages use different sounds in different ways (see section 4.2.4 below for a discussion of this), and so it is useful to make a language-neutral categorization of the possible human speech sounds. That is what the IPA does. Once we have a set of units like those in the IPA, we can count how often the units occur when we make frequency profiles for complex systems. The following sections describe how the IPA categorizations are made.
4.2 English Speech Sounds
To start our discussion of English speech sounds, we need to distinguish between consonants and vowels. Both types of sound are created as air is expelled from the lungs, traveling up through the larynx and out of the mouth, but there’s a difference in the way that consonants and vowels are articulated. Consonant sounds are created when there is some sort of blockage (called a constriction) along this path; vowels are articulated with no blockage. The differences in consonant sounds are often due to where and how the constriction takes place, while different vowels are produced by variations in the shape of the mouth, much the same way that blowing into glass bottles of different shapes produces different musical notes. We have an IPA symbol for each speech sound and we also have a technical description for each sound that contains details about its articulation. In essence, each description answers a series of questions, with different questions for consonants than for vowels.
4.2.1 Consonants
For consonants, we have a three-part technical description that addresses three questions, having to do with voicing, place of articulation, and manner of articulation. The first question addresses the issue of voicing, whether or not the vocal folds vibrate during articulation. Consonants can be described as either voiced (when the vocal folds vibrate) or voiceless (articulated without the vibration of the vocal folds). For a quick example of the difference between a voiced and voiceless consonant, you can channel your inner kid and make some animal noises: Place your hand on your throat and hiss like a snake and then buzz like a bee. In theory, you should be able to feel the vocal folds vibrate with the bee sound <zzzzzzzz>, which is voiced, but not with the snake sound <ssssssss>, which is voiceless. Most English consonants come in voiced–voiceless pairs, where the main difference between the two sounds is that one is voiced and one is voiceless. The sounds represented by the IPA symbols [s] and [z] are one of those pairs.
The two remaining questions we have to answer in order to describe consonant sounds have to do with the constriction that occurs along the articulatory path; place of articulation tells us where the constriction takes place, while manner of articulation tells us how much constriction is there. Sounds that are created with the constriction at the front of the mouth, like the initial sounds of the words ball and pop are bilabial sounds (bilabial meaning “two lips”), while sounds created with the top teeth and bottom lip, such as the first sounds in fox and van, are labiodentals. Next, we have interdentals, which are sounds made with the constriction occurring with the tongue behind or between the teeth, as in thank and the. Moving back in the mouth a bit further, we have alveolar sounds, which are articulated with the constriction at the alveolar ridge, the bony area behind the top teeth. There are a number of alveolar sounds in English, including the initial sounds in the words dog and top. Past the alveolar ridge are the post alveolar and palatal areas, points of constriction for the initial sounds of words such as cheese and jar. Back in the mouth further still is the velum, which raises and lowers as needed to block the opening of the nasal cavity. A few English sounds are articulated with constriction in this area; these sounds, called velar sounds, can be found in words such as kite and goat. At the back of the throat is the glottis; sounds articulated with constriction in this area are glottal sounds, such as the initial sound in the word horn.
In terms of how much constriction is taking place, we have a few options for characterizing the manner of articulation for consonants. There are some sounds that are produced with complete constriction at their point of articulation. These are called stops (or plosives), examples of which are at the beginning of the words pop, top, and goat. Other sounds are created by a good bit of constriction, when the tongue is very close to the teeth or alveolar ridge, for example, and these sounds are called fricatives, due to the “friction” that is created. Examples of fricatives are sounds such as those at the beginning of fox, thank, soda, and shop. Still other sounds are created when a stop and a fricative occur in close combination, and these are called affricates, including the first sounds in the words cheese and jar. For nasal consonants, there is a stoppage of the airflow through the mouth, but because the velum is lowered, the air is allowed to resonate in the nasal cavity, producing nasal sounds, such as those found at the beginning of money and note. Finally, there are a handful of consonants that are articulated with very little constriction, so little that they are referred to as semi-vowels (or approximants). These sounds can be liquids, such as the first sound in leg, or glides, as with the first sound in the word yarn. Table 4.1 is a chart containing all of the IPA symbols for English consonants, their technical descriptions, and some example words.
Table 4.1 IPA consonant chart
| Place of articulation | |||||||
|---|---|---|---|---|---|---|---|
| Manner of articulation | |||||||
| Bilabial | Labiodental | Interdental | Alveolar | Palatal | Velar | Glottal | |
| Stops | p (pat) | t (tag) | k (cat) | ||||
| b (bat) | d (dog) | g (goat) | |||||
| Fricatives | f (fat) | θ (thin) | s (sat) | ʃ (sheep) | h (hat) | ||
| v (vat) | ð (then) | z (zap) | ʒ (azure) | ||||
| Affricates | tʃ (cheap) | ||||||
| dʒ (jeep) | |||||||
| Nasals | m (mat) | n (nap) | ŋ (sing) | ||||
| Approximants | |||||||
|---|---|---|---|---|---|---|---|
| Liquids | l (lap), r (rap) | ||||||
| Glides | w (win) | y (yap) | |||||
Note: Voiceless consonants are shaded.
4.2.2 Vowels
In order to provide a technical description of a vowel, we have to take a different route. All vowels are voiced, so the issue of voicing does not apply and, because constriction isn’t an issue, we don’t talk about “manner” or “place” of vowel articulation. Instead, we answer four questions to arrive at the technical description of vowel sounds. The first two questions have to do with how we arch the tongue inside the mouth to create the different shapes that then produce the different vowel sounds: tongue height and tongue position. The tongue can arch close to the roof of the mouth, creating high vowels; it can be kept closer to the bottom of the mouth, creating low vowels, or in the middle for mid vowels. Thus, the terms “high,” “low,” and “mid” indicate an answer to the question of tongue height. In order to address tongue position, we look at whether the arch of the tongue is more toward the front (near the teeth), center, or back of the mouth, and we use the terms front vowel, central vowel, and back vowel to describe this. The third question deals with tongue tension, whether the sound is produced as a tense vowel or a lax vowel. Finally, we answer a question about lip roundedness to determine if the vowel sound is rounded or unrounded. Table 4.2 contains a chart that is laid out in such a way as to cover all four of the vowel characteristics along with example words for each vowel sound.
Table 4.2 IPA vowel chart

Note: the shaded symbols represent tense vowels (non-shaded vowels are lax) and the four bracketed back vowels are rounded (all other vowels are unrounded).
Because vowels are generally described using measures of tongue position (frontness/backness) and tongue height (high/mid/low), we can plot vowels two-dimensionally in what we refer to as vowel space. Figure 4.1 is a visualization of vowel space (note that this is simply a different way to present much of the same information found in Table 4.2).
Individual speakers’ articulation of vowels and consonants can vary. In order to capture minor variations in articulation of speech sounds, linguists can use diacritics, which are small marks added to IPA symbols, in order to describe more specifically an individual speaker’s pronunciation. Diacritics for vowels can indicate how front or back (or high or low) the tongue is during articulation or whether the vowel is more/less rounded. Figure 4.2 contains some examples of diacritics frequently used to give a more narrow (vs. a broad, or more general) transcription of vowel sounds. Full IPA charts (with diacritics) can be found at www.internationalphoneticassociation.org (including an interactive chart that lets you listen to the sounds represented by each IPA symbol).

Figure 4.2 Diacritics used for narrow transcription of vowel sounds.
4.2.3 Wells’ Standard Lexical Set
In order to facilitate comparisons between varieties of English, John C. Wells came up with a list of keywords intended to highlight differing pronunciations of the same vowel sounds across different English varieties. The keyword system is a complement to the IPA; whereas in the IPA there is a one-to-one correspondence between symbol and meaning, so that it can be used for all languages (and therefore all varieties of a language), the keyword system indicates what IPA vowel symbol would be used in the same word across English varieties. Wells’ (Reference Wells1982) standard lexical set can be found in Table 4.3; each keyword is accompanied by its realization in Received Pronunciation (RP) and North American (NA), in order to offer a comparison between the so-called “Standards” of British and American Englishes.
Table 4.3 Wells’ lexical set (keyword chart)
| Keyword | RP | NA | Example words |
|---|---|---|---|
| KIT | ɪ | ɪ | ship, rip, dim, spirit |
| DRESS | e | ɛ | step, ebb, hem, terror |
| TRAP | æ | æ | bad, cab, ham, arrow |
| LOT | ɒ | ɑ | stop, rob, swan |
| STRUT | ʌ | ʌ or ə | cub, rub, hum |
| FOOT | ʊ | ʊ | full, look, could |
| BATH | ɑː | æ | staff, clasp, dance |
| CLOTH | ɒ | ɔ | cough, long, laurel, origin |
| NURSE | ɜː | ɜr | hurt, term, work |
| FLEECE | iː | i | seed, key, seize |
| FACE | eɪ | eɪ | weight, rein, steak |
| PALM | ɑː | ɑ | calm, bra, father |
| THOUGHT | ɔː | ɔ | taut, hawk, broad |
| GOAT | oʊ | oʊ | soap, soul, home |
| GOOSE | uː | u | who, group, few |
| PRICE | aɪ | aɪ | ripe, tribe, aisle, choir |
| CHOICE | ɔɪ | ɔɪ | boy, void, coin |
| MOUTH | aʊ | aʊ | pouch, noun, crowd, flower |
| NEAR | ɪə | ɪr | beer, pier, fierce, serious |
| SQUARE | ɛə | ɛr | care, air, wear, Mary |
| START | ɑː | ɑr | far, sharp, farm, safari |
| NORTH | ɔː | ɔr | war, storm, for, aural |
| FORCE | ɔː | or | floor, coarse, ore, oral |
| CURE | ʊə | ʊr | poor, tour, fury |
Note that many of the differences between RP and NA have to do with vowel length, while others represent different articulations of the same vowel, such as BATH, which sounds like [bɑθ] in RP British English, but [bæθ] in American English. Wells’ [e] sound for DRESS is really the same as the American [ɛ] sound, but Wells uses a different symbol. The same thing can occur for STRUT, where many Americans would represent the sound as [ə]. British and American English also typically differ in rhoticity, the pronunciation of the <r> after a vowel, and Table 4.3 shows American English pronunciations with an [r] (IPA includes several different pronunciations for the <r> sound, of which [r] is just one).
4.2.4 Phonemes and Allophones
Besides the one-to-one correspondence between a speech sound and an IPA symbol, each IPA symbol can also represent a phoneme, a collection of phonetic segments that we perceive and distinguish as being the same sound in a particular language. When IPA symbols are used to represent phonemes they are enclosed in slant lines (also called virgules), so that /b/ is the phoneme for the <b> sound, and [b] represents the speech sound for a <b>. The important thing to note here is that, not only do different languages employ different subsets of speech sounds, speakers of different languages may have different perceptions of which sounds belong to the same phoneme. Different languages make different distinctions. For example, in English voicing is considered phonemic, because trading a voiced consonant for its voiceless counterpart would make a difference in meaning, for instance turning [bæt] into [pæt]. In Korean, however, voicing is not phonemic, which means that Korean speakers would perceive [bæt] and [pæt] as the same word. Likewise, in Korean, aspiration is phonemic. Aspiration occurs when there is a puff of air released in the articulation of a stop consonant, and is indicated in IPA by the addition of a small diacritic, in this case a small superscript h, to a transcription. What this means is that, in Korean, [bhæt] and [bæt] are perceived as different words, while English speakers, if they perceived a difference at all, would simply hear them as different pronunciations of the same word. In English, the aspirated and unaspirated versions of [b] are allophones of the phoneme /b/. An allophone is a spoken variation of a phoneme. Allophones of the same phoneme are perceived as being the same sound, and they are not distinctive in the same way that phonemes are: They don’t make a difference in meaning. Take, for example, the different articulation of the middle sound in words such as letter and butter; in American English [ɾ] (called a flap) and [d] are allophones of /t/ in this environment, i.e. between vowels. In some varieties of British English [ʕ] (a glottal stop) is an allophone of /t/ in those same environments but [ɾ] and [d] are generally not. Each of these sounds can be an allophone of the phoneme /t/ in the different English varieties, and whether you say [bʌtər] or [bʌɾər] or [bʌdər] or [bʌʕər], the meaning is the same.
Allophones for a phoneme in any variety of a language follow the same A-curve frequency profile that you learned about in Chapter 3. In any environment for use of a phoneme – say the /t/ between vowels in a word like butter – the possible allophones will be used more or less often, one or a few very often and most of them rarely. If you happen to be in England, you will hear the [t] most often in butter, sometimes the [ʕ], and more rarely the [d] or [ɾ]; in the United States it is the other way around, so you will hear [bʌɾər] or [bʌdər] most often, and less often the [t] or [ʕ]. Complexity science tells us that the variation in allophones is actually much greater than we generally think it is, and survey data from the LAP confirm it.
How do we figure out which sounds are (or are not) considered phonemes in a particular language? We look to minimal pairs (or minimal sets), which are pairs or sets of words that differ by only one sound, a sound that then makes a difference in meaning. In English, pat [pæt] and bat [bæt] constitute a minimal pair because of the single distinctive difference. The contrasting sound must be in the same place in each word: It can be an initial sound (as in the pat, bat example), a sound in the middle (pat, pit), or a sound at the end (pat, pad). Once distinctive sounds have been identified, we use slant lines /b, p/ to indicate that we are talking about these sounds as phonemes for a given language or variety. We stick with brackets to talk about the phonetic segments found in speech.
In addition to making statements about what is or is not a phonemic difference between languages, minimal pairs can also give us a way to talk about differences between dialects or varieties of a single language. For example, within American English some speakers find the sounds [ɑ, ɔ] to be distinctive, and some speakers don’t; so for some American English speakers the words Don and dawn, or cot and caught, sound exactly alike (which means the vowel sounds are not distinctive), while other American English speakers consider each of these to be a minimal pair. What this means is that there are at least two different sound systems within American English, just based on this one distinction. British English also has multiple sound systems, as many speakers of British English do not pronounce the [r] after vowels in words like bar and barn, while others (usually from the North or West) do pronounce it. The possibility that a large variety like British English or American English can have more than one sound system is exactly what we predict from complex systems: It makes sense to talk about big varieties, but at the same time we predict that the systems at lower levels of scale can be somewhat different.
Despite internal variations, we can still talk about “American English” and “British English”; we simply need to remember that treating any language as a single, unified system is a generalization. In fact, we are making a generalization when we talk about the characteristics of a dialect of a language as well, as it is the case that even people who are considered to be speakers of the same dialect may not use vowels and consonants in the same way. Even individual speakers are not entirely consistent in how they pronounce their vowels and consonants: Just as for large groups and sub-groups, individuals have A-curve frequency profiles for the units of their pronunciation. Basically, any claim we make for a sound system, whether it’s about an individual speaker or a group of people, will be a generalization. It is fine to make generalizations about sound systems if we need to do so for particular purposes, but it is important to remember that the boundaries and categories we use are of our own making. There are no “natural classes.”
The IPA offers those who know it an effective and reliable way to talk about the sounds of a particular language and about the pronunciation of specific words, not to mention the fact that having an agreed-on set of categories for speech sounds gives linguists something to count. In reality, though, the IPA does not represent any sort of linguistic “truth.” Instead it functions as an international agreement on how we talk about what are actually arbitrary segments of continuous physiological movements. In the context of natural speech, individual sounds are indistinct; even individual words taken out of a speech stream context are difficult to identify.
4.3 Phonology
If phonetics is the study of the articulation of speech sounds, then phonology studies changes in articulation that occur as sounds are strung together in natural speech. The articulation of a sound may change due to the influence of surrounding sounds, a phenomenon known as assimilation. For example, in the word “tan” [taen], the [ae] vowel can become nasalized, which means that it is articulated with the velum partially lowered, in anticipation of the nasal consonant [n] that follows. Though referred to as phonological rules, statements about articulatory processes (such as vowel nasalization) are more generalization than “rule,” something that most speakers do most of the time. Assimilation accounts for many phonological processes; sounds can change to be more like a following sound (anticipatory assimilation, as in the vowel nasalization example above) or to be more like the preceding sound (perseverative assimilation, seen when “plural –s” becomes a [z] when added to a word that ends in a voiced consonant: [dɑgz] vs. [dɑts]). Changes due to assimilation can be in the form of a change in the place of articulation, a change in manner of articulation, or a change in voicing. There are also times when a sound will change to be less like a neighboring sound, a process called dissimilation. Examples of dissimilation include the pronunciation of the word “sixth” as [sɪkst] and “fifth” as [fɪft]; in each case the second fricative becomes a stop. Two other common phonological processes are insertion and deletion. Insertion is exactly what it sounds like; a sound is inserted into a word, often in order to facilitate articulation. Think about the way you say the word “something” and “youngster,” not as [sʌmθɪŋ] and [jʌŋstər] but [sʌmpθɪŋ] and [jʌŋkstər], respectively. Deletion is the opposite of insertion; a deleted sound is one that is not articulated. Deletion is especially common in English with consonant clusters. For example, most people articulate “most people” as [mospipl] in conversational speech. In this case, the cluster [stp] is reduced to [sp]. All of these examples represent regular phonological phenomena; it’s not being a “lazy” speaker to articulate words according to the phonological processes common within a language. It’s simply part of real speech. And all of these phonological rules help to build the A-curve frequency profiles of variation in a language.
4.4 Looking Ahead
What we’ve seen in this chapter is the “traditional” view of speech sounds, with only a few mentions of complex systems. Tradition gives us a descriptive viewpoint, but it assumes the existence of natural categories within a system of speech sounds. In the next couple of chapters, we’ll go into more depth about speech sounds and how we study them, looking at acoustic phonetics, thinking a little more about the IPA, and looking at Linguistic Atlas data for empirical evidence of variation in speech sounds.
Keywords
Phonetics
IPA
Phonetic symbols
Consonant
Vowel
Constriction
Voicing
Place of articulation
Manner of articulation
Bilabial
Labiodental
Interdental
Alveolar
Palatal
Velar
Glottal
Stops
Fricatives
Affricates
Nasals
Semi-vowels
Approximants
Liquids and glides
Tongue height
Tongue position
Diacritics
Narrow vs. broad transcription
Phoneme
Allophone
Aspiration
Minimal Pair
Phonology
Phonological rule
Assimilation
Applications
(1) Transcribe the following sentences from the IPA to normal, written English.
(a) ɑrkiɑləʤi ɪz ðə stʌdi ʌv hjumən hɪstəri ænd prihɪstəri θru ði ɛkskəveɪʃən ʌv saɪts æn ði ənæləsəs ʌv ɑrtəfækts ænd ʌðər fɪzɪkəl rɪmeɪnz.
(b) ɑrkiɑləʤɪsts stʌdi ðə kʌlʧər ʌv prihɪstɔrɪk pipəlz θru ði ənæləsəs ʌv ɑrtəfækts ɪnskrɪpʃənz mɑnjəmənts ænd ʌðər sʌʧ rɪmeɪnz.
(c) ɑrtəfækts ɑr mətɪriəl rɪmeɪnz ðæt wɜr krieɪtəd juzd ɔr ʧeɪnʤd baɪ hjumənz.
(d) wʌn ʌv ðə moʊst feɪməs ɑrkiəlɑʤɪkəl saɪts ɪz ðæt ʌv kɪŋ tʌts tum wɪʧ kənteɪnd sevrəl θaʊzənd praɪsləs ɑbʤɛkts ɪnkludɪŋ ðə goʊld kɔfɪn ænd mʌmi ʌv ðə tineɪʤ kɪŋ.
(2) Try to transcribe the following passage into the IPA. Don’t worry if you and others working on the same application get different transcriptions! It takes extensive training to be able to transcribe accurately, but it is definitely worth giving transcription a try.
The academic science of linguistics has not yet achieved the consensus about its basic principles. At the same time, the popular view of language (at least for English speakers in Britain and America) has indeed arrived at something like consensus. However, that popular view is quite different from what academic linguists think.
Note: You will get different transcriptions if you try to put down the way that you would read the words in the passage, as opposed to putting down transcriptions for separate words.
(3) As mentioned in this chapter’s discussion of speech sound articulation, different aspects of articulation are phonemic in different languages. Aspiration, for example, creates a different phoneme in Korean while voicing does not. The opposite is true for English, where voicing is phonemic and aspiration is not. Why do you think that voiced and voiceless consonants have unique IPA symbols but aspirated consonants are indicated by a diacritic (the superscript “h”)? (Hint: A quick look at the history of who fashioned the IPA might be in order.)



