13.1 Introduction
Many researchers have described a relationship between input and the outcome of heritage language (HL) grammars. If a specific feature occurs less frequently in the input, heritage speakers (HSs) may be less likely to acquire and produce this feature. Similarly, in monolingually oriented studies of sound change, we find a concept called the functional load hypothesis (FLH), which says that phonemic contrasts that occur less frequently are more likely to be lost (i.e., merged) over time.
This chapter addresses the FLH in an HL context by using quantitative measures of input. The specific features discussed are two Toronto Cantonese vowel pairs: /y/~/u/ and /a/~/ɔ/. The contrasts involved are illustrated in the following minimal pairs, which are represented in Jyutping RomanizationFootnote 1 (in italics), in IPA (in brackets, where 25 represents a high-rising tone and 55 a high level tone, using the Chao Tone numbering system; Chao, Reference Chao1930), and in traditional Chinese characters (in parentheses): gyun2 [kyn25] ‘roll’ (捲) versus gun2 [kun25] ‘building’ (館); saa1 [sa55] ‘sand’ (沙) versus so1 [sɔ55] ‘comb’ (梳). This chapter will ultimately show that /y/~/u/ has a lower functional load (FL) and that this corresponds to greater vulnerability to change among English-dominant HSs. The broader contribution is in presenting one of the few quantitatively based studies of the FLH in an HL context.
13.2 Background and Motivation
13.2.1 The Functional Load Hypothesis (FLH)
Hockett (Reference Hockett1967) has described FL as the load carried by “the function of a phonemic system,” which “is to keep the utterances of a language apart” (p. 300). Different phonemic contrasts carry different loads. For example, the English phonemes /p/ and /b/ have the function of distinguishing minimal pairs such as cap and cab. The phonemes /ʃ/ and /ʒ/ also have minimal pairs (e.g., mesher [mɛʃɹ̩ ] versus measure [mɛʒɹ̩ ]), but not as many as /p/~/b/, according to Hockett. Therefore, the /p/~/b/ contrast has a higher FL than the /ʃ/~/ʒ/ contrast. The FLH says that contrasts with lower FL are more likely to merge over time.
Wedel et al. (Reference Wedel, Kaplan and Jackson2013) have described the FLH as an idea that “has held great intuitive appeal for language-change researchers over the last century” (p. 179). In spite of this intuitive appeal, providing empirical support has been a challenge because of a multitude of considerations and complications involved in testing the FLH. For example, Surendran and Niyogi (Reference Surendran, Niyogi and Thomsen2006) mention word token frequency, direction of merger, syllable structure, word structure, suprasegmental features such as lexical tone, and system entropy (i.e., how minimal pair or phoneme counts change before and after a sound change) as factors that need to be taken into account in assessing the FLH. Another complication is that languages with high average word lengths may lack minimal pairs for many phonemes (cf. Ceolin, Reference Ceolin2020).
Wedel et al. (Reference Wedel, Kaplan and Jackson2013) addressed some of these issues through computational modeling in one of the largest corpus-based studies addressing the FLH. The analysis was based on 634 phoneme pairs and more than 1 million word forms across nine different languages, including Cantonese. The results showed a greater likelihood of merger for phonemic oppositions with lower minimal pair counts. For phonemes that lack minimal pairs, a higher phoneme probability predicted a greater likelihood of merger.
To date, most FLH research has taken a monolingual orientation, but just as the FLH has held intuitive appeal in monolingual contexts, the same is true for bilingual contexts, especially with respect to HLs. This connects with Polinsky’s (Reference Polinsky2018) discussion of amount and type of input as one of the main sources of divergent attainment in HL grammars. She says that if an HS does not receive enough input for a given feature, they either do not acquire the feature or they reanalyze the feature. Either way, the outcome is a feature that diverges from the grammar of homeland speakers. Similarly, Aalberse et al. (Reference Aalberse, Backus and Muysken2019) have discussed low-frequency features as most vulnerable to change in HLs. Since FL essentially involves frequency of occurrence of phonemic contrasts in language use, which is broadly part of “input,” we might expect HSs to be more likely to diverge from homeland speakers in their production of low-FL contrasts than in their production of high-FL contrasts.
The FLH could also complement accounts based on language transfer. Unlike monolingual speakers, bilingual speakers have access to an additional phonemic system. This creates the possibility of language transfer, which could also lead to phoneme merger if the dominant language has a single phoneme that corresponds to two phonemes in the HL. Contact-induced change becomes a possibility. Dominant language transfer, however, does not develop for every cross-linguistically distinct feature. The FLH, thus, may account for which phonemes are more likely to merge as a result of dominant language transfer.
Amengual and Simonet (Reference Amengual and Simonet2020) illustrate this point by focusing on Catalan–Spanish early bilinguals and their acoustic production of several Catalan features absent from Spanish. These features included a phonemic contrast between /e/ and /ɛ/, a phonemic contrast between /o/ and /ɔ/, and unstressed vowel reduction. The results showed that linguistic dominance (i.e., Catalan dominant, Spanish dominant, or balanced bilingual) had an effect only on the mid-vowel contrasts. Amengual and Simonet (Reference Amengual and Simonet2020) attributed these results to their low FL, with unstressed vowel reduction being a phonological process described as easier for all speakers to acquire because it occurs more frequently in spontaneous speech than mid-vowel phonemes. Similarly, Bullock and Gerfen’s (Reference Bullock and Gerfen2004) study of heritage French speakers in Frenchville, Pennsylvania, showed the loss of a phonemic distinction between /ø/ and /œ/, which has low FL.
While both Amengual and Simonet (Reference Amengual and Simonet2020) and Bullock and Gerfen (Reference Bullock and Gerfen2004) mention FL differences in their discussion, neither study quantified the purported FL differences. Thus, a major goal of this chapter is to provide quantitative support for FL differences in Cantonese and to show how these differences could contribute to our understanding of HL phonetics and phonology.
13.2.2 Previous Studies of Heritage Cantonese Vowels
While Cantonese has been considered in previous studies of FL (Ceolin, Reference Ceolin2020; Surendran & Niyogi, Reference Surendran, Niyogi and Thomsen2006; Wedel et al., Reference Wedel, Kaplan and Jackson2013), these studies have focused on consonantal or tonal merger rather than on vowel merger. This is due to either a lack of ongoing vowel mergers or a lack of documentation of vowel mergers in Hong Kong Cantonese. Whatever the case, recent studies of Cantonese have identified evidence for one merger absent in Hong Kong Cantonese but present in Toronto Cantonese. The discussion below provides relevant background.
Figure 13.1 shows the monophthong systems of both Cantonese and Toronto English. Cantonese has eleven monophthongs (following the description in Zee, Reference Zee1999), while Toronto English has eight.Footnote 2 Ellipses indicate the parts of the vowel systems that are the focus of the current chapter.

Figure 13.1 Cantonese and Toronto English monophthongs.
At the top of the Cantonese vowel quadrilateral is an ellipsis that includes the two high round tense vowels of Cantonese: /y/ and /u/ (represented as yu and u, respectively, in Jyutping Romanization).Footnote 3 The corresponding space in the Toronto English vowel quadrilateral, however, contains only one vowel: a phonetically fronted /u/. The second part of the Cantonese vowel system highlighted in Figure 13.1 includes the vowels /a/ and /ɔ/ (represented as aa and o, respectively, in Jyutping Romanization). Again, the corresponding part of the Toronto English vowel quadrilateral contains only one vowel: /ɑ/.
The focus on /y/~/u/ is motivated by several previous studies. Tse (Reference Tse2019b) presented a comprehensive comparison of Hong Kong and Toronto speakers in their F1/F2 production of eleven monophthongs. The only difference found between the two groups was in the F2 production of /y/, with /y/ being significantly retracted for second-generation Toronto speakers. This suggests a merger of /y/ with /u/ rather than with other vowels. Tse (Reference Tse, Bayley, Richard, Preston and Li2022) provided further evidence for a /y/~/u/ merger by showing more /y/ retraction and /u/ fronting among those with lower Cantonese proficiency proxy scores. Tse (Reference Tse2019a) presented metalinguistic commentary about these two sounds, suggesting speaker awareness about the merger. The identification of a source structure (i.e., the lack of two contrasting high round tense vowels in Toronto English) and the lack of the same change in Hong Kong support a language transfer explanation.
The second vowel pair under investigation is /a/~/ɔ/. The hypothesis that these two vowels could merge is motivated by a large body of sociolinguistic studies of the low back merger shift (LBMS) across dialects of North American English (see Becker, Reference Becker2019 for an extensive discussion). As in many North American English dialects, Toronto English lacks a distinction between the vowels in lot (i.e., /ɑ/) and thought (i.e., /ɔ/). Thus, there is only a single vowel in this part of the vowel space. Toronto English influence would mean a merger of the Cantonese /a/~/ɔ/ contrast. While the LBMS has clearly spread across English dialects, cross-linguistic spread from English to an HL has not been previously addressed.
The specific questions addressed in this study are as follows: (1) Is there evidence of merger of /y/~/u/ or /a/~/ɔ/ based on generational group or dominant language? (2) How do these vowel pairs compare with each other in terms of numbers of minimal pairs and phoneme token frequency? (3) What are the implications of these results for the FLH in an HL context?
13.3 Methodology
13.3.1 The Heritage Language Variation and Change Corpus
The source of data for this chapter (and for Chapters 14 and 15 of this volume) is the Heritage Language Variation and Change (HLVC) in Toronto Corpus (Nagy, Reference Nagy2011; Nagy et al., Reference Nagy, Kang, Kochetov and Walker2009). This paragraph and the one that follows serve as an introduction to the corpus for all three chapters of this volume. The corpus includes spontaneous speech samples from ten HLs, including Cantonese. Each language is represented by a sample of forty HSs and twelve to sixteen homeland speakers (i.e., Gen0). The ‘homeland’ is defined as the city of origin of the HSs’ families (i.e., Hong Kong in the current study). The HSs from Toronto include the immigrant generation (i.e., Gen1) and two generations of their descendants (i.e., Gen2 and Gen3). Each speaker in the corpus contributed three types of data: an hour-long sociolinguistic interview, which involves a spontaneous conversation with topics selected based on participant interest (cf. Labov, Reference Labov, Baugh and Sherzer1984),Footnote 4 responses to an ethnic orientation questionnaire (EOQ)Footnote 5 that included questions about language practices and attitudes, and a picture-naming task that involved participants identifying a set of drawings from a children’s picture book (Amery & Cartwright, Reference Amery and Cartwright1987).
The HLVC data are available to other researchers. The goals of the HLVC Project include documenting and describing HLs as spoken by multiple generations of speakers, to compare HLs to homeland varieties, and pushing variationist research beyond its monolingually oriented core and its majority language focus (Meyerhoff & Nagy, Reference Meyerhoff and Nagy2019; Nagy & Meyerhoff, Reference Nagy, Meyerhoff, Meyerhoff and Nagy2008; Ravindranath, Reference Ravindranath2015; Smakman & Heinrich, Reference Smakman and Heinrich2015; Stanford & Preston, Reference Stanford and Preston2007), allowing researchers to test whether the theoretical generalizations that have emerged from over half a century of variationist sociolinguistics focused on English (and, to a much lesser degree, on Spanish, French, and Portuguese) are upheld in a typologically and culturally broader set of languages. The existing limitations in variationist sociolinguistics may be attributed to the narrow range of native languages spoken by most variationist sociolinguists. Thus, expanding the number of HL-speaking sociolinguists through research, training, and knowledge mobilization will promote HL vitality. Accomplishing this involves, on the one hand, changing attitudes and language practices of student-researchers and, on the other, sharing HLVC research findings, which generally reflect a more robust status of heritage languages than is seen in experimental studies, with heritage languages communities.Footnote 6
The current study focused on a subset of HLVC data that was processed as part of a previous study of Cantonese vowels (Tse, Reference Tse2019a, Reference Tse2019b). As shown in Table 13.1, the current study included eight Gen0, twelve Gen1, and twelve Gen2 speakers (n = 32). The data were processed in two different ways. The first method (which also formed the basis for Chapter 15 of this volume) addressed vowel merger (Section 13.3.2), while the second method addressed FL (Section 13.3.3).
Table 13.1 Cantonese speakers analyzed.
| Group | Description | City | Dominant language | Age range | Male | Female | Total |
|---|---|---|---|---|---|---|---|
| Gen0 (homeland) | Lifelong Hong Kong (HK) resident | HK | Cantonese | 16–77 | 3 | 5 | 8 |
| Gen1 | Arrived from HK as adult, lived in Toronto for 20+ years | Toronto | Cantonese | 46–87 | 6 | 6 | 12 |
| Gen2 | Grew up in Toronto, parents meet criteria for Gen1 group | Toronto | English | 20–44 | 6 | 6 | 12 |
| n = 32 |
13.3.2 Vowel Analysis Procedures
The data include ELAN (Sloetjes & Wittenburg, Reference Sloetjes and Wittenburg2008) transcriptions of interview speech using Jyutping Romanization and traditional Chinese characters. In the Chinese writing system, each character corresponds to a single syllable, which often corresponds to a single word. Matching up with each individual character transcription is the Jyutping transcription of the corresponding syllable, separated by a space from the next syllable. Thus, both the traditional character system and the Jyutping transcriptions treat all words as monosyllabic.Footnote 7 This facilitated data processing.
Prosodylab Aligner (Gorman et al., Reference Gorman, Howell and Wagner2011) was used to force align each ELAN transcript file with its accompanying .WAV file. The forced-aligned textgrids were manually reviewed and corrected as appropriate. A Praat (Boersma & Weenink, Reference Boersma and Weenink2016) script was run on these corrected textgrids to extract midpoint F1 and F2 measurements for vowel tokens from each of the eleven monophthongs of Cantonese. These formant measurements were then normalized using the Lobanov technique (Thomas & Kendall, Reference Thomas and Kendall2007). Finally, the normalized measurements were used to plot the vowel space for the thirty-two speakers. This included a total of 33,179 tokens for eleven monophthongs. Table 13.2 shows the distribution of the subset of vowels analyzed in this chapter.
Table 13.2 Vowel token distribution.
| Vowel (IPA) | Jyutping Romanization | Gen0 | Gen1 | Gen2 | Total |
|---|---|---|---|---|---|
| /a/ | aa | 1,485 | 3,172 | 1,896 | 6,553 |
| /ɔ/ | o | 1,250 | 2,295 | 1,723 | 5,268 |
| /y/ | yu | 276 | 623 | 351 | 1,250 |
| /u/ | u | 205 | 435 | 165 | 805 |
| n = 13,876 |
To address vowel merger, Pillai scores (PSs) were calculated for each individual speaker. The name of this score was given by Hay et al. (Reference Hay, Warren and Drager2006) to refer to the Pillai-Bartlett statistic, which is one of several MANOVA (Multivariate Analysis of Variation) tests. Hay et al. (Reference Hay, Warren and Drager2006) have described this score as a “summary [statistic] of the degree to which two distributions are kept distinct” (p. 467). A separate PS was calculated for each individual speaker by using SPSS to run a MANOVA model with F1 and F2 as the dependent variables and vowel class (i.e., /y/ versus /u/ or /a/ versus /ɔ/) as the independent factor. The output is a score on a continuous scale ranging from 0 to 1, with a higher score indicating greater F1/F2 distinction between the two vowel classes.
A major advantage of the PS is the ability to model F1 and F2 variation simultaneously on a continuous scale, thus making it possible to compare the extent of merger between speakers and speaker groups in gradient terms (Nycz & Hall-Lew, Reference Nycz and Hall-Lew2015). For example, we can say that a speaker with a /y/~/u/ PS of 0.401 is more merged than a speaker with a PS of 0.499. What 0.401 means, however, depends on the part of the vowel space measured. For example, 0.401 for /y/~/u/ would be more merged (at least in production) than 0.401 for /a/~/ɔ/ because of the smaller articulatory space available for low vowels (cf. Hall-Lew, Reference Hall-Lew2009, p. 143). Speaker group comparisons can also be made by comparing average PSs for each group. One-way ANOVA tests with ‘PSs’ as the dependent variable and ‘group’ as the independent factor can be run to determine the statistical significance of differences.
The current study considers two factors that could affect PSs: generational group and group-level dominant language.Footnote 8 Both factors are proxies representing different amounts of Cantonese input (see Table 13.3). For generational group, Gen0 represents the highest amount of Cantonese input, followed by Gen1, and then Gen2. Since both Gen0 and Gen1 are groups characterized by dominance in Cantonese, Gen0 and Gen1 were also put together into a Cantonese-dominant group, while Gen2 represents the English-dominant group. Because of the collinearity of these factors, each factor is modeled separately.
Table 13.3 Factors in the Pillai score analysis.
| Relative amount of Cantonese input | |||
|---|---|---|---|
| Factors | Higher | → | Lower |
| Generational group | Gen0 | Gen1 | Gen2 |
| Dominant language | Cantonese | English | |
13.3.3 Two Measures of Input
Wedel et al. (Reference Wedel, Kaplan and Jackson2013) show minimal pair counts and phoneme probability to be the best predictors of mergers. For the current study, minimal pair count and phoneme token frequency are used.
13.3.3.1 Minimal Pair Counts
The minimal pair count is simply the number of minimal pairs involving either the /y/~/u/ or /a/~/ɔ/ contrasts. Only monosyllabic minimal pairs were counted. The rationale for this was based on the way the HLVC Corpus was transcribed (see Section 13.3.2). Tonal categories were treated as segments when counting minimal pairs. For example, pairs with the same (C)V(C)Footnote 9 sequence but different tone categories, such as gyun1 (捐, ‘to donate’) versus gyun2 (捲, ‘roll’), were considered minimal pairs. Similarly, pairs with the same tone category that differ only in one vowel segment, as in gyun1 (捐, ‘to donate’) versus gun1 (觀, ‘to observe’), would also be considered minimal pairs.
The generally monosyllabic nature of Cantonese also means a large number of homophones. Semantically distinct homophones were all treated as single words when counting minimal pairs. For example, gun1 can mean either a ‘government official’ (官) or a ‘coffin’ (棺). Furthermore, gun1 contrasts with gyun1, which can mean a ‘cuckoo bird’ (鵑), a ‘small stream’ (涓), or ‘to observe’ (觀). These five words would, thus, be counted as only one minimal pair: gun1 versus gyun1.
The minimal pair count was based on two sources of data. The first is the HLVC Corpus. This involved exporting a spreadsheet from the ELAN files to identify each distinct syllable uttered by speakers in the corpus. Since the conversational speech represented in the corpus may lack representation of the full lexicon of individual speakers, Yue-Hashimoto’s (Reference Yue-Hashimoto1972) syllabary was included as a supplemental data source. Syllabaries are a commonly used tool in the study of Chinese phonology. A syllabary is a table showing all possible onset+rime+tone combinations that occur in the language. This includes Chinese characters indicating which syllables are attested/non-attested monosyllabic words in the language. The organization of Yue-Hashimoto’s (Reference Yue-Hashimoto1972) syllabary, thus, facilitated the counting of monosyllabic minimal pairs.
13.3.3.2 Phoneme Token Frequency
Wedel et al. (Reference Wedel, Kaplan and Jackson2013) showed that phoneme probability predicted mergers in cases in which two phonemes lack minimal pairs. Since it was known a priori that this is not the case for Cantonese /y/~/u/ and /a/~/ɔ/, phoneme token frequency was chosen as an alternative measure of input. This was calculated simply as the number of times each phoneme occurs in the HLVC Corpus. All monophthong vowels were included so that the relative occurrence of each phoneme could be compared with that of the other relevant phonemes. Since the syllable is the unit of analysis, the number of phonemes is equal to the number of syllables. This facilitated the counting of phoneme tokens.
13.4 Results
13.4.1 Vowel Merger Analysis
First are results for /y/~/u/ (Section 13.4.1.1), followed by results for /a/~/ɔ/ (Section 13.4.1.2). The results below include boxplots. In all boxplots (Figures 13.2, 13.3, 13.4, and 13.5), the dark-shaded part indicates values below the mean while the light-shaded part indicates values above the mean. The braces indicate statistical significance based on results from Bonferroni post-hoc tests. A complete list of all the PSs calculated for each individual speaker is shown in the Appendix.

Figure 13.2 Boxplots showing /y/~/u/ Pillai scores by generational group.

Figure 13.3 The /y/~/u/ Pillai scores at the group level based on dominant language.

Figure 13.4 Boxplots showing /a/~/ɔ/ Pillai scores by generational group.

Figure 13.5 The /a/~/ɔ/ Pillai scores at the group level based on dominant language.
13.4.1.1 The /y/~/u/ Pair
As shown in both Table 13.4 and Figure 13.2, there is an inter-generational decrease in average /y/~/u/ PSs, with Gen0 having the highest average, followed by Gen1. The Gen2 group has the lowest average, and hence, the least distinct acoustic productions of /y/~/u/. This decrease in average PSs is also accompanied by a decrease in the highest and lowest scores within the range of each group. The speaker with the highest PS (0.944) is in the Gen0 group, while the speaker with the lowest PS (0.565) is in the Gen2 group. There is also an increase in variation within each group. For the Gen0 group, the difference between the highest and lowest PS is 0.081. The difference goes up to 0.137 for Gen1 and up to 0.361 for Gen2. The standard deviation also increases when moving from Gen0 to Gen2. As shown by the braces in the boxplots in Figure 13.2, the difference between Gen0 and Gen2 is significant (p < 0.05), while the differences between Gen0 and Gen1, as well as between Gen1 and Gen2, are not significant. When combining Gen0 and Gen1 together into a Cantonese-dominant group, as shown in Figure 13.3, we find a statistically significant difference, with the English-dominant group (i.e., Gen2) having a lower /y/~/u/ PS (p < 0.05).
Table 13.4 Summary statistics of /y/~/u/ Pillai scores.
| Range | Difference | Average ± 1 SD | |
|---|---|---|---|
| Gen0 | 0.863–0.944 | 0.081 | 0.921±0.030 |
| Gen1 | 0.798–0.935 | 0.137 | 0.871±0.046 |
| Gen2 | 0.565–0.926 | 0.361 | 0.823±0.108 |
13.4.1.2 The /a/~/ɔ/ Pair
For /a/~/ɔ/ (Table 13.5 and Figure 13.4), we see a pattern resembling a V-shape rather than an incremental decrease in average scores moving from left to right. The Gen1 group has the lowest average PSs, while Gen2 is in between Gen0 and Gen1. Also, unlike the case for /y/~/u/, Gen0 rather than Gen2 has the widest range of scores as well as the highest standard deviation. Results from Bonferroni post-hoc tests, which are indicated in the braces in Figure 13.4, show that only the difference between Gen0 and Gen1 is significant (p < 0.01). When grouping by dominant language (Figure 13.5), however, the difference between Cantonese- and English-dominant speakers is not significant.
Table 13.5 Summary statistics of /a/~/ɔ/ Pillai scores.
| Range | Difference | Average ±1 SD | |
|---|---|---|---|
| Gen0 | 0.600–0.816 | 0.216 | 0.750±0.070 |
| Gen1 | 0.591–0.751 | 0.160 | 0.665±0.051 |
| Gen2 | 0.617–0.776 | 0.159 | 0.690±0.053 |
13.4.2 Functional Load Analysis
13.4.2.1 Minimal Pair Count
Using the criteria presented in Section 13.3.3.1, a total of five monosyllabic /y/~/u/ minimal pairs were identified in the Yue-Hashimoto (Reference Yue-Hashimoto1972) syllabary. Table 13.6 shows examples of these minimal pairs. This list includes homophones, which are not treated as distinct words. Three out of the five minimal pairs identified in Yue-Hashimoto (Reference Yue-Hashimoto1972) also occur in the HLVC Corpus. Yue-Hashimoto (Reference Yue-Hashimoto1972) also discussed a near complementary distribution relationship, with /y/ occurring exclusively following non-labial onsets and /u/ almost exclusively with labial (i.e., bilabial, labio-dental, labio-velar) onsets. The exception is with velar onsets. These minimal pairs reflect these inventory gaps.
Table 13.6 Monosyllabic /y/~/u/ minimal pairs.
| /u/ | /y/ | |||||
|---|---|---|---|---|---|---|
| Syllable | Sample word | # of other homophones | Syllable | Sample word | # of other homophones | Occurs in HLVC? |
| gun1 | 觀 (‘to observe’) | 3 | gyun1 | 捐 (‘donate’) | 2 | Yes |
| gun2 | 館 (‘public building’) | 3 | gyun2 | 捲 (‘roll’) | 1 | Yes |
| gun3 | 罐 (‘a can or metal container’) | 6 | gyun3 | 券 (‘ticket’) | 1 | Only gun3 occurs |
| gut6 | 嗗 (‘sound of swallowing’) | 0 | gyut6 | 橛 (‘chunk’, ‘section’) | 0 | Only gyut6 occurs |
| kut3 | 括 (‘include’) | 1 | kyut3 | 決 (‘decide’) | 6 | Yes |
For /a/~/ɔ/, there are fewer phonotactic restrictions and fewer inventory gaps. Table 13.7 presents this information by showing different onsets in each row and examples of monosyllabic words, including each onset combined with either /a/ (second column) or /ɔ/ (third column). This is similar to the organization presented in Yue-Hashimoto‘s (Reference Yue-Hashimoto1972) syllabary. The total number of minimal pairs identified in Yue-Hashimoto is 108, while the total in the HLVC Corpus is 77.
Table 13.7 Monosyllabic /a/~/ɔ/ minimal pairs.
| Sample words | Number of minimal pairs | |||
|---|---|---|---|---|
| Possible onsets | /a/ | /ɔ/ | Yue-Hashimoto (Reference Yue-Hashimoto1972) | HLVC Corpus |
| Bilabial | baa1, 爸 (‘father’) | bo1, 玻 (‘glass’) | 21 | 15 |
| Labio-dental | faa1, 花 (‘flower’) | fo1, 科 (‘class’) | 4 | 3 |
| Labio-velar | waa1, 哇 (‘crying sound’) | wo1, 鍋 (‘pot, pan’) | 9 | 6 |
| Alveolar | saa1, 沙 (‘sand’) | so1, 梳 (‘comb’) | 39 | 26 |
| Palatal | jaa1, 吔 (‘cry of pain’) | jo1, 喲 (‘oh’) | 1 | 0 |
| Velar | gaa1, 家 (‘family’) | go1 哥 (‘elder brother’) | 17 | 12 |
| Glottal | haan6, (‘limit’) | hon6, 汗 (‘sweat’) | 12 | 10 |
| Zero-onset | aan3, 晏 (‘late’) | on3, 按 (‘press’) | 5 | 5 |
| TOTAL | 108 | 77 | ||
13.4.2.2 Phoneme Token Frequency
Figure 13.6 shows the token frequency of each of the eleven monophthongs in Cantonese across each generational group and based on a total of 253,861 phoneme tokens from the HLVC Corpus. The three most frequently occurring monophthongs are /a/, /ɔ/, and /ɐ/, while the three least frequent monophthongs are /y/, /u/, and /ɵ/.

Figure 13.6 Phoneme token frequency in the HLVC Corpus.
13.5 Discussion and Conclusion
The first research question is about whether there is a decrease in PSs based on generational group or dominant language. For /y/~/u/, we do, in fact, see the Gen2 group with significantly lower PSs than the Gen0 group. The lack of a significant difference between Gen0 and Gen1 (n = 20) and between Gen1 and Gen2 (n = 24) could be due to small sample sizes. When grouping instead by dominant language (n = 32), the English-dominant group has significantly lower PSs than the Cantonese-dominant group. This is exactly as expected, given the hypothesis that decreased input favors vulnerability to change.
For /a/~/ɔ/, Gen1 speakers have lower PSs than Gen0 speakers. The differences between Gen0 and Gen2 (n = 20) and between Gen1 and Gen2 (n = 24) are not significant. As was the case for /y/~/u/, this could be due to small sample sizes. When grouping by dominant language (n = 32), we also find no significant difference. Thus, from these results, we can conclude that there is stronger evidence for a decrease in phonetic distinctiveness for /y/~/u/ based on linguistic dominance than there is for /a/~/ɔ/.
The second research question is about FL differences between the two vowel pairs. The pair /a/~/ɔ/ has far more minimal pairs than /y/~/u/. We get the same results whether we use the HLVC Corpus or Yue-Hashimoto (Reference Yue-Hashimoto1972). In the HLVC Corpus, there are seventy-seven /a/~/ɔ/ minimal pairs, compared to just three /y/~/u/ minimal pairs. This is more than a twenty-five-fold difference. In Yue-Hashimoto, there are 108 /a/~/ɔ/ minimal pairs, compared to just five /y/~/u/ minimal pairs. This is also more than a twenty-fold difference. The vowels /a/ and /ɔ/ are also among the most frequently occurring in the HLVC Corpus, while /y/ and /u/ are both among the least frequent. Thus, from these results, we can conclude that /a/~/ɔ/ has a much higher FL than /y/~/u/ in terms of type frequency. Both /a/ and /ɔ/ are also far more common in terms of token frequency.
The final research question is about the implications of these findings for the FLH. The results support the FLH based on the current analysis of two vowel pairs. Both pairs could potentially merge due to dominant language transfer because of a two-to-one correspondence between the HL and the dominant language. Yet the results show English dominance as a significant factor only for the lower FL pair: /y/~/u/. We do not see the English-dominant group producing significantly distinct variants of /a/~/ɔ/ (or at least this is the case based on F1/F2 values), the higher FL pair. These results support the FLH in an HL context by showing more evidence for vulnerability to change for the lower FL pair.
These results also complement previous studies of Toronto Cantonese that suggest that /y/~/u/ merger is a contact-induced change (e.g., Tse, Reference Tse, Bayley, Richard, Preston and Li2022). They suggest that FL may be a more specific mechanism facilitating contact-induced change. In this case, the English-dominant group produced less F1/F2 contrast between /y/~/u/ than the Cantonese-dominant group. Yet neither group is significantly different from each other in distinguishing between /a/~/ɔ/, which has far higher FL. This is the case even though neither pair of vowel contrasts exists in Toronto English. Thus, these results show that, given two mergers that could potentially develop based on cross-linguistic differences, FL helps predict which of the two is more likely to develop.
The results highlight an important difference between a monolingual context and an HL context. Cantonese-dominant (and Cantonese monolingual) speakers have access to a greater amount of Cantonese input than English-dominant speakers. We see this effect in the maintenance of the /y/~/u/ contrast merger in the Cantonese-dominant group and in the decreased F1/F2 contrast in the English-dominant group. Since Cantonese-dominant speakers use more Cantonese, they are more likely to encounter situations in which maintaining the /y/~/u/ distinction is important than is the case for English-dominant speakers. Thus, even though /y/~/u/ has a very low FL, Cantonese-dominant speakers use Cantonese often enough to maintain the distinction. It could be the case that it is only when input goes below a certain threshold that this would lead to a merger between these two sounds. Since the English-dominant group is less likely to reach this threshold, we see a stronger trend towards merger within the English-dominant group.
We see further input effects when examining variation within the Gen2 group, which has the widest range of /y/~/u/ PSs. Figure 13.7 shows a /y/~/u/ plot from the Gen2 speaker with the highest PS, while Figure 13.8 is a plot from the speaker with the lowest PS. For the former, tokens of /y/ and /u/ are clearly separated from each other, while for the latter, there are three tokens of /y/ that have a lower F2 than the two most-fronted tokens of /u/. Such inter-speaker variation is likely related to variable levels of Cantonese linguistic dominance and proficiency within the Gen2 group, as suggested by Tse (Reference Tse, Bayley, Richard, Preston and Li2022), which showed lower proficiency proxy scores involved with /y/‑retraction and /u/‑fronting. Weaker proficiency in Cantonese translates to greater vulnerability to change for low FL contrasts. Thus, an account based on FL goes hand-in-hand with an explanation based on language transfer. While cross-linguistic differences set the basis for potential language transfer effects, the actual language transfer that develops is mediated by FL differences.

Figure 13.7 Plot for C2F21B, the Gen2 speaker with the highest Pillai score. Squares represent tokens of /y/ and circles represent /u/.

Figure 13.8 Plot for C2M22A, the Gen2 speaker with the lowest Pillai score. Squares represent tokens of /y/ and circles represent /u/.
This study has several limitations. First, it involved only two pairs of phonemes. To provide further support for the FLH, it would be helpful to show how the FL of /y/~/u/ compares to that of other sounds involved in known mergers that are taking place in Cantonese, such as consonantal and tonal mergers. It would also be helpful to distinguish between type and token frequency. In the current study, the merging pair had both low type and low token frequency. It is unclear which is more important in facilitating mergers. Another limitation is that potential mergers were assessed based only on F1 and F2 values. There could be changes in other vowels in Toronto Cantonese that involve different acoustic features. For example, /ɵ/ was shown to have a lower phoneme frequency than either /y/ or /u/. This vowel did not show any inter- generational differences based on F1/F2 (Tse, Reference Tse2019b), but it could potentially be changing based on other features, such as F3.Footnote 10
To conclude, while research on FL has largely focused on monolingual settings, the contribution of this chapter shows how the FLH is supported in an HL context. Quantitative metrics of FL may be helpful in understanding variation across HL speakers. The results show stronger evidence for cross-linguistic convergence where there is a low FL. Thus, FL is a concept that is applicable in monolingual as well as HL settings.







