Introduction
Unfair treatment based on how a person sounds is pervasive, especially in professional settings, where speakers of stigmatized varieties, such as second language (L2) speakers and gay men, often face unequal treatment (Maindidze et al., Reference Maindidze, Randall, Martin-Raugh and Smith2025). These biases, which can manifest in discriminatory outcomes (e.g., microaggressions, barriers to employment), rely on socially learned associations between vocal cues and personality traits reinforced through cultural representations that link certain ways of speaking with warmth, competence, or authority (Fasoli et al., Reference Fasoli, Maass, Paladino and Sulpizio2017; Spence et al., Reference Spence, Hornsey, Stephensen and Imuta2024). Whereas listener reactions to gay-sounding and accented L2 speakers are well documented, there is little understanding of how listeners perceive speakers with a double minority status, such as gay-sounding L2 speakers, particularly in workplace-relevant contexts. Similarly, it is unclear how contextual variables, such as a job’s communication demands (e.g., barista vs. cleaner) or sexual orientation stereotypicality (e.g., interior decorator vs. engineer), influence the treatment of speakers with multiple voice-cued identities. Finally, there are no insights into how listeners’ processing fluency (Dragojevic et al., Reference Dragojevic, Giles, Beck and Tatum2017) shapes evaluations of speakers with intersecting identities. We therefore examined how listeners evaluate the employability of gay- and straight-sounding first language (L1) and L2 speakers applying for jobs considered high or low in communication demands, gay- or straight-typed, and varying in prestige.
Background literature
Listener bias against gay-sounding and L2-accented speakers
Listeners are quite skilled at sorting various speech patterns into categories (Kinzler, Reference Kinzler2021), which includes labeling a speaker as sounding gay, such as through an association between a speaker’s vocal cues and culturally circulating stereotypes of gay men (Podesva et al., Reference Podesva, Roberts, Campbell-Kibler, Campbell-Kibler, Podesva, Roberts and Wong2002). Being categorized as gay, however, often comes with judgment, especially in workplace-relevant evaluations. Gay and gay-sounding individuals are frequently considered warm, organized, and pleasant to talk to, along with having excellent social skills (Fasoli & Maass, Reference Fasoli and Maass2020; Hajek & Giles, Reference Hajek and Giles2005; Niedlich & Steffens, Reference Niedlich and Steffens2015; Taylor & Raadt, Reference Taylor and Raadt2021). However, compared to straight-sounding speakers, they are perceived as less competent and less moral (Fasoli et al., Reference Fasoli, Maass, Paladino and Sulpizio2017), are downgraded on measures of leadership, management, and maturity (Taylor & Raadt, Reference Taylor and Raadt2021), and are believed to lack the qualities of a successful manager (Liberman & Golom, Reference Liberman and Golom2015). Importantly, men who sound gay are considered less employable than those who sound straight, partly because they are perceived as less competent (e.g., Fasoli & Formanowicz, Reference Fasoli and Formanowicz2024; Fasoli & Hegarty, Reference Fasoli and Hegarty2020).
Similarly damaging for gay and gay-sounding individuals is that their personal and professional qualities are often stereotyped in relation to specific occupations (Kite & Deaux, Reference Kite and Deaux1987; Li & Wei, Reference Li and Wei2024). For instance, gay and feminine-presenting men are perceived as more suitable for gay-typed professions, such as a nurse or a makeup artist, than for stereotypically straight occupations, such as an auto mechanic or an engineer (Fasoli & Teasdale, Reference Fasoli and Teasdale2025; Rule et al., Reference Rule, Bjornsdottir, Tskhay and Ambady2016; Steffens et al., Reference Steffens, Niedlich, Beschorner and Köhler2019). Influenced by the gendered stereotype of a workplace, gay men are considered less effective in male-typed organizations, such as an auto repair shop, but more effective in female-typed establishments, such as a spa, even when they perform the same job as straight men (Pellegrini et al., Reference Pellegrini, De Cristofaro, Giacomantonio and Salvati2020). Clearly, as with the indexicality of a speaker’s pronunciation, contextual (e.g., job-related) stereotypes are socially constructed through expectations tied to specific gender and sexuality roles. As a result, judgments about who fits and who does not fit a certain occupation reflect not only gender and sexuality stereotypes but also socially imposed ideas such as which jobs are considered high or low in status or which jobs are perceived as masculine versus feminine or straight- versus gay-typed.
Like stereotypically gay-sounding men, accented L2 speakers similarly elicit biased judgments in workplace-relevant evaluations (Lippi-Green, Reference Lippi-Green2012). L2 speakers whose accents are more nativelike and easier to understand are generally preferred to those with heavier, harder-to-understand accents (Derwing & Munro, Reference Derwing and Munro2009; Tsunemoto et al., Reference Tsunemoto, McAndrews, Trofimovich and Friginal2023). In the workplace, L2 accents are frequently targeted by negative stereotypes, with different standards applied in the evaluation of accented speakers (Russo et al., Reference Russo, Islam and Koyuncu2017). These stereotypes often reflect the perception that L2 speakers are less competent and intelligent than L1 speakers (Dragojevic et al., Reference Dragojevic, Giles, Beck and Tatum2017; Hosoda et al., Reference Hosoda, Stone-Romero and Walter2007) and the commonly held belief that speaking without an L2 accent is a desired, prized skill (Creese & Wiebe, Reference Creese and Wiebe2012). In Canada, for example, physicians who speak with a Chinese accent are downgraded in professional evaluations relative to those speaking with Canadian accents (Baquiran & Nicoladis, Reference Baquiran and Nicoladis2020). In the United States, Black African-born nurses get belittled by colleagues for their accent and receive complaints that they are inadequately skilled (Iheduru-Anderson, Reference Iheduru-Anderson2020).
As with the treatment of gay-sounding individuals, workplace manifestations of L2 accent bias become nuanced in light of job-relevant, contextual variables such as a job’s prestige and communication demands. In terms of job prestige, L2 speakers are preferred for low- versus high-status positions (Kalin & Rayko, Reference Kalin and Rayko1978) and for semi-skilled over supervisory roles (De La Zerda & Hopper, Reference De La Zerda and Hopper1979; Iheduru-Anderson, Reference Iheduru-Anderson2020). In terms of job-relevant communication, accented L2 speakers are downgraded relative to L1 speakers in employment evaluations for customer-facing jobs (Hosoda & Stone-Romero, Reference Hosoda and Stone-Romero2010; Timming, Reference Timming2017), consistent with how ethnically minoritized groups are typically found most suitable for non-customer-facing positions (Derous et al., Reference Derous, Pepermans and Ryan2017).
Sources of listener bias
According to several perspectives, listener bias originates in social categorization, where, broadly speaking, listeners first use vocal cues to define a speaker’s social belonging and then extend the stereotypical views associated with that group to all its members. Conceptually, listeners’ evaluations are shaped by a combination of cognitive and social factors, including coarse categorical judgments such as “native” versus “nonnative” (Fiske & Neuberg, Reference Fiske, Neuberg and Zanna1990), individuating information such as a speaker’s traits and behaviors (Kunda & Thagard, Reference Kunda and Thagard1996), and context-specific linguistic and sociocultural expectations (Ryan, Reference Ryan1983). To infer a speaker’s sexual orientation, listeners rely on several voice cues such as vowel quality and pitch (Munson, Reference Munson2007; Suire et al., Reference Suire, Tognetti, Durand, Raymond and Barkat-Defradas2020). Although the relevant acoustic cues may vary, listeners reliably and efficiently identify gay-sounding voices across multiple languages (Sulpizio et al., Reference Sulpizio, Fasoli, Maass, Paladino, Vespignani, Eyssel and Bentler2015, Reference Sulpizio, Fasoli, Lapomarda and Vespignani2025). Similarly, to infer a speaker’s L2 status, listeners detect even the slightest departure from the expected speech pattern, using fine-grained phonetic cues in segments or prosody (Munro et al., Reference Munro, Derwing and Burgess2010).
Clearly, group-focused stereotyping (e.g., gay persons lack morals, L2 speakers are less skilled) is a learned association, given that listeners internalize these views through life experiences or from the media and popular culture (Kinzler, Reference Kinzler2021). These beliefs tend to interact with contextual factors, with consequences for how speakers are evaluated. According to the role congruity and lack-of-fit perspectives, bias arises when the traits ascribed to a person (or an entire group) are perceived as being incongruent with the expected requirements of a given role or occupation (Eagly & Karau, Reference Eagly and Karau2002; Heilman, Reference Heilman2012). For example, when listeners hear a gay-sounding job applicant, they might generate trait inferences about the speaker (e.g., he is warm but lacks agency or authority). If these assumed traits mismatch the expected role requirements, such as for a job considered masculine or communication-heavy, listeners might penalize the speaker in their evaluations (e.g., Fasoli & Hegarty, Reference Fasoli and Hegarty2020; Spence et al., Reference Spence, Hornsey, Stephensen and Imuta2024). Conversely, listeners can reverse this pattern when they perceive congruity or fit between the presumed speaker traits and job requirements, such as for stereotypically feminine occupations or those emphasizing interpersonal warmth, occasionally leading to an advantage for gay-sounding applicants whose voices signal sociability or empathy (e.g., Fasoli & Teasdale, Reference Fasoli and Teasdale2025).
Another pathway for listeners to engage in evaluative behaviors, leading to speakers being downgraded or undervalued in assessments, is experiential (Dragojevic & Giles, Reference Dragojevic and Giles2016; Dragojevic et al., Reference Dragojevic, Giles, Beck and Tatum2017). According to this view, listeners might experience processing disfluency, finding it more difficult to understand speakers who sound unfamiliar to them or whose speech is uncommon in their environment. The experience of difficulty might therefore create feelings of frustration or general negativity for listeners, resulting in unfavorable evaluations. Listeners essentially seem to blame the speaker for their communication difficulty or feeling of discomfort, for instance, by downgrading the speaker whose speech feels harder to understand in ratings of competence. Importantly, ease of understanding (also referred to as comprehensibility) and accentedness (degree of L2 accent) are distinct perceptual dimensions of L2 speech (Nagle & Huensch, Reference Nagle and Huensch2020), though a speaker’s accentedness can also influence listener ratings (Dragojevic et al., Reference Dragojevic, Giles, Beck and Tatum2017; Teló et al., Reference Teló, Trofimovich and O’Brien2022, Reference Teló, Silveira, Marcelino and O’Brien2024). The role of processing fluency in evaluations is salient for accented L2 speech (Dragojevic & Dayton, Reference Dragojevic and Dayton2025; Dragojevic et al., Reference Dragojevic, Giles, Goatley-Soan and Dayton2025), which is often more effortful to understand than L1 speech (Nagle & Huensch, Reference Nagle and Huensch2020), but may also extend to other nonstandard varieties such as stereotypically gay-sounding speech (Sumner et al., Reference Sumner, Kim, King and McGowan2014).
Current research gaps
Although listener bias is well documented, there is little understanding of how listeners perceive individuals belonging to multiple, intersecting social categories. Conceptually speaking, additive and double-jeopardy approaches suggest that biases associated with each category accumulate, whereby double-minority identities such as gay-sounding L2 speakers are more dissimilar to normative groups and elicit compounded negative evaluations (e.g., Beale, Reference Beale and Bambara1970; King, Reference King1988). In turn, category-based perspectives consider how a single category, such as being a gay man or an L2 speaker, dominates other minoritized identities (e.g., Brown & Turner, Reference Brown and Turner1979; Urada et al., Reference Urada, Stenstrom and Miller2007). Regardless of its source, people with intersecting stigmatized identities often experience some form of compounded discrimination (e.g., Remedios & Snyder, Reference Remedios and Snyder2018), with most evidence thus far emerging from groups whose identities are cued visually, such as through gender, race, or age.
To date, only a handful of studies have focused on individuals whose voice signals intersecting identities. This research has primarily examined associations between specific linguistic features (e.g., pitch, realizations of –ing, spectral characteristics of /s/) and their socially indexical meanings (e.g., regional accent, social class, gender, sexual orientation), revealing voice-based crossed categorization patterns such as when speakers with an urban accent tend to be perceived as gay (Campbell-Kibler, Reference Campbell-Kibler2007, Reference Campbell-Kibler2011; Levon, Reference Levon2014). When linguistic features stereotypically associated with gay men (e.g., a sharp, bright /s/) co-occur with cues signaling class or gender (e.g., a careful –ing pronunciation), listeners may reorganize their judgments, yielding recognizable profiles like the “smart but effeminate gay man” versus the “masculine but unintelligent straight man” (Campbell-Kibler, Reference Campbell-Kibler2011). More recently, Fasoli et al. (Reference Fasoli, Dragojevic, Rakić and Johnson2023b) had British listeners evaluate brief utterances spoken by British and Italian speakers who self-identified as gay or straight. In terms of social categorization, gay-sounding Italian speakers were identified more often as gay than gay-sounding British speakers and were also labeled more often as L2 speakers than straight-sounding Italian speakers. In terms of perceptions, gay-sounding L2 speakers were judged as the least competent of all speakers, revealing a penalty for belonging to two stigmatized groups. Fasoli et al. (Reference Fasoli, Dragojevic and Rakić2023a) explored perceived stigmatization among gay L2 speakers. Unexpectedly, there was little evidence that intersectionality led to double stigmatization; instead, speakers perceived their voice as a cue to one social category at a time.
Considering the important role of context in social evaluations (Wigboldus et al., Reference Wigboldus, Spears and Semin2005), there is similarly a lack of understanding of how specific communicative settings of private or workplace interaction moderate listener evaluations of speakers with intersecting voice-cued identities. For example, even though high-skilled immigrants are generally preferred over low-skilled immigrants (Hainmuller & Hiscox, Reference Hainmuller and Hiscox2010), newcomers are often relegated to low-paying, low-skilled jobs and preferred for jobs with low communication demands (Spence et al., Reference Spence, Hornsey, Stephensen and Imuta2024). Relatedly, gay-sounding men often face challenges in employment and career advancement (Drydakis, Reference Drydakis2022; Hoel et al., Reference Hoel, Lewis, Einarsdóttir, D’Cruz, Noronha, Caponecchia, Escartín, Salin and Tuckey2021), with gay and gay-sounding individuals consistently assigned lower suitability ratings for high-status, leadership positions (Fontenele et al., Reference Fontenele, de Souza and Fasoli2023; Gerrard et al., Reference Gerrard, Morandini and Dar-Nimrod2023) and also receiving nearly half the number of call-backs for blue-collar jobs compared to heterosexual men (Dilmaghani & Robinson, Reference Dilmaghani and Robinson2024). With some exceptions (e.g., Fasoli & Hegarty, Reference Fasoli and Hegarty2020), though, research on gay-sounding speakers has mostly examined high-status, managerial jobs, which limits the generalizability of findings given that managerial positions are typically high-status, masculine roles.
Lastly, there is presently little research exploring the role of comprehensibility (a measure of processing fluency) in voice-based evaluations of gay-sounding L2 speakers. Most work on processing fluency has centered on L2 speakers, where listeners tend to downgrade the speakers they find difficult to understand in ratings of competence, intelligence, and success, and also attribute more feelings of annoyance and irritation to them, compared to speakers who are easier to understand (Dragojevic et al., Reference Dragojevic, Giles, Beck and Tatum2017). A similar mechanism may extend to gay-sounding voices. In this context, fluency has been examined as categorization fluency—the ease with which listeners identify a speaker’s social category—rather than as processing or comprehension-focused fluency (Masi & Fasoli, Reference Masi and Fasoli2022). Admittedly, gay-signaling voice cues do not inherently impair linguistic decoding (Sulpizio et al., Reference Sulpizio, Fasoli, Lapomarda and Vespignani2025; but see Fasoli et al., Reference Fasoli, Maass, Karniol, Antonio and Sulpizio2020); nevertheless, they might disrupt processing fluency by activating gender or sexuality expectations and stereotypes (Mack & Munson, Reference Mack and Munson2012; Sulpizio et al., Reference Sulpizio, Fasoli, Lapomarda and Vespignani2025; Sumner et al., Reference Sumner, Kim, King and McGowan2014), similar to how expectations about a speaker’s gender, place of origin, or L2 status influence listener judgments even when no such cues are present in the actual speech signal (Lindemann & Subtirelu, Reference Lindemann and Subtirelu2013; Niedzielski, Reference Niedzielski1999; Strand, Reference Strand1999; Taylor Reid et al., Reference Taylor Reid, Trofimovich and O’Brien2019; Tripp & Munson, Reference Tripp and Munson2022). For instance, when the social persona projected through speech is incongruent with the communicative context (e.g., a gay-sounding job applicant considered for a stereotypically masculine job), listeners may experience a reduction in processing fluency, akin to how expectation mismatches are known to affect or otherwise slow down language processing (Jian & Hong, Reference Jiang and Hong2014; Mack & Munson, Reference Mack and Munson2012; Walker & Hay, Reference Walker and Hay2011). Ultimately, these processing fluency costs could contribute to less favorable social and professional judgments of gay-sounding speakers.
The current study
We addressed these research gaps by investigating listener reactions to gay- and straight-sounding L1 and L2 English speakers. We focused on the listener-rated professional dimension of speaker employability, which is a high-stakes decision with consequences for stigmatized groups (Fontenele et al., Reference Fontenele, de Souza and Fasoli2023; Teló et al., Reference Teló, Trofimovich, O’Brien, Le and Bodea2025). We specifically examined whether the dual voice-signaled identity of a gay-sounding L2 speaker would be associated with compounded negativity in listener evaluations and whether processing fluency mediated this process, extending prior work on listener attitudes toward speakers with these intersecting voice-cued identities (Fasoli et al., Reference Fasoli, Dragojevic and Rakić2023a, Reference Fasoli, Dragojevic, Rakić and Johnson2023b).
Unlike Fasoli et al. (Reference Fasoli, Dragojevic, Rakić and Johnson2023b), who asked listeners to evaluate brief, decontextualized sentences (e.g., “the dog runs in the park,” “the English course starts on Monday”), we elicited listener evaluations in response to 40-second audios of speakers responding to a job interview question. To contextualize potential listener bias in the workplace, where variables such as the type of job performed by the speaker can affect listener judgments (Spence et al., Reference Spence, Hornsey, Stephensen and Imuta2024), our speakers took on the role of job candidates applying to occupations that differed in communication demands (high vs. low) and sexual orientation stereotypicality (gay- vs. straight-typed), while accounting for job prestige. Finally, we modeled the role of processing fluency in speakers’ employability by asking listeners to estimate how easy or difficult it was for them to understand each speaker. Following the practice from previous processing fluency research (e.g., Dragojevic & Giles, Reference Dragojevic and Giles2016) and evidence that both speaker status and sexual orientation can influence language processing (e.g., Nagle & Huensch, Reference Nagle and Huensch2020; Sulpizio et al., Reference Sulpizio, Fasoli, Lapomarda and Vespignani2025), we tested a mediation model in which speaker status and sexual orientation predicted employability indirectly through processing fluency.
Our research question was the following: How are listener evaluations of employability for gay- and straight-sounding L1 and L2 speakers associated with speaker (language status, sexual orientation), listener (comprehensibility), and context (job communication demands, stereotypicality, prestige) variables? We anticipated that L1 speakers would elicit higher employability ratings than L2 speakers (Maindidze et al., Reference Maindidze, Randall, Martin-Raugh and Smith2025; Spence et al., Reference Spence, Hornsey, Stephensen and Imuta2024) and that gay-sounding L2 speakers would receive the lowest evaluations of all speakers (Fasoli et al., Reference Fasoli, Dragojevic, Rakić and Johnson2023b). Considering the importance of contextual, job-relevant variables, we further expected that L1 speakers would receive particularly high employability ratings when considered for high-communication jobs (Spence et al., Reference Spence, Hornsey, Stephensen and Imuta2024). In contrast, we expected gay-sounding speakers to receive higher employability ratings when considered for stereotypically gay- than straight-typed jobs (Fasoli & Teasdale, Reference Fasoli and Teasdale2025; Rule et al., Reference Rule, Bjornsdottir, Tskhay and Ambady2016). We did not make specific predictions regarding the interaction between speakers’ sexual orientation and the job’s communication demands. Whereas gay men can be perceived as having better communication skills (Hajek & Giles, Reference Hajek and Giles2005; Niedlich & Steffens, Reference Niedlich and Steffens2015), they might also be preferred for non-costumer-facing jobs (Derous et al., Reference Derous, Pepermans and Ryan2017). Finally, with respect to comprehensibility, we predicted that a speaker’s language status (L1 vs. L2) and sexual orientation (gay vs. straight) could influence processing fluency, such that cues associated with L2 accented and stereotypically gay-sounding speech would reduce listeners’ ease of understanding. In line with previous processing fluency research (Dragojevic et al., Reference Dragojevic, Giles, Goatley-Soan and Dayton2025; Teló et al., Reference Teló, Trofimovich, O’Brien, Le and Bodea2025), we also anticipated employability ratings to be higher for more than less comprehensible speakers.
Method
Audio recordings
There were four scripted scenarios featuring a male job candidate for one of four jobs responding to the question “What makes you a good candidate for this job?” (see Appendix A in Supplementary Information). The jobs differed in communication demands (high vs. low) and occupational stereotypicality (gay vs. straight): bus driver (low communication, straight-typed), school principal (high communication, straight-typed), fashion designer (low communication, gay-typed), and flight attendant (high communication, gay-typed). They were selected from 16 occupations (Hancock et al., Reference Hancock, Clarke and Arnold2020) evaluated by 12 raters from the University of Calgary’s linguistics participant pool for communication demands (1 = very low, 100 = very high) and stereotypicality (1 = very straight, 100 = very gay) on 100-point scales. According to Tukey tests, flight attendant (M = 75) and fashion designer (M = 80) were perceived as stereotypically gay (p = .999) but differed in communication demands (M = 87 vs. 47, p < .001). School principal (M = 29) and bus driver (M = 22) were considered straight (p = .999) but differed in communication demands (M = 90 vs. 49, p < .001). The scenarios were similar in length (M = 120 words, SD = 2.22, range = 118–123) and included similar content details (i.e., job-relevant skills, education, and local experience). According to an ANOVA, which compared the ratings of response quality from 12 additional raters (1 = very poor response, 100 = very good response), the scenarios elicited similar ratings (M = 79–88), F(3) = 0.93, p = .432, suggesting that all interview responses, as intended, were of good quality.
The target scenarios were recorded by eight men: two straight- and two gay-sounding L1 English speakers, and two straight- and two gay-sounding L1 Spanish speakers (all with L2 English). The four L1 English speakers (M age = 29, SD = 7, range = 24–40) were born in Canada or the United States but had spent most of their lives in Calgary (Canada), where this study was conducted. The four L1 Spanish speakers (M age = 29, SD = 4, range = 26–34) were born in Mexico or Spain; two resided in Calgary and two in their respective home countries. Spanish was chosen as the L2 because it is the fourth most common non-official mother tongue in Calgary (Statistics Canada, 2023), making it socially relevant. Moreover, several of the acoustic correlates of gay-sounding speech are similar in Spanish and English, including a speaker’s use of pitch and pronunciation of /s/ (e.g., Mack, Reference Mack, Levon and Mendes2016; Munson, Reference Munson2007). Cross-linguistic similarity in acoustic correlates of gay-sounding speech was expected to help listeners detect a speaker’s sexual orientation regardless of the speaker’s or listener’s language background.
Sixteen volunteers responded to a recruitment call for L1 and L2 speakers who believed they sounded stereotypically gay or straight. The call was distributed on campus and in Hispanic linguistics Facebook groups. All volunteers received the scenarios in advance and met individually with a researcher via Zoom for recording. They practiced reading the scenarios naturally several times and then recorded them in a conversational tone. If the speakers misspoke or hesitated unnecessarily, they re-recorded a specific scenario until all sounded comparable in pace and naturalness, as judged by the researcher who oversaw the recordings. Recordings from the eight speakers (all judged by the researchers to sound the most stereotypically straight or gay in English and Spanish) were selected for further evaluation. Twelve additional raters from the same participant pool rated the speakers for accentedness (1 = heavily accented, 100 = no accent at all) and perceived sexual orientation (1 = exclusively heterosexual, 7 = exclusively gay), and provided categorical classifications for each speaker’s language background, nationality, and sexual orientation. According to Tukey tests, the raters perceived the four L1 speakers (M = 86–89) as less accented than the four L2 speakers (M = 26–35, p < .001). The raters labeled the L1 Spanish speakers as “nonnative English speaker” in all but three instances and mentioned Spanish-speaking places in 68% of nationality guesses. Considering that listeners rarely differentiate among Spanish varieties when speakers use English (e.g., Hayes, Reference Hayes2022), identifications were generally broad (e.g., “Hispanic”), with no evidence that listeners distinguished among Spanish accents (e.g., Teló et al., Reference Teló, Silveira, Marcelino and O’Brien2024). The raters labeled the L1 English speakers as “native English speakers” in all but one instance and located them in North America in all but one response. As expected, the four gay-sounding speakers pre-selected by the researchers (M = 5–6) were judged as more stereotypically gay than the four straight-sounding speakers (M = 2–3, p < .001). In the binary classification task, the gay-sounding speakers were labeled as “gay” in 78% of the trials, and the straight-sounding speakers were labeled as “straight” in 89% of the trials, both proportions significantly above chance according to binomial tests (ps < .001). The mean duration of the final recordings was 41 seconds (SD = 6, range = 32–56).
Listeners
We recruited 192 listeners (M age = 34, SD = 12, range = 18–71) through local Facebook groups and Reddit/Nextdoor boards, posters placed on and off campus, and word of mouth. They had to meet the following two criteria: age (above 18) and residence (Calgary). We additionally used participation quotas to create a sample matching Calgary’s population (Statistics Canada, 2023) in the proportion of residents born in Canada (64% in our sample, 65% in Calgary) and outside of Canada (36% in our sample, 33% in Calgary). The sample was otherwise self-selected, meaning that all participants volunteered to participate. Of the 192 listeners, 62% self-identified as women, 35% as men, and 3% as non-binary; 70% reported being heterosexual, 3% chose not to disclose, while 27% described themselves as asexual, bisexual, gay, lesbian, or queer, with the most common response being bisexual (16%). Two listeners were transgender, while the remaining (save one who chose not to disclose) were cisgender. Of the 192 listeners, 37% reported Calgary as their hometown, 27% hailed from another Canadian city, while the remaining 36% were born outside Canada.
In terms of their ethnolinguistic background, all but two listeners answered a multiple-choice (select all that apply) question, most commonly reporting “European” (52%), followed by “South Asian” (8%), “Indigenous” (7%), and nine other ethnicities or their combinations (33%). English was the most common L1 (69%), followed by English and another language (4%), Portuguese (4%), Spanish (4%), and 21 other languages (19%). Listeners had previously completed various degrees (27% high school, 49% undergraduate, 24% postgraduate), and most (57%) indicated being proficient in one language, followed by two (31%), three (9%), and four (3%) languages. They mainly used English for daily communication (M = 90%, SD = 15, range = 11–100) and were familiar with L2-accented English (M = 7.81, SD = 1.53, range = 1–9, where 1 = not familiar at all, 9 = very familiar).
To characterize listeners in terms of potential underlying group-based biases, we asked them to respond to statements targeting racial prejudice (four items, α = .74, adapted from Akrami et al., Reference Akrami, Ekehammar and Araya2000) and homonegativity (four items, α = .68, adapted from Morrison & Morrison, Reference Morrison and Morrison2002) through a 5-point scale (1 = strongly disagree, 5 = strongly agree). In terms of racial prejudice (e.g., “Discrimination against immigrants is still a problem in Canada”, “A multicultural Canada would be good”), listeners tended to reject ethnoracial prejudice (M = 4.29, SD = 0.72, range = 1.50–5.00). As for homonegativity (e.g., “Gay men don’t have all the rights they need,” “Even in today’s tough economic times, Canadians’ tax dollars should be used to support gay men’s organizations”), they were supportive of gay men (M = 3.82, SD = 0.88, range = 1.50–5.00). As a group, listeners therefore expressed positive perceptions of ethnoracial diversity and gay men, which they potentially brought to the listening task.
Materials and procedure
Listeners evaluated the recordings remotely through an online Qualtrics interface. The study was initially described as an experiment investigating how Calgarians evaluate fellow city residents in job interviews. The interface presented five recordings (one practice trial featuring a comparable scenario with a female applicant plus four experimental trials) one at a time, followed by several 100-point sliding scales, along with a reminder of the job being applied for. After the first playback, listeners rated speaker accentedness, defined as how much a speaker’s speech is influenced by their native language and/or other non-native features (This speaker is… heavily accented–not accented at all), and comprehensibility, defined as how much effort it takes to understand what someone is saying (This speaker is… hard to understand–easy to understand). Following the second playback, listeners rated each speaker’s employability (I would likely hire this candidate… not at all–very much). Additional questions targeted the perceived stereotypicality of the speaker’s sexual orientation, through a 7-point Kinsey-like scale (This speaker is… exclusively straight–exclusively gay), and the prestige level of the targeted job, through a 100-point scale (The job that the candidate is applying to is… very low prestige–very high prestige). Listeners also evaluated each speaker on several personality traits expected to provide additional insight into listeners’ judgments. Listeners could initiate each recording at their convenience, but could not skip it, and could proceed to the next recording only after listening to the current one in its entirety.
The final recordings included 32 files (4 scenarios × 8 speakers) organized in 16 balanced experimental lists, each containing four target recordings for a within-participant, sparse rating design, with 8–16 unique listeners per list (see Appendix B). Across all lists, listeners heard all speakers applying to all jobs, but within each list, each set of listeners experienced a subset of the recordings with an equal distribution of speakers’ L1 (two English and two Spanish), speakers’ voice-signaled sexual orientation (two gay- and two straight-sounding voices), and job type (four occupations). Listeners first filled out a consent form and background questionnaire. After completing the practice ratings, they proceeded to the experimental trials, presented in random order. Finally, listeners completed the racial prejudice and homonegativity questionnaires and watched a short post-experiment debrief video. They completed all tasks within about 30 minutes and received $30 CAD as compensation.
Data analysis
We first screened the data using Qualtrics built-in controls to identify fraudulent responses (e.g., completed by bots) and geolocate each listener’s IP address to Calgary. All responses passed these quality checks. The ratings were then checked for inter-rater reliability using two-way, consistency, average-measure intraclass correlations, with satisfactory values for employability (.96), comprehensibility (.99), and sexual orientation (.99). We modeled the data using linear mixed-effects models with the lme4 package (version 1.1–37; Bates et al., Reference Bates, Mächler, Bolker and Walker2015). Because the outcome measure (employability on a 100-point scale) was an integer between fixed lower and upper bounds, we transformed it into proportions using min–max normalization such that all values fell between 0 and 1. For all models, we used a binomial distribution with a logit link function (Baum, Reference Baum2008), with 100,000 iterations performed through the BOBYQA optimizer and the number of adaptive Gauss-Hermite quadrature points (nAGQ) set to 1.
We ran models of increasing theoretical and analytical complexity. Employability was the outcome variable. Fixed-effects predictors included the speaker’s language status and sexual orientation (both categorical); comprehensibility (continuous); and the job’s communication demands, stereotypicality (both categorical), and prestige (continuous). We also modeled random intercepts for listeners and speakers, and allowed by-listener slopes for comprehensibility and sexual orientation to vary. Although the random-slope models converged, yielding findings similar to those in the fixed-slope models, these more complex models violated the assumptions of dispersion and distribution, so slopes were kept constant. For all models, we obtained diagnostics with the DHARMa package (version 0.4.7; Hartig, Reference Hartig2024). Dispersion and outliers were not problematic. A Kolmogorov-Smirnov goodness-of-fit test revealed some distribution violations in the models, but all cases appeared acceptable.
To facilitate the interpretation of the listener-rated variable of the speaker’s sexual orientation (captured through a 7-point Kinsey-like scale) and to make it comparable to the binary variable of the speaker’s self-reported language status, we recoded it into a two-level categorical predictor: straight (scores 1–3, with 297 datapoints) and gay (scores 5–7, with 324 datapoints). Considered ambiguous, midscale ratings of 4 (142 datapoints) were excluded from the analyses. To verify that the binary coding was likely to lead to statistically and substantively similar inferences as the original 7-point scale, we fit linear mixed-effects models predicting employability through the sexual orientation variable only but coded either as a numeric or a binary variable. Model fit comparison (ΔAIC = 0.08) and correlation between the model-predicted values (r = .92) indicated that the binary version of the 7-point scale would simplify model interpretability without affecting the findings.
To summarize our data, we obtained model-estimated marginal means (as measures of central tendency) and associated standard errors (as measures of uncertainty around those means) expressed on each relevant response scale (see Appendix C for raw means and standard deviations). To test the statistical significance of each parameter, we checked p values and examined confidence intervals (CIs) to check that the intervals do not cross zero. To explore main and interaction effects, we performed post hoc tests with Tukey-adjusted p values using the emmeans package (version 2.0.0; Lenth, Reference Lenth2024). Our study was exploratory, so no comparisons were planned a priori as we intended to explore all possible effects (up to two-way interactions) among a combination of fixed-effects predictors. We set the upper threshold at a two-way interaction because higher-order effects would be difficult to defend analytically (e.g., avoiding model overfit) given our sample size (621 valid datapoints) and the number of predictors. All data and analysis code are publicly available through the study’s Open Science Framework profile (https://doi.org/10.17605/OSF.IO/RTYAV).
Results
Speaker and job variables
Following prior research (Fasoli et al., Reference Fasoli, Dragojevic, Rakić and Johnson2023b), we first focused on the binary variables of speaker status (L1 vs. L2) and sexual orientation (gay vs. straight) as predictors of employability. As in Fasoli et al. (Reference Fasoli, Dragojevic, Rakić and Johnson2023b), we initially did not include pronunciation-related predictors (i.e., accentedness, comprehensibility), on the assumption that a speaker’s L1/L2 status alone might account for listeners’ judgments. To capture job-related effects, we included fixed effects for job communication demands, stereotypicality, and prestige. Because stigmatized speakers are often stereotyped into particular jobs, we also modeled two-way interactions between language status and sexual orientation and each job variable. Averaging across sexual orientation, L1 speakers (M = 85.9, SE = 1.54) had nearly 60% greater odds of being rated as more employable than L2 speakers (M = 79.5, SE = 2.07), OR = 1.58, SE = 0.19, z = 3.78, p < .001. Averaging across speaker status, straight-sounding speakers (M = 83.7, SE = 1.54) had 11% greater odds of being rated as more employable than gay-sounding speakers (M = 82.2, SE = 1.65), OR = 1.11, SE = 0.03, z = 3.60, p < .001. A significant interaction between speaker status and sexual orientation (illustrated in Figure 1) was driven by straight-sounding L1 speakers (M = 87.6, SE = 1.41), who were rated more employable than all other speakers, including gay-sounding L1 speakers (M = 84.0, SE = 1.72), OR = 1.35, SE = 0.06, z = 6.73, p < .001, gay-sounding L2 speakers (M = 80.2, SE = 2.04), OR = 1.75, SE = 0.22, z = 4.48, p < .001, and straight-sounding L2 speakers (M = 78.8, SE = 2.14), OR = 1.91, SE = 0.24, z = 5.21, p < .001. In contrast, gay-sounding L1 speakers were only rated more employable than straight-sounding L2 speakers, OR = 1.42, SE = 0.09, z = 2.85, p = .022. Gay-sounding L1 and L2 speakers received comparable evaluations, OR = 1.30, SE = 0.16, z = 2.13, p = .143, as did gay- and straight-sounding L2 speakers, OR = 1.09, SE = 0.03, z = 2.34, p = .089.
Estimated means for speaker employability as a function of the speaker’s language status and sexual orientation (SO). Whiskers around estimated means enclose 95% CIs.

In terms of job variables, employability ratings tended to be higher for low- than high-communication jobs, gay- than straight-typed jobs, and higher- than lower-prestige jobs. More importantly, both speaker status and sexual orientation interacted with communication demands in predicting employability (illustrated in Figure 2). Whereas L2 speakers were rated similarly employable in both low- and high-communication jobs (M = 79.8, SE = 2.06 vs. M = 79.2, SE = 2.10), OR = 1.03, SE = 0.03, z = 1.13, p = .671, L1 speakers were rated more employable in low-communication jobs (M = 87.5, SE = 1.40) than in high-communication jobs (M = 84.2, SE = 1.70), OR = 1.32, SE = 0.03, z = 8.10, p < .001. Similarly, while gay-sounding speakers were rated comparably employable in both low- and high-communication jobs (M = 82.6, SE = 1.63 vs. M = 81.8, SE = 1.70), OR = 1.06, SE = 0.03, z = 1.73, p = .309, straight-sounding speakers were rated more employable in low-communication jobs (M = 85.3, SE = 1.44) than in high-communication jobs (M = 81.9, SE = 1.69), OR = 1.28, SE = 0.03, z = 6.67, p < .001. Finally, speakers’ sexual orientation also interacted with job prestige in predicting employability (Figure 2). As job prestige increased, gay-sounding speakers tended to be perceived as more employable, with ratings approaching those of straight-sounding speakers, b = 0.81, SE = 0.11, z = 6.91, p < .001. In the full model (summarized in Appendix D), fixed-effects predictors explained 6% of the variance in employability (marginal R 2 = .06, conditional R 2 = .97).
Estimated means for speaker employability as a function of the speaker’s language status, sexual orientation (SO), and significant job characteristics. Whiskers and shaded areas around estimated means enclose 95% CIs.

Comprehensibility
We then modeled speaker comprehensibility (a measure of listeners’ processing fluency) as a predictor of employability. Processing fluency, which reflects listeners’ experience with speech, has been shown to mediate speaker evaluations, insofar as listeners downgrade L2 speakers whom they perceive as difficult to understand (Dragojevic & Giles, Reference Dragojevic and Giles2016). Because a comprehensibility-based mechanism could similarly account for speaker employability in our dataset, we used conditional mediation analyses to estimate direct (controlling for processing fluency), indirect (via processing fluency), and total effects of language status (L1 vs. L2) and sexual orientation (gay vs. straight) on employability. To control for differences in accent that are independent from comprehensibility (Dragojevic & Giles, Reference Dragojevic and Giles2016; Teló et al., Reference Teló, Trofimovich and O’Brien2022, Reference Teló, Silveira, Marcelino and O’Brien2024), we included the speaker’s accentedness ratings as a covariate. We fit mixed-effects models as building blocks within the mediation package (version 4.5.1; Tingley et al., Reference Tingley, Yamamoto, Hirose, Keele and Imai2014), which computed quasi-Bayesian CIs through 1,000 simulations of the fitted models. We used the same model parameters as described above, except in the random structure of the models, which included intercepts for listeners only (rather than for both listeners and speakers) because the mediation package accommodates only one random effect (Tingley et al., Reference Tingley, Yamamoto, Hirose, Keele and Imai2014). In the best-fitting outcome model, which explained 19% of variance in employability (marginal R 2 = .19, conditional R 2 = .97), we included job communication demands, stereotypicality, and prestige as main effects because interaction terms involving these variables did not improve model fit. Because the component equations included all relevant predictors, the final models were saturated, so no assessment of global model fit was required (see Appendix E for full model specifications and outputs).
In our initial models (illustrated in Figures 1 and 2), we treated language status as a general indicator of accent-related differences, so any variance in employability associated with speaker accentedness and comprehensibility was captured through language status. In contrast, in the mediation analyses, we modeled speaker accentedness and comprehensibility as separate predictors, so the estimated effect of language status represented only what is uniquely attributable to it after accounting for speakers’ accentedness and comprehensibility. This approach allowed us to clarify the interaction between language status and sexual orientation (previously shown in Figure 1). As illustrated in Figure 3, straight-sounding L1 speakers (M = 85.9, SE = 1.10) continued to be rated highest, now followed by straight-sounding L2 speakers (M = 82.7, SE = 1.27), gay-sounding L1 speakers (M = 81.6, SE = 1.32), and gay-sounding L2 speakers (M = 80.8, SE = 1.38). Groupwise contrasts indicated that the previously observed difference between gay-sounding L1 speakers and straight-sounding L2 speakers was attenuated when pronunciation variables were controlled, OR = 0.93, SE = 0.04, z = –2.06, p = .168, suggesting that what previously looked like a language status bias against L2 speakers was partly a pronunciation-based penalty. By contrast, a sexual orientation difference emerged among L2 speakers, OR = 1.13, SE = 0.04, z = 3.54, p = .002, where gay-sounding L2 speakers were placed at the bottom of the employability hierarchy when holding their comprehensibility and accentedness comparable to those of straight-sounding L2 speakers.
Estimated means for speaker employability as a function of the speaker’s language status and sexual orientation (SO), with speaker accentedness and comprehensibility controlled. Whiskers around estimated means enclose 95% CIs.

Mediation analyses (summarized in Figure 4) replicated and extended these results. As indicated by the direct and total effects of speaker status on employability (Figure 4a and b), L2 speakers were rated as less employable than L1 speakers, although this difference was more pronounced for straight- than gay-sounding speakers (b total = –0.06 vs. –0.02). As shown through the direct and total effects of sexual orientation on employability (Figure 4c and d), gay-sounding speakers were rated as less employable than straight-sounding speakers, although more clearly so among L1 speakers than L2 speakers (b total = –0.05 vs. –0.02). Considering the effects of speaker status and sexual orientation on comprehensibility, L2 speakers and gay-sounding L1 speakers were perceived as less comprehensible (Figure 4a–c), but there did not seem to be a compounded comprehensibility cost for gay-sounding L2 speakers as individuals belonging to two stigmatized identities (Figure 4d).
Mediation diagrams illustrating comprehensibility (processing fluency) as a mediator of the relationship between employability and speaker status (a and b) and sexual orientation (c and d).

Turning to indirect effects, comprehensibility mediated the relationship between speakers’ language status and employability (Figure 4a and b), indicating that L2 speakers were rated less employable than L1 speakers when they were more difficult to understand. This effect was more clearly evident for straight-sounding speakers (b indirect = –0.03, p = .002) than for gay-sounding speakers (b indirect = –0.02, p = .058). In contrast, there was no evidence of comprehensibility mediating the relationship between speakers’ sexual orientation and employability (Figure 4c and d), where gay-sounding speakers were perceived as less employable, independent of how easy or difficult they were to understand (b indirect = –0.01 vs. 0.01). To sum up, when evaluating speaker employability, listeners likely experienced processing disfluency, which may have informed their ratings. However, this processing disfluency likely reflected the aspects of speakers’ pronunciation that stem from their status as L2 speakers rather than their identity as gay-sounding speakers.
Discussion
We examined how a speaker’s language status, sexual orientation, and comprehensibility (as a measure of processing fluency) shape listeners’ judgments of employability for speakers applying for jobs considered high or low in communication demands, gay- or straight-typed, and higher or lower in prestige. A speaker’s language status and voice-cued sexual orientation interacted such that straight-sounding L1 speakers received the highest ratings overall. Speakers who sounded gay or L2-accented were comparatively downgraded. A job’s communication demands moderated the role of speaker variables, where L1 speakers and straight-sounding men considered for low-communication jobs generally elicited the most favorable evaluations. Similarly, a job’s prestige moderated the effect of sexual orientation, where gay-sounding speakers—more so than straight-sounding speakers—tended to receive higher employability ratings as perceived job prestige increased. Processing fluency mediated the relationship between language status and employability (but not between sexual orientation and employability), where L2 speakers were perceived as less employable in part because they were more difficult to understand.
Language status and sexual orientation
Following prior research on stereotyping of gay-sounding men and L2 speakers (Fasoli et al., Reference Fasoli, Dragojevic, Rakić and Johnson2023b), we first examined how listeners evaluate the employability of speakers belonging to these often-stigmatized social categories using binary variables capturing speakers’ language status and sexual orientation. Employability was predicted by the interaction between a speaker’s L1/L2 status and sexual orientation, with the magnitude of significant differences (M diff = 3.6–8.8 on a 100-point scale) comparable to those observed by Fasoli et al. (Reference Fasoli, Dragojevic, Rakić and Johnson2023b) for the ratings of speaker competence (M diff = 0.16–0.45 on a 7-point scale, equivalent to 2.3–6.4 on a 100-point scale). As shown in our initial modeling (Figure 1), which paralleled Fasoli et al.’s (Reference Fasoli, Dragojevic, Rakić and Johnson2023b) analysis in that a speaker’s language status but no pronunciation variables were accounted for, straight-sounding L1 speakers elicited the highest ratings, followed by gay-sounding speakers (regardless of their L1/L2 status), and finally by straight-sounding L2 speakers. Once speaker comprehensibility and accentedness were statistically controlled, the employability hierarchy for speakers whose voice signaled at least one stigmatized identity changed. When holding speakers’ comprehensibility and accentedness constant, gay-sounding L1 speakers were no longer more employable than L2 speakers (gay- or straight-sounding), and straight-sounding L2 speakers were preferred over gay-sounding L2 speakers (Figure 3). The speakers with zero stigmatized identities (i.e., straight-sounding L1 speakers) continued to receive the highest employability ratings.
This hierarchy, with heteronormative-sounding L1 speakers positioned above others, aligns with prior evaluations of stigmatized individuals, where the speaker’s L1 status (Hosoda et al., Reference Hosoda, Stone-Romero and Walter2007) and masculinity (Vandello & Bosson, Reference Vandello and Bosson2013) independently signal greater competence. As in Fasoli et al. (Reference Fasoli, Dragojevic, Rakić and Johnson2023b), it was ultimately gay-sounding L2 speakers—that is, those with a double minority status—who received the lowest ratings. This finding is consistent with double jeopardy perspectives of social categorization and stereotyping, which predict that membership in two minoritized groups is associated with a compounded detriment to a speaker’s evaluation (Nicolas et al., Reference Nicolas, de la Fuente and Fiske2017; Remedios & Akhtar, Reference Remedios, Akhtar, Mallett and Monteith2019). Crucially, this compounded penalty against gay-sounding L2 speakers was unclear until comprehensibility and accentedness were controlled, confirming that the disadvantage associated with sounding both L2-accented and gay cannot be fully explained by pronunciation issues. Likewise, the initial advantage that gay-sounding L1 speakers had over straight-sounding L2 speakers no longer held once comprehensibility and accentedness were controlled, again, suggesting separate penalties for L2 speakers (when they are perceived as accented and difficult to understand) and for gay-sounding speakers (when they are categorized as gay).
These results point to listeners not treating language status and sexual orientation as equivalent categories, as previously shown for gender and ethnic categories (Ghavami & Peplau, Reference Ghavami and Peplau2012). In particular, our findings imply that listeners appear to use a speaker’s L1/L2 status as a diagnostic cue related to expected or real differences in the speaker’s pronunciation, as captured through listener-rated comprehensibility and accentedness. When those performance correlates were statistically controlled, the dichotomous category of language status lost much of its evaluative force. Indeed, mediation analyses showed that approximately 50% of the effect of language status on employability was explained by differences in comprehensibility. In contrast, gay-signaling vocal cues, which were unrelated to speakers’ ability to convey their message (Figure 4c and 4d), appeared to operate as trait-like indicators of gender and heteronormative conformity or deviance (Campbell-Kibler, Reference Campbell-Kibler2011; Fasoli et al., Reference Fasoli, Maass, Paladino and Sulpizio2017). While sounding like an L1 speaker may be a common expectation for speakers in general, among men, sounding stereotypically straight is the norm (Fasoli et al., Reference Fasoli, Maass, Paladino and Sulpizio2017, Reference Fasoli, Hegarty, Maass and Antonio2018; Klysing, Reference Klysing2023). As such, gender- or sexuality-atypical cues and their related social categories might draw disproportionate attention because they violate prescriptive norms for men and undermine assumptions of heterosexuality as the default (Kite & Deaux, Reference Kite and Deaux1987; Lavan & McGettigan, Reference Lavan and McGettigan2023; Vandello & Bosson, Reference Vandello and Bosson2013), even though sexual orientation may be a less perceptually salient category in comparison to other categories such as race (e.g., Remedios et al., Reference Remedios, Chasteen, Rule and Plaks2011). In line with this interpretation, once pronunciation performance dimensions were held constant, gay-sounding men were consistently downgraded relative to straight-sounding men, regardless of their language background. From this vantage point, the effect of sounding gay became dominant over language status once all speakers were assumed to be similarly comprehensible and accented. The resulting pattern reflects an underlying prototypical structure in which straight-sounding L1 speakers function as the implicit benchmark for professional identity, and deviations from this prototype—whether through sounding accented or gay—produce systematic and, in the case of gay-sounding L2 speakers, compounded penalties.
Comprehensibility
We examined the role of speaker comprehensibility (a measure of listeners’ processing fluency) in evaluations of employability because speakers of stigmatized varieties are often penalized partly because they are perceived as harder to understand. Consistent with our expectation, comprehensibility emerged as the strongest predictor of employability, with more comprehensible speakers rated as more employable. Because comprehensibility targets listeners’ experience with speech—for instance, in terms of the vocal setting, the quality of segments and prosody, or the flow of speech—comprehensibility can capture subtle differences in speakers’ pronunciation, with evaluative consequences for them (Trofimovich et al., Reference Trofimovich, Tekin, Lindberg, Wagner, Batty and Galaczi2024). In our data, comprehensibility mediated the relationship between speakers’ language status and employability (Figure 4a and 4b), such that L2 speakers were rated less employable than L1 speakers when they were more difficult to understand. These findings extend prior research in which processing fluency was shown to mediate listener judgments of speaker competence (e.g., Dragojevic & Giles, Reference Dragojevic and Giles2016; Dragojevic et al., Reference Dragojevic, Giles, Beck and Tatum2017).
Our results also clarify the role of comprehensibility in listeners’ evaluations of men whom they also perceive to be gay. Comprehensibility did not explain evaluations tied to a speaker’s sexual orientation, in the sense that gay- and straight-sounding men were rated equally comprehensible (Figures 4c and 4d). We had initially anticipated that gay-signaling voice cues could disrupt processing fluency by activating gender or sexuality expectations and stereotypes (Fasoli et al., Reference Fasoli, Maass, Karniol, Antonio and Sulpizio2020; Mack & Munson, Reference Mack and Munson2012; Sulpizio et al., Reference Sulpizio, Fasoli, Lapomarda and Vespignani2025). While it seems likely that gay-sounding speakers in our study were subject to negative stereotyping, it appeared unrelated to speakers’ relative ease of understanding. Nevertheless, our findings for comprehensibility broadly align with theoretical perspectives that emphasize the interplay between social categorization and cognitive processing (Dargojevic & Giles, Reference Dragojevic and Giles2016; Dragojevic et al., Reference Dragojevic, Giles, Beck and Tatum2017, Reference Dragojevic, Giles, Goatley-Soan and Dayton2025; Fiske & Neuberg, Reference Fiske, Neuberg and Zanna1990), where listeners’ evaluative reactions are determined through top-down information derived from the speaker’s presumed identity (e.g., as a gay man) and bottom-up cues associated with the listener’s processing fluency (e.g., for an L2 speaker who is difficult to understand). As our findings illustrate, the two processes might unfold for listeners in parallel.
Job characteristics
We also explored the role of job variables in evaluations of speakers with intersecting voice-cued identities. We showed that job characteristics moderated the relationship between speaker variables and employability ratings, where listeners evaluated both L1 speakers (irrespective of their sexual orientation) and straight-sounding men (irrespective of their L1/L2 status) as more employable in low- than high-communication jobs, with no differentiation between L2 speakers or gay-sounding men for either job type. For one, this finding is consistent with the general bias against L2 speakers, who are penalized in professional settings regardless of job requirements (Hosoda & Stone-Romero, Reference Hosoda and Stone-Romero2010; Timming, Reference Timming2017), and against gay-sounding men, who often receive high assessments of social skills but low evaluations of professional competence (Fasoli et al., Reference Fasoli, Maass, Paladino and Sulpizio2017; Fasoli & Teasdale, Reference Fasoli and Teasdale2025). Alternatively, listeners might have purposely prioritized members of the two in-groups (L1 speakers, straight-sounding men) for low-communication occupations, insofar as listeners recognized and wished to undo the practice that members of stigmatized groups, including L2 speakers and (to a lesser extent) gay men, are disproportionately considered for low-communication, non-customer-facing positions (Derous et al., Reference Derous, Pepermans and Ryan2017; Gerrard et al., Reference Gerrard, Morandini and Dar-Nimrod2023; Spence et al., Reference Spence, Hornsey, Stephensen and Imuta2024). More specifically, to gay speakers, this finding might also imply a form of benevolent prejudice, where listeners capitalized on the stereotype of gay men as good communicators (Niedlich & Steffens, Reference Niedlich and Steffens2015; Steffens et al., Reference Steffens, Niedlich, Beschorner and Köhler2019) and therefore did not strongly endorse them for low-communication jobs.
Gay-sounding speakers also received higher employability ratings as job prestige increased, with ratings approaching those of straight-sounding speakers, whose evaluations were consistently high irrespective of job prestige. At first glance, this finding is surprising, given that gay-sounding speakers often face particular barriers in high-status, leadership roles (Fasoli & Hegarty, Reference Fasoli and Hegarty2020; Fontenele et al., Reference Fontenele, de Souza and Fasoli2023). However, this finding likely reflects the specific jobs included in our design, which the listeners evaluated differently in terms of their perceived prestige, with school principal rated highest (M = 77.2, SD = 16.3), followed by fashion designer (M = 68.5, SD = 17.7) and flight attendant (M = 64.7, SD = 20.0), and finally bus driver perceived as the lowest (M = 49.4, SD = 24.6). While these jobs were selected to vary in sexual orientation stereotypicality and communication demands rather than to maximize prestige differences, it is possible that within this set, higher-prestige jobs were perceived as more meritocratic (in the case of school principal) or stereotypically gay (in the case of fashion designer and flight attendant), whereas the clearly lower-prestige job (bus driver) was perceived as straight-typed. Put simply, job prestige was not a dimension that varied independently of each occupation’s stereotypicality or communication demands. Regardless of their origins, the effects of the job-relevant variables were weak in the amount of variance they explained (1%), compared to the role of the speaker’s status and sexual orientation (5%), as observed during model fitting.
Implications, limitations, and future research
Conceptually speaking, our work highlights the importance of adopting a multidimensional approach to listener judgment, integrating, among other variables, different speaker identities, professional contexts, and listeners’ real-time experience understanding the speaker. The change in employability hierarchy after controlling for pronunciation variables (cf. Figures 1 and 3) and the observed mediated effects of comprehensibility support the conclusion that what appears to be a language-status penalty may partly reflect listeners’ processing (dis)fluency stemming from a speaker’s pronunciation performance (Dragojevic & Giles, Reference Dragojevic and Giles2016; Dragojevic et al., Reference Dragojevic, Giles, Beck and Tatum2017). Practically speaking, our findings highlight the persistence and complexity of voice-based discrimination in workplace-relevant contexts. The fact that straight-sounding L1 speakers received the highest ratings while gay-sounding L2 speakers received the lowest—even in a sample of listeners who explicitly endorsed low racial prejudice and strong support for gay men—suggests that voice-based biases operate at a level that may be difficult to consciously monitor or control.
It would therefore be important to engage both lay and professional communities, including hiring managers, human resources personnel, and training and development professionals, to raise awareness of how a person’s pronunciation can subtly influence employment outcomes in highly nuanced ways. To begin breaking down these stereotypes, outreach efforts might include training sessions and workshops to familiarize listeners with diverse vocal profiles, especially those associated with speakers from single and multiple-voice-cued minoritized groups. For L2 accent bias specifically, strategies that improve listeners’ understanding (e.g., accent familiarization programs) may prove particularly helpful, given the important role of processing fluency in listener judgments. Leveraging social media platforms could also help disseminate prejudice-reduction efforts and invite public dialogue. Given that stereotypes are reinforced not only through media representation but also through daily interactions in the workplace, these interventions must challenge the perceived notion of “fit” in professional contexts, which requires both raising awareness and building long-term strategies to normalize vocal diversity across social and professional spaces.
At the same time, several limitations temper the generalizability of our findings. Our study focused exclusively on speakers who identified as men. Even though listeners also make sexual orientation judgments about women’s voices (Fasoli et al., Reference Fasoli, Maass, Paladino and Sulpizio2017), the present results likely do not extend to other gender groups. In addition, our sample was limited to L1 English and Spanish speakers, which constrains the extent to which the findings can be generalized to speakers of other language backgrounds, particularly those occupying different positions within global linguistic hierarchies (Dragojevic & Goatley-Soan, Reference Dragojevic and Goatley-Soan2022). We also used single-item scales to capture listeners’ ratings and scripted scenarios to control the quality and scope of the speakers’ audios. However, scales and scenarios may not have been representative of spontaneous, interactive interview assessment and content. We also did not manipulate the quality of the interview responses, so the overall employability evaluations were quite high (70–90 on average). It is therefore unclear how listener reactions to gay- and straight-sounding speakers differ for low- versus high-quality responses. Similarly, whereas we established each speaker’s sexual orientation through listener judgments, which should broadly reflect how listeners categorize their interlocutors in real life, we determined the speaker’s L1/L2 status through their self-reports, with support from the norming study that revealed distinct levels of accentedness for L1 and L2 speakers. Relatedly, because we recruited individuals from the community rather than trained actors to serve as speakers, their recordings inevitably contained idiosyncratic pronunciation differences beyond those related to language background and sexual orientation. Although these differences were not examined directly, they may have influenced listeners’ perceptions (cf. research on voice quality). To account for this known limitation, our models included random intercepts for speakers whenever possible, which accounted for stable between-speaker differences.
In terms of design limitations, although the within-participants design and our modeling of the random-effects structure allowed us to control some interindividual differences across listeners, repeated measurements may nevertheless have heightened their sensitivity to contrasts among speakers. To assess the role of potential individual listener differences, we computed post hoc exploratory Spearman correlations (with a Benjamini-Hochberg correction) between listeners’ employability ratings and their ethnoracial diversity and homonegativity profiles. There were only two associations that surpassed the .25 benchmark for a weak effect (Plonsky & Oswald, Reference Plonsky and Oswald2014), where more support for ethnoracial diversity was associated with higher employability ratings for straight-sounding L2 speakers, ρ = .29, p = .002, and L2 speakers in straight-typed jobs, ρ = .29, p = .001. Thus, although listeners’ positive attitudes toward ethnic diversity might mitigate some of their bias against L2 speakers, it is still unclear how listener attitudes toward gay men inform their perceptions of gay-sounding speakers (e.g., Fasoli & Teasdale, Reference Fasoli and Teasdale2025). Researchers should therefore examine the role of individual listener experience more closely, focusing on how voice-cued social categories such as age, race, gender, and sexual orientation interact with listeners’ attitudes and processing fluency as they engage in evaluative judgments. Lastly, this work would benefit from a replication in more naturalistic or professional settings (e.g., with actual hiring committees), taking into consideration the time-sensitive, experience-driven nature of processing fluency.
Conclusion
We examined how employability ratings in jobs varying in communication demands, sexual orientation stereotypicality, and prestige are explained through a speaker’s language status, sexual orientation, and comprehensibility (as processing fluency). Overall, employability was most strongly explained by listeners’ experience understanding the speakers, suggesting that processing fluency may explain language attitudes and accent bias to a greater extent than social categories (e.g., Dragojevic & Dayton, Reference Dragojevic and Dayton2025). In fact, the effect of language status on employability was mediated by comprehensibility, with L2 speakers downgraded relative to L1 speakers in part because they were more difficult to understand. Controlling for speakers’ comprehensibility and accentedness revealed an expected evaluative hierarchy at the intersection of language status and sexual orientation. Straight-sounding L1 speakers, whose voices signaled no stigmatized identities, received the highest ratings, whereas all speakers with at least one voice-cued stigmatized identity (L2 accent or gay-sounding voice) were rated significantly less employable. Gay-sounding L2 speakers, who carried both stigmatized identities, received the lowest employability ratings. A job’s communication demands moderated the effect of language status and sexual orientation, with L1 speakers and straight-sounding speakers advantaged particularly in low-communication jobs. Finally, a job’s prestige moderated the effect of sexual orientation, where gay-sounding speakers were perceived as more employable as job prestige increased, with ratings approaching those of straight-sounding speakers, whose evaluations were consistently high irrespective of job prestige. Together, our results highlight the complex, context-sensitive nature of language attitudes and support the view that processing fluency and social categorization together shape speaker evaluations, offering novel insight into how listeners perceive speakers at the intersection of often-stigmatized identities of gay-sounding L2 speakers.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0142716426100563.
Acknowledgments
This work was supported through a research grant awarded to the second and third authors by the Social Sciences and Humanities Research Council of Canada (grant number 435-2021-0069).
Use of AI
Claude Opus (version 4.0) was used to look up R code references during data analysis. No AI tools were used for text or image generation, data analysis or data interpretation.

