1. Introduction
There is something paradoxical about the human voice. On the one hand, we frequently recognize people by their voices, perceiving the sounds they make as distinctive and unique. On the other hand, we know that voices can be modulated, imitated, misattributed, misinterpreted, and stereotyped based on pitch, timbre, inflection, or accent. Often, stereotyping of vocal expression results in discrimination. Other times, adjusting one’s voice to pre-existing vocal scripts is a self-affirming, empowering phenomenon, as demonstrated by the experiences of transgender persons seeking vocal coaching (Conroy et al. Reference Conroy, Karcher and Pasternak2022; Adessa et al. Reference Adessa, Weston, Ruthberg and Bryson2023). We tend to draw comparisons between the voice we hear and other voices we have encountered, making associations and assumptions. This dual, ambivalent, and paradoxical quality of voice—revealing yet obscuring, unrepeatable yet repeating, singular yet plural, oppressive yet liberating—is bound to lead to divides in how voice is theorized and studied.
Some researchers theorize voice in a logic similar to how body and embodied subjectivity are approached in the Foucauldian-Butlerian tradition. They focus on the processes of subjectivation stemming from external, socially conditioned forces of power that define norms and transgressions, affirming the former and punishing the latter. They may explore how the musical qualities of voice are interconnected with intersecting aspects of identity and how these dimensions of identity are susceptible to stereotyping, which can ultimately result in oppression or subversion (e.g., Devers and Meeks Reference Devers and Meeks2024; Fasoli et al. Reference Fasoli, Maass, Paladino and Sulpizio2017, Reference Fasoli, Dragojevic and Rakić2023, Reference Fasoli, Frost and Serdet2024). In research on voice, these approaches are often reflected in sociologically and psychologically oriented empirical studies, usually conducted in societies with long traditions of feminist, queer, and anti-racist movements and racially diverse ethnic make-up. However, emerging from outside the Anglophone academic context, a distinctively philosophical approach positions voice as a phenomenological marker of embodied uniqueness. In this approach, voice expresses the “who” and not the “what” someone is, demanding respect for the vocal subject as a political agent whose embodied uniqueness makes them equal to all other vocal subjects (Cavarero Reference Cavarero2005). Both approachesFootnote 1 can be seen as pursuing the goal of unearthing and problematizing vocal discrimination and, ideally, providing tools to fight it. However, their distinctiveness—and their respective limitations—create the possibility of each of them viewing the other as dissonant to their own aims and arguments. Examining the tensions and similarities between the two can pave the way for yet another approach that captures the inherent ambiguity and paradoxicality of voice, along with the implications this ambiguity has for socially conscious and inclusive perspectives on subjectivity.
To find a theoretical position that accounts for the ambiguities of vocal expression, I begin by juxtaposing the works of Nina Sun Eidsheim, a musicologist who studies voice and sound in the context of race, with those of Adriana Cavarero, an Italian feminist philosopher who explores the gendered logic behind vocal expression. Eidsheim’s work readily lends itself to philosophical discussion due to its theoretical orientation and indebtedness to Foucault (Eidsheim Reference Eidsheim2019, 42) but it is primarily Eidsheim’s brief yet firm renouncement of Cavarero’s vocal ontology of uniqueness that calls for an in-depth exploration. After introducing both thinkers and their respective projects, I trace similarities and tensions, expanding on Eidsheim’s critique of Cavarero while pointing to its limitations. I examine Hannah Arendt’s influence on Cavarero’s vocal ontology of uniqueness and incorporate Sophie Loidolt’s phenomenological insights into Arendtian plurality to argue that Eidsheim’s implicit accusations of essentialism against Cavarero are not entirely justified, even if the Cavarerian framework might, indeed, stem from an intellectual tradition that does not try hard enough to dismantle systemic racism. Seeking a third theoretical inspiration that would bridge the two perspectives while contributing to an exploration of vocal ambiguity, I use insights from Karen Barad, one of the key figures of feminist new materialism. While Barad does not explicitly focus on voice, their quantum physics-inspired theories challenge binary divides and enable the coexistence of diverse approaches for understanding complex material-discursive phenomena. Barad’s framework presumes that no phenomenon is ever purely material or purely discursive; rather, it is always entangled in both dimensions, transcending duality. All phenomena are intra-actively (rather than interactively) connected; their boundaries are not stable and clearly defined but emerge through their mutual relationships. The exploration of intra-action in my discussion is invited by Eidsheim’s take on voice as action, presented in one of her early (Reference Eidsheim2012) papers that preceded the 2019 book The race of sound, with the latter also revolving heavily around the theme of the materiality of voice.
My goal for this paper is to explore the tensions between voice and identity from an interdisciplinary perspective that invites an intersectional framing but does not reduce voice to a mechanically assembled, stable, and measurable collection of clearly and uniformly defined identity categories. At the same time, I account for both vocal and ontological uniqueness without reducing them to some internal truth that remains untouched by the external world. While an inner reality unbothered by identity categories and systems of power relations that define those categories seems problematic and implausible, I nevertheless strive to acknowledge an interrelated but distinct kind of fantasy: that we can be heard and seen as singular and one of a kind, that we crave this recognition from other people and yearn to reciprocate it, and that this yearning is not only part of our shared humanity but also of how we make sense of the world. By complicating the relationship between singularity and plurality, uniqueness and performativity, I wish to develop a framework that allows us to capture the yearning for the unrepeatable, the authentic, and the singular while also recognizing the lived experiences of repeatable social scripts and power dynamics beyond our control—including but not limited to racism. Feminist new materialism has been applied to the considerations of sound (Fairbairn Reference Fairbairn2022) and race, racism, and racialized identities (Hames-Garcia Reference Hames-García2011; Rosiek Reference Rosiek, Rosiek and Kinslow2016, Reference Rosiek2019), yet a comprehensive synthesis of voice as uniqueness, voice as racialized difference, and voice as a material-discursive, intra-active phenomenon remains to be achieved. While this article by no means promises a complete and finalized framework, it outlines a path toward such a synthesis by critically engaging with existing theories and identifying key intersections and divergences. The stake of this intellectual exercise is, of course, a political one. In a world that perpetuates marginalization and erasure of certain groups based on common denominators such as gender, sexuality, race, ethnicity, nationality, or economic status, and where auditory interpretations of vocal expression can contribute to that marginalization (and where, one might add, white, straight, cisgender, affluent Europeans and descendants of Europeans have dominated intellectual academic production), a theorization of voice that does not take into account intersectionality cannot be easily afforded.
2. The problem with the vocal ontology of uniqueness
In her book The race of sound: Listening, timbre and vocality in African American music (Reference Eidsheim2019), Nina Sun Eidsheim begins by introducing the acousmatic question “Who is it?” We often hear this question in phone conversations but in a broader sense the term “acousmatic,” first used by composer and musicologist Pierre Schaeffer (Reference Schaeffer1966), pertains to all instances in which the source of a sound cannot be seen. As Eidsheim points out, “the acousmatic question arises from the assumption that, in asking, it is possible to elicit an answer” (Reference Eidsheim2019, 2), as if it were possible to know a sound and, analogously, to know a person. The acousmatic question thus implicitly expresses the idea that voice is an identifying sound that comes from the human body and reveals something about the person using that voice. As I will go on to show, the logic between the acousmatic question “Who is it?” is analogous to the logic behind the Arendtian question “Who are you?”—a question whose echoes we hear in Cavarero’s approach to voice. Both questions are driven by a logic that raises Eidsheim’s concerns, and for valid reasons. The idea that voice can be a uniquely identifying sound might seem innocuous, but only until we ask what that means for a non-white body in a world where people of color experience racism, discrimination, and violence.
In her works, Eidsheim often explores the phenomenon of the so-called “Black voice,” that is, a voice that is constructed as Black by being interpreted as sounding Black and belonging to a Black body, racialized as distinct from a white body and its vocality. The concept of “Black voice” can be seen as harking back to such essentializing practices like craniometry (Eidsheim Reference Eidsheim2019, 40) or the historical classification of a slave’s voice as “alien noise” (75) and used as evidence of Black people’s subhuman status (18). The impulse to “measure” voices continued well into present day, fueling racial stereotypes, and creating the conditions for “racialized listening” (Eidsheim Reference Eidsheim2019, 33). While racialized listening does not always have to stem from racism, racism stems from the assumptions about an essence revealed through those acts of measurement. In Eidsheim’s words: “While most racial essentialization of physical characteristics has been critically confronted (if far from eradicated), the West’s long history of entwining voice and vocal timbre with subjectivity and interiority has contributed to such truth claims remaining stagnant” (Reference Eidsheim2019, 5). Objecting to the paradigm that links voice to ideas of interiority and authenticity—in what she calls a “cult of fidelity” (Reference Eidsheim2019, 22)—Eidsheim’s project presumes that there is no stable answer to the acousmatic question “Who is it?” The very possibility of asking that question testifies to voice’s inability to readily reveal any kind of truth about the speaker. Rather, Eidsheim asserts that the voice is produced by the listener; if it reveals anything, it is the listener, and not the speaker.
Eidsheim lists three theses as central to her work: (1) voice is collective, (2) voice is cultural, and (3) the source of the voice is the listener (Reference Eidsheim2019, 9). However, I would suggest that, from a philosophical standpoint, what is most compelling is her diagnosis about Western thought’s preoccupation with the notion that (a) voice has the ability to reveal something about the person who emits it, and (b) there is something that can be revealed about the speaker through their voice, be it interiority, identity, truth, uniqueness, or however we choose to call it. It is precisely this diagnosis that makes Eidsheim critical of Cavarero’s vocal ontology of uniqueness as famously put forward in For more than one voice: Toward a philosophy of vocal expression (Reference Cavarero2005). Since Eidsheim does not devote more than one page to Cavarero, it is my task to reconstruct and expand on the reasons behind her critique.
What Eidsheim does highlight in the few sentences dedicated to Cavarero indicates an interpretation of uniqueness as an essentialistic, singular, and stable entity that can be identified through an “indexical” relation between the voice and the body (Eidsheim Reference Eidsheim2019, 33). Eidsheim’s choice of the word “indexical” is less than a fortunate one and potentially confusing. After all, in linguistics and philosophy, indexicality is associated with shifting, ambiguous content, rather than stable content, with indexicals being defined as context-sensitive expressions whose reference varies from use case to use case (see Braun Reference Braun2017). Given that Eidsheim assigns a pejorative connation to the terms “indexical” and “indexed,” positioning them in contrast to her own project (which, actually, could be complemented by indexicality in the former, linguistic sense), her use of “indexical” most likely means simply “pointing to” or “used as an indicator or measure.” As such, the main point of Eidsheim’s critique against Cavarero seems to assume that vocal ontology of uniqueness treats embodied voice as unambiguously pointing to a specific body and, vice versa, that a truth about a body can be heralded by its voice. To quote from Eidsheim:
While Cavarero ties the sound of the voice to the uniqueness of the vocalizer’s body in order to offer a relational ontology and politics, I advance the micropolitics of listening, a process that does not assume any indexical connection between voices and bodies. In fact I began by noting that racialized listening does not necessarily stem from racism, and I can now show that, (most likely) inadvertently, Cavarero’s “vocal ontology of uniqueness” assumes the very same logic that supports racialized perception of vocal timbre. (Reference Eidsheim2019, 33)
Before I proceed to demonstrate that attributing an endorsement of such an “indexical connection” to Cavarero is not entirely warranted, it is worth establishing what the vocal ontology of uniqueness is, and which of its aspects calls for a critique.
As we might remember, Cavarero’s project in For more than one voice aims to subvert what she described as the “devocalization of logos.” For Cavarero, this devocalization manifested itself in Western metaphysical thought’s tendency to erase voice from the realm of rational discourse. The erasure was facilitated by a patriarchal logic that positioned the rational mind and meaningful speech as masculine while relegating the body and its embodied voice to a subordinated realm of the feminine. Joining the Derridean goals of critiquing the metaphysics of presence and challenging the solipsistic model of subjectivity, Cavarero nevertheless refused Derrida’s take on phonocentrism (Derrida Reference Derrida1998). For her, Western philosophy has never properly paid attention to the voice, understood as the musical sound that comes from the human body by means of lungs, larynx, and the oral cavity. It could not pay attention to voice because it did not pay attention to women. In the patriarchal heteronormative schema, cisgender men speak to do politics, art, philosophy, and science. Meanwhile, cisgender women are reduced to mere sonorousness in which nothing they say is of any substance but occasionally makes a sound that excites the heterosexual cis male listener.
Cavarero’s aim in For more than one voice is thus to expose and subvert the patriarchal vocal schema. One way she accomplishes this is by noting that what makes speech political is not its content, but its ability to expose the embodied, gendered uniqueness of the speakers through their voices. This exposure is always relational, because according to Cavarero, voice is “destined for the ear of another” (Reference Cavarero2005, 7), like a cry of a newborn is destined for the ear of a mother.Footnote 2 Traditionally, speech—understood as an articulation of silent, contemplative thought—was what happened in the political arena between men while the musical, implicitly feminine voice was equated with unsubstantial chatter, an enjoyable song, and other vocalizations of pleasure and displeasure that fill the household, the only realm where a woman could rule. The reinstitution of voice into our thinking about speech creates “an interactive space of reciprocal exposure” (Cavarero Reference Cavarero2005, 190) which allows us to leave the ideology of separate spheres behind. Cavarero calls this reciprocal space locale assoluto, the absolute local, an Arendtian polis transposed into the global era (Cavarero Reference Cavarero2005, 204). According to Cavarero, the absolute local allows us to think about politics in terms of a relation that can appear anytime, anywhere, as this type of locality does not appeal either to territories or to identity myths (Reference Cavarero2005, 209).Footnote 3 What it appeals to is solely the uniqueness of a person who speaks in their voice.
This is where the problems with vocal ontology of uniqueness begin. According to Cavarero, only when people strip themselves of their “western, eastern, Christian, Muslim, Jew, gay, straight, poor, rich, ignorant, learned, cynical, sad, happy—or even guilty or innocent—being” (Reference Cavarero2005, 205), can they create the absolute local. What they are supposed to leave at the threshold are all identities that answer the easier of the two Arendtian questions, that is, “what are you?” rather than “who are you?” The question “what are you?” refers to personality traits, social roles, ideological views, or even emotional states. It defines a person’s affiliation with a given group of people or with a shared experience by means of a describable, common variable. But it is the “who are you?” that seeks the answer to a person’s unrepeatable uniqueness. As Arendt puts it: “This disclosure of ‘who’ in contradistinction to ‘what’ somebody is—[their] qualities, gifts, talents, and shortcomings, which [they] may display or hide—is implicit in everything somebody says and does” (Reference Arendt1998, 179). And since the verbal cannot fully communicate this “who” and keeps slipping into “what,” this uniqueness can be mediated by the asemantic component of embodied speech, that is, the voice. In the English translation of Cavarero’s For more than one voice, embodied uniqueness is often accompanied by another adjective: gendered.
The problem with the vocal ontology of uniqueness is therefore the assumption that there is a “who” that can be discovered through voice, and that the only “what” that somehow is nevertheless attributed to the “who” is gender, as mediated through the acknowledgment of gendered uniqueness. The latter is especially puzzling. Reading Cavarero, it could seem that gender is the only identity variable that can enter the absolute local. The presence and acknowledgment of musical voice (coded as feminine) in the political arena (coded as masculine) reinstates what has been previously barred from politics. However, if gender has the power to enter the absolute local, it is not entirely clear why other identity markers are forbidden from entering it. Without an extended commentary, this can be perceived as an oversight or perhaps even a manifestation of the privilege of a white cisgender European woman. It should be therefore acknowledged that Cavarero’s thought stems from the continental tradition of sexual difference theory where sexual difference is understood both as the principle that organizes the patriarchal order and a force that disrupts it when reclaimed with feminist tools and strategies. Perhaps something is lacking from the English translation of Cavarero’s work, and the gendered embodiment that enters the absolute local is not what Anglophone feminist thought would recognize as the performative, socially constructed gender. It is also not the essentialized biological sex that patriarchal discourses tend to perceive as a prediscursive basis of gender.
While it is beyond the scope of this paper to fully delineate the various meanings and theoretical contributions from thinkers like Julia Kristeva, Hélène Cixous, or Luce Irigaray, let us first broadly assume that sexual difference is the principle that organizes the symbolic order, delineating what is masculine and what is feminine, and subordinating the latter to the former. Then, let us assume that the same symbolic order uses the divides set up by sexual difference as a matrix that inscribes a similar power dynamic within other conceptual pairs that expand beyond the idea of masculinity and femininity but still implicitly evoke it. These pairings would include subject (man) and object (woman), mind (man) and body (woman), colonizer (active, masculine, penetrating force) and colonized (passive, quasi-feminine object of penetration), etc. In this set up, it would be logical for sexual difference to be an intrinsic part of the embodied, ontological uniqueness as communicated by voice. Only then could sexual difference be accessed as the lever that uproots the oppressive order that constrains the voice (and, at the same time, subjectivity) within harmful boundaries. Of course, it would be easy to contradict the assumption on which this premise is built. First, it remains unclear how non-binary voices would fit into Cavarero’s schema, or what insights transgender voices—as voices that transgress the initially assigned gender binary to then reassign it in a self-affirming fashion—might reveal about the very concept of sexual difference. Secondly, by adopting a decolonial angle, we could argue that it is not sexual difference but the three-fold aspect of knowledge, racism, and capital that is the organizing principle of oppression, including racial, gendered, epistemic, and economic oppression (Quijano Reference Quijano2000; Mignolo Reference Mignolo2007). In that configuration, it would be racialized difference, and not sexual difference, that attains the power to dismantle the hegemonic symbolic order, which would now be understood as the colonial matrix of power (Quijano Reference Quijano2000).
However, as can already be inferred from the discussion on Eidsheim, racialized difference on its own cannot (and perhaps should not) be adopted as a principle for the vocal ontology of uniqueness due to the magnitude of historical experiences of racism in the light of intersectional nature of oppression or, for that matter, the coloniality of gender (Lugones Reference Lugones2010). With all this in mind, it is not surprising that Eidsheim’s position is distrustful toward that of Cavarero’s. This is further exacerbated by cross-disciplinary differences. Before I go on to show in what ways Eidsheim’s concerns about the Cavarerian paradigm are valid and how she herself can contribute to our understanding of voice, I would like to expose some potential misconceptions that she might have about Cavarero’s theory of vocal expression due to the differences between their academic disciplines.
3. The paradoxical plurality of unique voices
As we have established, Eidsheim’s distrust in Cavarero’s project is raised by the category of uniqueness. However, it is worth noting that Cavarero’s uniqueness is the Arendtian “paradoxical plurality of unique beings” (Arendt Reference Arendt1998, 176). As Sophie Loidolt points out in her phenomenological reading of Arendt, this plurality is not the multitude of properties that can be described from a third-person perspective and confronted as an object (Reference Loidolt, Szanto and Moran2016, 44). Rather, it is the plurality of first-person perspectives that elude objectification and remain intangible and difficult to pinpoint (Loidolt Reference Loidolt, Szanto and Moran2016, 45). In other words, plurality for Arendt does not mean that unique beings are comprised of a plurality of properties. Rather, plurality is understood here as a shared condition of human life that is always lived among others and binds multiple distinct humans in being human “in such a way that nobody is ever the same as anyone else who ever lived, lives, or will live” (Arendt Reference Arendt1998, 8). It thus combines the “twofold character of equality and distinction” (Arendt Reference Arendt1998, 175). Surely enough, Arendt herself states that the manifestation of who someone is “retains a curious intangibility that confounds all efforts toward unequivocal verbal expression” (Reference Arendt1998, 181). As such, contrary to Eidsheim’s concern that uniqueness, vocal or otherwise, forms “a unified locus that can be unilaterally identified” (Eidsheim Reference Eidsheim2019, 3), I would argue that the “who”—also as it appears in the acousmatic question “Who is it?”—is never stable and unified. Rather, it is closely tied with instability and intangibility.
To quote Loidolt: “Instead of the rather unsupported claim that we are unique simply because we belong to the human species, it turns out that ‘uniqueness’ is the result of an active encounter of singular accesses in the plural, by speaking with one another and by acting together” (Reference Loidolt, Szanto and Moran2016, 46). Speech and action are the two actualities of Arendtian vita activa, a way of political life that sets itself apart from vita contemplativa, the way of philosophical life in which the mind reflects itself in language like an image is reflected in water (Arendt Reference Arendt1998; Cavarero Reference Cavarero2005; Plato Reference Waterfield1987). Speaking and acting constitute a mode of how bodies appear in the world and their appearance escapes both the descriptive and narrative capacities of language. In Loidolt’s phenomenological interpretation:
I encounter others in the world as living bodies with objective properties. But what makes them truly unique is not the possible uniqueness of these properties or of a combination of properties, but rather that each of them is a singular access to the world and presence of the world, which is neither reducible to an inner-worldly object nor directly accessible by me. (Reference Loidolt, Szanto and Moran2016, 46)
This is an assumption that, in my view, Cavarero shares. Through its appearance in the world, the body announces itself as an unrepeatable “I” among other unrepeatable beings, and the musical, asemantic dimension of human voice is part of that appearance. In that light, we cannot read Cavarero’s thesis on vocal uniqueness as a laudation dedicated to singularity understood as individualism. Eidsheim makes the assertion that “voice is not singular; it is collective” (Reference Eidsheim2019, 9) but Cavarero’s voice is also collective in the sense that it makes relationality the inherent condition of subjectivity and political agency. This also marks Arendt’s influence on Cavarero’s philosophy. As Loidolt underscores, when the Arendtian subject appears in front of others, they are related to “the others and the world as a shared space of appearance” (Reference Loidolt, Szanto and Moran2016, 48), thus constituting themselves as a dynamic, intersubjective, and hence, in a sense, collective event; the “who” can appear and develop only with others, in a web of relationships.
Taking it a step further, I would argue that the belief that each voice is a unique entry point into the world—one that cannot simply be reduced to an inner-worldly object or solely belong to “me,” as Loidolt has pointed out (Reference Loidolt, Szanto and Moran2016, 46)—is a notion shared not only by Cavarero but also by Eidsheim herself. This suggests that her theorizations may be more aligned with Cavarero’s than anticipated. In an autobiographical story near the end of The sound of race, Eidsheim—born in South Korea, raised in Norway by adoptive parents, living in the US—recounts how she found herself feeling offended when someone asked her why she and her Norwegian Colombian son “spoke Norwegian when [they] didn’t look Norwegian” (Reference Eidsheim2019, 185). This defensive feeling arose from the other person’s inability to recognize something that Eidsheim has learned to recognize as a certain “truth” about herself, together with other labels and identifications related to how she presents to the world, vocally and otherwise. Eidsheim goes on to rationalize this situation by criticizing the solipsistic logic that governed her reaction. Her pain was caused not by the question but by her holding onto a fixed and seemingly clear category of self-identification, focusing on her own viewpoint, and disavowing the possibility of other perspectives (Reference Eidsheim2019, 188). Ironically, though, that feeling testifies to the longing for something distinct, unrepeatable, and deep within; something that differentiates the “I” from all others. This longing cannot be easily ignored. Eidsheim grasps to address it by quasi-mathematical formulas, positioning voice as a collection of styles and techniques and pointing to an infinite potentiality in which “the concept of ‘me’ as a separate identity and the recognition that there is something that is not-me coexist. The effect is something that is not me or you, yet it includes both of us—and each of us is only one formulation of infinite possibilities” (Reference Eidsheim2019, 193). This is not as far from the Cavarerian schema as it might seem. When read through a philosophical lens, vocal ontology of uniqueness does not entail a solipsistic vision of subjectivity; quite the contrary, it seeks to overturn its reign.
When approached from this standpoint, the tensions between Eidsheim’s and Cavarero’s projects begin to recede. But this does not yet mean that we should renounce Eidsheim’s concern—including the premonition that Cavarero’s theory stems from the same logic that “supports racialized perception of vocal timbre” (Eidsheim Reference Eidsheim2019, 33)—or, for that matter, it does not mean that we should excuse Arendt whose infamous stance on segregation supports an inexcusably racist logic (1959; see also Burroughs Reference Burroughs2015). In her project on the vocal ontology of uniqueness, Cavarero does not consider race, ethnicity, or dimensions of oppression and privilege other than gender. Even her approach to voice at the intersection with gender creates a kind of confusion. Voice arises from a gendered body that lives in a world that still stubbornly reads it through a binary lens of male and female sex, construing voice as sexually dimorphic. However, contrary to Cavarero’s conviction, we know that embodied voice does not have the immediate and infallible capacity to reveal anything about the body’s gender. It does not take an opera or a drag performance to prove this kind of vocal ambiguity. Everyday lives are filled with vocal misunderstandings and misrecognitions. This is evident not only in approaches to transgender voices (e.g., Lagos Reference Lagos2019) but also pertains to, for instance, the experiences of cisgender men with high-pitched voices and cisgender women with low-pitched voices who also get misgendered based on their vocal expression.
This is precisely a limitation of Cavarero’s project that Eidsheim’s project can address. In her 2012 article on voice as action (which I will discuss in more detail in the next section), Eidsheim emphasizes that voice does not exist in vacuo. Non-vocal information generates a filter through which we listen to voices and interpret them. Voice is never experienced in “a purely sonorous realm, divorced from contextual information. Rather, non–sonic aspects, including preconceptions of race [or gender, for that matter], tend to influence how sound is perceived” (Eidsheim Reference Eidsheim2012, 10). In other words, our preconceptions about race and ethnicity dictate our preconceptions about human voices. Perceived correspondence between a person’s voice and their race does not mean that there is one single essence of a racialized voice (e.g., Black voice), just like a perceived correspondence between a person’s voice and their gender does not signal an inherently gendered nature of voice. A particular vocal timbre that we perceive is not the only one that a person can emit. A voice is not always a quintessential and undeniably recognizable marker of racialized embodiment or gender, and even when it does mark it, the relation is not casual. As Eidsheim notices: “when we hear a voice that happens to align with our preconceived ideas of racial differences, this correlation is not to be confused with causality” (Reference Eidsheim2012, 11). The same could be said about gender. A voice that aligns with our preconceptions about masculine-and feminine-sounding voices does not infallibly reveal either sex or gender, and it certainly does not point to an immediate relation between sex and gender. Even if it reveals something about a person’s gender identity or, for that matter, sexual preference by means of intentional modulation—which might happen when, for instance, a transgender woman strives for a more conventionally feminine vocal expression, a cisgender lesbian vocally presents as masc, or a cisgender man modulates his voice to present as gay—the relation is not stable, universal, unambiguously causal, or given once and for all.
Of course, Cavarero could perhaps argue that these preconceived ideas cannot enter the locale assoluto because the absolute local is a space where individuals encounter each other as equal but not equivalent. Unfortunately, such a space might exist only within the realm of philosophical imagination.
4. Toward a theory of voice as (intra)action
Despite their mutual focus on materiality, references to physicist and philosopher Karen Barad do not appear in Eidsheim’s works. As I will try to demonstrate, integrating these two perspectives could significantly enhance our discussion on voice. To achieve this efficiently, I will draw upon an earlier, more concise text by Eidsheim that already addresses voice as action; from there, taking the next step towards intra-action, a Baradian concept, seems like a logical next step. But first, let me briefly recapitulate some of Barad’s main contributions that are relevant here.
The framework of agential realism, as put forward in Barad’s Reference Barad2007 book Meeting the universe halfway: Quantum physics and the entanglement of matter and meaning, seeks to explore the onto-ethico-epistemological relationship between the material world and discursive practices in a manner that moves beyond binary distinctions. These divides often arise when discussing matter, language, identity, and embodiment, starting from subject/object, mind/body, nature/culture and expanding to broader theoretical rifts between constructivism and realism, or materialism and idealism, among others (Barad Reference Barad2007, 26). The divides between passive, unknowing objects and active, knowing subjects also fuel the trifurcation between ontology, ethics, and epistemology, which Barad’s project aims to transgress. Barad implements their framework through what they term a diffractive methodology. Diffraction, a physical phenomenon in which waves bend and spread around obstacles, interacting with one another and producing intricate patterns of light and darkness, serves as a theoretical blueprint for understanding reality not as a series of distinct boundaries, but as a network of complex, interwoven relationships. Diffraction was initially proposed by Donna Haraway as an alternative to reflexivity and reflection, the dominant, patriarchal metaphor for thinking. While reflexivity fosters a preoccupation with similarity, sameness, and mimesis, “setting up the worries about copy and original and the search for the authentic and really real, [… d]iffraction is an optical metaphor for the effort to make a difference in the world” (Haraway Reference Haraway1997, 16). More broadly, diffraction is a “critical and difference-attentive mode of consciousness and thought” (Geerts and van der Tuin Reference Geerts and van der Tuin2016) applied by contemporary feminist theorists working in the new materialist tradition. Considering Eidsheim’s concerns about voice and the “cult of fidelity” (Reference Eidsheim2019, 22), a diffractive methodology would lend itself to a conceptualization of vocal expression beyond the paradigm in which voice merely reflects an internal state and its supposed authenticity is derived from its interiority. Before examining the potential overlaps with Barad’s approach, let us first revisit Eidsheim’s article on voice as action to explore how she engages with materiality in her own work.
In “Voice as action: Towards a model for analyzing the dynamic construction of racialized voice” (Reference Eidsheim2012), Eidsheim explores the voice as a material, multisensory, multiple event that is a product of individual, material articulation and social conditioning that shapes and constrains the materiality of the body and its embodied voice. In a logic that echoes both Michel Foucault (Reference Foucault1995) and Judith Butler (Reference Butler1990), Eidsheim centers her article on the argument that, just as a person’s voice can be changed by applying pressure to different parts of their body, social expectations and racial stereotypes about voice also restrict and shape it, making it fit into preconceived audial codes. “The material that is the singing voice, i.e., the body in its material dimension, never exists in a pre-cultural state. The vocal, material body is always already formed by the cultural and social context within which it vocalizes,” Eidsheim writes (Reference Eidsheim2012, 12). To drive her point home, Eidsheim describes The Voice Box, a project in vocal pedagogy and contemporary dance that she conducted from 1999 to 2012. In collaboration with fashion and object designer Elodie Blanchard, Eidsheim created devices that restricted the human body, exposing the connection between the quality of voice and the material conditions of the vocal apparatus.Footnote 4 Some devices restricted the neck and the throat with a help of a textile sleeve with pockets where various objects—plates, balls, sticks—were inserted to change the shape and the position of the neck. Others restricted the whole body, forcing it into stances or adding pressure to it, since Eidsheim is adamant that the vocal tract is not the only “instrument” that produces the voice; rather the whole human body makes up the vocal apparatus. Each such interference resulted in significant qualitative changes to a person’s voice. Through those staged vocal restrictions, The Voice Box project was meant to highlight the materiality of the vocal instrument and show how physical alterations limit vocal production. It was meant to materially demonstrate how everyday expectations reinforce certain vocal modes and curb others (Eidsheim Reference Eidsheim2012, 17). As Eidsheim claims: “The ways in which a body is physically shaped by vocal expectations and restrictions, affirmative as well as restrictive, are no less violent or intrusive than our Voice Boxes … Our voices are physically and metaphorically bent into certain corporeal and sonic molds so that they may fit certain preconceived sonic identities” (Reference Eidsheim2012, 17). A real-world example could include the stereotyping of Black women and their voices as loud and argumentative, which does not align with the dominant sonic mold for feminine voices per white colonial standards. This perception often relegates them to the “angry Black woman” sonic stereotype, carrying negative consequences in both professional and educational settings. As a result, many Black women feel pressured to modify their vocal expression, which may involve code switching or adjusting the pitch of their voices (see, e.g., Eason et al. Reference Eason, Gaskins, Guy, Martin and Askarlan2023).
Finally, through observing modern dance classes, Eidsheim began to think of singing not as a sound but as an action. Instead of focusing deliberately on how to sing a certain tune, “thereby imprisoning [her] body within ideals” (Reference Eidsheim2012, 17), Eidsheim started to think about the act of singing in terms of a bodily action that results in a sound, just like an elaborate movement of a dancer’s body can produce an involuntary sound, like a foot landing on the floor with a thump. The conceptual link between singing and dancing led Eidsheim to propose that, when a voice appears to conform to stereotypical racial norms, it is not due to any inherent quality or the body’s limitations in producing different sounds; instead, the voice acts out a choreography prescribed by sociocultural conceptions of race (Eidsheim Reference Eidsheim2012, 19). As such, voice is not a sound “but a dynamic interaction, a co-creation of action and what we typically think of as ‘sonic material’” (Eidsheim Reference Eidsheim2012, 18). The term “sonic material” remains, curiously enough, undefined. As such, I would suggest that Eidsheim uses it to denote the process of emitting a sound as a sort of physical “substance” in a collaborative, collective act that also presumes a listener hearing and contributing to the interpretation, and hence co-creation, of this material (cf. Eidsheim Reference Eidsheim2012, 24), with the noun “material” alluding to its dictionary definition of a “substance out of which a thing is or can be made” (with “substance,” of course, being a rather unfortunate term for auditory stimuli).
While it would be quite straightforward to link Eidsheim’s argumentation with the poststructuralist tradition in the Foucauldian-Butlerian vein, the (underdefined) framing of voice as a “sonic material” invites an interpretation of voice as a Baradian phenomenon, that is, a dynamic topological reconfiguring, entanglement, relationality, and (re)articulation of matter (Barad Reference Barad, Alaimo, Hekman and Bloomington2008, 135). For Barad, matter is a dynamic entity, it is active and changing, simultaneously historic and cultural; it is in a continual process of becoming, as it gains various meanings through diverse cultural, social, linguistic, and scientific practices, while simultaneously shaping these practices through its own materiality. In Barad’s approach, discourse and matter are not disjunctive categories. Phenomena are always material-discursive, encompassing a motley of physical and psychical, tangible and abstract, organic and fabricated, objective and subjective, constructed and found objects, sensations, concepts, circumstances, and occurrences. Reality is composed of material-discursive phenomena, and every phenomenon comes with various components that exist in intra-active relations with other phenomena and their components. Contrary to interaction, which “presumes the prior existence of independent entities/relata” (Barad Reference Barad, Alaimo, Hekman and Bloomington2008, 133), intra-action positions phenomena as complex, material-discursive ebbs and flows, rather than as passive nonhuman objects and active human subjects. Intra-action defines the boundaries of phenomena, because only in intra-action can an act of observation delineate the object of observation. These boundaries establish properties and characteristics, giving rise to meaning. It is only at this point that a phenomenon departs from its “inherent ontological indeterminacy” (Barad Reference Barad, Alaimo, Hekman and Bloomington2008, 133) and goes on to reach determinacy. What conditions the possibility of determinacy is what Barad refers to as the agential cut—an act that locally delineates and differentiates a specific phenomenon during the process of observation. During that process, a phenomenon is iteratively reconfigured as a causal structure “with determinate boundaries, properties, meanings, and patterns of marks on bodies” (Barad Reference Barad, Alaimo, Hekman and Bloomington2008, 135). This is precisely what distinguishes the agential cut from the traditional Cartesian cut between subject and object. Agency is not an attribute of the thinking, speaking human subject, but rather the dynamic quality of all material-discursive phenomena that extend beyond the human, while still encompassing human beings.
An agential realist framework could enhance the dynamism and motility that Eidsheim seeks but somewhat obscures by the imagery of rigid vocal molds. This rigidity might create a misconception that vocal molds in themselves are static, unchangeable, and consistently negative. Rather, conceptions of voice can morph and evolve, encompassing a spectrum of micro-and macroaggressions, as well as strategies of resistance and self-empowerment in which fitting into a vocal mold is a subversive gesture of reclaiming one’s own identity, be it racial, ethnic, gender, or sexual, self-constructing vocal expression just like speech (e.g.,Davis Reference Davis2018). Intra-action would productively complicate the notion of causality, reframing it from Eidsheim’s reading as an oppressive, unequivocal relation into a more dynamic and fluid category and thus allowing for an examination—rather than a disavowal—of those apparatuses that produce the impression of a seemingly straightforward relationship between body and voice, interior and exterior, including those produce the impression of (and the longing for) the uniqueness of vocal expression. To cite Barad at length:
traditional conceptions of causation are concerned with the causal relationship between distinct sequential events. In my agential realist account, causality is rethought in terms of intra-activity. Intra-actions do not simply transmit a vector of influence among separate events. It is through specific intra-actions that a causal structure is enacted. Intra-actions effect what’s real and what’s possible, as some things come to matter and others are excluded, as possibilities are opened up and others are foreclosed. And intra-actions effect the rich topology of connective causal relations that are iteratively performed and reconfigured. (Reference Barad2007, 393)
Of course, it can be argued that Eidsheim is already accomplishing much of what can be achieved through Baradian tools. For instance, in The race of sound, Eidsheim traces the tension-ridden interplay of what she calls the symbolic, the measurable, and the material. The symbolic pertains to the meanings and interpretations of vocal expression shaped by power relations, thus revealing the contextual specificity of vocal expression, while the measurable evokes the qualities of voice that can be quantified and explained in evolutionary terms, enabling a problematic claim to universality (Reference Eidsheim2019, 15). The material remains curiously underdefined by Eidsheim, but most likely pertains to the biological and physical mechanics of both vocal production and auditory processing. As such, concepts and definitions already developed by Barad can be employed to systematize Eidsheim’s project in a way that aims to integrate the Cavarerian schema without reducing or totalizing uniqueness as an essence.
Translating it into Baradian terms, I define voice as a phenomenon that is material-discursive; it comes from the human body as mediated by its organs, arising as a sound that operates according to the laws of physics—but it is simultaneously shaped by and interpreted through socially constructed factors, with meanings attributed to vocal expression through purely linguistic means that are encapsulated within a historically arising order that is as material as it is conceptual. If a voice is an intra-active phenomenon, it would mean it is always already in relation, always in a position of agency that simultaneously acts upon and is being acted upon. If the meanings of this phenomenon are delineated in an act of an agential cut through potentially varying apparatuses that might give it a different valence depending when, where, how, and by whom the agential cut is made, I suggest that the uniqueness we perceive in the voice is also locally, intra-actively produced. I propose we think of this uniqueness not as a stable, unchanging locus but rather a moving topological knot. We can delineate its features in the act of observation, but the said act of observation does not necessarily disclose any kind of unmoving, unmistakable truth about the object of its observation. If we think about uniqueness as a topological knot (rather than a mere topographic intersection), it would be easier to perceive vocal uniqueness in tandem with repeatability and in relation to material-discursive forces of power that operate on the subject.
Through that lens, what Cavarero pursues as uniqueness is reframed in a logic that Eidsheim’s project would find as complementary to itself. The uniqueness mediated by an embodied voice now becomes an unnamable event that unfolds and that is, paradoxically enough, repeatable and unrepeatable at the same time, constantly shifting between the “who” and the “what,” with its diffractive movement being its intrinsic part that should be embraced rather than (unsuccessfully) eradicated. Through that framing, we arrive at both a methodology and an ethical commitment to theorize and study the voice not as a mechanically collated sum of all possible intersectional identities and not as a manifestation of something intimately our own and completely untouched by the external world. Rather, we can perceive voice as an indeterminate flux of changing affiliations and affinities, with that flux still being something more and something yet unlike those shareable characteristics and historically conditioned powerplays of oppression and privilege. What makes this framework an ethical commitment is a dedication to identifying and acknowledging all possible ways in which a person’s voice might differ from our own; simultaneously, we would commit ourselves to understanding the reasons behind those differences while refraining from trying to confine them to pre-given definitions and categories. By means of the theoretical synthesis that I have proposed in this article, we can conduct an exploration of voice that does not prioritize sexual difference over other possible differences. As such, we can open a new feminist philosophical arena of dialogue on voice that would allow us to consider the vocal realities of racialized bodies, transgender or non-binary bodies, gender non-conforming bodies, to name just a few possible but commonly overlooked configurations.
5. Conclusion
This paper began by juxtaposing Nina Sun Eidsheim’s and Adriana Cavarero’s approaches to voice with the intention of delineating both the problem with Cavarero’s vocal ontology of uniqueness and the problem with Eidsheim’s critique of it. Cavarero’s and Eidsheim’s positions can be seen as conflicted because it is the “what-ness” and not the “who-ness” that is often the stake of identity politics in a world increasingly polarized on issues of gender, sexuality, race, ethnicity, nationality, or economic status. The refusal to address the “what-ness” in a way that would acknowledge the intersectionality of experience is, as I argue, the main problem with the idea of vocal ontology of uniqueness. At first glance, Cavarero’s schema seems to follow the paradigm that Eidsheim recognizes as the founding principle of racialized listening (or, more broadly, any model of listening that leads to stereotyping based on vocal cues and assumptions about bodies that make those sounds). This paradigm presumes that (1) there is an indicative, two-way, mutual relation between body and voice, (2) voice can reveal something about the body and about the subject as such, and (3) there is always a kind of interiority that can and needs to be revealed if an answer to the question “who are you?” is meant to be found. Additionally, Cavarero’s assertion that intersectional axes of identity are not part of that answer is equally problematic. It is highly debatable if one can strip themselves of identities to arrive at an untouched “who,” or, for that matter, if we can afford renouncing intersectional considerations in contemporary societies that are increasingly polarized on identity matters. However, despite its limitations, Cavarero’s project is far from assuming that uniqueness is a stable, uniform, and solipsistic essence, as Eidsheim seems to be implying. Quite the opposite, a phenomenological reading of the Arendtian influences in Cavarero reveals that uniqueness is a dynamic, collective event that signals a unique point of entry to the world, a distinct way of experiencing and relating to the world that coexists with and inherently presumes the plurality of different perspectives. By exposing the nonobvious similarities between the two projects and acknowledging their respective strengths, I proposed a synthesis that would further enable a productive dialogue between the two positions. By introducing the new materialist Baradian angle, I achieved this synthesis by framing voice as a material-discursive phenomenon, an “unfolding event articulated through a particular sensing and sensed body” (Eidsheim Reference Eidsheim2012, 9) that has the ability to simultaneously reveal the uniqueness of a given body and to obscure that uniqueness by repeatable, performative scripts that constrain the body and its voice via various apparatuses vis-à-vis various human and nonhuman agents in dynamic intra-active contexts.
By conceptualizing voice as a complex, fluid, motile phenomenon that points to the entanglement of individual vocal properties with the physical, economic, sociocultural, institutional aspects of its production and reception across various historical and national contexts, as well as academic traditions, we can develop a more nuanced philosophy of voice. My approach allows for a critical examination of the intertwined ethical, ontological, and epistemological dimensions associated with each aspect that we analyze in a moment of focused inquiry (i.e., through an agential cut), while remaining cognizant of the ongoing interplay of other material-discursive entanglements that may affect the understanding of vocality. For example, while this discussion aims to engage with dominant academic discourses on race from a relatively equal standpoint, a parallel exploration could (and should at some point) delve into the complex material relationships that shape the dynamics of oppression and self-empowerment related to ethnicity and accent in racially homogeneous regions, where intersectionality manifests in ways that differ significantly from those recognized in Anglo-American academia. This article thus serves as a preliminary exploration that sets the stage for future inquiries into the multifaceted nature of voice and its implications across various transnational and transdisciplinary contexts.
Importantly, my discussion does not necessarily solve the adjacent problem that emerges from it; rather it opens new avenues for investigation. That problem can be summed up in the question: Are our respective axes of identity an intrinsic part of our uniqueness as subjects, with the “what” making us “who” we are, or are who we are despite what the world tries to tell us “what” we are? It is only my hope that the proposed synthesis at least helps us to remain aware of both sides of this question at once, regardless of how difficult it might be to juggle two different lenses, two different apparatuses. However, to drop one—either one—such lens would be to lose sight of something extremely important. Accepting the limits of our situated knowledges (to echo Haraway Reference Haraway1988), I propose that we nevertheless grasp for a type of multiplied vision that tries to (fore)see what it does not see when it looks in one direction—or, to leave behind problematic optical metaphors, to (fore)hear what it cannot hear when it listens to one voice. Problems remain unresolved, questions remain unanswered, but the political stake of (fore)hearing a voice that is unlike our own strikes me as the most important one, no matter how we choose to theorize it.
Acknowledgments
I would like to thank Hypatia’s editorial board and reviewers for their assistance in refining this article. This research was conducted without any external funding.
Adrianna Zabrzewska is a feminist philosopher and interdisciplinary researcher with a publication track record spanning philosophy, social sciences, and children’s literature studies. As a Senior Research Fellow at the School of Applied Sciences, Edinburgh Napier University, Adrianna currently supports the impact work package and the case study on Poland of the EU Horizon grant RESIST. Fostering Queer Feminist Intersectional Resistances against Transnational Anti-Gender Politics.