Small Gestures: Generating radical sonic futures in an algorithmic world

Debashis Sinha

doi:10.1017/S135577182500010X

Small Gestures: Generating radical sonic futures in an algorithmic world

Published online by Cambridge University Press: 19 May 2025

Debashis Sinha

Show author details

Debashis Sinha*: Affiliation:
School of Performance, Toronto Metropolitan University, Canada
*: Corresponding author: Debashis Sinha; Email: sinhad@torontomu.ca

Article contents

Abstract
Introduction
Unpredictable actions hold great emotional load
How are we listening?
A different set of solutions
Conclusion
Supplementary material
Footnotes
References

Rights & Permissions

Abstract

The widespread deployment of artificial intelligence (AI) and machine learning tools has created a shift in knowledge culture. The marginalisation of slower, more traditional modes of engagement for quantifiable data easily parsed by mathematical algorithms has resulted in prioritising proprietary or opaque datasets (knowledge) explicitly constructed with measurable parameters. Well-documented concerns persist regarding the narrow range of human data used by algorithmic tools, data that arguably encapsulates the many failures of human society. The inevitable result of the use and priority of this data, alongside very particular notions of value and what is valuable, is a replication of many of the foibles of our history as a species.

Cultural practice in general necessitates the communication of what drives our hopes and underlies our experiences. In algorithmic times we can see that this kind of communication supports some of the many critiques of AI and machine learning already extant in activist circles. Through investigating some of the theoretical backgrounds of this resistance, this article uses the first iteration of HEXORCISMOS’S SEMILLA AI project and the resulting album release as one of the many possible ways in which we might use machine learning and AI tools alongside very deliberate and uplifting models of community and community building.

Information

Type: Article
Information: Organised Sound , Volume 30 , Issue 1: Experimental sound from the Black and South Asian diaspora , April 2025 , pp. 24 - 31

DOI: https://doi.org/10.1017/S135577182500010X [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1. Introduction

an invocation:

can we have a conversation?

can we share our knowledge on a walk, in the quiet of the evening, sitting side by side?

come. let us talk. let us find the truth through the live and vital moment of sharing, on these pages, in the world, looking to others but also ourselves

may we all live up to our ancestors’ dreams. (ds)

The imperative of algorithmic culture is strong: we exist now in a world that seeks ease, speed and predictability through the almost ubiquitous deployment of artificial intelligence (AI). The value propositions of slowness, which are deeply felt and deeply entrenched in human and non-human communities, are being eroded and overlooked in deference to one particular reading of reality, one that lies in service to very specific understandings regarding how the world operates and what we should be doing in it (or, perhaps more accurately, with it).

(and yet

still

even now, even on this day, the day you read this

if I were to invite you

we could sit for hours staring into a fire)

What, then, is cultural practice in an increasingly algorithmic world? How can we build on tradition in an increasingly quantised reality, where massive and opaque datasets are more and more positioned as the only and complete valid archives of knowledge, with the solutions that result from their analysis prioritised? What are the other vectors of knowledge or story that we miss in working with algorithmic tools (specifically machine learning, in the case of this article), beyond ones that rely on statistical analysis? Is there (and what is) the algorithmic equivalent of the state of staring into the fire? Is it even possible?

The entire field of machine learning and AI – algorithmic culture in general, in fact – is predicated on speed, on ‘accuracy’. Its outer, consumer-facing aspect reinforces this assumption at every turn. Everywhere one sees posts about prompt engineering for ChatGPT and how to generate and humanise its output. Everywhere the outputs, and in very few places the encouragement to interrogate the workflow that gives rise to those outputs, or to interrogate the outputs themselves.

I would suggest that the ‘consumer-facing’ outcomes we see seeded in our feeds and YouTube ads are driven by very, very, very specific ideas about how society operates and what generates value (and, to be sure, what ‘value’ even is). Given this assumption (that the interpretations of what is useful or valuable is a finite or knowable set), it follows then that there are other interpretations that lie outside this field. And that, if they lie outside this field, there must be a space for them – a different field, somewhere.

In the AI domain, we are presented with fait accompli outcomes in our current propositions of value, but there is much elided, and much discarded. If we spend some time imagining outside the lines, we can intuit that in this discarding we miss something. In prioritising the predictability of the market and the infallibility of statistical models, we are in danger of setting aside the strategies available to us to enact and embody creativity that might apply when working with machine learning and ‘AI’.

There is work being done in many real-life places and digital communities that fold in algorithmic tools into processes of community building, advocacy and creativity – see, for example, the collective Dreaming Beyond AI (n.d.). These alternative workflows are predicated on a more realistic picture of how these tools operate, as they embrace unpredictability and are aware of digital detritus as part and parcel of the processes of machine learning. Machine learning is messy. We need to bring this into our understanding, because engaging with the mess yields the promise of rich, messy, glorious results.

There are, to be sure, intense and deep harms being caused by the current state of AI and its deployment – weapons systems deployed in conflict zones (e.g., Stewart and Hinds Reference Stewart and Hinds2023), the significant environmental impact of training AI models (Coleman Reference Coleman2023), job displacement (Clark Reference Clark2023), algorithmic racism (Kar Reference Kar2023), surveillance (Waelen Reference Waelen2023), and many more. There is a market imperative that encourages us to treat AI as something ‘other’, eliding or outright ignoring the fact that ‘AI’ tools are trained on human behaviour and human data, very often data that is extremely problematic (Birhane et. al. Reference Birhane, Prabhu and Kahembwe2021). For many techno-optimists, the fact that algorithmic tools replicate the foibles, harms and danger of our human society is because the ‘datasets are incomplete’ or the ‘model is biased’. Somehow, we have come to a point where AI and machine learning has become something that is just there, out of our control. It is rare that we are encouraged to consider how it is a result and mirror of our own relationship to human history, and only a small and narrow portion of our history at that: the portion that is digitised in 1s and 0s, mostly on the internet.

So then let us take that argument – that the outputs of algorithmic tools are narrow expressions of human knowledge and experience, full of shortcomings, omissions, imperfections and misunderstandings. With that as a starting point, we can also see another possible path in working with them. We can look for a different field to play in.

I have no illusions. The artistic experiments in these domains will not immediately make a huge impact on the harms outlined incompletely earlier. They are, though, a step in the trying – a trying that has as its goal to figure out another path or a strategy of refusal that could help in understanding AI’s utility and limits, and force a push to create a more realistic conversation regarding its uses and shortcomings. I remain acutely aware, at the time of writing (and probably even now, at the time of your reading), that art-washing machine learning and AI could be and indeed is used to restrict or re-direct the conversations we must have. I am constantly questioning myself, and trying to distill what I feel are the important points of this creative process – what is it that I want to discover? What have I found? Am I just using these tools because they are au courant, or is there something more being revealed in this process?

I am not quite sure what the answers are to these questions, and I do feel more and more uncomfortable about how the tools I use to create a story are part of a family of tools that enact violence and harm on Indigenous and Black communities, harms on people of colour and the environment, not to mention the hidden and exploitative labour practices often omitted in speaking about how AI functions (e.g., Muldoon et al. Reference Muldoon, Graham and Cant2024). It is an urgent undercurrent in the stories that I try to construct beyond the stories and works themselves. The conflict and contradiction perhaps might be the reason I stay in this space.

Suzanne Kite, a Lakota artist and researcher who works with AI, asks ‘how do I make art in a good way?’ (Kite and Benivolsk Reference Kite and Benivolsk2023: 11). I wonder: what is a ‘good way’ when algorithms can be/are extensions of capitalism, colonialism and the many, many imperfections and harms of the human world? What else can they be, and can they help us make art in a ‘good way’? How do we open up the conversation to dreaming, to slow practice, and to other knowledges and wisdoms? And how do we resist the pressure to co-opt these good ways to elide and dilute the problems these tools present to us?

This article offers some possible paths to a different engagement with machine learning tools and interpretations through a practice-based research lens, specifically in using machine learning tools as a site of community as illustrated by the MUTUALISMX project and record release. It is not within the scope of this article to interrogate and propose solutions for the problems arising from the conventional discourse around AI – the harms and difficulties alluded to previously, but also questions of copyright, ethics and dataset curation (there are others who do this extremely well, for example, the podcast Mystery AI Hype Theater 3000 at the Distributed AI Research Institute).Footnote ¹ What is interesting is the potential in the ‘folding in’ of machine learning in non-algorithmic creativity, and the role it might play in creating a greater potential than we are currently confronted with. Leif Weatherby of the NYU Digital Theory H-Lab frames it well: ‘It’s not just a black box, It’s at least grey. When you open that up you start to see things that have either aesthetic value, critical value, or both’ (Arshake 2022).

Further, and before I go on, I have a proposition that bears repeating: that in an issue dedicated to alternative representations of Black and South Asian sound practices, we should soften our gaze and perspectives in how we speak about practice and knowledge, and understand that there are many valuable ways rigour is demonstrated. That we make a conscious effort to allow for a broader palette of what constitutes discourse and sharing, that we make space for felt knowledge as well as book knowledge, that we consider the smell of our grandmothers’ saris, our ancestors’ hopes, our future generations’ dreams, and let them enter into our ways of knowing and exchange.

2. Unpredictable actions hold great emotional load

The imperatives of machine learning – classification, efficiency, scale, speed – have never interested me greatly. They seemed to me poor outcomes to strive for in creative work. Transparency and smoothness have their place, but in my workflow it is the surprise, the mistake, the unexpected, the error that reveals the path, more often than not.

This messiness plays an important part in systems of creativity. Divergent thinking, brainstorming, spitballing and the like are examples of how we can find new paths and generate creative ideas, and a constant part of our working lives. To me, the value of the known is useful as a creative tool mainly by outlining the form and foundation, or perhaps as a foil to the unknowns, our surprises. The ‘what happens if X’ question is of immense value, and a central question in my own sound practice: what happens if I turn this knob up to 10? What happens if I stretch this audio 100x its length? What happens if I use this or that filter, or remove it?

In a section enticingly titled ‘Using machine learning against the grain’, Rettberg (Reference Rettberg2022: 4) offers the intriguing sentence that heads this section, and points out how algorithmic failure is a strategy used by many artists, citing early AI artistic projects such as Trevor Paglen’s project ImageNetRoulette (Paglen Reference Paglen2017) and Elwes’s Queering the Dataset (Elwes 2019). The work of Mimi Onuoha’s Library of Missing Datasets (Onuoha Reference Onuoha2018) is a further example of how failure exposes the assumptions and shortfalls of neural nets. Onuoha speaks of a number of reasons (and their corollaries) as to why data could be missing, and points out that ‘missing’ itself is a normative term (‘implying both a lack and an ought’), obscuring the impulse/agenda/structure that has led to the data being missing (but more likely omitted) in the first place (Onuoha Reference Onuoha2016).

The term ‘thick data’, coined by Tricia Wang (Reference Wang2013), describes this term as a way for ethnographers to interface with big data for their research in ways that emphasise story: ‘Thick Data analysis primarily relies on human brain power to process a small ‘n’ while big data analysis requires computational power to process a large ‘n’ …. Big Data delivers numbers; thick data delivers stories’ (ibid.). Indeed, there is a sizeable portion of ethnography scholarship that has considered thick data and even small data as far back as, for example 2012 (Burrell Reference Burrell2012).

The takeaway I have in this conversation is the importance of relation of the storyteller to the data, and the relationality of the data points as the scene of exploration. It is the relationship that we subliminally execute or assume that we are constrained by that governs what we do with data. David Cecchetto in his book Listening in the Afterlife of Data: Aesthetics, Pragmatics and Incommunication (Cecchetto Reference Cecchetto2021) explores the idea that even though we know that ‘data’ is a flawed construct, we still live our lives as if it is valid. We hold that the ‘97% match’ on a dating app is a valid number, when in reality it is not.Footnote ² The point here is not that we have to do work to refute these flawed data evaluations, but that the refutation has already been done, has been largely accepted, and yet still we act as if they are useful.

Data are useful, but maybe the ways in which we think about it is flawed. How can we codify a critical practice as regards data, especially now when we heavily rely on often problematic datasets?

One such paper worth doing a deep dive on this issue into is ‘Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence’ (Mohamed et al. Reference Mohamed, Png and Isaac2020), one of the first papers I read that gave voice to some of my questions and misgivings in the early days of my own algorithmically driven sound practice.

The authors propose that the practice or study of decolonial theory can offer some strategies to mitigate or push back against harms connected to machine learning and AI. Further, the authors suggest that we can also use these strategies to form a ‘decolonial field of artificial intelligence: creating a critical technical practice of AI, seeking reverse tutelage and reverse pedagogies, and the renewal of affective and political communities’ (ibid.: 1). Not an easy proposition to operationalise in practice, especially as that labour often falls to marginalised communities – a regular occurrence everywhere, even in academia. I am reminded of a panel I was on discussing AI and cultural practice, even after my lukewarm and placating statements of ‘we have a long way to go, and we have to remain aware’, the next presenter responded by beginning his presentation with ‘it’s easy to complain’. I was too shocked to respond.

In essence, Mohamed et al.’s article defines coloniality as the continuation of the structures and mindsets put in place in service of colonialism, a kind of indoctrination of the mind that lingers, and which must be critically examined to enable the process of refusal. Knowing this, the authors outline the process of decolonisation: that it is both territorial (e.g., the Indigenous-led Land Back movement in Canada and elsewhere) and structural. The structural element is a confrontation with and undoing of colonial system of power that include attitudes and assumptions around race, gender, work, power, language, culture and so on. We can, and should, add ‘data’ to this list.

The applications to a decolonial AI field as the authors propose is easily imagined. The authors outline several strategies to activate decolonisation, framed by various views in decolonial theory: the ‘decentring’ view, the ‘additive-inclusive’ view and the ‘engagement’ view. They rightly point out that grand ‘meta-narratives’ (e.g., east vs. west, lumping together oppressed communities without regard for their unique needs and difficulties) are of limited use – some connections with the strategies of ‘thick data’ are evident here.

Strategies and perspectives are a useful foundation for re-imagination. They give us some possible ways forward when we sit with the pen, the brush, the microphone or mouse. Any person who has made a piece of music, a drawing, a collage, or even shared a particularly favourite image on social media knows the creative act is one of vulnerability and attention. We have laid the groundwork for a theoretical perspective for working with algorithmic tools, and some alternative possibilities in how we understand their results. What now?

3. How are we listening?

Integral to any creative practice, and of course particularly sound practice, is the act of listening. Listening to the material, finding out what it asks of us; listening to the ways in which its elements work together or against our own impulses; listening to the outcomes of the use of various tools on our sources; listening to our own heart as we listen back. We attune ourselves to our materials’ and co-creators’ silent and not-so-silent requests, and continue to listen as we try to make these requests manifest.

What are some of the different ways we listen, and how do these ways operate differently when we are looking to use algorithmic tools as a starting point, despite the shortcomings inherent in their makeup and processes? What is ‘listening’ in an algorithmic world?

3.1. Sovereign listening

In his award-winning book Hungry Listening: Resonant Theory for Indigenous Sound Studies, Dylan Robinson, xwélmexw (Stó:lō/Skwah) artist, curator and writer states: ‘The desire for familiarity … is the demand that difference present itself in a form that accommodates settler recognition’ (Robinson Reference Robinson2020: 68). If we extend the definition of ‘settler recognition’ to include ‘data parsing by algorithmic tools’, we can start to see how we might be constrained by the machine learning process in working with sound.

The sovereign act of creation, with all its rich smells, memories and materiality is allowed, but at the end of the day there is an unspoken ‘agreement’ that the object must be fully apprehensible through the gaze of the scholar or the art critic or the listener. For some, my field recordings of the streets of Kolkata have value as ‘exotic’ soundscapes that evoke a picture of the East, rather than a sovereign expression of my experience, an experience that may not have multiple pathways to understanding, and which possibly ultimately can only be fully understood by those with my or similar experience. Even as I write, I feel the tension in uttering this – that I am somehow failing in my practice by proposing that universality is imposed, and I possess a strongly felt desire to accept that this may not have been the goal. I wrestle with the tension, that somehow my art fails if I withdraw it from epistemologies that I am not particularly interested in participating in or with.

Robinson’s concept of listening as entering into a ‘sound territory’ (ibid.: 53) has been useful at those times of discomfort and tension. In positioning listening as an action of a guest in a territory, one can never completely understand the many layers of what one listens to, but this is ‘not a lack to be remedied but merely an incommensurability that needs to be recognized’ (ibid.).

Accepting this frame paradoxically opens the object to a deeper understanding, a relationship to the culturally specific ways of knowing outside my experience, says Robinson:

By decoupling the deterministic relationship between sovereign object and reception, we can gain a more nuanced understanding of Indigenous and settler forms of sensory experience that extend beyond the overly reifying subject positions of ‘Indigenous’ and ‘settler’ … [we can] question the difference between listening to an object’s expression of sovereignty and listening through sovereignty. (ibid.: 64)

That ‘(refusing) entry to comforting origins and social context’ (Eshun Reference Eshun1998: 00[-004]) can jolt the perceiver to consciously and authentically engage with their ways of apprehension, making them more nimble, or at the very least opening the way to naming their ways of knowing and how they function.

Robinson suggests that, in Canada, this could take the shape of decoupling from the desire to ‘know’ what one listens to, to remove the settler assumption that all knowledge, everywhere, is available to everyone, all the time. The acceptance that to know some forms of knowledge embedded in the object are not necessary. Observing the imperative of ‘recognition’ and ‘analogy’ and allowing it to dissipate, much in the same way that meditation traditions observe thoughts and the breath. This is a listening not rooted solely in refusal, but rather on an acceptance of alternative ways of knowing and expression. The acceptance that some of those ways do not belong to us, and being open to their power anyway, and accepting that when we miss that power it is a function of ourselves rather than the work or the way.

This alternate value proposition around knowing is something that can be useful in dealing with the frameworks embedded in algorithmic tools and their outputs. The ways of knowing that are inherent in other or ‘non-default’ systems lie outside the quantisation of knowledge that is necessary for these tools to function. We can operationalise ways of knowing that are predicated on authenticity, vulnerability and community in any workflow and with any tool. Knowing the limits of a tool allows us to use those limits as a jumping-off point, a node in the system of apprehension and valuing. Sovereign listening is a call for us to investigate how we value and integrate knowledge and shows us the use of algorithms is not an endpoint but only an element of our perceiving.

3.2. Relationality

The relational aspect of sound is central to its understanding and deployment as a strategy of knowing. To approach sound as an object defined by a lexicon risks disconnecting the process of interconnection crucial to its power. Salome Voegelin and the Listening Across Disciplines project proposes a set of protocols of listening (Voegelin et al. Reference Voegelin, Barney, Wright, Stubbs, Weaver and Smith2022) that sidestep the process of nailing down and limiting how we interact with sound, focusing on relationality while building a shared vocabulary that is intended to be non-limiting. In the building of the glossary, the priority became uncovering language related to sound that allowed for knowledge exchange across disciplines. Pairing the glossary with a set of listening protocols allows the words to ‘cease to be definitions, but become a point of connection, a skin, and touch, that a shared practice makes sensible and therefore thinkable’ (ibid.: 228).

Vijay Iyer refuses the idea of music being a thing outside of people and their relationality to each other and the act(s) and spaces inherent to music: ‘when you hear music, you hear a person, or you hear people, and you hear everything about them in those moments. They reveal themselves in ways that cannot be revealed any other way, and it contains historical truths because of that’ (in Ouzounian Reference Ouzounian, Grant, Matthias and Prior2021: 510). Dr Ouzounian points out that this is a radical departure from our understanding of Western/Global North-centred models of listening, that centring relationality rejects the premise of listening being consumption, and makes it an act of the subjects who listen. Music becomes a site rather than an object (and, by extension, the understanding that music is/sites are inhabited by a constellation of relationships).

The many polyphonic sound relationships inherent in a locale declare themselves when right attention arises. Researcher and sound artist Budhaditya Chattopadhyay (Reference Chattopadhyay2020) speaks of a mapping that occurs through the aural and physical travelling through the examined space, the application of attention to moments and locales. In other words, to practice sonic ethnography means less recording worlds as they are than fathoming how they might be alongside (some of) those who inhabit and compose them (Littlejohn 2021).

Auditing my own archives uncover cartographies of feeling, sound and story that change on each listen. Sounds that drew me now recede, others come to the fore. Subjecting them to training epochs through various neural networks result in folders full of low-fidelity audio of checkpoints and training outputs which add another layer of meaning and which places my listening to them in time and story.

Classification, where input data is mapped to a set of quantifiable features and then defined, is a kind of narrow authority: an assumption of ‘correct’ which is, to me, ultimately uninteresting. These outputs are based on training that seeks to replicate or reproduce expert knowledge, narrowly defined – to recognise the difference between a dog, pig, or loaf of bread (a running gag in the delightful animated film The Mitchells vs. the Machines (Rianda Reference Rianda2021)). Encountering this, scholar Beth Coleman’s thought is to ‘make AI more wild, not less … (a) generative possibility for the technology in opposition to the reproduction of the same’ (Coleman, B. Reference Coleman2021: 2).

I am interested in curiosity, how to activate it and have it inform investigation. Incorrectness, even when expressed singularily, implies multiplicity – a multiplicity of relations, of gazes, of knowledges, of embodiments, of potentials. Networks of knowing, invited into the conversation, each empowering and deepening the other. Prioritising relationality rather than extraction, making our investigation dependent on relationships rather than authority. We sense the field we were looking for, the field of the missing and the discarded.

Information theory tells us we cannot transmit information without noise. Sometimes the noise obscures the information, sometimes it augments it. Sometimes the noise is not-noise, no-sound. Gaps are traditionally perceived as deficits, negatives, spaces of zero that need filling, often in harmful and colonial ways (e.g., ‘closing the gap’ government programmes). This rhetoric imposes a hierarchy of knowledge (Tynan and Bishop Reference Tynan and Bishop2022).

But gaps are silences, and silences are as much a form of rhetoric as words. They project a picture of ‘all well, nothing to see here’, but are full of structure(s) carrying omissions and conjecture. We can perceive them as obstacles, or we can perceive them as inviting. They can be cold and authoritative, brooking no argument, or they can point to spaces that are cozy and warm, places one snuggles, sits, contemplates and discovers something new. The space of failure and shortfall become places of potential, imagination and community, of future or past beauty and knowledge, the smell of your grandmother’s cooking, the warm sunlight through the trees. They can be also this.

3.3. Counterlistening

The proposal of ‘counterlistening’ (Ouzounian Reference Ouzounian2020) is a concept we can apply to our workflows with algorithmic tools. Counterlistening is ‘subversive, dangerous, and self-sacrificing. It takes risks in seeking to hear what would or “should” remain unheard: what is ignored, concealed, or denied’ (ibid.: 312). When considering the outputs of neural networks, and the ways in which they work, we can see how this idea could be a useful tactic. Ouzounian outlines many possibilities of counterlistening’s outcomes: speculative; futuristic; listening for possible worlds, in the margins. Through the practice of counterlistening we encounter other possible worlds and make them manifest.

We can apply the process of counterlistening when parsing the idea of the algorithm and its place in current knowledge culture. Amrute et al. point out that AI systems are a product of sociality, which they define as ‘the labour, assumptions, and practices of humans and more-than-human lives interacting’ (Amrute et al. Reference Amrute, Singh and Guzmán2022: 9). Their main question is ‘how are these underpinnings of its production hidden and to what end?’ (ibid.: 9). In the first chapter of the primer, ‘Decolonizing Feminist AI’ they explore what is hidden, what is encoded in the models and method, what is embedded in the landscape of machine learning, but passed over as we move through it. Again here we are speaking about gaps, voids, hidden corners that reveal as much if not more than the space they offer. We discover that map can be rethought to become not about the certainty of its mapping, but about the stories the map elides.

Kate Crawford (Reference Crawford2021) describes her work as a ‘politics of AI’ rather than an AI Ethics approach. She situates her work as exploring the question ‘what is AI’ in a variety of ways that expose the frame in which we look at AI systems and what that frame omits. For Crawford, AI is neither artificial nor intelligent, and points out (alongside many, particularly researchers such as Suzanne Kite and other Indigenous AI practitioners) that AI is embodied, made up of resources – energy, plants, fuel, the earth. It extracts data, resources and labour, and props up the dominant ways of seeing and knowing that already are in place.

The project under discussion in the cited podcast is the ‘anatomy of AI’ project she undertook to examine the makeup and the associated politics of Amazon’s Alexa (Crawford and Joler Reference Crawford and Joler2018). In this project, they try to map out every resource and capability required to create one of these instruments, from domestic infrastructure to data labelling to geological processes – a ‘full stack modelling’ to reveal the hidden costs of producing AI.

The resulting .pdf produced is astounding in its detail and depth, and says as much about what is inside the process of producing Alexa as what is not considered. For example, there are nearly no people in the map. This is not a criticism of Crawford’s work, as I believe the whole point of the project is to reveal exactly this. There are no worker’s councils, no streets or playgrounds, no schools, no roads or meals or homes. People as people simply do not factor into the equation beyond the labels of ‘owner’ and ‘worker’.

When contemplating creativity (either with or without algorithmic tools), we can immediately sense that the proposition of counterlistening mirrors the creative process. Any one of us has uncovered something not immediately apparent, a lightning bolt of discovery that somehow reframes or re-energises how we think about something – a place, a story, an artwork or a piece of writing. Ouzounian states that counterlistening ‘resists the fixity and stability of an analysis’ (Ouzounian Reference Ouzounian2020: 312) – the act of counterlistening and creativity are perfectly intertwined, and available to us.

4. A different set of solutions

Anil Dash, tech writer and former CEO of developer community Glitch frames the search for solutions with compassion and joy: ‘being critical of extractive and exploitative technology *is* optimism. Saying that new tech shouldn’t happen at the expense of the vulnerable *is* an optimistic belief’ (Dash Reference Dash2022). Using our frameworks of decoloniality, of unpredictability, of listening and counterlistening, we can approach technical and algorithmic tools through a frame of optimism, engaging with them as a part of a manifestation of the creative process and spirit.

What is the opposite of homology, and how do we make space for it in increasingly consolidated digital spaces? Is it enough to make our own spaces, sow the seed of different working models and communities, different outcomes and embracing of knowledge systems that allow us to work in these consolidated spaces in an alternate way?

4.1. Semilla and new models of collaboration

SEMILLA (which takes its name from the ancient Mayan numerical concept of 0) is a new machine learning powered application for sound processing based on the RAVE variational autoencoder model. The SEMILLA environment is designed and compiled by Mexican sound artist and researcher HEXORCISMOS (Hexorcismos n.d.). It is a standalone app that uses Max/MSP, a modular audio and video environment available to users to run ‘patches’ that act on media in different ways. Here, I discuss the use case for the pre-release v.1 of SEMILLA, which was for this project hosted as an interface (or ‘plugin’) in Ableton Live (usually referred to as ‘Live’). Live is a popular and industry-standard digital audio workstation that is developed for live use, although it functions as a standalone recording software as well.

The musician uses SEMILLA to interact with what HEXORCISMOS terms ‘AI Mutuals’, as opposed to the more conventional term ‘AI twin’. ‘AI twin’ is an extension of the term ‘digital twin’, which refers to a virtual replica of (usually) a physical device used to run simulations to predict behaviour or outcome. In AI, this term has been expanded to refer to a neural network that can respond authentically and automatically, trained on our own outputs: ‘I want an AI digital twin of me to scale up access to me, so that I can serve more of my friends and colleagues better when they ask me for a favor’ (Spohrer Reference Spohrer2024). The thinking behind having an AI twin trained on one’s own outputs (e.g., emails, texts) means that one can automate requests and questions in a manner replicating the person’s own voice and style of communication. Rejecting the ‘twin’ nomenclature and substituting ‘mutual’ in this project is a considered approach that is leveraged because of the nature of the project – to create a data representation of the participating artists’ creative output. This re-terminology practice in HEXORCISMOS’S vocabulary grows from a model of collaboration that permeates the project space.

The model embedded in SEMILLA v.1 is the RAVE variational autoencoder (Caillon and Esling Reference Caillon and Esling2021), which is a machine learning model that provides high-quality sound processing and playback after training. Essentially, the network learns the sonic features and patterns within the dataset corpus of audio material and uses it to generate new high-fidelity output. HEXORCISMOS implemented a version of RAVE in SEMILLA v.1, training it on datasets that the artists provided to him – for example, my Mutual was trained on approximately 90 minutes of improvisations using traditional and extended techniques on a variety of percussion instruments. The resulting outputs are accepted to illustrate the various sound personas of the artists contributing to the project, which are then used by the artists to create new music by interacting with the AI Mutuals in the model archive. Each artist had access to all participants’ Mutuals, not just their own.

Upon installation of the software in the Ableton Live environment, one is presented with several possible AI Mutuals from the artist cohort. Loading the Mutual (with the click of a button) allows for an automatic download of the data associated with the sound material provided by each participant. There are a variety of possible parameters that can be manipulated in the SEMILLA UI, many of which are labelled in Spanish or not labelled at all, prioritising listening in working with the AI Mutual.

4.1.1. Datasets

As mentioned earlier, the artists who are participating in this project have all contributed a large dataset of audio that reflects their own sound persona. This dataset was shared with HEXORCISMOS through file sharing applications and in person where possible. A minimum of one hour of audio was requested, and HEXORCSMOS processed the audio to be machine readable – compressing and converting the file formats to be easily processed by the machine learning algorithm in its training run(s).

The dataset I provided the project was recorded in two sessions in Toronto, Ontario, and Stratford, Ontario in the spring and summer of 2022. Given the electronic/electroacoustic nature of many of the artists’ sound palettes, I made a decision to keep my sounds acoustic, haptic and tactile. Gongs, bass drums, culturally specific percussion instruments and other percussion were played traditionally and with extended techniques. The resulting improvisations were textural and purposely did not necessarily engage with tempo or a stable tempo grid, again as a decision to refuse approaches to music and time that are imposed by the digital audio workstation, with its built-in tools that prioritise linear composition and quantisation.

All the SEMILLA outcomes from the artist cohort were released in 2024 as an album titled MUTUALISMX on Other People Records (Various Artists 2024).

4.1.2. A collaborative model

As mentioned earlier, the AI field demonstrably prioritises problem solving, accuracy, classification, quantisation and scale in its approach. Using opaque and proprietary datasets further restrict our interactions and the outcomes of the tool, basing it on digitally recorded knowledge, much of it acquired without permission. Tools such as SEMILLA and other communities and sites of sharing push back against these imperatives partly in the use of data over which the actor takes control, but also in ways that try to incorporate agency and collaboration. One such site of pushback is the Decolonial AI Manyfesto, part of which reads:

Our urgency arises from humans’ capacity to use AI as a knowledge system to create irrefutable ‘algorithmic truths’ to reinforce domination. In doing so, other systems of knowledge production and other visions are denied and erased, as are other peoples’ agency, autonomy, and contestation. In this way AI coloniality extends beyond data colonialism … In insisting on a decolonial AI, we stand for the right of each historically marginalized community to reshape reality on their terms. (AI Decolonial Manyfesto, n.d.)

HEXORCISMOS’S approach is deeply aligned with this type of anti-colonial thinking, positing new possible modes of incorporating technology as part of a strategy of resistance to the prescriptive interactions that the tools invite. The approach to the software encompasses the totality of the project, not just the software itself. He came to each artist organically, through common nodes of connection such as festivals, emails and community building in the AI space. Each person involved in the project has a personal connection to him and the work of resisting colonial mindsets in AI, in music, and in many cases life in general through activism and political work.

At every stage in the development of the project HEXORCISMOS ensured each step he undertook was transparent and understandable. He sent regular emails to the participants and kept us apprised of developments in programming. He offered flexible deadlines and assistance in compiling our audio as needed.

The agreement we signed in the beginning of the project in 2023 was provided to us with openness and a request for further clarifications as needed for each artist, which included clauses such as:

The Collaborating Artist/Mutual will hold legal intellectual copyrights to the trained model, known as {collaborating_artist_name}.mutualismx.ts (torch script) digital file, exported by the generative deep learning algorithm, known as RAVE … The Collaborating artist/Mutual will hold absolute legal rights in perpetuity to the trained model, known as {collaborating_artist_name}.mutualismx.ts (torch script) digital file from the Duration of project and into perpetuity. (Personal communication 2023).

The agreement also outlines the album remuneration and rights: ‘The Collaborating Artist/Mutual holds an equal share percent of the profits as the Lead Artist and Collaborating Artists/Mutuals from the subsequent commercialisation of the new musical work in the form of physical and/or digital musical albums and singles’ (ibid.).

All artists were paid 200 Euros for their datasets and labour. This amount was self-financed by HEXORCISMOS. He also arranged with Other People Records to release the work on LP and digitally, and an advance from the record label was paid out against future record sales, which is standard practice in the music industry.

4.1.3. Reflections

We can see multiple threads from the ideas raised in this article coalescing in the process of the execution of the project and the release of the MUTUALISMX record. The human, creative and machine processes invited questioning of each other and our modes and assumptions of music making, tying directly back to some of the thoughts expressed by Beth Coleman when she speaks about ‘wilding’ AI. Through the interactions with the artist’s AI Mutuals, the cohort activated curiousity in the machine learning domain – a central part of the creative process generally, but not often invited when encountering machine learning outputs. The omissions and shortfalls in sonic accuracy invited a process of integrating the outcomes of the neural networks into an alternative expression, rather than throwing them out in favour of re-training for more ‘accurate’ or authoritative results.

RAVE is a model that is particularly good at replicating or outputting accurate audio when trained on particular kinds of audio – monophonic, single instrument sounds in particular. Almost none of the artists provided that kind of audio for training. In my own work, I have been most interested in trying to apply audio that specifically was not meant to be parsed by models in an effort to uncover what might occur. This process continued for all of us in the MUTUALISMX project. The outcomes of the training were somewhat otherworldly and odd, and provided rich soil for us to dig into. Video Example 1 shows my first encounter with SEMILLA, with my own Mutual loaded.

Understanding our listening as analogous to entering into Robinson’s ‘sound territory’ allowed our compositional process to be open and unsure, and Ouzounian’s ‘counterlistening’ – encountering what is embedded in the landscape instead of passing through it – becomes sonically expressed through the compositional and collaborative process in the tracks of the record. Gaps in meaning in the sonic material open to allow for other meanings to be placed or discovered there.

It is of note that the AI Mutuals are included in the language in the agreement. There is a lively discussion about copyright of datasets – for example, who they ‘belong’ to – and controversy on this issue is high, with other sites of pushback from artist activists such as Holly Herndon and Matt Dryhurst, the minds behind haveibeentrained.com (Herndon and Dryhurst, n.d.). The inclusion of the datasets in the SEMILLA AI project as protected intellectual property is a further gesture of refusal against the prevailing discourse, explicitly acknowledging the labour and IP of the artists in the data used to create this version of SEMILLA.

The space HEXORCISMOS built with the cohort was messy and open, much like the outputs of machine learning itself. The entire project relied on a process of authentic community building, which itself is an already messy but joyful prospect. There was a full commitment on the part of all the artists to engage came from a shared standpoint of accepting that the prevalent models of working in this space are predicated on a kind of fatality: the industry-driven myth that there is little room for dreaming of new ways of working in the AI space, or that incorporating anti-colonial working models is pointless when encountering the 1s and 0s of code. Each artist attacked these assumptions in various ways, as illustrated by the compositions on the record.

This kind of true and engaged connection between artists, a shared commitment to finding room for each other – a ‘manyfesto’ if you will – is vital in creating sites of resistance and reimagining. It is clear these sites are necessary everywhere, not just with neural networks. They allow for all of us to create and exist in a space that is predicated on acceptance and multiplicity.

5. Conclusion

Algorithms abound. They are being added to tools we use everyday, being offered as standalone new programs and apps, and embedded into the code of app interfaces without our knowledge. To re-evaluate our relationships to computing and algorithmic tools, we need to understand their frame and their potential. There is a need for us to educate ourselves on how our experiences might be being narrowed, how they are being activated, and how they are, in some cases, being re-thought.

Our inner lives are non-computational. They cannot be quantised, only reached for through quantisation. I approach my percussion instruments not with a desire to replicate the merciless grid of a computer’s perception of time, but with an understanding that I and my instruments stand inside a frame of time, and that musical time is a relational process wherein our network of connections push and pull in their expression. That is what music is, no matter the set of tools we use.

There is more than one path. And while there remains work to be done, there will continue to be sites of resistance and refusal in cultural practice. What this resistance and refusal accomplishes is perhaps debatable, but the role of the artist is to explore the fringes and the depths, the edges of what we think possible.

Aarthi:

Refusal of the outcome as endpoint as openness, as sunshine, as freedom, as path, as connection, as new opportunity. Refusal of the outcome as endpoint tied to a welcoming of all, of others, tied to building a better community, tied to a celebration of a multiplicity – of gazes, of experiences, of stories and knowledges.

Refusal of

the outcome

as endpoint

as the ultimate starting point of joyousness. (ds)

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/S135577182500010X

Footnotes

¹ www.dair-institute.org/maiht3k/.

² Personal communication, 15 August 2023.

References

AI Decolonial Manyfesto. n.d. https://manyfesto.ai/.Google Scholar

Amrute, S., Singh, R. and Guzmán, R. L. 2022. A Primer on AI in/from the Majority World: An Empirical Site and a Standpoint. SSRN. https://doi.org/10.2139/ssrn.4199467.CrossRef Google Scholar

Arshake. 2022. Aesthetics of New AI at the Creative AI Lab. www.arshake.com/en/serpentine-rd-platform-creative-ai-lab/ (accessed 1 September 2023).Google Scholar

Birhane, A., Prabhu, V. U., and Kahembwe, E. 2021. Multimodal Datasets: Misogyny, Pornography, and Malignant Stereotypes. arXiv:2110.01963 [cs.CY]. https://arxiv.org/abs/2110.01963 (accessed 2 October 2024).Google Scholar

Burrell, J. 2012. Small Data: People in a Big Data World. Ethnography Matters. https://ethnographymatters.net/blog/2012/05/28/small-data-people-in-a-big-data-world/ (accessed 12 September 2022).Google Scholar

Caillon, A. and Esling, P. 2021. RAVE: A Variational Autoencoder for Fast and High-Quality Neural Audio Synthesis. https://arxiv.org/abs/2111.05011 (accessed 2 October 2024).Google Scholar

Cecchetto, D. 2021. Listening in the Afterlife of Data: Aesthetics, Pragmatics, and Incommunication. Durham, NC: Duke University Press.Google Scholar

Chattopadhyay, B. 2020. The Nomadic Listener. Berlin: Errant Bodies Press.Google Scholar

Clark, E. 2023. Unveiling the Dark Side of Artificial Intelligence in the Job Market. Forbes. www.forbes.com/sites/elijahclark/2023/08/18/unveiling-the-dark-side-of-artificial-intelligence-in-the-job-market/?sh=9c7682d6652f (accessed 27 September 2024).Google Scholar

Coleman, B. 2021. Technology of the Surround. Catalyst 7(2): 2–21.CrossRef Google Scholar

Coleman, J. 2023. AI’s Climate Impact Goes beyond Its Emissions. SciAm. www.scientificamerican.com/article/ais-climate-impact-goes-beyond-its-emissions/ (accessed 27 September 2024).Google Scholar

Crawford, K. (Guest) 2021. Interdependence #9: Kate Crawford (AI Now). Interdependence. Episode 9, hosted by Holly Herndon and Matt Dryhurst. https://interdependence.fm.Google Scholar

Crawford, K. and Joler, V. 2018. Anatomy of an AI System. http://www.anatomyof.ai (accessed 9 November 9, 2022).CrossRef Google Scholar

Dash, A. [@anildash]. 2022. This Framing Makes a Common Error – Being Critical of Extractive and Exploitative Technology is Optimism. Saying That New Tech Shouldn’t… [Tweet, X], 5 February. https://twitter.com/anildash/status/1489684134922530818?s=12.Google Scholar

Dreaming Beyond, AI. n.d. https://dreamingbeyond.ai (accessed 20 October 2024).Google Scholar

Elwes, J. 2019. Project ZiZi. www.jakeelwes.com/project-zizi-2019.html (accessed 20 October 2024).Google Scholar

Eshun, K. 1998. More Brilliant Than the Sun: Adventures in Sonic Fiction. London: Quartet Books.Google Scholar

Hexorcismos. n.d. Semilla AI. https://semilla.ai (accessed 1 November 2024).Google Scholar

Herndon, H. and Dryhurst, M. n.d. Have I Been Trained. https://haveibeentrained.com/ (accessed 10 November 2024).Google Scholar

Kar, S. 2023. AI Models Propagate False Race-Based Medical Information, Stanford Researchers Find. https://stanforddaily.com/2023/12/07/ai-models-propagate-false-race-based-medical-information-stanford-researchers-find/ (accessed 6 January 2023).Google Scholar

Kite, S. and Benivolsk, X. 2023. In Conversation with Xenia Benivolsk: How to Make Art in a Good Way. Art Review Oxford 7(Autumn). https://artreviewoxford.com/issues/ARO%207.pdf (accessed 9 October 2024).Google Scholar

Littlejohn, A. 2021. Sonic Ethnography. In Grasseni, C., Barendregt, B., de Maaker, E., De Musso, F., Littlejohn, A., Maeckelbergh, M., Postma, M. and Westmoreland, M. R., Audiovisual and Digital Ethnography: A Practical and Theoretical Guide. London: Routledge.CrossRef Google Scholar

Mohamed, S., Png, M.-T. and Isaac, W. 2020. Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence. arXiv [Cs.CY]. https://arxiv.org/abs/2007.04068.CrossRef Google Scholar

Muldoon, J., Graham, M. and Cant, C. 2024. Feeding the Machine: The Hidden Human Labour Powering AI. Edinburgh: Canongate Books.Google Scholar

Onuoha, M. 2016. The Library of Missing Datasets. https://mimionuoha.com/the-library-of-missing-datasets (accessed 12 September 2022).Google Scholar

Onuoha, M. 2018. Missing-Datasets: An Overview and Exploration of the Concept of Missing Datasets. https://github.com/MimiOnuoha/missing-datasets (accessed 12 September 2022).Google Scholar

Ouzounian, G. 2020. Counterlistening. English Studies in Canada 46(2–4): 311–17. https://doi.org/10.1353/esc.2020.a903549.CrossRef Google Scholar

Ouzounian, G. 2021. The Sonic Undercommons: Sound Art in Radical Black Arts Traditions . In Grant, J., Matthias, J. and Prior, D. (eds.) The Oxford Handbook of Sound Art. Oxford: Oxford University Press, 502–20.Google Scholar

Paglen, T. 2017. ImageNet Roulette. ∼www.chiark.greenend.org.uk/∼ijackson/2019/ImageNet-Roulette-cambridge-2017.html (accessed 1 November 2024).Google Scholar

Rettberg, J. 2022. Algorithmic Failure as a Humanities Methodology: Machine Learning’s Mispredictions Identify Rich Cases for Qualitative Analysis. Big Data and Society 9(2). https://doi.org/10.1177/20539517221131290 (accessed 6 January 2023).CrossRef Google Scholar

Rianda, M. (2021) The Mitchells vs. the Machines [Film]. Columbia Pictures/Sony Pictures Animation.Google Scholar

Robinson, D. 2020. Hungry Listening: Resonant Theory for Indigenous Sound Studies. Minneapolis, MN: University of Minnesota Press.CrossRef Google Scholar

Spohrer, J. 2024. Personal AI Digital Twins: The Future of Human Interaction? www.eitdigital.eu/newsroom/grow-digital-insights/personal-ai-digital-twins-the-future-of-human-interaction/ (accessed 20 January 2025).Google Scholar

Stewart, R. and Hinds, G. 2023. Algorithms of War: The Use of Artificial Intelligence in Decision Making in Armed Conflict. Humanitarian Law and Policy. https://blogs.icrc.org/law-and-policy/2023/10/24/algorithms-of-war-use-of-artificial-intelligence-decision-making-armed-conflict/ (accessed 25 October 2024).Google Scholar

Tynan, L. and Bishop, M. 2022. Decolonizing the Literature Review: A Relational Approach. Qualitative Inquiry 29(3–4): 498–508. https://doi-org.ezproxy.lib.torontomu.ca/10.1177/10778004221101594 (accessed 10 October 2024).CrossRef Google Scholar

Voegelin, S., Barney, A., Wright, M. P., Stubbs, P., Weaver, J. and Smith, T. 2022. Protocols of Listening: Reflections on the Development of an Interactive Digital Platform for Cross-Disciplinary Sound Research. Resonance 3(3): 224–54. doi: https://doi.org/10.1525/res.2022.3.3.224.CrossRef Google Scholar

Waelen, R. 2023. Computer Vision, Surveillance, and Social Control. https://montrealethics.ai/computer-vision-surveillance-and-social-control/ (accessed 12 January 2024).Google Scholar

Wang, T. 2013. Big Data Needs Thick Data. Ethnography Matters. https://ethnographymatters.net/blog/2013/05/13/big-data-needs-thick-data/ (accessed 12 September 2022).Google Scholar

Discography

Various Artists. 2024. MUTUALISMX. Vinyl/Digital. Other People, 15 February. https://otherpeople.bandcamp.com/album/mutualismx (accessed 20 October 2024).Google Scholar

Sinha supplementary material

File 49.7 MB

Article contents

Small Gestures: Generating radical sonic futures in an algorithmic world

Abstract

Information

1. Introduction

2. Unpredictable actions hold great emotional load

3. How are we listening?

3.1. Sovereign listening

3.2. Relationality

3.3. Counterlistening

4. A different set of solutions

4.1. Semilla and new models of collaboration

4.1.1. Datasets

4.1.2. A collaborative model

4.1.3. Reflections

5. Conclusion

Supplementary material

Footnotes

References

References

Discography

Sinha supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests