Hostname: page-component-5db58dd55d-ggg9q Total loading time: 0 Render date: 2026-06-02T11:22:07.978Z Has data issue: false hasContentIssue false

When AI don’t sound like AI: Negotiating aesthetic expectations in technology-mediated musical practice

Published online by Cambridge University Press:  14 April 2026

Teresa Pelinski*
Affiliation:
Centre for Digital Music, Queen Mary University of London , UK
Adam Pultz Melbye
Affiliation:
Independent Artist and Researcher, Berlin, Germany
Andrew McPherson
Affiliation:
Dyson School of Design Engineering, Imperial College London, UK
*
Corresponding author: Teresa Pelinski; Email: t.pelinskiramos@qmul.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

This paper examines how aesthetics are constructed in technology-mediated musical practice, focusing on the interplay between cultural expectations of AI-generated sounds and the technical structures determining the behaviour of AI algorithms. Through a reconstruction of events in the Surfing Hyperparameters project, we capture how the sonic aesthetics of the system were constructed by negotiating between our sonic expectations (informed by cultural narratives of ghosts in machines) and the sound produced by the system. We argue that the aesthetics of AI-generated sound are often inspired rather than directly caused by the technology itself. While existing research has identified how tools embed ‘paths of least resistance’ towards certain sonic aesthetics, our work reveals a complementary force: how aesthetic expectations rooted in cultural narratives – from science fiction’s stories of autonomous machines to sonic hauntology’s spectral presences – actively shape design decisions and sonic outcomes. Through a radically transparent approach to documenting mismatches between expectation and reality, we show that the stories practitioners tell while building and making music with technology are performative, constructing rather than merely describing aesthetic realities. Addressing these interplays between imagination, expectation and material reality constitutes an important step towards addressing the complex sociotechnical assemblages in which technology-mediated musical practices come into being.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Core system architecture. Eight piezo signals $p$ are processed by the current model ${m_k}$ to produce latent outputs $l$, which are then sonified. When the switching model algorithm triggers a model switch, the current latent values $l$ are mapped to coordinates in the hyperparameter space $H$, and the nearest model ${m_{k + 1}}$ becomes the new active model.

Figure 1

Figure 2. Piezo contact microphones attached at the front, tailpiece, bridge and inside the F-holes of the doublebass. The piezos at the back and on the scroll are not shown in the picture. The faders and knobs visible on the FAAB are used for individual string gain and effects blend (Melbye 2023: 81), respectively, for the FAAB’s own signal processing. In the supplementary video and audio materials, no effects were used.

Figure 2

Figure 3. Autoencoder architecture based on the Transformer encoder. Eight piezo signals ($p$, 8 × 1,024 samples) are combined with positional encodings $\theta$, compressed to four channels via linear layer ${L_{in}}$ processed through Transformer encoder $TE$ to produce latent representation $l$ (4 × 1,024), then reconstructed to original dimensions ${p^ \sim }$ (8 × 1,024) via output layer ${L_{out}}$ during training. During performance, the ${L_{out}}$ layer is dropped and the $l$ latents are directly used as output.