Soundscapes in English and Spanish: a corpus investigation of verb constructions

ABSTRACT This corpus study explores how sound events are communicated in English and Spanish. The aims are to (i) contribute production data for a better understanding of the couplings of meanings and their realizations, (ii) account for typological differences between the languages, and (iii) further the theoretical discussion of how sound is conceptualized through the window of language. We found that, while there are significant differences between the languages with respect to how sound events are communicated, they are similar with respect to what domains the sound descriptions are instantiated in, namely perception, motion, manipulation, emotion-reaction, consumption, and cognition. One striking difference has to do with the conflation of sound for action, e.g., creak, squeak, and sound for motion, e.g., slam, crash. This finding supports the received view of English as a language that may lexicalize manner in those kinds of verbs, while Spanish expresses manner through qualifiers outside the verb. Moreover, both languages employ three different perspectives on the soundscapes: Producer-, Experiencer-, and Phenomenon-based. While English favours the Producer perspective, Spanish features an even distribution between Producer and Experiencer. Phenomenon-based descriptions are relatively few in both languages.


Introduction
Researchers have long since abandoned the idea that human communication is a matter of simple encoding (naming) and decoding, but still have a long way to go in order to reach a proper understanding of how meanings are expressed in language use. A widespread view in research on how sensory meanings are mediated through language starts from the assumptions that (i) sensory experiences are primarily conveyed by specific, single domain words, and (ii) meanings and lexical items enjoy a relatively static one-to-one relation in different contexts. Although there is indeed work in linguistics and related disciplines exploring the expression of sensory perceptions (e.g., Caballero & Paradis, 2015;Howes & Classen, 2014;Majid & Burenhult, 2014;Olofsson & Gottfried, 2015;Speed, O'Meara, San Roque, & Majid, 2019;Strik Lievers, 2015;Viberg, 1984Viberg, , 2015Winter, 2019a), most studies remain word-driven in that they typically start with preselected lexical items that are deemed to refer to sensory perceptions. This approach is referred to as the lexical category perspective by Strik Lievers and Winter (2018) in contrast to the sensory modality perspective where meanings are the starting point. The latter approach is important since meanings of words do not hold a one-to-one correspondence to one another. Indeed, using words as the window into conceptual space will overlook cases that are commonplace in language production, like, for instance, bad smells colloquially expressed by means of the speech verb cantar 'sing' in Spanish, as in "a mi hijo le cantan los pies" 'my son's feet sing'. The widespread use of such ways of describing sensory experiences has implications for the modelling of meaning in language. In fact, multiple meanings of lexical items and a diversity of language realizations of meanings constitute the normal state of affairs and, because of this, there is a need to approach meanings, and how they are put to use through language, starting from domains of meaning to give the most accurate account possible for meaning construal in language production.
In this study, we explore how s o u n d events (soundscapes 1 ) are portrayed through language in written communication in English and Spanish. A s o u n d event is a conceptual gestalt that, when expressed through language, With this study, we aim at contributing to the still scarce research on s o u n d in the language sciences by describing how such events are portrayed through language in English and Spanish, two languages that are known to exhibit interesting typological differences with respect to how m o t i o n and s p e e c h events are represented in language (e.g., Caballero & Paradis, 2018;Ibarretxe-Antuñano, 2017;Talmy, 2000). Their differences are highlighted in (2), where an English text has been translated into Spanish by a professional translator.
(2) (a) "Need in, Robin! Quickly!" he shouted on the intercom, slamming his way through the outer door as soon as Robin had buzzed it open. (b) -¡Ábreme, Robin! ¡Deprisa!-gritó por el interfono, y en cuanto ella le abrió desde la oficina, empujó la puerta de la calle y entró. The present study uses complete texts from fictional narratives originally written in English and Spanish in order to explore the way soundscapes are described. The choice of fiction rather than impromptu language production is motivated by the relative frequency with which the novelists describe sensory events. Basing our analysis on whole texts rather than on concordances of 2 Because we do not explore the grammatical constructions involved in the description of soundscapes, the glosses of the Spanish examples just provide a translation that highlights the lexical semantic realizations of the soundscape in order to facilitate the task of reading.

707
s o u n d s c a p e s i n e n g l i s h a n d s p a n i s h preselected individual words enabled us to identify and explore a wider range of language realizations than is usually the case in corpus-based studies, and, in this regard, we hope to be able to provide a broader empirical basis for theorizing about how s o u n d meanings may be conveyed in the two languages. In short, our first aim is to contribute to a better understanding of the couplings of sensory meanings and language realizations in the realm of s o u n d . This has implications for meaning modelling, which in turn is of crucial importance for research using language data. Second, we aim at providing production data based on a meaning-driven approach to view typological differences and similarities in English and Spanish. Our final aim is to contribute to the theoretical discussion of how s o u n d is conceptualized and what the categorial properties of this domain are.

Previous work
Our study is situated in the broad theoretical framework of Cognitive Semantics, where sensory perceptions are subsumed under the notion of embodiment, i.e., the view that human thinking is motivated by our bodily configuration and sensorimotor experiences. Our basic assumptions are that words are cues to meaningcues for experiential simulations and for interlocutors to construct a conceptual representation of what is communicated (Fischer & Zwaan, 2008;Hartman & Paradis, 2018). While the usage-based approach to knowledge of and about lexical items is part and parcel of Cognitive Semantics (Geeraerts & Cuyckens, 2007;Paradis, 2003;Tomasello, 2003), it still deserves to be explicitly stressed that statistical patterns of language use across different contexts are crucial for language comprehension and production (Louwerse, 2018;Stefanowitch & Gries, 2005). Meanings of words crystallize on the occasion of use and are highly dynamic and contextually sensitive in relation to the domains where they are instantiated (Paradis, 2005(Paradis, , 2015a(Paradis, , 2015b. Lexical meanings are not fixed but evoked in context, and this is the reason for our choice of a meaning approach to the exploration of soundscapes in this study. In the rest of this section, we review previous work on sensory meaning and language, both more generally but also with reference to the two languages contrasted here. Suárez-Toste, & Paradis, 2019;Diederich, 2015;Ibarretxe-Antuñano, 1999;Strik Lievers & Winter, 2018;Winter, 2019a), on iconicity/onomatopoeia (Classen, 1993;Dingemanse, 2012;Winter, Perlman, Perry, & Lupyan, 2017), on sound talk in engineers' discursive practices (Porcello, 2004), and in neighbouring disciplines such as philosophy, psychology, and neuroscience (e.g., Borghi & Cimatti, 2010;Knöferle & Spence, 2012;Lacey, Stilla, & Sathian, 2012;Nudds & O'Callaghan, 2009).
As part of the work on sensory meanings from the lexical perspective, there has been an interest in the relation between sensory words and their meaning potentials. For instance, in order to tap into participants' interpretations of individual words and their strengths of association with the different sensory modalities, Lynott and Connell (2009) investigated 423 words expressing properties of objects that could be associated with one or more sensory modalities (dark, light, crackling, glowing, thin, acidic, yellow). They asked participants to rate their experiences of each of the perceptual modalities (sight, hearing, smell, taste, and touch) for each word, and showed that most word meanings are evoked through several senses. A much larger and more recent study, The Lancaster Sensorimotor Norms, used data from 3,500 individuals using Amazon's Mechanical Turk platform in order to measure the sensorimotor strength of 39,058 English lemmas (Lynott, Connell, Brysbaert, Brand, & Carney, 2019). This multi-functional characteristic of words becomes salient, for instance, in descriptions of wine, where property descriptors such as sharp, ruby, or soft, and object descriptors such as apple, leather, lemon invoke experiences across more than one sensory modality (Caballero et al., 2019, pp. 58-70).
These investigations suggest that cognition and language to a substantial degree appear to be cross-modally embodied (Johansson, Anikin, & Aseyev, 2019;Paradis & Eeg-Olofsson, 2013;Winter 2019b). Brain research has found responses in taste and smell areas of the brain when participants were exposed to words such as cinnamon, garlic, and jasmine (González et al., 2006), and it has been proposed that the large areas of cortex situated between the sensory cortical areas are higher-level representational convergence zones (Binder & Desai, 2011). Several researchers (e.g., Barsalou, 2010;Pecher & Zwaan, 2005) have pointed out that there is a continuity between perceptual knowledge and the sensory modalities (visual, auditory, olfactory, gustatory, tactile), which is consistent with the idea that sensory perceptions and cognition are grounded in the same neural system, and this is ultimately revealed in the vocabularies of languages. All these findings have important implications for how meaning in language needs to be modelled.
s o u n d events: a sound, a sound-producing entity, and an experiencer. Such a set-up allows for the possibility of honing in on different aspects in order to set the scene in a particular way. For instance, in his work on sensory expressions in language, Viberg (2015Viberg ( , 2019 distinguishes two main types of verbs of perception, Experiencer-based verbs and Phenomenon-based verbs. For hearing, he identifies two types of Experiencer-based verbs, namely listen to (Activity) and hear (Experience), and three types of Phenomenon-based verbs, namely 'sound good' (sensory copula, as in "it sounds good"), be audible (perceptibility) and crack, creak, rattle (sensory verbs). His purpose was to give a typological account of the lexical resources in a number of languages of perspectives expressed through individual verbs in those languages.
Based on behavioural data, Dubois (2000) points out that the same acoustic phenomena can be categorized as events with focus on the source of the sound or the action that generates the sound. She also points out that noise and sound tend to be structured differently; noise is closely related to the emitting source and memorized as effects of the world on the perceiver, while sound is described more objectively in terms of its properties such as pitch and temporal evolution. These findings point to important issues of how human beings categorize phenomena in different domains. Categories are not necessarily populated by objects, as is the case for visual phenomena, but may be differently structured, namely as events including participants, and, moreover, they may be subjectively construed as effects on the perceiver.
The observation that realizations of sound events in language may be evoked through m o t i o n (e.g., descriptions of sound floating, lingering, or rising) points to the dynamic nature of our perception and conceptualization of sound as propagated through space; we can observe objects vibrating as a result of loud sounds (Strik Lievers & Winter, 2018, p. 50), and we can feel blows, i.e., motion in our bodies when we are near fireworks (Caballero, 2016). 3 These facts indicate the role of directionality in sound events. Likewise, in a study of Finnish expressions of vision, audition, and olfaction, Huumo (2010) tests the hypothesis that these sensory perceptions are conceptualized as a directional relationship between the stimulus and the experiencer. His data include perception verbs in the above-mentioned domains in combination with casemarked locative elements. The outcome is that there are differences between different verbs and also between different sensory modalities. With respect to the latter, he shows that visual expressions favour static expressions to a greater extent than auditory and olfactory expressions, which favour directionality from the stimulus to the experiencer. He argues that this difference follows from the fact that auditory and olfactory perception involves motion of a sound or a smell, in contrast to vision, which is conceptualized as the perception of a concrete entity. This observation dovetails nicely with Dubois' (2000) findings about how audition and olfaction are categorized, and adds linguistic evidence in the form of directional case-marking for the conceptualization of s o u n d and the representation of s o u n d events in language.
To conclude this section, it should be clear that, apart from Dubois' work on auditory categorization and conceptualization, the role of perspective in meaning creation has only been considered with reference to domainspecific lexical items, as is the case of Viberg's work (2015Viberg's work ( , 2019. Perspectivization through language has not been studied using meaning-driven approaches and production data. In this study, however, we explore not only the different ways sound events are expressed but also the preferred perspectives in their description. Meanings in language are never neutral or fixed, but always view-pointed in different ways through the foregrounding and backgrounding of various elements of situations. Our work is an attempt at integrating perspectives in a meaning-based study of auditory events rather than focusing on whether a given language has a verb that realizes one of the perspectives or not. 2 . 3 . t y p o l o g i c a l d i f f e r e n c e s b e t w e e n e n g l i s h a n d s p a n i s h The reason for studying English and Spanish is that they have been described as primary representatives of the typological dichotomy between verbframed and satellite-framed languages (Talmy, 2000), with Spanish as a verb-framed language since it lexicalizes m o t i o n and p a t h in the main verb and m a n n e r as a co-event in a satellite (typically, gerunds or adverbials), and English as satellite-framed because it lexicalizes p a t h in the satellite and conflates m o t i o n and m a n n e r in the main verb. This typological distinction has been questioned by many researchers as too simplistic (e.g., Beavers, Levin, & Tham, 2010;Zlatev, Blomberg, & David, 2010), and new insights have been offered through research on other languages (e.g., Filipović, 2007;Ibarretxe-Antuñano, 2017;Slobin, Ibarretxe-Antuñano, Kopecka, & Majid, 2014).
With regard to m o t i o n , Pedersen (2019) offers a particularly insightful study of directed motion events in Spanish and English that seriously challenges the above distinction and proposes an alternative account. First, he shows that both p a t h verbs and m a n n e r verbs are regularly used in both languages in transitive directed motion event sentences. For instance, Pedro bajó las escaleras and Peter descended / went down the staircase both feature sentences where p a t h is expressed by the verb and m a n n e r by a direct object rather than a satellite, and Fernando saltó la valla and Ferdinand jumped the fence both describe a situation where m o t i o n and m a n n e r are conflated in the verb. However, there are also differences between the two languages in transitive directed manner of motion sentences, which involve displacement. English allows sentences such as Peter paddled the river, where a m a n n e r of m o t i o n verb is used to describe a directed displacement event. This is not felicitous in Spanish, *Pedro remó el rio, because such constructions require the spatio-temporal, directed displacement to be expressed by the verb.
Also, the use of intransitive m a n n e r verbs for path events are felicitous in English, but not in Spanish: Peter danced to the beach (*Pedro bailló a la playa). The reason for the restriction in Spanish, according to Pedersen (2019), is, again, that there is nothing in the semantics of the verb that supports the p a t h component expressed by the directional adverbial, and that is what inhibits the use of manner meanings of motion events expressed through the verb in Spanish. Pedersen argues that in a verb-governed language such as Spanish, p a t h has to be part of the verb meaning itself to sanction the goal expressed through to the beach, while this is fine in English since the use of a non-telic verb can be sanctioned by the construction as a whole, i.e., by the event schema. We return to this issue in the discussion of s o u n d events and add that also a construal of metonymy has to be part of the explanation.
Comparisons between English and Spanish have also been carried out on speech framing expressions (Caballero 2015(Caballero , 2016Caballero & Paradis, 2018;Rojo & Valenzuela, 2001). What is clear from those studies is that there is a rich flora of ways of describing s p e e c h in both languages. Also, after identifying five main categories of verb meanings (s p e e c h , a c t i v i t y , p e r c e pt i o n , c o g n i t i o n , and e m o t i o n ), Caballero and Paradis (2018) show that Spanish features a more varied vocabulary and makes more use of verbs referring to thinking and reasoning, while expressions evoking physical meanings are preferred in English. Consider an example from translations of English into Spanish (Caballero, 2015), in (3). The translator's use of protestar 'protest' involves interpreting the intentions of the speaker while leaving out the fact that he is an adolescent and, hence, has a changing voice, as effectively conveyed by squeak. Caballero (2015) says that there is a tendency of English narrators to describe speech events in a physical and filmic way ('showing' what happened) in contrast to the Spanish preference for explicating speaker intentions. Differences between English and Spanish in the domains of speech and motion provides the starting point in our present exploration of s o u n d events.

Data, method, and analysis
The core questions guiding our research are as follows. Due to the explorative nature of our study, we did not start with a schema of categories beforehand, but the categorization was built up incrementally in a pilot study before the real annotation procedure (see below) took place. In the pilot study, we started out by exploring different chapters in the dataset in both languages in order to get a picture of how sound events were expressed, what conceptual domains were involved, and from which perspectives they were described. It was decided that sound related events describing speech in speech framing expressions of direct speech were not to be included, e.g., 'Bill said' or 'Sheila shouted'. Those specific speech contexts are accounted for in Caballero and Paradis (2018). On the basis of this preliminary work, we then designed the annotation schema to be used for the analysis of the data. 4 Next, we turned the corpus data into txt.files for practical work on the annotation and analysis proper. The texts were read by one of the analysts, who identified and marked the sound events in the texts, i.e., the occurrences that describe sound. After that, the texts were analysed by the two analysts, who annotated the texts independently of one another using the annotation schema developed in the pilot study, and then compared their analyses, identified cases of inconsistencies, discussed them one by one, and resolved any outstanding errors and divergencies. The txt.files were subsequently uploaded to a concordancer (MonoConc Pro) to facilitate data management and post-annotation searches.
In order to address research questions 1 and 2 above, we decided to make use of verbs (finite and non-finite) as the anchor points for our annotations of the individual sound events. This means that the nature and referential status of the verb determines the annotation schema and consequently the categorization of the sound events. The domains of instantiation that we identified in the pilot study are p e r c e p t i o n , m o t i o n , m a n i p u l a t i o n , e m o t i o nr e a c t i o n , c o n s u m p t i o n , and c o g n i t i o n . As a consequence of the decision to use verbs as the anchor point, we also ended up with a category that we refer to as support verb constructions, where the sound descriptions are primarily evoked by nominals (e.g., noise, din, silence) and adjectivals (e.g., loud, soft, jarring). Consider examples (4)-(7) from the data. (4) From the courtyard, she could hear the other slaves shuffling toward the wooden building where they slept.
More cheers rose up to meet them.
The sound of her name startled her.  (6), and support verb construction (7) with the underlined verbs as anchor points for the annotation in the txt.file and for searches in the concordancer. In all the examples, the descriptions concern s o u n d events, but, as can be seen, the domains of instantiation of the descriptions differ, and the scenes depicting the events thereby highlight different aspects of the events. To address research question 3, namely the perspective from which sounds are described, we drew upon Viberg's (2015Viberg's ( , 2019 classification of the semantic components of perception verbs, as described in Section 2.2. We customized his categories for our own purposes since his focus is different from ours in that he was interested in typological differences of the vocabularies of verbs of perception across languages, while the starting point of our analysis is how s o u n d events are communicated. Put differently, his focus is on lexical items whereas ours is on the domains of instantiation in language production. In our case, this called for a threefold grid of analysis. We distinguish between Experiencer (as in (4)), Phenomenon (as in (5), (6), (7)), to account for those cases where sound is described either as a result of someone's or something's action or as the very agent in the event, respectively, and Producer (as in (8) and (9)) to account for the source or origin of the sound in the event, which can be either an animate being, i.e., the agent actively making sounds, as in Dorian whistling in (8), or an inanimate entity such as doors producing a banging sound when opening in (9). In the next section, we present the results of our annotations.  Table 2.  English favours the Producer perspective (57%), while the distribution of perspectives is more even in Spanish with the same proportion of Producer (38%) and Experiencer (38%) perspectives. (For additional information about per million words see A- Table 1 and 2.) In addition to these quantitative differences, there are also differences of a qualitative nature. These are discussed in the next few subsections, where we will provide an overview of domains, the verb constructions, and the perspectives they adopt. (10) (a) I'll not have some peasant woman banging on the gate, wailing that you've broken her heart. (b) Héctor carraspeó y bebió un sorbo de gin tonic.

Results
'Hector made a rasping sound in his throat and drank a sip of gin and tonic.' (11) (a) She heard the beat of the drums.
(b) El único sonido que escuchábamos era el canto de los pájaros. 'The only sound that we listened to was the singing of the birds.' (12) (a) He began to say something but running footsteps sounded from around the corner. (b) El sonido retumbó como una campanada inmensa en mitad del bosque.
'The sound boomed like an immense bell stroke in the middle of the forest.' As shown in Table 3 describing perspectives in the domain of perception, there are differences with respect to the favoured perspectives in English and Spanish: English favours the Producer perspective (62%), followed by the Experiencer and the Phenomenon perspectives, whereas most s o u n d events in Spanish foreground the Experiencer (53%), followed by the Producer and Phenomenon perspectives. In addition, there are also big differences between the lexical variation for the different perspectives, where the Experiencer perspective stands out as being described with very few types of verbs (a limited number of core verbs such as hear/oír or listen/escuchar, namely five types for English and four for Spanish), while for Producer and Phenomenon there is a good deal of lexical variation (see A-Tables 3, 4, 5, and 6). In the case of Experiencer and Phenomenon events, the verbs mostly combine with nominal meanings directly referring to sound, as shown in (11a, b) and (12a, b), respectively.
The most salient difference between English and Spanish concerns the Producer perspective and involves both the number of expressions found in each language and the types of verbs used in them. The percentages are 62% for Producer-perspective in English as compared to 29% for Spanish. Moreover, there is a good number of onomatopoetic verbs portraying the production of sound in English, such as click, creak, crunch, and jangle. Such verbs exist in Spanish (e.g., chasquear 'snap', chistear 'make a tsk tsk sound') but are less t a b l e 3 . p e r c e p t i o n data in the English and the Spanish datasets: number and percentage. s o u n d s c a p e s i n e n g l i s h a n d s p a n i s h numerous (see A- Table 4). While the frequent use of such verbs in English contrasts with the substantially fewer cases in Spanish, the most interesting difference concerns the way these two languages profile the meanings of such verbs. Before taking this point further, consider examples (13) and (14) with the verb click.
(13) There was no kindness in Chaol's face, and she clicked her tongue as she left.
She clicked on the link and a single sentence was revealed.
Here click conflates an action and the sound that it typically produces in a s o u n d f o r a c t i o n construal, where the contingent part (the sound) of the action is expressed. The woman in (13) (15) and (16).
(15) Le estreché la mano, recorrí el vestíbulo taconeando con paso presuroso. 'I squeezed his hand, travelled the hall clicking my heels with quick step.' (16) Los goznes chirriaron suavemente y la puerta se abrió. 'The hinges creaked softly and the door opened.' In contrast to the English examples, the Spanish examples profile the sound rather than the action that produces the sound. In (15) we have the sound produced by somebody wearing heels and pacing a space, and (16) describes the sound made by the hinges of a door opening. These usage differences between English and Spanish are substantial, as described in detail in the 'Discussion of the results' section.  (17) and (19)), the Experiencer (20), or the Phenomenon ( (18) and (21)).
(18) Silence fell, and Dorian tried not to fidget.
(19) Los tres hombres en un flanco y las tres mujeres en el otro alzaron sus seis voces […]. 'The three men on one side and the three women on the other raised their six voices. ' (20) […] el runrún de las conversaciones ajenas y el borboteo del agua de una pequeña fuente acompañaron mi espera. 'the murmur from other people's conversations and the bubbling of the water in a fountain accompanied my wait.' (21) Del resto de los asistentes, que eran muchos, brotó un divertido clamor.
'From the rest of the assistants, who were many, came up an amused roar.' The distribution of the different perspectives taken in m o t i o n events are shown in Table 4. Table 4 reveals that the dominant perspective in English is Phenomenon (62%), followed by Producer, with no instances of Experiencer-based expressions at all. In the Spanish data, however, all three perspectives were found, with the majority belonging to the Producer (53%), followed by Phenomenon and Experiencer. There is a relatively high degree of lexical variation (see 8,9,and 10) (in contrast to what is the case for Experiencer focus in the category of perception (A- Table 5)).
With respect to the individual verbs used for foregrounding the Producer (shown in A- Table 8), we see that, although both languages use p a t h verbs to describe the emission of sound from the Producer (let out/soltar, emit/emitir, spit/escupir, loose/lanzar), such verbs are more frequent in Spanish than in English. The English soundscapes, however, are more often described through m a n n e r o f m o t i o n verbs such as splash, flap, swish, bang, explode, plop, or pound. In Spanish, the only verb associated with m o t i o n that expresses manner is traquetear 'rattle'. This typological p a t h /m a n n e r distribution in m o t i o n is consistent with previous research on motion events in these languages, with the restriction that such verbs in Spanish cannot be used in constructions expressing directed motion (Pedersen, 2019).
Next, the Experiencer perspective was only found in the Spanish dataset, which yielded the tokens shown in A- Table 9. These meanings profile p a t h from the point of view of an Experiencer always present in the linguistic description, as indicated by "a mis oídos" 'to my ears' in (22), or through the directional expression come that profiles the trajectory in a direction towards the Experiencer, in (23). Finally, Phenomenon-based descriptions are found in both English and Spanish, as shown in A- Table 10. There are many more tokens of m o t i o n expressions in English than in Spanish in the Phenomenon perspective. The proportions in each language are also different: nearly two-thirds of the m o t i o n expressions in English are Phenomenon-based, whereas the same figure for Spanish is one-third. Considering the actual verbs, it is also clear that the English dataset contains many more verbs lexicalizing m a n n e r (slam, crash, ripple, swish, quiver, stagger) and specifying the various ways in which different participants of the events produce or emit sound. However, the English dataset also contains verb meanings that foreground p a t h , sometimes describing its direction (fall, come, circle, leave) and sometimes conflating direction and manner (erupt, drift, float, slither).

. 3 . s u p p o r t v e r b c o n s t r u c t i o n s
This category comprises anchor verbs such as be, have, continue, start, or give way, which convey existential, modal, possessive, or aspectual properties, and verbs of change such as change, weaken, or turn into that profile the change of state of the s o u n d events. Table 5 shows that there is a slight distributional difference of support tokens between English and Spanish (516 tokens pm for English and 404 for Spanish), and also that there is more variation in Spanish (see A-Tables 11 and 12).
In both languages, most of the descriptions focus on the Producer of the sound (24) and (25), closely followed by Phenomenon-based descriptions (26) and (27), and very few descriptions from the Experiencer perspective (28) and (29).
There was a scraping noise somewhere beneath her feet.
Se produjo un murmullo unánime de aprobación. 'It was produced a unanimous murmur of approbation.' (28) […] she was already out of earshot, at the bar. Verbs such as make/hacer or produce/producir can express different meanings depending on the words that co-occur with them. In the cases above, the nouns express the s o u n d meaning. In English there is a relatively large number of instances with be compared to only one example in Spanish. Given the few differences between English and Spanish in this regard, the support category will not be addressed in the 'Discussion of the results' section.  Tables 13 and 14) m a n i p u l a t i o n is the largest group with 20 occurrences in English and 53 in Spanish; e m o t i o nr e a c t i o n has five in English and three in Spanish; c o n s u m p t i o n features one example in English and eight in Spanish, and c o g n i t i o n holds none in English and one in Spanish. One observation worth pointing out is that, while most of the English occurrences in m a n i p u l a t i o n profile the event from the point of view of the Phenomenon, portrayed as capable of performing actions as in (30), almost all Spanish m a n i p u l at i o n descriptions foreground the Producer of the sound, as in (31).
(30) Only the ceaseless crash of the surf against the offshore barrier reef broke the silence. (31) A blast from a conch shell cut through the murmuring voices.
Here also English makes use of various different verbs for, say, cutting sounds as cut, rent, rip, slice, or slit, and contrast with the common use of core verbs such as cortar 'cut' to describe similar scenes in Spanish.
As to the other domains in this group, the only one worth mentioning is e m o t i o n , used in both languages to describe the reaction of hearers to sounds, as in (32) and, in the case of English, to articulate Phenomenon frames, which often involve personifying non-human entities and presenting them as having human emotions, as in (33).
'The voice coming from the door surprised everybody.' (33) Outside, the wind bellowed and raged against the glass spire.
After showing what the datasets offered, we now proceed to discuss our results and observations.

Discussion of the results
This study has explored the way English and Spanish describe s o u n d events, i.e., events representing the production and/or reception of sound. Our analysis has focused on the domains, type of the verb constructions involved in the description, and the perspectives from which the events are portrayed (Producer, Experiencer, and Phenomenon).
We have shown that there are both quantitative and qualitative differences between English and Spanish, and that, although sensory meanings are traditionally considered as states in the semantics literature, a large number of the descriptions are dynamic. What these general findings also indicate is that there are interesting differences between languages and cultures with respect to the frequency of sensorimotor modalities included in the narratives and the way those modalities are described. There is a quantitative discrepancy between English and Spanish in that there are more than twice as many descriptions of s o u n d events in English. This is a striking finding that calls for more research on the basis of production data to establish if this is true more generally.
What is also clear from our data is that the way we communicate sound events is not restricted to a vocabulary commonly associated with sound and hearing out of context; it is richer, much more complex, and instantiated in domains beyond sound more specifically. The breadth of meanings and forms used to describe sound events in discourse is of crucial importance for the modelling of meaning in language. With regard to the perspectives from which s o u n d events are described, the fact that English favours the Producer perspective supports its characterization as more prone to dynamic scenes. Spanish has an even distribution between Producer and Experiencer, yet its frequent use of the latter perspective renders its users more inclined to explicate what is going on inside people's heads, and is therefore less dynamic. This tendency is also in line with what Caballero and Paradis (2018) found for s p e e c h events, where English narrators favour agentive and dynamic descriptions, while Spanish narrators tend to instruct readers about how to interpret the situation.
One of the most interesting observations concerns the predilection for expressions of conflated meanings in English, which is evident in descriptions of sound events in both p e r c e p t i o n and m o t i o n . With respect to the former, this conflation consists of a sound element and a dynamic element, which, in the domain of perception, concerns the s o u n d f o r a c t i o n constructions including verbs such as ring, buzz, and bang. For instance, ring in English may be used for the sound produced by a bell (the bell rang) or may refer to the action carried out by an agent (she rang the bell). Like English, Spanish may use similar verbs to describe the sound itself, e.g., sonar, tintinear, and resonar ('sound', 'tinkle', and 'resound'), while s o u n d f o r a c t i o n has to be expressed through a combination of an action verb and the entity that creates the sound, as pulsar el timbre 'press the doorbell', or with two verbs; a support verb (hacer) and the sound element in the subsequent verb: hacer sonar 'make sound'.
Next, we have also shown that s o u n d events have a preference for descriptions that conflate s o u n d and m o t i o n , and thereby also direction from a source to a perceiver, which reflects the very nature of sound as a phenomenon that travels through air and reaches the hearer. These observations are in line with work by Dubois (2000), where she reports on the flexibility of acoustic representations in terms of the source of the sound, the sound itself, or the effect on the perceiver. They are also in line with observations by Strik Lievers and Winter (2018), who show that "the association of sound with verbs is due to sound concepts being inherently more dynamic, motion-related and eventbased, in contrast to other sensory perceptions which are phenomenologically less strongly associated with motion". This event representation is also true of m o t i o n and s p e e c h , and hence there are similarities between them as cognitive categories. Also, Huumo (2010) demonstrates that audition in Finnish is portrayed as a directional relationship between the source of the sound and the perceiver via a combination of perception verbs and case-marked locative elements that foreground the destination of the travelling sound and its displacement. In like manner, our corpus also includes numerous descriptions of sound events that highlight a directional relationship between the emission of sound, as in (34), its trajectory (35), or its goal (36).
(34) Her fingers slipped on the keys, which let out a loud, awful CLANK. Furthermore, there are twice as many constructions with m o t i o n verbs in the English dataset than in the Spanish dataset, and also most of the English m o t i o n verbs express m a n n e r (slam, crash, ripple, swish, quiver, stagger), while Spanish favours p a t h .
The two languages also differ with respect to the distribution of the perspectives in the m o t i o n set in that Phenomenon is the dominant perspective in English, followed by the Producer perspective, with a complete absence of Experiencer-oriented meanings. In contrast, Spanish uses all three perspectives, with most descriptions focusing on the Producer, followed by Phenomenon and Experiencer. The most striking difference, however, concerns the way the two language allow for descriptions of s o u n d f o r m o t i o n . Consider (37), where, in addition to our own glossing, we also show professional translations in the Spanish version of the novels as they pinpoint the typological differences in a succinct way.
(37) (a) Strike blended well with the strong men banging their way in and out of the cafe. (b) Strike no desentonaba con aquellos tipos corpulentos que entraban y salían de la cafetería con andares bruscos. 'Strike did not clash with those bulky men that entered and exited the café with brusque gait' Example (37a) describes a situation of directed motion towards an endpoint including the intransitive s o u n d f o r m o t i o n verb bang couched in a way-construction, a realization that is felicitous in English but not in Spanish, where the way-construction is replaced by a p a t h expression (entering and exiting a place), and what may cause the sounds involved in the m o t i o n event originally described in English through banging has been omitted and substituted by an adverbial focusing on the agent's gait (con andares bruscos 'with brusque gait').
What our data demonstrate is that directed motion event constructions also house s o u n d events. Theoretically, both s o u n d f o r m o t i o n and m a n n e r of m o t i o n constructions highlight the tension between the importance of the verb in a construction and the importance of the constructional schema as a whole. Applying Pedersen's (2019) claim to also be true of s o u n d f o r m o t i o n , we note that there is nothing in the meaning of verbs such as bang that can sanction the use of directed displacement complementation their way in and out of the café in strongly verb regulated languages such as Spanish. This is, however, fine in English because English verb meanings such as bang can be overridden by the constructional schema as a whole, which in this case also includes the directed motion and displacement complementation. The same explanation holds for s o u n d f o r a c t i o n , where it is fine in English to use ring in ring the bell, while in Spanish the action itself has to be expressed as in pulsar el timbre 'press the doorbell'. However, in order to fully account for this possibility in English, we also have to appeal to the ease with which English invokes construals of metonymization of the verb meaning to adapt to and sanction meanings of direction and displacement outside the verb itself. In other words, for a full explanation of s o u n d f o r m o t i o n and s o u n d f o r a c t i o n constructions in English, a construal of metonymy proper is necessary to accommodate p a t h and a c t i o n in the final interpretation and modelling of the event (Paradis, 2004(Paradis, , 2011.

Conclusion
In this meaning-based study of how s o u n d events are mediated through language in English and Spanish, we have shown that that there are significant differences between the two. We have shown that, for both languages, the anchor verbs are not only instantiated in s o u n d , or p e r c e p t i o n more generally, but also in domains such as m o t i o n , m a n i p u l a t i o n , and more rarely in e m o t i o n -r e a c t i o n , c o n s u m p t i o n , and c o g n it i o n . In addition, we also found a sizeable number of anchor verb constructions that did not fall nicely into these domains but formed a category of support verb constructions with the role of combining existential, modal, possessive, or aspectual properties. These general findings are theoretically important for approaches to language structure and meaning modelling, as these domain conflations may be indicative of the synaesthetic sensorimotor architecture in perception, closeness in conceptual space, and ultimate fusion in language. Current usage-based research in the language sciences has repeatedly shown that meanings of words are potential and sensitive to the contexts in which they are used. This is also the case in the description of sound events.
However, English and Spanish differ in how meanings are represented, primarily with respect to s o u n d f o r a c t i o n and s o u n d f o r m o t i o n cases. In English, it is both possible and common to conflate a sound with the action causing it through onomatopoetic verbs such as huff, clink, splutter, thud, clang, creak, crunch, shriek, jangle, or squawk. This usage is not possible in Spanish. We might find expressions that refer to the same sounds, but then they do not express s o u n d f o r a c t i o n but just s o u n d . In the case of fusions of m o t i o n and s o u n d , the s o u n d event is embedded in a description of a m o t i o n event profiling a trajectory between two entities (the gust of wind […] clattered to a stop). This way of describing a soundscape is felicitous in English, but in Spanish m o t i o n and s o u n d are kept separate. In the case of directed motion, Spanish verbs have to realize the p a t h of the sound through the verb rather than the m a n n e r . This possibility gives English language users the opportunity to give metonymical descriptions of soundscapes in an economical way. These observations tie in with the findings reported in the m o t i o n literature, where English is known to lexicalize m a n n e r in the main verb in directed m o t i o n , where Spanish has to refer to p a t h instead.
There are also major differences regarding the perspectives from which soundscapes are profiled. In English, the most prominent perspective is Producer, while Spanish has an even distribution between Producer-and Experiencer-based descriptions. Phenomenon-based construals, where the sound itself is in focus, is the smallest category in both English and Spanish. Both languages are similar in that they feature a great deal of lexical variation with respect to the different domain instantiations as well as the different perspectives, except for the fact that there are very few anchor verb constructions with Experiencer-perspective in the domain of p e r c e p t i o n in both languages, e.g., hear, oír.
Our study is a first attempt to explore how s o u n d events are described in one Germanic language (English) and one Romance language (Spanish). More data are necessary to make stronger claims and to provide more extensive descriptions of lexicalization patterns, meaning representations, and typological characteristics of languages. Our study shows that there are twice as many instances of descriptions of sound events in the English dataset than in the Spanish one. Should this pattern prove to hold true, we might ask ourselves whether Spanish speakers are less inclined to describe s o u n d events than English speakers, and if so, why? Our study also shows that both English and Spanish describe s o u n d events through a range of different domains and a large number of different language realizations, which indicates that there is no simple one-to-one relationship between s o u n d events and their wordings. It also shows that there is a particularly interesting difference between the two languages with respect to conflations of s o u n d f o r m o t i o n or s o u n d f o r a c t i o n . Such metonymical construals are not allowed in Spanish. The explanation for this is that there is then nothing in this strongly verb-regulated