We consider Juslin & Västfjäll's (J&V's) article from the perspective of a distinction we propose between two classes of mechanisms: signal detection and amplification. Signal detection mechanisms are unmediated sources of emotion, including brain stem responses, expectancy, and evaluative conditioning. They are unmediated because they induce emotion by directly detecting emotive signals in music. Amplifiers act in conjunction with signal detection mechanisms. They include episodic memory, visual imagery, and possibly emotional contagion.
Signal detection mechanisms
J&V distinguish brain stem responses from the other mechanisms proposed. This neuroanatomical classification presents a source of confusion, however, because the brain stem has multiple functions and may be implicated in the other five mechanisms. An alternative conception is the psychophysical signal detector, which encompasses brain stem responses and evaluative conditioning. Balkwill and Thompson (Reference Balkwill and Thompson1999) defined psychophysical signals as sound attributes having consistent emotional connotations across domains (e.g., music and speech prosody) and cultures. The signals may be learned or congenital. Learned signals arise through evaluative conditioning, acting on attributes correlated with emotion. Congenital signals trigger hard-wired affective responses including, but not restricted to, brain stem activity.
J&V restrict discussion of expectancy to syntax, but syntactic structure represents only one attribute relevant to expectancy. Expectancy implicates multiple mechanisms at several processing levels. Huron's (Reference Huron2006) expectancy model includes imagination, tension, prediction, reaction, and appraisal, underscoring the challenge of defining expectancy as a unified mechanism operating solely on syntactic structure. For example, the tension response is a physiological preparation for any imminent event and involves changes in arousal that likely arise from brain stem activity and are adjusted according to the degree of uncertainty about the outcome. Prediction responses are transient states of reward or punishment arising in response to accuracy of expectations. Accurate expectations lead to positive states. Inaccurate expectations lead to negative states.
The mechanism of evaluative conditioning proposed by J&V conflates a process of learning following long-term exposure to environmental regularities with an emotional-induction mechanism that detects signals and induces emotion. However, feedback mechanisms that establish learned associations are usefully distinguished from signal detection mechanisms that decode emotions during listening. Learning mechanisms act both on musical pieces and on psychophysical attributes of sound. The sadness of Shakespeare's monologue “Tomorrow and tomorrow …” nurtures associations between emotions communicated by verbal information and statistical parameters of the acoustic signal, such as slow delivery and little pitch variation. Such psychophysical signals are correlated with emotional states and connote them even when embedded in nonverbal stimuli such as music.
Amplification mechanisms
J&V posit visual imagery as an independent cause of emotional experience. But imagery primarily accompanies or amplifies emotional experience; emotional states induced by music are conducive to imaginative processes that elaborate and amplify that experience. Moreover, imaginative processes are not restricted to visual images. Some music has a conversational quality that stimulates an auditory image of talking; other music can stimulate a kinesthetic image such as floating. Music can even generate conceptual imagination, such as the idea of death. Imagery during music listening may have less to do with music than with the absence of visual stimulation to which a listener must attend.
Like visual imagery, episodic memories are rarely an independent cause of musically induced emotion but primarily amplify emotional experience. Episodic memory is powerful precisely because there is typically congruence in the emotional connotations of the music and episode. More generally, because self-report studies are susceptible to demand characteristics, the prevalence of episodic memory and imagery is probably overestimated. Most music listening accompanies activities such as driving, reading, and socializing, with little opportunity for imagery and episodic memory. Emotional effects of music are subtle but they occur continuously. In contrast, tangible visual images (a meadow) or episodic memories (a day at the beach) – because they are extraordinary – are over-reported.
According to the authors, emotional contagion is triggered by voice-like qualities of music, including intensity, rate, and pitch contour (“super-expressive voice”). However, such music-speech associations must be established in the first place through conditioning, and then decoded by psychophysical signal detectors. Once signals are decoded, emotional contagion converts perceived into felt emotion through a process of mimicry, amplifying the output of perceptual mechanisms. It is feasible that emotional contagion is directly activated by acoustic signals with no mediating process, but it should be engaged not only by voice-like attributes, but any emotional signal.
Conclusions
J&V characterize the literature as confused. A more optimistic interpretation is that the field is developing, and the target article is a valuable stimulus for this progress. Researchers have carefully controlled musical attributes, and cross-cultural studies have elucidated the capacity of people to interpret emotional connotations of music or speech from foreign cultures by relying on psychophysical signals that are culture-transcendent (Balkwill & Thompson Reference Balkwill and Thompson1999; Thompson & Balkwill Reference Thompson and Balkwill2006) and that have similar connotations in music and speech (Ilie & Thompson Reference Ilie and Thompson2006).
Emotional responses to music seem to arise from three broad sources: psychophysical signal detection, expectancies, and emotional amplifiers. Many issues remain unresolved. The difference between perceived and felt emotion – not explored here – has implications for theories of music and emotion (Schubert Reference Schubert2007). Moreover, research suggests that emotional responses to music implicate multisensory processes not acknowledged in the target article (Thompson et al. Reference Thompson, Graham and Russo2005; in press). Finally, it is important to define modularity explicitly (Peretz & Coltheart Reference Peretz and Coltheart2003) since this is a much-misunderstood concept (Coltheart Reference Coltheart1999).