1 Introduction
Primary metaphors (Grady Reference Grady1997a; Lakoff & Johnson Reference Lakoff and Johnson1999: 49–59) have proven an interesting focus for linguists and scholars in other cognitive disciplines (this volume: Casasanto; Cuccio; Winter & Matlock), since they form the basis for widely shared if not universal patterns of language and conceptualization, linking one idea (and element of experience) to another. In this section, we discuss the nature of primary metaphors, with a focus on previous characterizations, which will prepare the ground for the new proposals presented in the rest of the chapter. We begin with several examples of primary-metaphor patterns, which will serve as reference points for further discussion.
Examples (1) to (3) illustrate three common metaphorical patterns, associating ‘difficulty’ with ‘heaviness’ (DIFFICULT IS HEAVY), ‘social dominance’ with ‘higher spatial position’ (POWER/STATUS IS UP), and ‘happiness’ with ‘brightness’ (HAPPY IS BRIGHT), respectively.
(1) The New Orleans Saints’ rookie offensive tackle was completely spent after taking on a heavy workload with both the first-string and third-string offenses during Friday’s scrimmage. (“More highs, lows for Saints’ Andrus Peat under heavy workload,” ESPN.go.com, 8/7/2015)
(2) Plucked from the dirt in rural France by unexpected benefactors, Jean-Marie begins to rise through the hierarchy of an aristocracy which treats France as though she is their very own banqueting table. (“The Last Banquet, By Jonathan Grimwood,” review at Independent.uk.com, 7/7/2013)
(3) Her bright demeanor is a marked contrast to Sichel, who frequently confesses her fears and anxieties directly to the camera. (“A Woman Like Me,” film review at Hollywoodreporter.com, 7/7/15)
Like all other primary metaphors, these three patterns follow the basic template of “conceptual metaphors” as discussed in Conceptual Metaphor Theory (CMT) beginning with Lakoff and Johnson (Reference Lakoff and Johnson1980): They constitute systematic conceptual associations that underlie both conventional and novel metaphoric usages of terms (originally) related to sensory domains. In (1) to (3), for instance, the words heavy, rise, and bright are used in their (highly conventional) non-sensory meanings of ‘difficult,’ ‘gain in status/power,’ and ‘happy.’
CMT has assumed that the link between the concepts paired in a primary metaphor is rooted in highly regular correlations in experience, such as those between the weight of an object and our ease or difficulty in handling it in the case of DIFFICULT IS HEAVY. Similarly, the metaphor POWER/STATUS IS UP plausibly originates in the advantages offered by higher spatial position when it comes to dominating another (given the effects of gravity). In the case of HAPPY IS BRIGHT, there is a clear experiential correlation between personal mood and ambient brightness.
CMT has stressed from the beginning that examples like (1)–(3) are not isolated instances, neither within languages, nor across them. It is easy to identify many other English expressions illustrating the same three mappings, for instance: A difficult task can be burdensome or weigh a person down. Privileged social ranks are routinely referred to as higher or upper ones and the people in them as higher-ups or superiors. The face of a happy person lights up or shines. In other languages, words expressing sensorimotor experiences related to ‘weight,’ ‘high spatial position,’ or ‘brightness’ frequently refer to the same non-sensory domains, as do their English counterparts: Arabic thaqiil can mean both ‘heavy’ and ‘oppressive.’ Finnish yli (‘high’) also refers to high status, or position within a hierarchy. And the Hausa root hask, which basically refers to brightness, can also refer to happiness.
2 Looking Back: Primary Metaphors in Conceptual Metaphor Theory
2.1 On the Nature of Primary Metaphors
Both the source and the target concept of a primary metaphor are equally basic concepts, which differ from each other only with regard to whether they are ultimately grounded in a particular sensorimotor domain. Primary Metaphor Theory (PMT) claims that the connection between the two concepts is strongly unidirectional, such that the source is used to conceptualize and talk about the target, but not vice versa. In the following, we take up each of these aspects in more detail, also responding to recent findings and suggestions.
Basicness
Primary metaphors are conceived as associations between fundamental concepts, e.g. defined as concepts that are grounded in universal (rather than culturally determined) aspects of human experience. Considering the three patterns offered above as illustrations, while the six relevant concepts are of course subject to cultural variation (e.g. different cultures might certainly be characterized by different understandings of ‘happiness’), the spirit of the primary-metaphor proposal is that there is also a shared element of experience across cultures that forms the basis for these concepts.
That is, primary source and target concepts (such as HEAVINESS and DIFFICULTY, VERTICAL HEIGHT and SOCIAL DOMINANCE, or BRIGHTNESS and HAPPINESS), however they are elaborated in a given culture and language, should ultimately be grounded in what can plausibly be conceived as basic parameters of human physical, social, emotional, or intellectual experience. Other, more particular or culturally contingent concepts – ‘house,’ ‘chair,’ ‘wife,’ ‘election,’ ‘priest,’ etc. – do not participate in primary-metaphor patterns. On the other hand, primary metaphors may certainly play a role in figurative conceptualizations of such entities (see section 2.2).
Related to the discussion of possible neural substrates for aspects of primary metaphor in section 2 below, a more concrete definition of ‘basicness’ might entail that primary source and target concepts are grounded in experiences that are, in some meaningful sense, part of humans’ innate cognitive repertoire. That is, a definition of primary source and target concepts might ultimately include the requirement that we are “pre-wired” to have the associated experiences.1 While certainly not proven with respect to all such concepts, this proposal seems very plausible where concepts like ‘difficulty,’ ‘heaviness,’ ‘social dominance,’ ‘vertical height,’ ‘happiness,’ or ‘brightness’ are concerned – and there is even concrete knowledge of how the brain processes some of these aspects of experience, such as in the visual domain (see section 2).
It is worth considering an interesting challenge to the idea that primary metaphors should be stated at the level of basicness discussed in this section. Through a very clever series of experiments investigating SIMILARITY IS CLOSENESS, Casasanto (Reference Casasanto2008a) shows that spatial proximity on a screen causes abstract concepts to be judged more similar, but visual stimuli to be judged less similar. Based on these results, he argues that there is no conceptual metaphor linking proximity and similarity – the actual metaphor, on this account, only links proximity and conceptual similarity – and concludes that, “[i]f the present results are interpreted as validating conceptual metaphor theory overall, then the theory is rendered too vague to make falsifiable behavioral predictions” (ibid.: 1053). In other words, while people talk about perceptual similarity as closeness (as Casasanto observes, “two paint chips can be close in color”), they do not think in this way, and the experiments “directly contradict predictions that follow from conceptual metaphor theory” (ibid.).
We suggest an alternative interpretation of the experimental data that preserves the original metaphor analysis but requires a more layered consideration of cognitive processes – and one that is in fact closer to the “hierarchical” account offered by Casasanto (Reference Casasanto2014a, this volume): It is possible that there is a conceptual link between proximity and similarity (including perceptual similarity) but that this link is “trumped” in the context of a particular judgment task, and quite possibly in other contexts. While CMT is concerned with patterns of association, it does not assert that these associations are predominant in all cognitive situations. In fact, much is made of the finding that the very same target concept can be conceptualized in different ways. Grady (Reference Grady1997a), for instance, observes that moral goodness can be understood as either cleanliness or healthiness. Conceptual metaphors, including primary metaphors, do not absolutely determine how we experience or perceive a given topic, rather they create the possibility or probability that we may conceptualize it in this way.2
In short, we believe it is both intuitively helpful and consistent with linguistic and experimental data to preserve the idea of a primary metaphor that links the two very basic experiential dimensions of proximity and similarity.
Sensory vs. Non-sensory Concepts
The three examples listed above all illustrate another central property of the patterns of association between one conceptual domain and another that constitute primary metaphors: They all include one concept, the source, that is associated with a particular aspect of sensory experience, and another concept, the target, that is not. While the target concepts – HAPPINESS, SOCIAL DOMINANCE, DIFFICULTY – can certainly be associated with experiences that include a sensory dimension (e.g. we might feel happy when we eat chocolate cake), they are not consistently grounded in one particular type of sensory experience and might be experienced independent of any relevant sensory aspect at all (e.g. in the context of the election of a favorite candidate). The relevant source concepts, however – BRIGHTNESS, HEIGHT, HEAVINESS – are each defined with relation to a particular aspect of physical, sensory experience. Given the basicness of these dimensions of experience, as just discussed, Grady (Reference Grady2005a) proposed that primary source concepts might in fact be usefully considered a subset of image schemas.
Overall, the distinction between “sensory” and “non-sensory” is obviously resonant with numerous previous claims (e.g. by Lakoff & Johnson Reference Lakoff and Johnson1980) that metaphors map the “concrete” onto the “abstract,” for instance (but certainly not with various observations that metaphors map the “familiar” onto the “unfamiliar”). On the other hand, the dimensions of experience in which concepts like ‘happiness,’ ‘difficulty,’ and ‘social dominance’ are grounded seem intuitively just as basic as the source experiences they are associated with, though different in kind. For this reason, depending on how we understand the term, it may be misleading to describe primary target concepts as “abstract.”
But is “non-sensory” an adequate label, with its implication that primary target concepts as a group are defined only by a quality they lack? Exploring the idea that primary target concepts may constitute a natural category of their own (not defined by exclusion from another category), and referencing his own earlier proposals among others, Grady (Reference Grady2005a: 47) offered the following brief characterization of the nature of primary target concepts and used the term “response content” to distinguish their referents from the sensory referents of associated source concepts: “[Primary target concepts] relate to our interpretations of and responses to the world, our assessments of the physical situations we encounter, their nature and their meaning.”
Whether or not “response content” is a fitting label, the spirit of the primary-metaphor proposal has been that source and target are distinguished by a fundamental difference in kind between the two sets of concepts – a distinction that may even be amenable to characterization in neural terms, as we consider in the second part of this chapter.
Directionality
The source–target relationship is unidirectional by definition, and in fact, there is a consistent directionality to the semantic extensions motivated by primary metaphors. In accordance with AFFECTION IS WARMTH, for instance, an affectionate person is warm, but there are no conventional usages that go in the opposite direction – i.e. with social-emotional terms mapped onto temperature; English speakers would presumably have difficulty guessing the intended meaning of “affectionate day,” which reverses the direction of the source–target relation. Likewise, relative difficulty is talked about as relative heaviness, but not vice versa (ex 4a), physical height is not figuratively expressed as social dominance (ex 4b) and so forth. Importantly, this directionality is not only a characteristic of conventional linguistic usages, but also predicts whether novel usages are likely to be interpretable.
a. ?? She kept the blanket from blowing away by placing a difficult stone on it.
b. ?? Picture A is hanging in charge of Picture B on the wall.
c. I have a fifty-ton task to complete this weekend.
By contrast, virtually any reference to weight is easily understandable as a reference to relative difficulty (ex 4c). Note that while the discussion here assumes English speakers will agree with judgments of interpretability, there is also experimental evidence regarding the (lack of) aptness of many “reversed” metaphors. For instance, though we know of no experimental evidence specifically focusing on primary metaphors, Chiappe et al. (Reference Chiappe, Kennedy and Smykowski2003) find that a variety of metaphors are judged uninterpretable when reversed.
A number of recent psychological studies (for reviews, see Lakoff Reference Lakoff2014; Shen & Porat, this volume) have raised very interesting questions about the unidirectionality of primary metaphors. The key finding reported is that there is non-linguistic behavioral evidence for two-way associations between primary sources and targets so that a feeling of suspicion, for instance, can support the detection of a “fishy” smell (cf. SUSPICIOUS IS SMELLY). Lakoff (Reference Lakoff2014: 8, emphasis in the original) offers the following summary of findings from Lee and Schwarz (Reference Lee and Schwarz2012): “They showed not just that fishy smells induce suspicion, but that by inducing suspicion in subjects, that subjects were better able to distinguish the smell of fish oil from other smelly oils.” Likewise, thoughts about morality can trigger a sense of feeling clean or unclean (cf. MORAL IS CLEAN); judgments of another individual’s social-emotional stance can influence sensations of temperature (cf. AFFECTION IS WARMTH); judgments of relative importance can affect assessments of physical weight (cf. SIZE IS IMPORTANCE), or judgments about similarities influence assessments of spatial distance (cf. SIMILARITY IS CLOSENESS).
What do such bidirectional psychological relationships mean for the theory of primary metaphors? The answer depends in part on how we define “metaphor.” One possibility is that we reserve the term for patterns that include lexical instantiation, i.e. usages of words or signs (in signed languages).
Casasanto (e.g. Reference Casasanto2013) suggests helpfully that it is worth distinguishing between what he calls “linguistic” and “mental” metaphors, which may overlap and be related but can also be considered separately. In a review of experimental research on various cognitive correlates of handedness, Casasanto (Reference Casasanto2014a) points out that left-handed individuals, unlike right-handers, tend to associate positive valence with the left side rather than the right – e.g. when asked to rate the appeal of images presented on one side or the other – and that this particular “space-valence association” can even be acquired (unconsciously) by right-handers, through a few minutes of training that constrains action with the dominant hand, allowing the left hand to move more fluently (ibid.: 112). Such a cross-domain association, which is not evidenced in language, would be a “mental” metaphor in Casasanto’s terminology.
On the contrary, we propose that such patterns as the association between ‘left’ and ‘good’ – in certain individuals, or at certain times – not be considered metaphors per se. Rather, these often-unconscious, cross-domain patterns of mental association constitute, as Casasanto (Reference Casasanto2013) suggests, the basis for potential metaphors that may or may not arise, based on various environmental factors, such as the linguistic usages of other speakers. According to Casasanto (Reference Casasanto2014a), these patterns do obviously form the basis for the common, crosslinguistic metaphor pattern that associates the right side with good things, presumably because right-handedness is far more common than left-handedness. In fact, it might be helpful to refer to such patterns of cross-domain mental association as “pre-metaphors,” not least because of existing evidence of certain such cross-domain mental associations even in infants (e.g. Lourenco & Longo Reference Lourenco, Longo, Dehaene and Brannon2011).3
Returning to the question of bidirectional effects, if we reserve the term “metaphor” for the patterns that inform explicit conceptualization (“thinking”) and communication (a usage that we feel is consistent with widely shared and intuitive understandings of the nature of metaphors), as opposed to merely unconscious associations affecting behavior, then the directionality of primary metaphors is preserved. While we certainly expect correlated dimensions of experience to lead to cognitive and neural association (Hebb Reference Hebb1949), and while this means we should expect bidirectional effects of some kind related to these correlations (cf. also Lakoff Reference Lakoff2014), this is not the same as calling these effects metaphors, which are most usefully and intuitively understood as patterns in language, (explicit) conceptualization, or both.
In sum, we propose that the recent psychological literature on bidirectional associations offers a rich and helpful context, but not counterevidence, for the idea of unidirectional primary metaphors. Certain kinds of correlation in experience (discussed next) lead to (bidirectional) patterns of mental association and to “pre-metaphors” (such as ‘side of the dominant hand is good’) that may in the end become (unidirectional) primary metaphors if they are regularly instantiated in conceptual and (consequently also reflected in) linguistic structure. The neurological basis for the directionality of these metaphors is the subject of section 2 of this chapter.
Correlation and Covariation
The origins and motivations for primary metaphors are presumed to be related to recurring correlations in experience, as the three examples offered at the beginning of this section illustrate. Brightness in our environment is correlated with positive affect, the weight of an object is correlated with the degree of effort required to lift or move it, and so forth. Lakoff and Johnson (Reference Lakoff and Johnson1980) pointed out such correlations as the basis for some metaphors, and Primary Metaphor Theory (Grady Reference Grady1997a) takes the step of identifying primary metaphors as ones that are plausibly motivated in this way – as opposed to more complex, culture-specific metaphor patterns such as the conceptualization of THEORIES as BUILDINGS (Grady Reference Grady1997b).
As Grady (Reference Grady2005b) points out, however, mere correlation between two aspects or dimensions of experience (one sensory, the other not) is not enough to predict a metaphorical relationship. These aspects must also be construable as sharing the same “superschematic structure”:
For instance, they must both be construable as states (viability – erect posture), as scalar properties (bright – happy), as atemporal relations (inside X – member of category X), or as actions (achieving a purpose – arriving at a destination), as events (failure – collapse; seeing – coming to know), as entities which can serve as landmarks, or as trajectors (state of affairs which is allowed to continue – physically supported object; fact – object which is seen), and so forth.
Even if eating is correlated with positive affect, for instance, we do not find conceptual patterns such as HAPPINESS (state) IS EATING (action/process). Additionally, to form the basis for a primary metaphor, the two correlated dimensions of experience must “covary”:
[A] difference in one domain, such as a difference in degree or quantity, for example, must be associated with a corresponding difference in the other. This is easiest to see in the case of scalar properties, and especially cases where either source causes target or vice versa. For instance, increased anger means increased heat, and increased heaviness means increased difficulty.
Finally, while these relationships predict the likely existence of a primary metaphor in most or all languages and cultures (see below), logical prediction is not the same as causal explanation. How exactly do primary metaphors arise? Are they natural products of experience and development, i.e. associations learned through repeated experiential correlations from infancy on? Might they be innate? Or are they culturally transmitted?
From the perspective of Primary Metaphor Theory, the most plausible general answer seems to be that experiences lead to natural cognitive associations (“pre-metaphors”), which then may or may not be established as conventional patterns of conceptual and linguistic associations, depending on the presence or absence of reinforcement from the surrounding linguistic and cultural environment. On the other hand, Casasanto (this volume) cites experimental evidence with infants, suggesting that at least some primary metaphors may in fact be based on innate patterns of cross-domain association, e.g. between magnitude in time and space, respectively. “Pre-metaphors” could thus be either learned or innate (for a detailed discussion of this, see Casasanto Reference Casasanto2014a, this volume).
Wide Distribution across Languages
Table 2.1 offers a number of additional examples of words from sixteen languages that follow five different primary-metaphor patterns, including two of the metaphors introduced above.
Table 2.1 Cross-linguistic survey of linguistic polysemy motivated by primary metaphor
| HEAVY ➔ DIFFICULT, ARDUOUS | HIGH ➔ SOCIALLY DOMINANT | LARGE ➔ IMPORTANT | HOLD/GRASP ➔ CONTROL | HOT ➔ EMOTIONALLY AGITATED | |
|---|---|---|---|---|---|
| Arabic | thaqil (‘heavy’) | ‘ala (‘above’) | kabir (‘large’) | qabda (‘grip’) | harara (‘heat’) |
| Basque | pezütarzün (‘weight’) | goi (‘high’) | handi (‘large’) | esku (‘hand’) | bero (‘hot’) |
| Finnish | raskas (‘heavy’) | yli- (‘above’) | suuri (‘large’) | käsi (‘hand’) | kuuma (‘hot’) |
| Hausa | naunaya (‘make heavy’) | kai (‘top’) | ‘bagume (‘become large’) | dank’a (‘hand to’) | zafi (‘heat’) |
| Hawaiian | kaumaha (‘heavy’) | luna (‘above’) | nui (‘large’) | kaohi (‘hold’) | wela (‘heat’) |
| Hungarian | nehéz (‘heavy’) | magas (‘high’) | nagy (‘large’) | szorít (‘grasp’) | forró (‘hot’) |
| Japanese | omoi (‘heavy’) | ue (‘above’) | ookii (‘large’) | te (‘hand’) | netsu (‘heat)’ |
| Malay | berat (‘heavy’) | tinggi (‘high’) | besar (‘large’) | pegang (‘hold’) | panas (‘hot’) |
| Mandarin | zhóng (‘heavy’) | sháng (‘above’) | dà (‘large’) | ba (‘hold’) | rè (‘be hot’) |
| Old Irish | tromm (‘heavy’) | ós (‘above’) | mór (‘large’) | lám (‘hand’) | té (‘hot’) |
| Russian | t’azolIj (‘heavy’) | v’erx (‘top’) | krupnIj (‘large’) | d’erzat’ (‘hold’) | gor’acij (‘hot’) |
| Sanskrit | guru (‘heavy’) | upári (‘above’) | mahâ- (‘large’) | grah- (‘grasp’) | usna- (‘hot’) |
| Swahili | -zito (‘heavy’) | juu (‘above’) | -kubwa (‘large’) | mkono (‘hand’) | moto (‘heat’) |
| Tagalog | mabigat (‘heavy’) | ipataas (‘raise’) | malaki (‘large’) | hawak (‘grip’) | mainit (‘hot’) |
| Turkish | agIr (‘heavy’) | üst (‘top’) | büyük (‘large’) | tutmak (‘hold’) | kizgin (‘hot’) |
| Zulu | -nzima (‘heavy’) | enyuka (‘rise’) | -khulu (‘size’) | phatha (‘hold’) | -shis- (‘heat’) |
Evidence for primary-metaphor patterns is found in languages so widely distributed in space and time that borrowing is not a plausible explanation. Instead, as implied by the kinds of considerations discussed above, it seems that primary metaphors reflect universal aspects of human experience, cognition or neural structure, or a combination of these.
Note, however, that the universality of a set of motivations for primary metaphors does not imply that lexical patterns themselves must be universal. There are a number of intervening factors between experience, for instance, and linguistic conventionalization, including cultural mediation, so that even a conceptual association that is well motivated may not end up leading to a productive pattern of semantic extension. Bernárdez (Reference Bernárdez, Gaz, Danaher and Łozowski2013: 417–420) reviews a number of studies and offers a compelling discussion of the fact that quite a few languages (in Australia, South America, and elsewhere) have a strong metaphorical pattern linking ‘knowing’ with ‘hearing’ but only a weak pattern, or none at all, linking ‘knowing’ and ‘seeing’ – the pattern familiar in usages of myriad English words such as introspection.
If particular patterns of conceptual association are well motivated yet not lexically conventional in a given language, it is an interesting question – not yet experimentally tested, to our knowledge – whether speakers of such a language might nonetheless easily grasp and learn these patterns. For instance, might speakers of an Australian language apparently lacking the KNOWING IS SEEING pattern nonetheless easily guess the intended meanings, related to knowledge, of novel usages of seeing verbs in their language? The discussion in this chapter, along with other discussions of motivations/origins for primary (or correlation-based) metaphors (e.g. Casasanto Reference Casasanto2014a), predict such patterns would be easily learned.4
Bases for More Complex Conceptualization
Grady (Reference Grady1997b) reexamined the pattern that Lakoff and Johnson (Reference Lakoff and Johnson1980) had called THEORIES ARE BUILDINGS and concluded that, for several reasons, it is best understood as a combination and special case of two more basic mappings: one relating CAUSAL/LOGICAL STRUCTURE (ORGANIZATION) TO PHYSICAL STRUCTURE (5a), the other relating VIABILITY/FUNCTIONALITY to ERECTNESS (Ex 5a, b, cf. Grady Reference Grady1997b). Each of these patterns is easily motivated by the kinds of correlations associated with primary metaphors, and each leads to a wide range of linguistic usages that have nothing to do with buildings (instead making reference to textiles, works of art, etc.) and can apply to a wide range of concepts from target domains (besides theories), such as social relations or ecosystems.
a. the fabric of society, an array of procedures, a masterpiece of logical construction
b. the foundation of marriage, the collapse of the Bay’s ecosystem
Following up in a similar vein, Grady (Reference Grady2005b) discussed primary metaphors as a special type of input to conceptual integration (Fauconnier & Turner Reference Fauconnier and Turner2002). For example, a novel linguistic metaphor that frames aloofness as a “glacier, slow to melt away” (Grady Reference Grady2005b: 1608) is an elaboration of the more basic (primary) association between ‘temperature’ and ‘affect’ (AFFECTION IS WARMTH). Likewise, many other complex and culture-specific metaphors are also based on and informed by primary patterns: Grady et al. (Reference Grady, Oakley and Coulson1999) discuss how the “Ship of State” metaphor pattern, while culturally determined to a significant degree, is also grounded in such primary (and widely distributed) patterns as ACTION IS SELF-PROPELLED MOTION, GOALS ARE DESTINATIONS, and SOCIAL RELATIONS ARE CONTAINERS.
In sum, primary metaphors are often the starting points for richer, more vivid, specific, and idiosyncratic conceptualizations (for a discourse-based illustration of this point, see Deignan, this volume).
2.2 On the Function(ality) of Primary Metaphors
Given that primary-metaphor patterns are so ubiquitous, both within English and across languages, it is natural to ask why. At least three broad categories of hypothetical answers to this question suggest themselves:
(i) There is no “function”: The patterns are epiphenomenal, not “useful,” an accidental by-product of some other aspect(s) of experience or brain function that happen to lead to particular linguistic and conceptual associations. (The possibility that primary metaphors are epiphenomenal in this sense can be considered the null hypothesis with respect to function.)
(ii) Primary metaphors offer communicative advantages: Source concepts are sensory in nature, and by definition associated with particular “images” – in a broad sense that includes kinetic and auditory images as well as visual ones. Thus, as basic elements of a communicative setting (as opposed to the internal, emotional world of the interlocutors, for instance), they might serve as “anchors” for communication, as speakers are able to point to and agree on their referents. If relatively more subjective or “abstract” target concepts such as ‘happiness’ or ‘difficulty’ can be linked with these concepts, there may be communicative advantages to these associations.
(iii) Primary metaphors offer cognitive advantages: If cognition related to source concepts is in any way different in character from cognition related to target concepts, then even if we were not communicating with others, primary metaphors might hypothetically play a useful role in conceptualization – namely by allowing us to harness cognitive capacities associated with source concepts when processing (in whatever sense) target concepts.
Any of these hypotheses may be true, and the last two are not mutually exclusive. But the third is arguably the strongest and most interesting. It suggests a reason to explore possible neural substrates for “sourceness” and “targetness,” in case these point to processing differences that would lead to cognitive advantages of some kind when source and target are associated.
3 Thinking Ahead: A Proposal on the Neural Character of Primary Source and Target
While scholars in several fields have explored linguistic, psychological, and other aspects of primary metaphors (this volume: Casasanto; Cuccio; Mittelberg & Joue; Winter & Matlock), it is currently unknown whether primary metaphors reflect aspects of neural organization or function. But given the basic and (seemingly) universal nature of these patterns, it is potentially fruitful to speculate whether there may be neural correlates to some aspects of primary metaphor – possibly related to the cognitive characteristics referred to in the discussion of potential cognitive advantages above. More specifically, can neural signatures of sourceness vs. targetness be identified? We suggest in the remainder of the discussion that the answer may be yes and explore several related hypotheses, which may ultimately describe the neural signatures in question.
Lakoff (Reference Lakoff2014: 5) proposes that the asymmetry between source and target, and, in particular, the directionality between them, can be explained based on ordering: “The asymmetry of the mappings appears to arise via STDP – spike-timing dependent plasticity – from which metaphor sources and targets can be predicted.”
To the best of our understanding, Lakoff is arguing that if (the experiences underlying) two concepts tend to be associated, according to STDP, the more frequent of the two would tend to become pre-synaptic and the less frequent would tend to become post-synaptic. It is useful here to recall the basic rule of STDP: If neuron A fires “just before” neuron B, the synapses from (the axons of) A to (the dendrites of) B are strengthened, while the synapses from (the axons of) B to (the dendrites of) A are weakened. Thus, Lakoff’s STDP hypothesis makes assumptions about which neurons fire more frequently and also assumes that the neurons firing more frequently would tend to fire “just before” those firing less frequently.5 We take no position on these assumptions except to suggest that they would need to be substantiated. In any case, we believe there are additional hypotheses worth exploring when it comes to the distinctions between source and target, and the possible cognitive advantages of linking one to the other.
For purposes of subsequent discussion, it will be helpful to distinguish four aspects of the complex set of experiential, cognitive, and social processes that ultimately yields semantic meaning:
stimulus ➔ (phenomenological) experience ➔ concept ➔ language
While the relationships among these are certainly not as simple as a linear, ordered chain, the ordering above is also not random and reflects an intuitive scenario in which the existence of a stimulus is a logical starting point and a fact about language is the end result. A given stimulus (e.g. an apple) is perceived in ways that lead to an experience of that stimulus (e.g. of the flavor, visual image, or weight of an apple); concepts, in turn (e.g. the concept ‘apple’), are in some way connected with and refer to such experiences; and language, in turn (e.g. the English word “apple”), is built on and refers to such concepts.
There are endless possible ways of reinterpreting these relationships, but such discussions are not germane to the present chapter, and the schema above is presented only to clarify that the discussion is primarily concerned with neural structures and processing related to experiences – e.g. the neural processes that allow us to experience the look, feel, or taste of an apple, or any emotional responses associated with apple – as opposed to the neural substrates of concepts or language per se.6 More specifically, our discussion will largely focus on the substrates for “primary source concept experiences” (source experiences, or SEs, for short) and “primary target concept experiences” (TEs).
3.1 Localized vs. Distributed Neural Substrates
Having established that we will be focusing on distinctions at the level of neural substrates of perception and phenomenological experience, there are several related hypotheses worth exploring regarding the ways in which the substrates for SEs may differ systematically from those of TEs. The main hypothesis we explore here is that the brain activity patterns representing source experiences are organized based on cortical positions (as sets of localized “hotspots”), which we will refer to as “neural maps.” In other words, we propose that the experiences associated with source concepts such as HEAVINESS, HEIGHT, and BRIGHTNESS are more likely to have neural substrates characterized by neural maps, than are the experiences associated with target concepts such as DIFFICULTY, DOMINANCE, or HAPPINESS.
This position is generally consistent with the known functional organization of the mammalian neocortex pertaining to perception of the body and of the external world (Bednar & Wilson Reference Bednar and Wilson2015). For example, the sensory cortex represents somatic sensation (such as touch, pain, and cold or hot detection) from skin receptors approximately preserving the local organization of the periphery. A similar “body map” in the motor cortex (known as the homunculus) codes for the planning and execution of intended movements (Graziano & Aflalo Reference Graziano and Aflalo2007).
One type of cortical localization is the case of so-called “topographic mapping,” in which nearby cortical locations encode for similar stimuli (Chklovskii & Koulakov Reference Chklovskii and Koulakov2004). In other words, parts of the cortex are organized such that adjacent neurons respond to the same type of stimulus (e.g. edges of visual shapes or frequencies of sound tones), and the spatial positions of the neurons themselves encode variations that can be detected, such as relative inclination angles of an edge or the pitch of a voice. All primary sensory and motor maps in the mammalian neocortex are topographic, but higher associative areas such as the frontal neocortex are also known to follow a topographic architecture (Silver & Kastner Reference Silver and Kastner2009).
At the same time, due to the complex and far-reaching, tree-shaped structure of their input and output extensions (dendrites and axons, respectively), cortical nerve cells can also form non-topographic maps. That is, they can form complexes (so-called “cell assemblies”) that act together to perceive, process, and represent particular types of stimuli in the way that topographic maps do, though they are not physically adjacent in the same sense. Among the most evolutionarily conserved examples of these non-topographic maps (i.e. showing significant similarities among species related to humans and to each other) are the neural systems responsible for spatial representation, which have also been shown to play a role in higher cognition and language (Landau & Lakusta Reference Landau and Lakusta2009). Some of the most striking examples of these spatial maps are given by the so-called systems of “place cells” in the hippocampal formation. Place cells are neurons that are selectively active when the subject is in a particular location within an environment. For example, in the context of a residential setting, a specific hippocampal neuron might be systematically active when the subject stands by the stove in the kitchen, another neuron when the subject is in the shower, and yet another when the subject crosses the entrance doorway. The map is non-topographic because nearby locations are not reflected by nearby neurons and vice versa: the ‘stove’ neuron and the ‘shower’ neuron might be adjacent in the hippocampus (even if the kitchen and bathroom are in opposite locations in the house), whereas a neuron representing the medicine cabinet in the bathroom just next to the shower might be physically located in a relatively faraway position of the hippocampus.
Although the activity of all hippocampal neurons is spatially modulated (McNaughton et al. Reference McNaughton, Barnes, Gerrard, Gothard, Jung and Knierim1996), pyramidal cells (the main neuron types in the hippocampus) display the strongest and most reliable place fields. More generally, activity in the neural circuits of the hippocampus appears to be involved in spatial navigation (Hartley et al. Reference Hartley, Lever, Burgess and O’Keefe2014) and, of relevance to our proposal, in the spatial aspects of imagination (Bird et al. Reference 337Bird, Bisby and Burgess2012).
In both topographic and non-topographic maps, the cortical representations of experience remain fairly localized and encoded in the actual positions of the activation foci. This survey represents only a greatly simplified account of the experimental neuroscientific evidence, as the extent of the topography (and even map-like character) of cortical representation may vary substantially depending on the considered level of analysis, from microscopic neuronal assemblies to macroscopic regional activation (Kanold et al. Reference 351Kanold, Nelken and Polley2014).
The hypothesis we suggest here is that the neural representation of an SE (such as the pattern of activity necessary for representation of an experience of heaviness or brightness) corresponds to a significantly smaller set of such loci than the representation of a typical TE (such as difficulty or happiness). In this sense, the cortical representation of SEs would be substantially more localized than the representation of TEs. For instance, the physical brightness of a light stimulus is encoded as visual intensity in the primary cortex in the occipital lobe organized as a retinotopic map (that is, cortical positions actually reflect the locations in which light entered the retina at the back of the eye). Similarly, heat is encoded in temperature-sensitive patches of somatosensory cortex in the parietal lobe organized as a body map (the homunculus). Likewise, heaviness is encoded as force intensity during movement planning in the premotor cortex in the frontal lobe organized as a directional map.
If SEs are, as just discussed, associated with more localized representations, TEs appear to be associated with more distributed representations and diffuse brain states. As an extreme case of diffuse, distributed representation, many target concepts – specifically those associated with affect – appear to be considerably related to a system of neurotransmitters that modulate cortical activity across the entire cerebral cortex rather than locally (Zaborszky Reference Zaborszky2002). These modulatory neurotransmitters include, among others, acetylcholine (associated with states of purposeful arousal, as during a search activity), dopamine (positive and negative valence), serotonin (satisfaction or aversion to risk-taking), and noradrenaline (sudden and unexpected novelty). It seems that many TEs – such as those associated with happiness and difficulty – may consist of or include affective states represented by a combination of such diffuse representations throughout cortical areas. Although the literature on emotional representation in the brain is immense (and there has also been considerable study of the culturally determined dimensions of emotion), our specific emphasis on target concepts in primary metaphors leads us to focus on the neural correlates of declarative experience (Mitchell & Greening Reference Mitchell and Greening2012), as well as the most basic psychological and physiological dimensions that experiences of happiness, for instance, might share across cultures.
It is tempting to speculate that other primary target concepts, including DOMINANCE or others related to social status, may also be less correlated than source concepts with localized areas of activity, and more correlated with diffuse cortical states (see Table 2.2). If true, such a pattern would add substance to previous suggestions (e.g. Grady Reference Grady2005a) that primary target concepts encode aspects of our response to the environment, while source concepts encode our physical perceptions of it. In effect, SEs would make up something like a “physical map” of a given scenario, i.e. reflect the physical aspects of the scenario, while TEs would make up a “goal-related map,” i.e. reflect various aspects of our goal-orientation, correlated in turn with characteristic balances of neurotransmitters, and possibly other non- or less localized aspects of cortical activity.
Table 2.2 Primary source and target, and potential neural substrates
| Source experience | Target experience | English examples | Neural map [sensorimotor “pre-wiring”] | Relevant neurotransitter(s) |
|---|---|---|---|---|
| HEAVINESS | DIFFICULTY | work load, light duty | primary motor | acetylcholine (Ach) |
| HEIGHT | SOCIAL DOMINANCE | middle class, high officer | visual topograpy | aoradrenaline (NA) |
| BRIGHTNESS | HAPPINESS | sunny disposition, dark hours | visual intensity | dopamine (DA) |
| HEAT | EMOTIONAL AROUSAL | cool off, heated discussion | somatic homunculus | serotonin (HT) |
| SIZE | IMPORTANCE | a big day a small change | secondary visual | NA/Ach |
| FORWARD MOTION | PROGRESS TOWARD GOAL | getting there, one step forward, two steps back | secondary motor | DA/HT |
| PROXIMITY | SIMILARITY | close (e.g. in skills), far apart (e.g. in ideas) | visual topograpy | Ach1 |
1 Acetylcholine could be correlated with TEs for ‘similarity’ if these are in any way related to search tasks – that is, if detecting similarity between two objects of attention is related to identifying objects of search tasks.
3.2 Attentional Mechanisms
An interesting consequence of the proposed neural distinctions between SEs and TEs along the lines of cortical localization vs. neuro-modulatory diffuseness is that SEs may be more closely associated than TEs with attentional mechanisms, making it easier in some sense to focus on source concepts. According to this perspective, attention might be bootstrapped to target concepts via source concepts, potentially yielding a cognitive advantage of primary metaphors.
A brain region relevant to this consideration is the thalamus, which plays a crucial role in selective gating (Shipp Reference Shipp2004): the process of suppressing or allowing activity in a given set of neurons. The thalamus is robustly and bidirectionally connected with all areas of the neocortex and is believed to modulate relative levels of activity in various neocortical locations, giving rise to the ability to selectively experience/attend to some stimuli vs. others.7 Given the previous discussion of cortical localization, i.e. the positional encoding of the neural correlates of SEs, it may be the case that the attentional mechanisms associated with the thalamus can act more directly on these localized representations than on the more diffuse states associated with TEs.
An additional notion worth considering related to the aforementioned attentional mechanisms concerns a micro ‘aha’ or ‘eureka’ effect, which may again involve bootstrapping of target concepts by source concepts. An intriguing possibility is that when we “successfully” attend to or focus on a concept – i.e. when the brain successfully stabilizes into a state representing a particular experience, which might happen on the order of many times per second – the resulting neural and cognitive events correspond in some sense to effective thought, as opposed to random noisy activity (Kounios & Beeman Reference Kounios and Beeman2014). If it is easier to achieve this type of stabilization, and the resulting cognitive “reward,” with source than with target concepts – e.g. due to attentional mechanisms discussed earlier – then linking the two may enable this effect with target concepts as well. In short, SEs may be associated with more stable or identifiable representations than target concepts, perhaps mediated in part by the neural map organization discussed above, and this stability could in turn lead to a cognitive distinction and advantage.
4 Conclusion
The speculation in the second part of this chapter is partly motivated by a desire to capture the intuition that metaphors are “helpful” in some sense, while also recognizing that primary target concepts, or at least TEs, are so basic to our experience that it is difficult to see how we would need help thinking or talking about them. The hope going forward is that further research may help clarify, possibly by following up on suggestions offered here, the types of cognitive advantage that could result from linking a concept correlated with localized neural map(s) on the one hand, with concepts correlated with more diffuse representation on the other.
The discussion is further motivated by questions regarding “sourceness” and “targetness” as real categories of concepts, and tries to build on the widely shared intuition that these concepts are of meaningfully different kinds. If so, can we expect the conceptual differences to be reflected in distinct types of neural signature? In at least some cases, there seem to be promising directions to look in for such distinctions.
An additional, related question not substantively addressed in the body of the chapter concerns “primary-ness” itself, and whether this too might be a meaningful category from a conceptual and/or neural point of view. For instance, it might be useful to define primary (source and target) concepts (like HEAVY, DIFFICULT, HIGH, DOMINANT, BRIGHT, HAPPY, etc. as opposed to more specific concepts like ‘house,’ ‘marry’) as those whose associated experiences the brain is pre-wired to process or represent. Neuroscience research is making promising progress in elucidating the molecular, physiological, and behavioral dynamics underlying the formation and alignment of topographic brain maps within and across representation modalities (Cang & Feldheim Reference Cang and Feldheim2013).
Finally, all of the above hypotheses arise from looking at a small sample of primary metaphors. We hope that future research might follow up on some of the questions and hypotheses suggested here by looking more systematically at a broader set of patterns.
1 Introduction
According to theories of metaphorical mental representation, metaphors in language are more than just ways of talking, they are clues to a pervasive way of thinking (Lakoff & Johnson Reference Lakoff and Johnson1980). On this view, when people use expressions like a “long vacation,” a “high price,” or a “close resemblance,” they are using mental representations of space (i.e. ‘length,’ ‘height,’ ‘proximity’) to scaffold mental representations in non-spatial conceptual domains (i.e. ‘time,’ ‘value,’ ‘similarity’). Although initial evidence for metaphor theory was based on descriptive analyses of how people talk, there is now abundant experimental evidence that people also think metaphorically – even when they are not using any metaphorical language (or using language, at all) (for a review, see Casasanto & Bottini Reference Casasanto and Bottini2014a). That is, people often think in “mental metaphors” (Casasanto Reference Casasanto2008a, Reference Casasantob): point-to-point mappings between nonlinguistic representations in a “source domain” (e.g. SPACE) and a “target domain” (e.g. TIME) that is typically more abstract (i.e. hard to perceive) or abstruse (i.e. hard to understand; Lakoff & Johnson Reference Lakoff and Johnson1980), which support inferences in the target domain.
The term “mental metaphor” is used contrastively with “linguistic metaphor” here, the former designating a mapping between non-linguistic mental representations, and the latter between linguistic representations.1 Distinguishing mental metaphors from linguistic metaphors becomes particularly important in contexts like the present chapter where I raise questions like: “Do people who use different linguistic metaphors think in correspondingly different mental metaphors?” and “Do people sometimes think in mental metaphors that are absent from language?”
Where do our mental metaphors come from? Does everyone use the same mental metaphors, at least when they are thinking about universal experiences like the passage of time? Once established, do our basic mental metaphors ever change? Some of the answers to these questions offered by metaphor theory’s founders (Lakoff Reference Lakoff1993; Lakoff & Johnson Reference Lakoff and Johnson1999) are at odds with a growing body of experimental results. In this chapter I first illustrate the tension between some core tenets of Conceptual Metaphor Theory (Lakoff & Johnson Reference Lakoff and Johnson1999) and experimental tests of metaphorical thinking. I then describe a proposed resolution, Hierarchical Mental Metaphors Theory (Casasanto Reference Casasanto2008b; Casasanto & Bottini Reference Casasanto and Bottini2014a, Reference Casasanto and Bottinib), and review three sets of studies that serve as testbeds for this proposal.
2 Origin and Universality of Mental Metaphors: Puzzles and Paradoxes
According to Lakoff and Johnson (Reference Lakoff and Johnson1999), our most basic mental metaphors are universal because we learn them from universal correlations between source and target domains in the natural environment:
We acquire a large system of primary metaphors automatically and unconsciously simply by functioning in the most ordinary of ways in the everyday world from our earliest years. We have no choice in this […]
When the embodied experiences in the world are universal, then the corresponding primary metaphors are universally acquired […] Universal conceptual metaphors are learned; they are universals that are not innate.
Once acquired, these basic mental metaphors are said to constitute “fixed conceptual mappings” (ibid.: 149), implemented in “permanent neural connections … across the neural networks that define conceptual domains” (ibid.: 46).
To summarize these points, our basic mental metaphors are proposed to be: (a) learned early in life on the basis of source–target correlations in the world; (b) universal, so long as the source–target correlations in the world are universal, and (c) fixed, by virtue of their implementation in permanent neural connections.
Yet the claims that basic mental metaphors are learned, universal, and fixed are all challenged by experimental data that have accumulated over the past decade. The claim that primary metaphors are learned early in life on the basis of observed source–target correlations was called into question by the most direct test to date. De Hevia and colleagues (Reference de Hevia, Izard, Coubart, Spelke and Streri2014) showed that neonates between 0 and 3 days old are already sensitive to relationships between spatial, temporal, and numerical magnitudes that are encoded in linguistic expressions like “a long time” and “a large number.” At 0 to 3 days, these infants presumably had no understanding of linguistic and cultural conventions linking these domains, and had very little experience with correlations between them in the natural world. The data presented by de Hevia and colleagues thus suggest that cross-domain mappings between space, time, and number may be innate, not learned.
The claim that mental metaphors are universal is challenged by numerous experiments demonstrating crosslinguistic, cross-cultural, and cross-individual variation in spatial mappings for basic domains of experience. For example, some languages talk about musical pitch in terms of one-dimensional spatial height, whereas other languages metaphorize pitch in terms of multidimensional spatial thickness; speakers’ mental metaphors for pitch differ accordingly (Dolscheid et al. Reference Dolscheid, Shayan, Majid and Casasanto2013). Some cultures depict temporal sequences as unfolding rightward across calendars, graphs, and written timelines, whereas other cultures depict them as unfolding leftward; people’s mental timelines follow the directions of these culture-specific SPACE–TIME mappings (Casasanto & Bottini Reference Casasanto and Bottini2014b; Fuhrman & Boroditsky Reference Fuhrman and Boroditsky2010; Ouellet et al. Reference Ouellet, Santiago, Israeli and Gabay2010; Tversky et al. Reference Tversky, Kugelmass and Winter1991). Right-handers implicitly associate positive ideas and emotions with the right side of space and negative ideas with the left, but left-handers show the opposite associations between space and emotional valence (Casasanto Reference Casasanto2009a). These studies showing that mental metaphors can be language-specific, culture-specific, or body-specific also challenge the claim that cross-domain mappings in our minds are fixed; on the contrary, mental metaphors that are deeply entrenched and highly automatic can be changed – even reversed – after as little as 5 minutes of exposure to different cross-domain relationships. How can mental metaphors be grounded in universals of experience if they vary across people? How can they be fundamental to our conceptualizations of target domains if they can change in a matter of minutes?
3 A Proposed Solution: Hierarchical Mental Metaphors Theory
A solution to these apparent paradoxes emerges if we consider that even our most basic mental metaphors are constructed over multiple timescales, on the basis of multiple kinds of experience. According to Hierarchical Mental Metaphors Theory (HMMT; Casasanto & Bottini Reference Casasanto and Bottini2014a,Reference Casasanto and Bottinib; see also Casasanto Reference Casasanto2008b, Reference Casasanto2014b; Dolscheid et al. Reference Dolscheid, Shayan, Majid and Casasanto2013), the cross-domain mappings that people use at any moment are members of a superordinate family of mappings. The superordinate family is typically constructed on the basis of source–target relationships in the natural world. These relationships could be learned from early experiences with source and target domains, as Lakoff and Johnson (Reference Lakoff and Johnson1999) suggest, or they could be part of infants’ innate “core knowledge” (Srinivasan & Carey Reference Srinivasan and Carey2010). Cross-domain relationships like MORE TIME IS MORE DISTANCE are survival-relevant and could plausibly become encoded in the human genome. Whether learned or innate, each superordinate family of mental metaphors constitutes a set of mappings that can be used for scaffolding target-domain thinking and can be encoded in linguistic and cultural conventions. For a given target domain (e.g. PITCH), children acquire (or manifest) a superordinate family of source-domain mappings early in life. To the extent that source–target relationships in the natural world are found universally, superordinate families of mental metaphors should be universal.
Once learners are exposed to relevant conventions in language and culture, or to regularities in the way they use their particular bodies, a second process begins which can continue throughout the lifetime and gives rise to the diversity of mental metaphors found across individuals and groups. One of the mappings from a superordinate family (or a subset of the mappings) becomes strengthened through a process of competitive associative learning. For example, each time someone uses a linguistic metaphor, the corresponding non-linguistic source–target mapping is activated. Activating a mapping strengthens this source–target association and, importantly, also weakens the competing source–target mappings in the same family, as a consequence. The process of strengthening the frequently activated mappings within a family, and of weakening their “sibling” mappings, follows naturally from the dynamics of long-term memory networks for families of associations (e.g. Anderson et al. Reference Anderson, Bjork and Bjork2000).
The process of strengthening and weakening specific mappings within superordinate families can account for several otherwise mysterious properties of mental metaphors. First, this hierarchical model can explain how mental metaphors can be grounded in universal source–target relationships in the natural world and yet be variable across people. Even if superordinate families are universal, the specific mappings that get used most frequently or automatically can vary across individuals and groups. On this view, there is no single answer to the question, “Are our mental metaphors universal?”
Second, HMMT can explain how mappings can change rapidly in response to new patterns of experience. In the examples that will be reviewed below, participants are implicitly using “new” mappings that differ from – and in some cases directly contradict – the mappings they ordinarily use, after only brief experimental interventions. This rapid change is possible because the “new” mappings introduced during the experiment are not really new; rather, they are members of the same superordinate family as the mappings participants normally use and can become strengthened through repeated use to the point that they are (at least temporarily) stronger than the mappings that participants ordinarily use.2
Here I will focus on the three sets of mental metaphors mentioned above, whose use is conditioned by different streams of physical and social experience. Spatial representations of a particular dimensionality or directionality serve as the metaphorical source domain that structures people’s mental representations in the target domains of MUSICAL PITCH, TIME, and EMOTIONAL VALENCE. These target domains are more “abstract” than the domain of space insomuch as they are impossible to see or touch; in the cases of time and valence, they are impossible to experience through any of the five senses. That is, we can see the spatial length of a rope or the height of a ladder, but we can never see the length of a vacation (i.e. its duration) or the height of a musical note (i.e. its auditory frequency).3
Spatializing these non-spatial domains in our minds may make our experiences of pitch, time, and valence easier to imagine, compare, or remember. It may be a human universal to conceptualize these domains in terms of space (cf. Eitan & Timmers Reference Eitan and Timmers2010; Whorf Reference Whorf and Carroll1956), but the particulars of these spatial representations vary across groups of people, according to the particulars of their linguistic, cultural, or bodily experiences. The mechanism underlying all of these effects of experience, I will suggest, is the same: Habitual experiences cause a certain mental metaphor to be activated frequently, strengthening this source–target association in memory, at the expense of competing associations within the same family of mappings.
4 Spatial Representations of Musical Pitch: Universals and Language-Specificity
In many languages, pitch is metaphorized in terms of vertical space: High-frequency pitches are “high” and low-frequency pitches “low.” But this is not the only possible spatial metaphor for pitch. In other languages like Farsi, Turkish, and Zapotec, high-frequency pitches are “thin” and low-frequency pitches are “thick” (Shayan et al. Reference Shayan, Ozturk and Sicoli2011). Beyond talking about pitch using spatial words, do people think about pitch using spatial representations? Several studies suggest that speakers of “height languages” like English activate vertical SPACE–PITCH mappings when judging pitches (e.g. Pratt Reference Pratt1930; Roffler & Butler Reference Roffler and Butler1968). In one set of experiments, Dolscheid et al. (Reference Dolscheid, Shayan, Majid and Casasanto2013) investigated (a) whether people still activate SPACE–PITCH associations even when they are not using language, and (b) whether speakers of “height languages” and “thickness languages” tend to use the same non-linguistic SPACE–PITCH associations, or whether their mental metaphors for pitch are shaped by their experience of using linguistic metaphors.
Like English, Dutch describes pitches as hoog (‘high’) or laag (‘low’), but in Farsi, high pitches are ‘thin’ (nāzok) and low pitches are ‘thick’ (koloft), as already noted. Dolscheid et al. (Reference Dolscheid, Shayan, Majid and Casasanto2013) tested Dutch and Farsi speakers on a pair of non-linguistic pitch reproduction tasks in which participants were asked to reproduce the pitches of tones that they heard in the presence of irrelevant spatial information: either lines that varied in their height (height interference task) or their thickness (thickness interference task). Dutch speakers’ pitch estimates were strongly affected by irrelevant spatial height information. On average, a given tone was sung back higher when it had been accompanied by a line that was high on the computer screen, and lower when it had been accompanied by a line that appeared low on the screen. By contrast, lines of various thicknesses had no measurable effect on Dutch participants’ pitch estimates. Farsi speakers showed the opposite pattern of results. Lines of varying heights had no measurable effect on Farsi speakers’ pitch estimates, but tones accompanied by thin lines were sung back higher, and tones accompanied by thick lines were sung back lower.
4.1 Differences in Nonlinguistic Pitch Representations Not Due to Verbal Labeling during the Task
The pattern of spatial interference on people’s pitch reproduction performance reflected the SPACE–PITCH metaphors in their native languages: Dutch speakers could not help incorporating irrelevant height information into their mental representations of pitch (but could ignore thickness), whereas Farsi speakers could not help incorporating irrelevant thickness information into their mental representations of pitch (but could ignore height). This pattern cannot be explained by differences in overall accuracy of pitch reproduction, or in differences in musical training between groups.
Importantly, this pattern also cannot be explained by participants using language covertly during the task: labeling the pitches they needed to reproduce as “high/low” or “thick/thin.” This explanation was ruled out by the experimental design, in which nine different pitches were paired with each of nine different spatial heights or thicknesses. This “crossing” of all of the levels of pitch and of space meant that variation in each domain was orthogonal to variation in the other: There was no correlation between space and pitch in the stimuli. As such, covertly labeling high pitches as “high” (or “thin”) and labeling low pitches as “low” (or “thick”) could not produce the observed effects of space on pitch reproduction; on the contrary, labeling pitches using the spatial metaphors in one’s native language during the task could only work against the effects we predicted and found.
Rather than an effect of using language “online” during the task, these experiments show an effect of people’s previous experience using either one linguistic metaphor or the other, and thereby strengthening either one mental metaphor or the other (i.e. strengthening a non-linguistic HEIGHT–PITCH or THICKNESS–PITCH mapping in memory). To confirm that the observed effects did not depend on participants covertly labeling pitches during the task, Dolscheid et al. (Reference Dolscheid, Shayan, Majid and Casasanto2013) repeated the height interference task in Dutch speakers with the addition of a concurrent verbal suppression task. On each trial of the task, participants had to rehearse a novel string of digits while perceiving and reproducing the pitches. Secondary tasks like this have been used across many experiments to prevent participants from labeling the stimuli (e.g. Winawer et al. Reference Winawer, Witthoft, Frank, Wu, Wade and Boroditsky2007). As predicted, verbal suppression had no effect on the results of the pitch reproduction task. Dutch speakers still showed strong height–pitch interference, consistent with an “offline” effect of participants’ previous experience using language on their subsequent non-linguistic pitch representations (see also Casasanto Reference Casasanto2008b).
4.2 Does Using Different Linguistic Metaphors Cause People to Use Different Mental Metaphors?
The results reviewed so far show a correlation between people’s linguistic metaphors and their non-linguistic mental metaphors, but they do not provide any evidence that language causes Dutch and Farsi speakers to mentally represent pitch differently. Dolscheid et al. (Reference Dolscheid, Shayan, Majid and Casasanto2013) reasoned that if using THICKNESS–PITCH metaphors in language is what causes Farsi speakers to activate THICKNESS–PITCH mappings implicitly when reproducing pitches, then exposing Dutch speakers to similar THICKNESS–PITCH metaphors in language should cause them to reproduce pitches like Farsi speakers. A new sample of Dutch speakers were recruited and assigned to one of two training conditions: Participants in the “thickness training” group learned to describe pitches using Farsi-like metaphors (e.g. “a tuba sounds thicker than a flute”), whereas the other half in the “height training” group (i.e. the control group) described pitches using standard Dutch metaphors (e.g. “a tuba sounds lower than a flute”). After about 20 minutes of this linguistic training, participants in both groups performed the non-linguistic thickness interference task described above. Whereas “height-trained” participants showed no effect of irrelevant thickness information on their pitch estimates, “thickness-trained” participants showed a thickness interference effect that was statistically indistinguishable from the effect found in native Farsi speakers. Even a brief (but concentrated) “dose” of thickness metaphors in language was sufficient to influence Dutch speakers’ mental metaphors, demonstrating that linguistic experience can cause the differences in non-linguistic pitch representations found across natural language groups.
Notably, Dolscheid et al. (Reference Dolscheid, Shayan, Majid and Casasanto2013) also included a training condition in which Dutch-speaking participants were taught to use “thick” and “thin” to describe pitches in a way that contradicts both the linguistic mappings in languages like Farsi and the relationships between thickness and pitch in the natural world (e.g. they learned to use expressions like “a tuba sounds thinner than a flute”). This training intervention was identical to the “Farsi-training” condition described above in every way except for the pairing of high/low with thick/thin. Although participants learned to use the “reverse-Farsi” mapping with high accuracy (95%), this linguistic training had no effect on their non-linguistic mental representations of pitch.
The contrast between the Farsi-like and reverse-Farsi training supports a prediction of HMMT: People should be able to adopt mappings that are included in a superordinate family of mappings (e.g. the SPACE–PITCH mappings that are evident in the natural world) more easily than they can adopt mappings that are not included in the superordinate family (i.e. mappings that run contrary to, or orthogonal to, the source–target mappings found in the natural world).
4.3 When Does Language Shape SPACE–PITCH Mappings?
The results reviewed up to this point leave open the question: Do SPACE–PITCH metaphors in language cause people to develop the corresponding non-linguistic SPACE–PITCH mappings, or does using linguistic metaphors change how likely people are to use a preexisting mental metaphor? To evaluate these possibilities, Dolscheid et al. (Reference Dolscheid, Hunnius, Casasanto and Majid2014) tested 4-month-old infants on a pair of space–pitch congruity tasks. Infants heard pitches alternately rising and falling while they saw a ball rising and falling on a screen (height congruity task) or a cylinder growing thicker and thinner (thickness congruity task). For half of the trials, changes in pitch and space were congruent with the HEIGHT–PITCH and THICKNESS–PITCH mappings encoded in Dutch and Farsi, respectively, and for the other half of the trials they were incongruent with these SPACE–PITCH mappings. The data showed that infants looked longer at congruent SPACE–PITCH displays than at incongruent displays. This was true both in the height congruity condition (consistent with an earlier experiment by Walker et al. Reference Walker, Bremner, Mason, Spring, Mattock and Slater2010) and in the thickness congruity condition. There was no difference in the magnitude of the congruity effect between conditions, suggesting that there was no difference in the strength of the HEIGHT–PITCH and THICKNESS–PITCH mappings in the infants’ minds.
Four-month-olds are completely unable to produce SPACE–PITCH metaphors in language and are also, presumably, unable to understand them. Yet, they are sensitive to two of the SPACE–PITCH metaphors that are found in languages like Dutch and Farsi, and in their speakers’ non-linguistic pitch representations. These results suggest that people who use different linguistic metaphors for pitch come to think about pitch differently, not because language instills in them one spatial mapping instead of the other, but rather because language strengthens one of their preexisting SPACE–PITCH mappings, at the expense of the other.
4.4 Hierarchical Construction of Spatial Metaphors for Pitch
How could infants who are sensitive to both HEIGHT–PITCH and THICKNESS–PITCH mappings turn into adults who appear to activate only one of these mappings when they represent pitch? This process can be understood in terms of HMMT. First a superordinate “family” of mappings is established, which in the case of space and pitch includes both the HEIGHT–PITCH and THICKNESS–PITCH mappings. These mappings may be constructed, over either ontogenetic or phylogenetic time, on the basis of observable correlations between space and pitch in the natural world. The HEIGHT–PITCH mapping reflects the fact that people involuntarily raise their larynxes, chins, and sometimes other body parts (e.g. their eyebrows) when they produce higher pitches, and lower them when they produce lower pitches. It also reflects a statistical tendency for higher pitches to originate from higher locations, and lower pitches from lower locations (Parise et al. Reference Parise, Knorre and Ernst2014). The THICKNESS–PITCH mapping reflects a pervasive correlation between pitches and the size of the objects or creatures that produce them: Consider the different pitches produced by, for example, strumming thin vs. thick strings on a guitar; banging on a large steel oil drum vs. a small steel can; barking by a small dog vs. a big dog. Although Dolscheid et al.’s (Reference Dolscheid, Hunnius, Casasanto and Majid2014) data confirm that both the HEIGHT–PITCH and THICKNESS–PITCH mappings are present in infants’ minds, they leave open the question of exactly how and when these mappings become established initially.
Whatever the ultimate origin of SPACE–PITCH mappings in pre-linguistic children may be, when children learn metaphors in language, a second process begins. The findings in adults reported by Dolscheid and colleagues (Reference Dolscheid, Shayan, Majid and Casasanto2013) suggest that each time people use a linguistic metaphor like “a high pitch” they activate the corresponding mental metaphor, strengthening this mapping at the expense of competing mappings in the same family of SPACE–PITCH associations. As a consequence, speakers of height languages like Dutch and English come to rely on vertical spatial schemas to scaffold their pitch representations more strongly than on multidimensional spatial schemas, whereas speakers of thickness languages like Farsi come to rely on multidimensional spatial schemas, more strongly than vertical spatial schemas.
According to HMMT, the process of strengthening certain mental metaphors via the use of the corresponding linguistic metaphors results in the weakening of other members of the family of mappings – but this does not cause these dispreferred mappings to be extinguished. This aspect of the theory may explain the representational flexibility demonstrated in the training experiment by Dolscheid and colleagues (Reference Dolscheid, Shayan, Majid and Casasanto2013). Dutch speakers could be induced to use a non-linguistic THICKNESS–PITCH mapping (like Farsi speakers) after only a brief training intervention because no spatial mappings had to be created or destroyed; rather, the new pattern of language experience boosted the strength of the thickness–pitch mapping that had presumably been present in the Dutch speakers’ minds since infancy, causing them to think about pitch in a way that was not new, just rarely used.
5 Spatial Representations of Temporal Sequences: Universals and Culture-Specificity
Spatial metaphors for time are common across languages (Alverson Reference Alverson1994). In English, time appears to flow along a sagittal (front–back) axis: the future is “ahead” and the past is “behind.” No known spoken language uses the lateral (left–right) axis to talk about time conventionally: Monday comes before Tuesday, not to the left of Tuesday (Cienki Reference Cienki and Koenig1998). Yet, despite the total absence of left–right metaphors in spoken language, there is strong evidence that people implicitly associate time with left–right space and that the direction in which events flow along people’s imaginary lateral timelines varies systematically across cultures. In a seminal study (Tversky et al. Reference Tversky, Kugelmass and Winter1991), children and adults were asked to place stickers on a page to indicate where breakfast and dinner should appear relative to the lunch sticker, in the middle of the page. Whereas English speakers placed breakfast on the left and dinner on the right of lunch, Arabic speakers preferred the opposite arrangement. This cross-cultural reversal in the lateral SPACE–TIME mapping has been corroborated by reaction time tasks (e.g. in English vs. Hebrew speakers; Fuhrman & Boroditsky Reference Fuhrman and Boroditsky2010; Ouellet et al. Reference Ouellet, Santiago, Israeli and Gabay2010).
The sagittal mapping of time, enshrined in linguistic metaphors, has been proposed to arise from the canonical experience of moving forward through space (not backward or sideways) due to the construction of our feet, hands, and sensory organs all of which are directed toward the front of our bodies (Clark Reference Clark and Moore1973). As we use these bodies to move forward through the world, objects that we will encounter in the future lie literally ahead of us, and objects we have already passed lie behind us. Thus, a correlation between anteriority and the future and between posteriority and the past is reinforced nearly every time we walk (or run, bike, drive, fly, etc.). But where does the lateral mapping of time come from?
The left–right mapping of temporal sequences has been hypothesized to arise from our experience with the written word. As we read or write, we move our eyes and attention through both space and time, from left to right for some orthographies (e.g. Roman script) and from right to left for others (e.g. Arabic script). In English, for each line of text we read we begin on the left side of a page (at an earlier time) and arrive at the right side (at a later time). Thus, reading English entails a correlation between “progress” through space and time, from left to right. To find out whether experience using one orthography or another is sufficient to determine the direction of the mental timeline, Roberto Bottini and I asked Dutch participants to perform a space–time congruity task on stimuli written in standard (left-to-right) Dutch orthography, mirror-reversed orthography, or orthography that was rotated either 90 degrees upward or downward (Casasanto & Bottini Reference Casasanto and Bottini2014b). When participants judged temporal phrases written in standard orthography, their reaction times were consistent with a rightward-directed mental timeline: Past-related phrases (e.g. “a day earlier”) were judged faster with the left hand, and future-related phrases (e.g. “a year later”) with the right hand. After a few minutes of exposure to mirror-reversed orthography, however, participants showed the opposite pattern of reaction times; their implicit mental timelines were reversed. When standard orthography was rotated 90 degrees upward or downward, participants’ mental timelines were rotated, accordingly.
5.1 Separating Effects of Language and Culture on SPACE–TIME Mappings
These data show that the experience of reading is sufficient to determine the direction of people’s implicit mental timelines (though they do not rule out the possibility that other culture-specific practices, such as gesturing about time or using calendars, could influence people’s lateral representations of time, as well). But why is this an effect of cultural experience as opposed to linguistic experience? Language can be considered an aspect of culture, and in many cases it is difficult to disentangle linguistic and non-linguistic practices. At first glance, reading might appear to be an ambiguous case, since it is a cultural overlay on natural language. Yet it is notable that reading is extremely recent and rare in human history; although reading may seem integral to language use in our culture, only a tiny fraction of all of the humans who have ever used language have been able to read. More to the point, in Casasanto and Bottini’s experiments, language was held constant across the orthography conditions. The words and phrases in natural language were invariant; all that changed was the direction and orientation of the orthography in which they appeared. Thus, changes in orthography determined the flow of time in people’s minds independently of any changes in the structure or content of language.
5.2 Hierarchical Construction of Spatial Metaphors for Temporal Sequence
How could a few minutes of exposure to a new orthography completely reverse people’s usual mental timeline, established over a lifetime of reading experience? As in the case of language, space, and pitch, HMMT may explain the flexibility of this culture-dependent mental metaphor. To elaborate, Casasanto and Bottini (Reference Casasanto and Bottini2014b) proposed that people’s implicit associations between source and target domains can be characterized, not just as families of mappings but alternatively as a set of nested intuitive hypotheses (Goodman Reference Goodman1983). At the top of the hierarchy is the overhypothesis, which comprises a family of specific hypotheses. In this case, the overhypothesis could be: “Progress through time corresponds to change in position along a linear spatial path.” This correspondence could be learned as children observe the relationship between space and time in moving objects, or it could be innate (Casasanto Reference Casasanto, Evans and Chilton2010; de Hevia et al. Reference de Hevia, Izard, Coubart, Spelke and Streri2014; Srinivasan & Carey Reference Srinivasan and Carey2010). Either way, the overhypothesized association between space and time is presumably universal across cultures, and it should be omnidirectional, since more time passes as moving objects travel farther in any direction.
Once children are exposed to cultural practices with consistent directionality, they accumulate a preponderance of evidence for one specific hypothesis. For Dutch children, reading and writing experience provides evidence for the specific hypothesis “Progress through time corresponds to rightward change in position along a linear spatial path,” strengthening this hypothesis at the expense of its competitors and causing Dutch speakers to use a rightward-directed mental timeline by default. Exposure to a different orthography in the experimental setting increased the weight of evidence for one of the participants’ overhypothesized (but culturally dispreferred) SPACE–TIME mappings, strengthening it to the point that it influenced behavior, transiently weakening their culturally preferred mapping as a consequence.
5.3 Spatial Representations of Emotional Valence: Universals and Body-Specificity
Several years ago I proposed a theory of bodily relativity (Casasanto Reference Casasanto2009a), by analogy to the theories of linguistic and cultural relativity. By hypothesis, the contents of our minds are constructed, in part, through our physical interactions with the environment. People with different kinds of bodies interact with their environment in systematically different ways. Therefore, myriad aspects of their thinking should vary relative to the particulars of their bodies. The spatial mapping of emotional valence has provided one fruitful testbed for this proposal (for reviews, see Casasanto Reference Casasanto2011, Reference Casasanto2014a).
Across languages and cultures, good things are often associated with the right side of space and bad things with the left. This association is evident in positive and negative idioms like “my right-hand man” and “two left feet.” Beyond language, people also conceptualize good and bad in terms of left–right space, but not always in the way linguistic and cultural conventions suggest. Rather, people’s implicit associations between space and valence are “body specific.” When asked to decide which of two products to buy, which of two job applicants to hire, or which of two alien creatures looks more trustworthy, right- and left-handers respond differently. Right-handers tend to prefer the product, person, or creature presented on their right side but left-handers tend to prefer the one on their left (Casasanto Reference Casasanto2009a). This pattern persists even when people make judgments orally, without using their hands to respond. Children as young as 5 years old already make evaluations according to handedness and spatial location, judging animals shown on their dominant side to be nicer and smarter than animals on their non-dominant side (Casasanto & Henetz Reference Casasanto and Henetz2012).
Beyond the laboratory, the association of ‘good’ with the dominant side can be seen in left- and right-handers’ spontaneous speech and gestures (Casasanto & Jasmin Reference Casasanto and Jasmin2010): In the final debates of the 2004 and 2008 US presidential elections, positive speech was more strongly associated with right-hand gestures and negative speech with left-hand gestures in the two right-handed candidates (George W. Bush, John Kerry), but the opposite association was found in the two left-handed candidates (John McCain, Barack Obama).
In summary, a body-specific mental metaphor links lateral space and emotional valence, and influences the way people think and communicate about positive and negative ideas. The observed SPACE–VALENCE mappings cannot be explained by influences of language or culture since, in the case of the GOOD-IS-LEFT mapping in left-handers, the implicit mental metaphor goes against the explicit GOOD-IS-RIGHT mapping enshrined in linguistic idioms and other cultural conventions (e.g. raising the right hand to swear to tell the truth).
5.4 Experiential Basis of Lateral SPACE–VALENCE Mappings
Where do body-specific SPACE–VALENCE mappings come from? All of the results reviewed so far demonstrate correlations, but Casasanto (Reference Casasanto2009a) proposed a causal relationship between the way people use their hands and the way they implicitly spatialize ‘good’ and ‘bad.’ In general, greater motor fluency leads to more positive feelings and evaluations: People like things better when they are easier to perceive and interact with (e.g. Ping et al. Reference Ping, Dhillon and Beilock2009). Bodies are lopsided. Most of us have a dominant side and a non-dominant side and therefore interact with the physical environment more fluently on one side of space than on the other. As a consequence right-handers, who interact with their environment more fluently on the right and more clumsily on the left, come to implicitly associate ‘good’ with ‘right’ and ‘bad’ with ‘left,’ whereas left-handers form the opposite association (Casasanto Reference Casasanto2009a).
To test this proposal, Evangelia Chrysikou and I studied how people think about ‘good’ and ‘bad’ after their dominant hand has been handicapped, either due to brain injury or to something much less extreme: wearing a bulky ski glove. In one experiment, right-handed university students performed a motor fluency task, arranging dominoes on a table while wearing a cumbersome glove on either their left hand (which preserved their natural right-handedness) or on their right hand (which turned them temporarily into left-handers, in the relevant regard). After about 12 minutes of lopsided motor experience, participants removed the glove and performed a test of SPACE–VALENCE associations, which they believed to be unrelated. Participants who had worn the left glove still thought RIGHT was GOOD, but participants who had worn the right glove showed the opposite LEFT-IS-GOOD bias, like natural lefties (Casasanto & Chrysikou Reference Casasanto and Chrysikou2011).
5.5 Hierarchical Construction of Spatial Metaphors for Valence
Even a few minutes of altered motor experience can change people’s implicit associations between space and emotional valence, causing a reversal of their usual judgments. HMMT provides a potential explanation of this representational flexibility. In the case of mental metaphors linking lateral space and valence, the overhypothesis may be: “The fluent region of space is good.” For right-handers, who act more fluently on the side of their dominant hand, typical motor experience provides a preponderance of evidence for the specific hypothesis that “the right side of space is good,” whereas typical motor experience for left-handers increases the strength of the evidence for the hypothesis that “the left side of space is good.” In terms of memory networks, this means that either the association between ‘right’ and ‘good’ or the association between ‘left’ and ‘good’ is strengthened at the expense of the competing associations – which are weakened but not lost and can therefore be strengthened again by new patterns of motor experience.
6 Hierarchical Construction of Language-, Culture-, and Body-Specific Mental Metaphors
The case studies described in this chapter illustrate ways in which mental metaphors can be constructed via similar mnemonic processes being driven by different kinds of linguistic, cultural, or bodily experiences. Early in ontogenetic time (or perhaps over phylogenetic time), families of source–target mappings are constructed, which reflect sets of observable relationships between source and target domains in the natural world. Specific members of these families are then strengthened according to an individual’s language-specific, culture-specific, or body-specific experiences. As a result, other mappings in these families are weakened. This process of competition among mappings in long-term memory explains why all of the mappings within a given family are not active at once: for example, why adult Dutch speakers do not typically conceptualize pitch using representations of both height and thickness, even though both mappings appear to be equally active in infants’ minds (Dolscheid et al. Reference Dolscheid, Shayan, Majid and Casasanto2013; Dolscheid et al. Reference Dolscheid, Hunnius, Casasanto and Majid2014).
The hierarchical structuring of mental metaphors in terms of source–target families and their constituent members may also explain how spatial source–target mappings can be fundamental to our conceptions of non-spatial domains, yet also remarkably flexible. It is notable, for example, that about 5 minutes of exposure to mirror-reversed writing did not simply modulate the direction of participants’ mental timeline, it completely reversed its direction, as indicated by a reversal of reaction–time congruity effects (Casasanto & Bottini Reference Casasanto and Bottini2014b). Had space–time congruity effects been extinguished by exposure to the new orthography, this would have been consistent with the possibility that people cease to represent time spatially when their preferred mental metaphor is challenged. Instead, the reversal of these effects indicates that participants did not abandon a spatial mapping of time; rather they rapidly adopted a different mental timeline, consistent with their new orthographic experience.
How are the various members of an overhypothesized family of mappings preserved in long-term memory, even though some of the mappings may never be reinforced explicitly (e.g. by the use of corresponding linguistic metaphors or by cultural conventions)? Presumably they are maintained by the recurrence of the same sorts of physical experiences that are ultimately responsible for the family’s construction. Even in a language group that exclusively uses height metaphors for pitch in language, physical thickness–pitch relationships can still be observed (e.g. in the sound of thick vs. thin guitar strings). Even in a left-to-right reading culture, it should still be possible to observe moving objects progressing through space and time from right to left (and in all other directions). Even right-handers occasionally experience greater fluency with their left hand, or on their left side of space. The preservation of dispreferred mappings explains why they can be adopted so quickly when people are given new patterns of experience.
By seeking to understand common mechanisms by which our mental metaphors are shaped by language-specific, culture-specific, or body-specific patterns of experience, we can better understand the origins of our thoughts, the extent of cognitive diversity, and the dynamism of our mental lives.
1 Metaphorical Directionality: Introduction
Verbal metaphors are fundamentally directional in that terms referring to more accessible (“source”) domains are commonly used to talk about less accessible (“target”) domains, but not the other way around (e.g. Lakoff & Johnson Reference Lakoff and Johnson1980, Reference Lakoff and Johnson1999). For example, a conventional metaphor people use in talking about SOCIAL RELATIONS (a less accessible domain) employs expressions referring to TEMPERATURE (a more accessible domain), as in “She is a warm person.” However, inverse mapping in which we talk about temperature in terms of social relations, as in “This is a kind/friendly temperature,” are not conventional and hard to interpret.
Conceptual Metaphor Theory (CMT) claims that the directionality of verbal metaphors (especially those labelled as “primary”) reflects an underlying conceptual directionality in which more accessible (typically, more concrete) source domains provide the conceptual structure for conceptualizing less accessible (typically, more abstract) target domains. Thus, SOCIAL RELATIONS are thought of in terms of TEMPERATURE, and not merely talked about in that way (cf. Grady Reference Grady1997a; Lakoff & Johnson Reference Lakoff and Johnson1980, Reference Lakoff and Johnson1999). It is further maintained that this “conceptual directionality” is rooted in our bodily interaction with the environment, in which cross-domain associations originate in the frequent correlation (and resulting experiential conflation) of subjective (abstract) and sensorimotor (concrete) experiences, such as the co-occurrence of infants’ experience of affection and the physical warmth of their caretaker’s body (Grady Reference Grady1997a; Grady & Ascoli, this volume; Grady & Johnson Reference Grady and Johnson2002; Lakoff & Johnson Reference Lakoff and Johnson1999).
Regarding the origins of verbal metaphors as ultimately rooted in our bodily experience has become a central tenet of embodied-cognition approaches to metaphor (e.g. Boroditsky Reference Boroditsky2000; Casasanto & Boroditsky Reference Casasanto and Boroditsky2008; Lakoff & Johnson Reference Lakoff and Johnson1980, Reference Lakoff and Johnson1999; Williams et al. Reference Williams, Huang and Bargh2009), which propose that “our conceptual system recycles concrete concepts to help understanding the abstract. Representations of location, motion, size, color, brightness, weight, smell, temperature, and other perceptually based dimensions of experience are used to understand more abstract concepts as if, at least in part, the latter were examples of such concrete experiences” (Santiago et al. Reference Santiago, Ouellet, Román and Valenzuela2012: 1051).
2 Verbal vs. Conceptual Metaphors
2.1 Experimental Evidence for Conceptual Metaphors: The Directionality Problem
Until recently CMT relied primarily on linguistic evidence for the consistency of metaphorical expressions in language. Given that the occurrence of such clusters of expressions as “warm person,” “chilly reception,” and “cold shoulder” are unlikely to be a mere coincidence, it was suggested that such metaphors do not constitute an arbitrary group of isolated expressions but are rather the linguistic manifestations of a single, implicit mapping between concepts from different domains (e.g. TEMPERATURE and INTERPERSONAL RESPONSIVENESS), that is, a “conceptual” (or, more specifically, a “primary”) metaphor (Grady Reference Grady1997a; Lakoff & Johnson Reference Lakoff and Johnson1980). In other words, the existence of a pre-/non-linguistic metaphorical mapping was deduced from the regularity of mappings in some groups of verbal metaphors, an argument that was criticized for its circularity (cf. e.g. Murphy Reference Murphy1996).
In recent years, however, various psycho-physical studies in the area of embodied cognition have provided a growing body of psycholinguistic/experimental evidence in support of the theory, as manipulating one domain was found to implicitly affect the perception of another, metaphorically related one. For example, participants who held a warm (rather than a cold) beverage tended to judge target individuals as having a “warmer” (i.e. ‘more friendly’) personality, in accordance with the hypothesized conceptual (/primary) metaphor AFFECTION IS WARMTH (Williams & Bargh Reference Williams and Bargh2008a; see also Citron & Goldberg Reference Citron and Goldberg2014). In another study, participants were found to be more likely to judge a currency to be more valuable when holding a heavy (rather than a light) clipboard, in accordance with the conceptual metaphor IMPORTANT IS HEAVY (Jostmann et al. Reference Jostmann, Lakens and Schubert2009). These and similar studies were interpreted by CMT advocates as evidence for the extension of metaphorical relations beyond the realm of language and, consequently, as support for the CMT model (e.g. Gibbs Reference Gibbs2014b: 175–177; Lakoff Reference Lakoff2014).
However, these recent psycho-physical studies have also uncovered a puzzling discrepancy between the conventional verbal instantiations of such (primary) metaphors as AFFECTION IS WARMTH or IMPORTANT IS HEAVY and their conceptual/mental counterparts: In contrast to the clear unidirectionality of verbal metaphors, behavioral research on the kinds of cross-domain associations underlying primary metaphors in CMT consistently reveals bidirectional effects (IJzerman & Koole Reference IJzerman and Koole2011; IJzerman & Semin Reference IJzerman and Semin2009; Lee & Schwarz Reference Lee and Schwarz2012). While evidence accumulated that manipulating a concrete domain affects individual’s judgments in the correlated abstract domain – in accordance with the predictions of CMT – the reverse pattern was also found. For example, thinking about pleasant or unpleasant social situations likewise tends to alter judgments of room temperature as being warmer or colder, respectively (Zhong & Leonardelli Reference Zhong and Leonardelli2008). Following the logic of the original study, this finding corresponds to the non-hypothesized conceptual metaphor *WARMTH IS AFFECTION, which defies the regular concrete-to-abstract pattern and has no conventionalized verbal equivalent.
The WARMTH-AFFECTION case is not an isolated example, as this bidirectional effect extends to many other associations between abstract (non-perceptual) and sensorimotor (perceptual) experiences (of the kind hypothesized to underlie primary metaphors), including moral judgment and physical cleanliness, verticality and positive or negative mood, importance and weight, and many others (Schneider et al. Reference Schneider, Rutjens, Jostmann and Lakens2011; Zhong & Liljenquist Reference Zhong and Liljenquist2006; for an overview, see Landau et al. Reference Landau, Meier and Keefer2010). In contrast, the corresponding verbal metaphors exhibit a robust unidirectionality, e.g. moral judgment is verbally described in terms of physical cleanliness but not vice versa, mood is described in terms of verticality but not the other way around, importance in terms of weight, and so forth (see also Grady & Ascoli, this volume).
We hold that the discrepancy between the unidirectionality of verbal metaphors and the apparent bidirectionality found in psycho-physical studies poses a serious problem for CMT, especially concerning primary metaphors. As CMT maintains that abstract concepts/domains are “embodied,” i.e. conceptualized via bodily experiences, it predicts that manipulating the experienced source concept should impact the representation of the target concept, but not vice versa. To return to our example, if affection is represented in terms of temperature, then changing experiences of physical warmth should affect judgments about affection/empathy. Since physical warmth, in turn, is not conceived in terms of affection, however, perceiving it should not be affected by our social feelings.
In sum, despite CMT’s long-standing claims that verbal metaphors are mere reflections of underlying conceptual mappings, the reported psycho-physical findings (taken as reliable indications of the properties of their mental/conceptual counterparts) suggest that there is a clear difference between the two, with the former being unidirectional and the latter bidirectional.
2.2 Previous Attempts to Cope with the Directionality Problem
Although largely overlooked, a small number of researchers have noticed this discrepancy and the challenge it poses to CMT. While Landau, Meier, and Keefer (Reference Landau, Meier and Keefer2010: 364) “hold out the hope that future research can resolve this issue while preserving the benefits of a metaphor-enriched perspective,” IJzerman and Koole (Reference IJzerman and Koole2011) as well as Schneider and colleagues (Reference Schneider, Rutjens, Jostmann and Lakens2011) take these findings as refuting the CMT notion of conceptual metaphors with a clear target–source distinction altogether; they do not attempt, however, to account for the directionality of verbal metaphors.
In contrast, Lee and Schwarz (Reference Lee and Schwarz2012) claim that the discrepancy can be fully accommodated within Lakoff’s paradigm, relying on (a) the distinctions between the linguistic and the psychological consequences of conceptual metaphors; (b) the distinction between representational structure and online processing; and (c) the distinction between the two mechanisms involved in the formation of metaphors according to Lakoff and Johnson (Reference Lakoff and Johnson1999: 55–57), “co-activation” and “projection.”
The first two of these distinctions explore the discrepancy rather than fully explain it. No attempt is made, for example, to address questions such as why there should be a difference between the linguistic and psychological consequences of conceptual metaphors, and, in particular, why these two should differ in directionality. The third distinction presents a more convincing argument. According to Lakoff and Johnson (Reference Lakoff and Johnson1999: 48–49), early life experience involves repeated conflation between different domains – for example, between the feeling of affection while being held or hugged as an infant and the physical warmth of the caretaker’s body. These experiential correlations may lead to neural co-activation of the two domains, a bidirectional process by which experience in one domain activates the other and vice versa. In turn, these correlation-based associations form the basis for metaphors such as AFFECTION IS WARMTH, in which properties are projected from one (concrete) domain onto another (abstract) one. According to Lee and Schwarz’s proposal, it is possible that both mechanisms remain active so that aside from the unidirectionality of verbal metaphors that derives from the underlying process of projection, there may also be a bidirectional metaphorical effect that results from an underlying (bidirectional) process of co-activation.
However, Lee and Schwarz’s proposal does not constitute a full account of the issue at stake, since the very distinction between the two processes (which we fully agree with) does not provide an answer to such questions as: What is the relation between the two processes? How does one shift from one process to the other? And, most importantly, why do verbal metaphors reflect the (unidirectional) process of projection rather than the (bidirectional) process of co-activation? Lakoff (Reference Lakoff2014) himself, apart from pointing to a possible solution to Lee and Schwarz’s findings in neural terms, concludes that their experiment points to the need to better understand the difference between metaphorical mapping and experimental effect.
To conclude, past references to the conflict between the CMT predictions and the empirical findings on directionality have led to unresolved attempts to reconcile the two, dismissal of the reliability of psycho-physical findings as “psychological consequences” of the original conceptual metaphors, or even to the rejection of CMT altogether. In contrast, we propose an alternative account that not only provides a simpler solution for the discrepancy but also allows us (i) to accept the experimental findings reported above as reliable representations of metaphorical relationships at the conceptual level, and (ii) to largely preserve CMT’s central claim that verbal metaphors are reflections of a deeper pre-linguistic/conceptual relationship. The solution we propose relies on the commonly overlooked role of a central factor that distinguishes the conceptual level from the verbal one, namely, language itself.
3 Verbal Metaphors and Conceptual Associations
3.1 Two Phases of Metaphor Processing
We agree with the CMT model that verbal metaphors originate from a conceptual pre-linguistic level so that the relation between ‘affection’ and ‘warmth,’ for example, as represented by expressions such as “She is a warm person,” does indeed reflect an underlying implicit association between the two domains. We disagree with CMT’s claim that this implicit pre-linguistic association represents the same type of metaphorical relation as their corresponding metaphorical expressions. Instead, we claim that it represents a simpler, more symmetric type of linkage in which the two domains are simply “associated,” and which involves no attribution of “target” or “source.” We suggest that for this symmetrical “bare association” between the two domains to become a full-blown metaphor in which aspects of the source are unidirectionally “projected” onto the target, another process is required, in which language plays an essential role. In other words, we claim that the unidirectionality of metaphorical expressions in language is not a direct reflection of a corresponding unidirectionality of pre-linguistic metaphorical associations. Instead, we suggest that language is (at least partially) the generator of this unidirectionality, by turning a latent (conceptual) asymmetry between the two concepts/domains into an active one. As said, our suggestion relies on the separation of metaphor processing into two different phases to be elaborated on next: “bare concept/domain association” and “directional projection.”
Bare concept/domain association is the initial phase of metaphorical association, which is derived from experiential correlation between two domains and corresponds to Lakoff and Johnson’s suggested mechanism of co-activation. We use the word “bare” to emphasize that at this phase the two domains (e.g. ‘affection’ and ‘warmth’) are not assigned target and source functions but are merely associated with each other. Crucially, this bare association between the two domains is, in principle, bidirectional, in that each domain is associated with the other and can therefore prompt or interfere with it, as the aforementioned experiments show. We agree with CMT that this basic association between the two domains then motivates a more developed type of relationship which is expressed in full-blown metaphors such as AFFECTION IS WARMTH. However, we do not claim that the shift from this initial phase to the more developed one is triggered automatically, probably in early childhood (see Lakoff & Johnson Reference Lakoff and Johnson1999: 45–59). Rather, we claim that this bare association remains the prominent type of connection between metaphorically related domains – unless directionality is externally triggered, which occurs when the relationship between the two domains is expressed in language. Put simply, we propose that instead of assuming full-blown metaphors at the conceptual level, it would be more accurate to describe the pre-linguistic level as consisting of “conceptual (bidirectional) associations.”
Directional projection refers to the process of mapping one domain onto the other. This type of relation is built on the simpler, bare association between the two domains and includes the additional components of (i) assigning the TARGET and SOURCE functions to the two domains and (ii) projecting some properties from the source to the target domain, which results in the conceptualization of the target in terms of the source. This relation is a unidirectional one, in that properties of the source are used to conceptualize the target, rather than the other way around.
Bearing in mind this separation between the two “types” of metaphor processing, we propose the following:
1. The bidirectional bare association between concepts from two domains is the initial phase of metaphor formation and comprehension, preceding unidirectional projection.
2. Language plays an essential role in changing this bare association into a directional relation of target and source.
Taken together, these two claims provide a fairly straightforward solution to the discrepancy discussed above: The bidirectionality found in the perceptual and sensorimotor experiments reflects the initial bidirectional relation between two domains. In contrast, the robust unidirectionality of verbal metaphors reflects the output of another, “higher” process, namely, the assignment of a target–source relation to the associated domains, whereby language plays an essential role.
This proposal has direct bearing on the debate about (the depth of) the level at which conceptual metaphors exist. The present proposal suggests a compromise between those who maintain the position that there are primary metaphors at a preverbal (conceptual or even embodied) deep level (Lakoff & Johnson Reference Lakoff and Johnson1980, Reference Lakoff and Johnson1999; this volume: Grady & Ascoli; Winter & Matlock; see also Casasanto’s “mental” metaphors) and those who challenge this position by arguing that there is no evidence for this (e.g. Keysar et al. Reference Keysar, Shen, Glucksberg and Horton2000; for conceptual metaphors generally, see also Murphy Reference Murphy1996).
3.2 Evidence for the Distinction between the Two Phases of Metaphor Processing
Since the empirical investigation of the conceptual level poses major methodological difficulties, the existence of the hypothesized bidirectional bare association between the two domains can only be deduced from converging pieces of evidence, e.g. through analytic reasoning (as done by CMT) and from the findings of relevant psycho-physical studies. However, there is another piece of evidence for the existence of two distinct mechanisms in the process of metaphorical comprehension, which seem much like the types we are proposing, that is, an initial symmetrical relation followed by a more developed, directional one.
For example, Wolff and Gentner (Reference Wolff and Gentner2011: 1456–1457) managed to reconcile two of the most dominant (and competing) views regarding the nature of metaphor in the last years, to which they referred as the “emerging commonalities” and the “directional projection” views: While the former claims that metaphor relies on the detection of commonalities in different domains, the latter states that metaphor is a process of projection, in which properties from one domain (the source) are conveyed to the other (the target).1 Crucially, these two approaches differ in their view of directionality, since the process of projection is unidirectional by definition, but detecting commonalities is at least theoretically a bidirectional process.
Wolff and Gentner (Reference Wolff and Gentner2011) used a mix-deadline procedure (exposing subjects to the stimuli for either a short or a long time) to show that these two processes should not be taken as separate mechanisms, but rather as two stages of metaphor processing. Participants were asked to judge the comprehensibility of verbal (hence directional) A-IS-B metaphors, such as “some arguments are wars,” and their reversed versions “some wars are arguments,” in either short or long exposing time. In accordance with their model, forward and reversed orders did not differ in comprehensibility at the early stage of processing (600 ms), while at later stages forward metaphors were clearly more comprehensible than reversed metaphors (as demonstrated in numerous studies). These results indicate that, despite the common view of metaphor as inherently unidirectional, directionality is in fact attributed at a later stage of processing. As the researchers themselves point out (ibid.: 1480), their findings pose a problem to Lakoff’s view, which predicts directional processing from early stages (both explicit and implicit). However, these findings are consistent with our modified version of CMT, which also claims that there is a shift from symmetrical to unidirectional processing. According to our model, therefore, the first 600 ms in Wolff and Gentner’s study may be regarded as a “window” into a deeper conceptual level.
A few developmental studies provide further support for a “shift” from a symmetrical to an asymmetrical processing. For example, several studies indicated that children are insensitive to asymmetry in metaphors.2 Most noticeably, one study showed that although 4-year-olds respond better to grammatically asymmetrical similes – e.g. are more likely to identify the metaphorical “ground” (i.e. the relevant metaphorical property) of “a boat is like a leaf” than of “a boat and a leaf are alike” – they are insensitive to the order of the two concepts: e.g. “a leaf is like a boat” and “a boat is like a leaf” were equally easy to extract a ground from.3 These findings, too, support the notion that metaphorical directionality is a later development. Since young children do notice the asymmetry of grammatical structure, at least one possible explanation for these findings is that children at a young age simply do not use metaphorical projection but, rather, rely on the simpler association between the two concepts.
An even more long-term “developmental” shift was demonstrated in a study that compared cross-domain interference using human and non-human subjects (Merritt et al. Reference Merritt, Casasanto and Brannon2010), which found that while humans demonstrate asymmetrical space–time interference (with spatial information influencing temporal judgments much more than temporal information influenced spatial judgments), rhesus monkeys demonstrate symmetrical interference (with spatial information strongly influencing temporal judgments, and vice versa). As the researchers point out, one obvious (but not necessary) explanation for the difference between the two species may be the availability of language, which could have long-term effects on the creation of asymmetrical mappings (ibid.: 199).
Combined with the discrepancy between the characteristics of verbal metaphors and the findings of psycho-physical studies, these examples provide some empirical support for the first claim of our model, i.e. that under the directional type of metaphor processing there is a more basic and simpler, non-directional linkage between metaphorically related domains. We now introduce some evidence supporting our second and major claim, namely, that language plays an essential role in enhancing, or even establishing, metaphorical unidirectionality.
4 The Impact of Language on Metaphorical Directionality
The distinction between the two types of cross-domain relations immediately raises the question of how the system changes the initial bare association into a unidirectional projection from source to target. According to CMT, the shift from non-directional to unidirectional processing occurs automatically due to differences in the relative accessibility of the associated concepts, their conceptual prominence, or some other conceptual asymmetry between the two: It is more natural to map properties from a more accessible concept to a less accessible one than the other way around, from a more familiar to a less familiar one, from a concrete to an abstract one, and so forth (e.g. Lakoff Reference Lakoff2014; Lakoff & Johnson Reference Lakoff and Johnson1980, Reference Lakoff and Johnson1999; for a proposal in neural terms, see Grady & Ascoli, this volume).
According to that view, once the target-source relation has been established on the ground of this pre-linguistic asymmetry, expressions referring to source and target concepts respectively are placed in the respective slots of the grammatical templates that express this relation (for an in-depth study of this, see Sullivan Reference Sullivan2007). In the (primary) metaphor “warm smile,” for instance, the head noun (“smile”) and the modifier (“warm”) represent the target and source functions, respectively (as opposed to “smiling warmth”). Similarly, in “Our relation has been in the doldrums for a while,” instantiating LOVE IS A JOURNEY, the clause is a statement about the subject NP, thus assigning the target role to it, while the predicate “be in the doldrums” constitutes the source. The same goes for fully explicit A-is-B metaphors like “libraries are goldmines,” where the subject noun (“libraries”) again provides the target and the predicate noun (“goldmines”) the source. Changing the allocation of the nouns to the two slots (“Goldmines are libraries”) also “reverses” source–target functions.
Crucially, under this account the bulk of the work is relegated to the conceptual, pre-linguistic level, while language itself plays the marginal role of merely providing the means of expressing this conceptual asymmetry between the domains in question. In other words, CMT assumes that the division of labor is determined prior to the representation of the metaphor in language, i.e. that the assignment of grammatical functions merely reflects a conceptual asymmetry that already exists and does not in itself contribute to the directionality of the metaphor. However, we suggest that this is only partially correct.
4.1 The Importance of Linguistic Form
First, linguistic form is by no means “transparent.” Consider, for example, the way people interpret “reversed” metaphors, in which the more natural candidate for the source concept appears in the target position, as in the reversed (unconventional) A-IS(-LIKE)-B metaphor “An anchor is (like) a friend.” The interpretation of such reversed metaphors may cause processing difficulties (e.g. Glucksberg & Keysar Reference Glucksberg and Keysar1990; Ortony et al. Reference Ortony, Vondruska, Foss and Jones1985) or even produce new meanings (Shen Reference Shen2008), as people rely on the grammatical form in determining the metaphorical target (and source) terms, even when this form is incongruent with the conceptual bias to map from more to less accessible concepts (e.g. interpreting the reversed metaphor “an anchor is like a friend” as ‘an anchor is a sailor’s best friend’). Arguably, these difficulties are caused by a clash between grammatical and conceptual asymmetries, that is, between the tendency to assign the target and source on the basis of the grammatical and the conceptual bias to map from more to less accessible concepts.
Second, in some cases linguistic form can even determine the target–source assignment, as in the case of conceptually symmetrical (metaphorically related) pairs, in which none of the concepts can be regarded as the natural candidate to the roles of target and source. A case in point is “This butcher is a surgeon,” which yields a totally different interpretation (highlighting exaggerated precision and care) than its reversed version, “This surgeon is a butcher.” In this case it is the grammatical form that determines the direction of mapping, given that both directions yield equally plausible interpretations (cf. Glucksberg & Keysar Reference Glucksberg and Keysar1990).
Third and most importantly, grammatical asymmetry may not only trigger or enhance the corresponding conceptual asymmetry but rather necessitate it: Even if, as our model predicts, the fundamental (pre-linguistic) association between metaphorically related concepts is non-directional, then still the very placement of the two concepts in a grammatically asymmetrical form would impose some division of labor between them. Consider, for example, the butcher–surgeon case: The concepts of ‘surgeon’ and ‘butcher’ may be related in several respects (e.g. both cut flesh), and hence, in the absence of grammar (or some other directional cue, see also Forceville, this volume), both directions of mapping are possible, from ‘butcher’ to ‘surgeon’ or vice versa. Once grammatical asymmetry is employed, however – for example, when one is asked to verbalize the relation between two domains in a typical (i.e. grammatically asymmetric) metaphorical form, certain choices regarding the assignment of the target and source functions to the concepts have to be made, determining the direction of the mapping.
In that respect, we assume that expressing metaphors in a grammatically asymmetrical structure may illustrate a special type of “thinking for speaking” (Slobin Reference Slobin, Gumperz and Levinson1996), where the very use of certain linguistic forms requires the speaker to apply a certain mode of thinking, in this case, assign target–source functions to an otherwise bidirectional association; the hearer, of course, is also forced to understand the metaphorically related concepts in the way determined by this linguistic form.
It is important to note that though all of these effects are easily demonstrated by using nominal metaphor or simile (A-IS-B/A-IS-LIKE-B) – as done throughout this section – we intend our arguments to equally apply to other types of grammatically asymmetrical constructions usually discussed in CMT, such as noun–adjective constructions of the type “a smooth person,” “a big event,” “a warm welcome,” or “a heated debate” (which reflect the basic, correlation-based associations underlying primary metaphors), or the above-mentioned instantiation of a complex metaphor like LOVE IS A JOURNEY. The important point is that linguistic forms which are used to express metaphorical association are almost always grammatically asymmetrical and therefore impose some division of labor (between target and source functions) on the associated concepts.
4.2 Language as a Prerequisite for Metaphorical Unidirectionality
We have so far stressed that language may trigger, determine, and necessitate unidirectional mappings. But is it also a prerequisite for such mappings? Unlike the ‘butcher’–‘surgeon’ example, the vast majority of metaphors do maintain some type of conceptual asymmetry between the two concepts/domains which determines a preference for a particular target–source assignment. As discussed above, such a division of labor between target and source is a sufficient condition, according to the CMT view, for the creation of a “full-blown” directional metaphor, especially one of the primary type.
But what if the arrangement of the two concepts within a grammatically asymmetrical structure did not merely reflect a pre-linguistic division of labor but rather led to this division of labor in the first place? In other words, could it be that the mere existence of conceptual asymmetry between ‘fear’ and ‘cold,’ for example, is not sufficient to trigger a directional process of mapping – and that, therefore, an external intervention in the form of grammatical asymmetry (or some other directional cue) is needed?
To test this prediction, we investigated the effect of grammar on directional judgments of novel metaphorical associations (of the A-IS-LIKE-B type) with a clear abstract–concrete asymmetry, such as “childhood memories” (coding for an abstract concept) and “migrating birds” (coding for a concrete concept).4
The experiment was conducted in two phases. In the first phase 10 independent judges (mean age 40) were presented with two directional comparisons for each pair of the aforementioned terms (e.g. a pro-hierarchical order in which the source term is more concrete than the target one, as in “childhood memories are like migrating birds” and the corresponding anti-hierarchical order, as in “migrating birds are like childhood memories”). They were asked to judge which of these comparisons seems more natural, and to rate (on a scale of 1–7) the degree of confidence in their choice. Unsurprisingly, the judges demonstrated a very strong preference toward pro-hierarchical ordering (91% of all choices, with average confidence level of 5.3), as would be predicted by CMT. This finding suggests that even without understanding the relation between two given concepts (the judges were not required to interpret those novel similes), the mere conceptual asymmetry between abstract and concrete by itself is sufficient to dictate a division of labor if (and, as soon to be explained, this is a big “if”) grammatical asymmetry is involved.
In the second phase, 43 participants (mean age 33) were presented with the same set of metaphorical pairs. Half of pairs (the “asymmetrical condition”) were embedded in a pro-hierarchical asymmetrical comparison (“A is like B,” as in “childhood memories are like migrating birds”); the other half (the “symmetrical condition”) were embedded in the corresponding symmetrical form (“A and B are alike,” as in “childhood memories and migrating birds are alike”).
Each experimental item was followed by two alternative contexts in which the comparison could have been uttered: the “pro-hierarchical context” (e.g. “a nostalgic writer speaking about his youth,” which determines the abstract concept ‘childhood memories’ as the topic of the comparison), and the “anti-hierarchical context” (e.g. “an enthusiastic ornithologist describing the flight of birds,” which determines the concrete concept ‘migrating birds’ as the topic of the comparison). Participants were asked to decide which was the more likely context for the comparison to be uttered, and to rate (on a 1–7 scale) their level of confidence in their choice. If indeed the preferred ordering of the two concepts reflects a directional mapping existing prior to its representation in language, then the pro-hierarchical context should be preferred regardless of the grammatical form (symmetrical or asymmetrical) representing it in language.
In contrast to this prediction, however, a notable difference between the two structures was found. Comparisons in the asymmetrical condition demonstrated a strong preference for the pro-hierarchical topic-assignment (95.1% of all cases, average confidence level of 5.3); in contrast, the comparisons in the symmetrical condition demonstrated a much weaker tendency for the pro-hierarchical assignment (69%, confidence level of 4.9). The difference between the two conditions was significant for both choice (Z = -4.739 p < 0.001) and confidence level (t = -4.417, p < 0.001).
The results of this experiment suggest that the conceptual asymmetry alone was not sufficient to establish a clear division of labor between the two concepts. Rather, embedding the terms of these comparisons in a grammatically asymmetrical structure is required in order for this conceptual asymmetry (that is, the assignment of the target and source functions to the abstract and concrete terms, respectively) to be fully realized.
One might argue, however, that the novelty of associations in the previous experiment and/or the metaphor type chosen might have tilted the results, as they did not represent the metaphorical relationships discussed as “conceptual” or “primary” in CMT. In order to test this, we replicated the experiment (10 judges, mean age 40; 31 participants, mean age 35) using expressions reflecting conventional metaphorical associations that CMT would regard as instantiations of conceptual metaphors, both complex and primary ones. Example item pairs are given by “a political argument” and “a boxing match” (reflecting ARGUMENT IS WAR, Lakoff & Johnson Reference Lakoff and Johnson1980: 5), or by “fear” and “cold” (reflecting FEAR IS COLD, Kövecses Reference 352Kövecses2005: 289).
This replication yielded a similar pattern of results as the previous experiment. As predicted, a strong preference for pro-hierarchical ordering (92% of all cases, average level of confidence 5.7) and pro-hierarchical topic-assignment was found in the asymmetrical condition (94%, level of confidence 5.5). In contrast, a much weaker tendency toward pro-hierarchical topic assignment was found in the symmetrical condition (72%, level of confidence 4.9). Again, the difference between the different (i.e. symmetrical vs. asymmetrical) verbalizations was significant for both choice (Z = −3.785, p < 0.001) and confidence level (t = −3.604, p < 0.001).
From these findings it follows that even preexisting conceptual hierarchies such as abstract–concrete require a directional task for their actualization and are not in themselves sufficient to prompt a strongly unidirectional mapping. If the very existence of an abstract–concrete asymmetry was sufficient to trigger a directional process, as implied by CMT, we would expect to find the same topic assignment (preferring the abstract concept as the topic of the metaphor) regardless of grammatical structure, i.e. in both the symmetric and the asymmetric condition. The fact that the tendency toward a pro-hierarchical context was much weaker in the symmetrical condition may suggest that the actual division of labor between a given pair of concepts does not derive automatically from their conceptual asymmetry but, rather, are only actualized fully by their insertion into the grammatically asymmetrical form.
To conclude, presenting metaphorically associated words in a grammatically asymmetrical form strongly enhances (and in some cases even triggers) metaphorical directionality in two complementary ways. First, it may actualize a potential asymmetry between the two domains that otherwise might have been remained latent: by inserting relevant verbal expressions into the grammatical slots defining targets and sources, a clear division of labor is imposed. Second, the very placement of the two concepts in these slots may in turn serve as a cue for their metaphorical assignment regardless of their semantic content.
5 From Mental Associations to Words
This discussion raises a broader, related question, which concerns the impact of language on conceptual hierarchies. Many recent studies have investigated the various ways in which language may shape our experience, as even the perception of the most purely perceptual stimuli may be altered when they are verbalized (see Lupyan Reference Lupyan2012; Wolff & Holmes Reference Wolff and Holmes2011). It seems, however, that while the way in which language modulates the processing of single concepts has gained much attention (see, e.g., Lupyan et al. Reference 354Lupyan, Rakison and McClelland2007; Winawer et al. Reference Winawer, Witthoft, Frank, Wu, Wade and Boroditsky2007), very few studies have dealt with the impact of language on the relations between concepts.
In a serious of studies, Shen and colleagues (Shen & Gil Reference Shen, Gil, Cohen and Lefebvre2017) examined the effects of language on the categorization of visual hybrids, i.e. images that are constructed from two or more entities from different domains (see Figure 4.1).5 They asked, particularly, whether this categorization would be affected by the (conceptual) “animacy hierarchy” (with humans on top, followed by animals, plants, and inanimate objects). The animacy hierarchy plays a central role in general and linguistic cognition (Keil Reference Keil1979; see also Deane Reference Deane1992; Lakoff & Turner Reference Lakoff and Turner1989 about the “great chain of being”), as well as in the domain of metaphor comprehension (Connor & Kogan Reference Connor, Kogan, Honeck and Hoffman1980).

Figure 4.1 A visual hybrid (half man, half tree)
The interesting finding was that this hierarchy affects the categorization of hybrids only in verbal tasks: When people had to describe the visual hybrid verbally, they tended to use the pro-hierarchal description (e.g. “a man with a lower part of a tree”) rather than an anti-hierarchical one (as in “a tree with a man’s upper part”). In contrast, non-verbal tasks, such as the categorization of the visual hybrid into one of two visually presented categories (e.g. ‘humans’ or ‘trees’), yielded no hierarchy effect: Participants in those tasks did not show a preference for the higher (or lower) category. Moreover, this discrepancy between linguistic and non-linguistic tasks seems to be correlated with grammatical asymmetry, as grammatically symmetrical tasks (e.g. using descriptions such as “half-human, half-tree”) reveal a lesser effect or none at all (Mashal et al. Reference Mashal, Shen, Jospe and Gil2014).
This finding demonstrates a point parallel to the one discussed in the previous section: Although both parts of the hybrid clearly maintained a conceptual asymmetry (humans are “higher” on the hierarchy than plants), this asymmetry in itself was not sufficient to yield a preference to categorize the hybrid as belonging to the “higher” rank in the animacy hierarchy. That is, in the absence of any directional task, the two concepts were simply associated, and the conceptual asymmetry between them remained implicit. The fact that this effect of language was correlated with the degree of grammatical asymmetry (that is, grammatically asymmetrical descriptions yielded a stronger effect than their grammatically symmetrical counterparts) further supports the claim that in this case, too, grammatical asymmetry served as the relevant directional trigger.
These examples bring us back to the question of automaticity. As said earlier, in contrast to the common view that metaphorical directionality is a direct and automatic outcome of the linkage between conceptually unequal domains or concepts (Lakoff & Johnson Reference Lakoff and Johnson1999: 45–59), we suggested that a distinction should be drawn between the pre-linguistic, experienced “bare” association and its expression as a full-blown metaphor in language. To further illustrate this point, consider the following thought experiment: Suppose you look at a pictorial representation of only two entities, a tired runner and a wilted plant, which are associated in our perceptual experience based on their conceptual (and, possibly, perceptual) similarity (Figure 4.2).
While one of the two entities may still be perceived as more prominent than the other, richer in its inferential capacity, or demonstrating any other conceptual difference which makes it a better candidate for becoming the source concept of a metaphor, there is nothing in the actual experience of these entities that leads the perceiver to project properties from one to the other. However, if a person is asked to judge the “proper” way to assign these two entities into target and source (for example, in the grammatically asymmetrical form of the simile “An X is like a Y”), then a potential directional mapping is prompted, actualizing the implicit conceptual hierarchy between the two entities (e.g. the person chooses “The runner is like the wilted plant” rather than “The plant is like the exhausted runner”).
Reconsidering the (primary) metaphor AFFECTION IS WARMTH (“a warm smile”), the same logic can be applied. The very association of the two domains does not necessarily yield a division of labor between source and target, that is, it does not impose a unidirectional mapping, at least not in the strict sense. Indeed, that is exactly what the psycho-physical experiments discussed above show: Just as warmth co-activates affection, affection co-activates warmth. According to the view proposed here, the metaphorical directionality does not therefore characterize a metaphorical relation from its early formation but rather originates at a later phase, in which this pre-linguistic relation is “converted” into a unidirectional verbalized form, following the grammatical templates of source–target assignment.
Further support for this account comes from the following small-scale study, which used non-verbal stimuli.7 Twenty-four participants were presented with 16 pairs of pictures representing metaphorically related concepts with a clear asymmetry between them (either human/non-human or abstract/concrete), such as the aforementioned ‘tired person’ and ‘wilted flower’ or more conceptually related concepts such as ‘raging man’ and ‘volcano.’ On each trial, one picture appeared on the left side of a computer screen and the other on the right side, as illustrated in Figure 4.3.
The participants were divided into two groups, the verbal and the non-verbal. In the verbal group, they were asked to decide which of the two pictures in each pair is more similar to the other, by expressing their choice in a full sentence (e.g. “The raging man is similar to the volcano” or vice versa) and rate their confidence in their choice (on a 1–6 scale). Participants in the non-verbal group were asked to express their decision by using only their pointing finger, without speaking (no preference was marked by “0” confidence on the scale).
As predicted, participants in the verbal condition showed a clear preference (83%) for pro-hierarchical ordering (by choosing, e.g., “the raging man is similar to the volcano” rather than the other way around); however, almost no such preference was exhibited in the non-verbal condition (54%). Furthermore, participants expressed greater confidence in their choice in the verbal than the non-verbal tasks (average of 3.1 and 1.9, respectively). The difference between the conditions was significant for both choice (Z = −2.857, p < 0.005) and confidence level (Z = −2.049, p < 0.05).9
These findings suggest that language is indeed a crucial factor largely determining metaphorical directionality: When encountering only the pictorial representations of ‘raging man’ and ‘active volcano,’ the participants did understand the conceptual relation between the two but had no preference for a mapping direction; when they were asked to express this relation in the form of a fully articulated metaphor, however, a clear pro-hierarchical preference emerged.
These findings strongly support the proposed model. First, it seems that in the absence of language, the relation between metaphorically related concepts tend to be processed non-directionally, even when a clear conceptual hierarchy is involved. Second, verbalizing such purely conceptual (or perceptual) metaphorical relations is sufficient to activate a directional process, thus turning the original relation between the two concepts into a full-blown (asymmetrical) metaphor.
6 Conclusion
Conceptual Metaphor Theory profoundly changed the way we think about metaphor. In contrast to the common traditional depiction of metaphor as a primarily linguistic phenomenon, Lakoff and Johnson (Reference Lakoff and Johnson1980) suggested that verbal metaphors are merely the explicit manifestations of unconscious processes through which we relate different domains, thus depicting metaphors as a window into a more fundamental level of cognition. Naturally, the rising popularity of this view shifted the focus of attention from the linguistic to the conceptual components of metaphors, as the significance of language to its formation was largely pushed aside (for a noteworthy exception to the neglect of the relevance of language for CMT, see Casasanto Reference Casasanto2013).
In this chapter we tried to reevaluate some of Lakoff and Johnson’s fundamental claims by pointing out and addressing a challenge to CMT: While verbal metaphors tend to be clearly unidirectional in their mapping from metaphorical source to target, their pre-linguistic counterparts, as manifested in psycho-physical manipulations, are not. Under the account proposed here, this commonly overlooked discrepancy is accounted for by a revision of a central tenet of the CMT model: In the revised model, unidirectional mappings are no longer regarded as an inherent component of metaphorical relationships at the conceptual level. Instead, we propose that the bidirectionality demonstrated in the psycho-physical studies derives largely from the structure of pre-linguistic metaphorical relations, which (contrary to central tenets of CMT) are based on a bare association between concepts/domains, with no clear assignment of source and target. The unidirectionality of verbal metaphors, we argue, is largely determined by being instantiated in a linguistic form. Language, by requiring a unidirectional projection, should thus be regarded as an important contributor to the formation of full-blown metaphor.
1 Introduction: The Role of the Body in Cognition
What is the role, if any, that our body plays in metaphorical cognition? No doubt, this question is of vital importance to research on conceptual metaphors and embodiment more generally. However, the answer is far from simple. A consistent answer to this question requires a clarification of what we mean by “body.” Although defining the notion of body may seem trivial, the literature on this topic is quite controversial.
In the embodiment literature, the notion of body has been defined in many different ways; consequently, many different roles have been ascribed to the body in human cognition. According to some authors, for example, the body directly affects cognition. According to others it contributes to cognition by being the object of a representation. In this latter view, representations of the body, and not the body in itself, have a role in the structuring of our conceptual system (for a survey, see Alsmith & de Vignemont Reference Alsmith and de Vignemont2012).
Related issues concern the level at which the body plays a role in cognition. A relevant distinction is that between the “personal” and “subpersonal” level in the explanation of our experience (Dennet Reference Dennet1969): The personal level pertains to “the explanatory level of people and their sensations and activities”; the subpersonal level in contrast concerns “the level of brains and events in the nervous system” (ibid.: 93). Personal-level phenomena are thus those mental processes that characterize our lives as subjects, while subpersonal phenomena are physical processes. The issue is thus whether the role of the body is explicit and even conscious or implicit, subpersonal, and unconscious.
Particularly controversial in this debate has been the notion of the ‘body schema.’ It was first introduced by the French neurologist Pierre Bonnier to refer to an internal representation of the body (Bonnier Reference Bonnier1905). Since then not only neurologists but also psychologists and philosophers have made frequent use of the notion (Merlau-Ponty Reference Merlau-Ponty1945). However, there has been considerable conceptual as well as terminological confusion as to what precisely body schemas are. Not only have there been many contrasting definitions of the notion, it has also frequently been used interchangeably with the notion of ‘body image.’1
In order to clarify, Shaun Gallagher (Gallagher Reference Gallagher1986, Reference Gallagher2005; Gallagher & Zahavi Reference Gallagher and Zahavi2008) proposed a clear distinction between the two notions of body schema and body image. According to Gallagher and Zahavi (Reference Gallagher and Zahavi2008: 164), the body image “is composed of a system of experiences, attitudes, and beliefs where the object of such intentional states is one’s own body.” The body schema, in contrast, “includes two aspects: (1) the close-to-automatic system of processes that constantly regulates posture and movement to serve intentional action; and (2) our pre-reflective and non-objectifying body awareness. So, the body schema is a system of sensorimotor capacities and activations that function without the necessity of perceptual monitoring” (Gallager & Zahavi Reference Gallagher and Zahavi2008: 165; see also Dijkerman & de Haan Reference Dijkerman and de Haan2007; Gibbs Reference Gibbs2006a: 27–39). Two neurological pathologies, “deafferentation” and “personal neglect,” which selectively affect the body schema or the body image, have been cited in support of the theoretical distinction between these two notions.2
What can this distinction tell us about the role of the body in cognition and, more specifically, about the hypothesis of the metaphorical nature of human cognition? I propose that a first level of cognition relies directly on body schemas. I will define the relationship between body schema and cognition as having a metonymic nature. I further propose that a second level of cognition relies on body images. The relationship between body image and cognition has a metaphoric nature. Depending on the source domain, i.e. body schema or body image, the body thus plays different roles in cognition that correspond to two levels of embodiment.
The first is the level of metonymies that have the body schema as their source domain. The power of these metonymies can be explained by their function of structuring perception and, consequently, also our conceptual system. In this case, the body is not the object of our attention and explicit knowledge. It is, instead, the condition of the possibility of action and knowledge. At this level, our cognition is intrinsically metonymic, implying a foundational relation between sensorimotor abilities and perceptual experiences. This first level of embodied metonymic cognition is prior to any other level of metaphorical cognition in thought, language, and communication; its role is that of making possible perception and, hence, knowledge of the world.3 I will describe the shift from sensorimotor abilities to perceptual experiences as “invisible” and “transparent,” because our body becomes invisible to us during perception. Except in particular circumstances, we do not focus our perceptual attention onto our own body, otherwise we would lose what perceptual experiences are about. From now on, I will refer to this first level of embodiment as the level of invisible metonymies.
The second level of embodiment is the level of embodied metaphorical cognition. It is claimed here that embodied metaphors occur at this level. In this case our body as object of knowledge and attention (i.e. our bodily experiences, and attitudes and beliefs about the body) is the source domain of a metaphorical mapping. Thus, the contribution of the body to thought and language is not direct. The body contributes to our metaphorical cognition as an object of knowledge. Accordingly, its contribution is mediated by our cultural, environmentally situated, and linguistically structured representations of the body. Body images can be conscious representations, but they do not necessarily have to be. From now on, I will refer to this second level of embodiment as the level of visible metaphors. It will be argued that the definition of visible metaphors provided includes not just Grady’s (Reference Grady1997a, this volume) “complex” metaphors, but also those described as “primary.”
In contrast to the neural, phenomenological, and cognitive-unconscious levels of embodiment often postulated (e.g. Gibbs Reference Gibbs2006a: 39–40; Lakoff & Johnson Reference Lakoff and Johnson1999), my proposal identifies two logically different processes: (1) the foundation of perceptual experiences (at the phenomenological level) on our sensorimotor abilities (including their neural basis), and (2) the foundation of the conceptual system (including what is called the “Cognitive Unconscious”) on our representations of the body.
In the following, I will first discuss the notions of body schema and body image (section 2) and then take up in more depth the different mappings they give rise to (section 3). In doing so, I will also discuss in which way my notion of invisible metonymy differs from that of primary metaphor.
2 Body Schema and Body Image
As indicated above, the notion of body schema has been used inconsistently, both by several authors and even in the work of single authors (for reviews, see Gallagher Reference Gallagher2005; Gibbs Reference Gibbs2006a: 27–36; Tiemersma Reference Tiemersma1989). The most confusing and ambiguous aspect of the definitions of body schema and/or body image is related to consciousness (Gallagher Reference Gallagher2005: 21). To what extent is consciousness a necessary feature of a schema or image of the body? The answers to this question have been quite divergent. Consequently, the notions of a schema or image of the body have been described (i) as a subpersonal phenomenon occurring at the neural level, (ii) as a conscious mental representation, (iii) as an unconscious representation, or (iv) as the way we organize our bodily experiences (ibid.: 23; see also Gibbs Reference Gibbs2006a: 28–29).
This conceptual confusion points to a deeper theoretical problem: At least two different levels of embodiment have often been collapsed. A conceptual and terminological clarification, which also considers interactions/dependencies between the body schema and body image is thus a necessary and preliminary step to any discussion of the role of the body in cognition.
In what follows, I will discuss the distinction between body schema and body image with the aim of fostering a deeper understanding of the role of the body in human cognition that also takes into account possible interactions between different levels of embodiment.
2.1 The Body Schema
The body-schema is a system of sensory-motor processes that constantly regulate posture and movements – processes that function without reflective awareness or the necessity of perceptual monitoring.
In Gallagher’s terms, the concept of body schema refers to the sum of the sensorimotor capabilities that allow us to move and constrain our movements, to navigate and perceive the world. In this sense, the body schema is not the result of our perception of our own body and does not refer to a perceptual or mental image of the body. With this definition, Gallagher refers to the motor abilities that are foundational to action and perception.
As such, the body schema does not usually involve awareness or constant perceptual monitoring. Unless something does not work in the right way or unless we are facing the task of learning a new motor pattern, for example a dance step, we do not generally focus our attentional awareness on the body and on the processes underlying action and perception. Let us think of what we normally do when we walk. We know that we need to move one foot after the other in a certain direction. However, while walking, we do not need to attend to our body and its fine movements, that is, we do not need to focus our attention on our legs and on our feet to calculate the exact distance of space we need to move or to estimate how to balance the weight of our body on one foot when the other is moving forward, and so on. So, even though we are not aware of what we do in order to walk, we are usually able to walk and to effortlessly adjust our movements to the ground. We do not continuously lose our balance if the surface where we are walking is not completely regular, and we do not fall down all the time. To make our movements faster and safer, there is a set of subpersonal mechanisms, proprioception in primis, that allow us to register the inputs we get from the outside and on this basis to regulate what we need to do in order to move in the environment. Of course, we can suddenly have a physical problem, such as a severe pain in a leg that will not allow us to move any more. In such a case we may lose our balance or even fall down, and then need to give conscious attention to our body and its movements to be able to walk again. But this is not the norm. Pain, together with other peculiar circumstances related, for example, to fatigue, sex, sickness, or even to philosophical introspection or medical investigation, are limiting cases in which the body itself becomes the focus of our attention. When the body enters our perceptual field, becoming the object of our conscious perception, we are already dealing with the body image. In sum, the body schema is not an image of the body but the condition of the possibility of action and, thus, also of perception (Noë Reference Noë2004).
According to Gallagher (Reference Gallagher2005: 45), three functional subsystems constitute the body schema. The first is the system responsible for the processing of information about posture and movement. Information about posture and movement comes from different sources. Somatic proprioception is certainly the primary kind of information we use. We get proprioceptive information from kinetic, muscular, articular, and cutaneous sources. Vestibular and equilibrial functions and the visual sense are also sources of proprioceptive information. In the latter case, we automatically register visual information about the movement of our own body in the environment. The visual sense, however, can also be a source of explicit information in cases when we directly focus our perceptual attention on the position of our limbs. In these cases we are again already dealing with a form of body image. Note that proprioceptive information is defined here as a set of physiological, subpersonal, and unconscious processes that lead to proprioceptive awareness. In some pre-reflexive sense, we are always aware of the position of our body, but this awareness is not the result of an explicit act of reflection.
The second functional system of a body schema is responsible for the production of motor programs and movement patterns (output). We have a number of motor habits. Developmental studies suggest that some of these are innate, such as swallowing, while others are learned, such as writing (Meltzoff & Moore Reference Meltzoff and Moore1977, Reference Meltzoff and Moore1989; see also Gallagher Reference Gallagher2005). At the neural level, a motor program can be described as the activation of some neural circuits in the motor cortex that enables the execution – at the behavioral level – of the single motor acts that together constitute the chain of a motor action. The processing of proprioceptive information is essential for the production of motor patterns. We constantly need feedback about our current body posture to efficiently move in the environment.
The third functional system of a body schema regards cross-modal communication. We have an innate ability to transform visual inputs into motor competence. This means that when we observe other people’ actions, this visual stimulus is immediately and automatically translated into motor terms. This is achieved by means of mirror neurons, multimodal neurons in the pre-motor cortex which respond both to action observation and to action execution (di Pellegrino et al. Reference di Pellegrino, Fadiga, Fogassi, Gallese and Rizzolatti1992). The motor neurons that enable the movement of the hand when we grasp a cup, for example, will be activated not only when we effectively grasp a cup but also when we observe someone else grasping a cup. The somatotopic activation of motor neurons in the pre-motor cortex in the absence of a corresponding action has been considered as a “motor simulation” of the observed action. It has been suggested that motor simulation is a constitutive process of action understanding; we understand other people’s actions by means of the activation of our own motor competence (Gallese & Sinigaglia Reference Gallese and Sinigaglia2011). Furthermore, it has also been hypothesized that motor simulation is involved in the process of learning by imitation (Rizzolatti et al. Reference Rizzolatti, Sinigaglia and Anderson2008). In this connection, the mirror mechanism determines automatic activation of the motor programs that are necessary to carry out the observed action.
It has lately been discovered that the mechanism of simulation is not limited to motor neurons in the pre-motor cortex. This mechanism is widespread in the brain, and it also characterizes perception- and emotion-related areas. When we see other people’s facial expressions indicating their emotions, for instance, emotion-related areas of our brain will be activated. Furthermore it has also been observed that the mechanism of simulation is activated not only by direct observation of others but also by cognitive tasks such as mental imagery and language comprehension (Gallese & Sinigaglia Reference Gallese and Sinigaglia2011; for a discussion of the latter, see Ritchie, this volume). Given that the process of simulation is neither activated by action observation only nor confined to the motor areas of the brain, it is usually referred to by the more general term of “embodied simulation.”
2.2 The Body Image
The body-image consists of a complete set of intentional states and dispositions – perceptions, beliefs and attitudes – in which the intentional object is one’s own body.
As suggested by this definition, different types of intentional relations can be involved in the constitution of a body image. First, the body can be the object of a perceptual state (“body percept”). When we move around perceiving the world, for example when we look at a beautiful landscape from the peak of a mountain, we are usually not aware of our body. It becomes invisible to us while we are absorbed in the breath-taking beauty of the landscape. But if we suddenly have a terrible itch on our left foot, this will become the object of our perceptual attention and of our efforts to relieve the complaint. This intentional relation, clearly, also implies conscious awareness. We are perfectly aware of what our foot feels like when it itches. This kind of epistemic access to our body is from the first-person perspective (Pauen Reference Pauen2012).
Second, the body can be the object of our conceptual knowledge (“body concept”). This happens when it is accessed from the third-person perspective (ibid.). When we open a handbook of anatomy and we start to study the structure and composition of the human body, it becomes the object of our conceptual understanding. In this case, we cannot know how it feels to have neurons firing, but we have a scientific description of this neuro-physiologic process. This is clearly an intentional relation, also involving conscious awareness. Note that our scientific knowledge about the body does not always imply a conscious level of awareness: We do not always have the functions and processes of the nervous system or of other parts of the body consciously present in our mind. According to Gallagher (Reference Gallagher2005: 25), this knowledge is part of a set of beliefs and attitudes that – even unconsciously – entertains an intentional relation with the body considered as an epistemic object. Scientific knowledge is not the only source of conceptual knowledge about the body; commonsense beliefs are an equally important part of our conceptual understanding of the body. Common sense and scientific knowledge constantly interact with one another. Our beliefs are inevitably historically, linguistically, and culturally situated and the progresses and results of scientific knowledge are an integral part of our historical, linguistic, and cultural environment. What scientific research reveals about the body does, in a slow but constant process, influence our commonsense understanding of the body.
Third, the body can also be the object of an emotional attitude (“body affect”). This is the case when our feelings, positive or negative, have the body as their object. We look in the mirror every day, and this routine action is usually associated with a positive or negative feeling, according to our mood on that day and to many other variables. We do or do not like what we see. This feeling toward the body is an intentional relation that can be conscious but is very often unconscious. It is of great importance that our feelings and attitudes toward the body are inevitably mediated by our body concept, that is, by our beliefs about the body. If we live in a society where the standard of beauty is a thin and athletic body and we do not match that standard, this can lead us to have negative feelings toward our body. It is also possible and fairly common that our feelings toward the body influence and affect our perception of our own body and our beliefs and conceptualization of it.
Body affect, body concept, and body percept are three different but deeply interconnected aspects of our image of the body. The body image, then, is a set of interconnected intentional states, whereby each type of intentional relation with the body affects and influences the others and all of them are deeply situated in a historical, linguistic, and cultural environment. The image we have of our own body is inevitably the result of our being situated in a specific context including a particular physical, linguistic, and sociocultural environment.
Finally, it is important to note that the body image is almost always a partial representation of the body: Our perception, emotional attitude, or conceptualization of the body are usually directed toward a part or a single aspect of the body at a time, as in the example of the itch on the left foot discussed at the beginning of this section. We single out one aspect of our bodily experience and this aspect or part becomes the object of a representational state.
2.3 Relations between Body Schema and Body Image
I indicated in the previous sections that, although conceptualized as distinct, the boundary between body schema and body image is not rigid in human cognition, so that the body image can affect the body schema. This is the case when our beliefs and concepts about our own body affect perception and movement in space. Even a basic activity, acquired very early on in life, such as walking, can be affected by our beliefs and thus, in the range of the bio-mechanical possibilities, determined by the structure of our body, it can be to some extent culturally shaped.
Vice versa, it is also possible that in particular circumstances, for example, when we are facing a situation of particular physical effort (e.g. during some sports training), we focus our attention on specific motor patterns that are usually below the threshold of awareness. In such a case, body schema functions become the object of attention and perception.
Finally and most relevant to the present concerns, the body schema and the body image interact in embodied simulation. It has been proposed that embodied simulation is involved in the sensorimotor foundations of the conceptual system because metaphorical thought seems to directly recruit the sensorimotor system (Gallese Reference Gallese and Pineda2009; Gallese & Cuccio Reference Gallese, Cuccio, Metzinger and Windt2016; Gallese & Lakoff Reference Gallese and Lakoff2005). The exploitation of the sensorimotor system, which is part of the body schema, in the structuring of the conceptual system by means of the mechanism of simulation highlights the tight interaction between body schema and body image (see section 4).
3 Two Levels of Embodiment
I have so far argued that an understanding of how the body contributes to cognition requires a clear distinction between body schema and body image. In this section I will distinguish between two levels of embodiment that are based on the body schema and the body image, respectively. I will argue that they provide the source domains of two different kinds of mappings, one metonymic and one metaphorical.
3.1 Invisible Metonymies: The Body Schema as a Source Domain
As previously defined, the notion of the body schema refers to the set of (partially innate) sensorimotor functions that enable our active interaction with the environment. In this section I will discuss in depth the hypothesis that the body schema is the basic and ontogenetically primary source of a metonymic mapping from sensorimotor abilities to perception (and, hence, ultimately to cognition, too).
Our perceptual experiences are always functions of our movements. This is the case even for vision, which is often taken to be the most passive of our perceptual modalities and has also served as the paradigm case for thinking about perception in general (cf. Noë Reference Noë2004: 2). There is empirical evidence showing that a paralysis of the eyes causes blindness. Vision requires saccades and microsaccades (eye movements), many times per second, to make us able to see (ibid.: 13). These constant movements of the eyes are constitutive of vision. Furthermore, it has also been shown that perception does not only require bodily movements relative to the surrounding environment. It also requires self-actuated movements, i.e. movements that individuals actively perform, not passive movements (ibid.; Held & Hein Reference Held and Hein1963). From this point of view, perception is grounded in and constrained by our motor abilities (see also Gibbs Reference Gibbs2006a: 42–78). In other words, perception is something that we literally act out (and thus cannot adequately describe as a passive reception of stimuli).
Calling this level of embodied metonymic cognition “transparent” emphasizes that, when we enact perception, our actions and the body that enables them are usually outside the focus of our attention and not visible to us. To have perceptual experiences, we need to map sensorimotor contingencies, here defined as the “interdependence between stimulation and movement” (Gibbs Reference Gibbs2006a: 65), onto the stimuli we receive. Crucially, the stimuli we get from the environment are always partial and incomplete, but our perceptual experiences are not. For example, we usually see objects only from one side but we perceive them as three-dimensional. To do so, we project sensorimotor contingencies onto the partial stimuli we receive. A partial and incomplete stimulus is, thus, always seen in terms of the actions and movements that would allow us to have a more complete perceptual access to it.
The first and primary level of embodiment is at play in our perception of the world, being its condition of possibility, and it is, thus, foundational to any other level of cognition. This is particularly evident when we look at cognitive development. Our body schema is the primary source of knowledge because movement in space is our primary source of knowledge. On the basis of partial stimuli, we map from movement to complete perceptual experiences. This level of embodiment does not reveal itself in the linguistic dimension.
Although taking up vital aspects of Conceptual Metaphor Theory (henceforth CMT) and Primary Metaphor Theory (henceforth PMT), namely, their claim that metaphors are cross-domain mappings from sensory source domains onto more abstract domains (including social and affective ones), the present proposal also goes beyond CMT and PMT insofar as these do not account for the modality of constitution of sensory experiences. Sensory experiences (see Grady Reference Grady1997a; Grady & Johnson Reference Grady and Johnson2002) seem to be a starting point that can be taken for granted. Whether the sensory experiences assumed to provide the source domains of primary metaphors belong to the body schema or to the body image is thus left open.
It is claimed here that we also need to account for the constitution of sensory experiences. A closer scrutiny of the constitution of sensory experiences will allow us to realize that this basic and primary level of cognition implies a metonymic description: The mapping goes from the body schema to perception. The body schema provides the structure to every perceptual experience: Although we are not aware of this, we map patterns of sensorimotor contingencies onto partial perceptual stimuli, and this allows us to have full perceptual experiences. This mapping is neither conscious nor directly visible to us, but it allows perceptual experiences to take place. The level of invisible metonymies is thus of a constitutive nature.
Clearly, the definition of this first level of embodied cognition is profoundly different from the level of primary metaphors introduced by Grady (Reference Grady1997a, Reference Grady2005a). In Grady’s account, primary metaphors are constituted by the association of sensory experiences, referred to as their “image content,” with non-sensory “response” concepts, which include social and emotional assessments of the sensory experience. Both the source and the target of a primary metaphor are equally basic aspects of the holistic experience given by a so-called “primary scene”:
The correlation between emotion and skin temperature is real and experienced. We feel warm when our emotions are aroused, and we feel warm when we are close to other people, as we are when we interact intimately. There is a conceptual association between coldness and lack of feeling, not because interacting with a cold object and interacting with an unfeeling person are perceived as similar experiences, but because through recurring experience we associate the conceptual domain of temperature with that of emotion.
Though PMT does not distinguish between body schema and body image, Grady’s remarks can be taken to place the association between the source and target of a primary metaphor already at the conceptual level. Hence, it can be concluded that, in PMT, the body contributes to cognition being already an object of knowledge, i.e. as body percept.
At the first level of embodiment, in contrast, the body is not an object of knowledge. It is, instead, the condition of possibility of knowledge because our motor abilities, together with the sensory apparatuses we are equipped with, are the condition of possibility of every perceptual experience. The relationship between the physical body and motor abilities, on the one hand, and perceptual experiences, on the other, is constitutive and considered to be metonymic. To have perceptual experiences, we map sensorimotor contingencies onto the partial stimuli we get from the environment. Only this allows us to build complete perceptual experiences.
However, the boundary between this level of embodied metonymic cognition and the level of visible metaphors is not so rigid. The lived body and its motor abilities can enter our perceptual field in different situations and can be exploited, by means of the mechanism of embodied simulation, to carry out cognitive tasks that also involve the use of representations of the body. I will discuss this in more detail in section 4.
3.2 Visible Metaphors: Body Image as a Source Domain
The power of metaphors having a body image as a source domain can be considered audible and visible, in comparison with the silent and transparent power of embodied metonymies based on the body schema. To be more precise, in metaphors from the body image the source of the metaphor is our body as an object of representation and knowledge (i.e. a body percept, body concept, or body affect), and the metaphorical mapping takes place independently of the presence of these metaphors at the level of language, although language may reflect them. As such, the metaphorical mapping is now from one representation to another representation, and the contribution of the body is mediated by our cultural, environmentally situated, and linguistically structured representations of the body itself.
According to my proposal, Grady’s (Reference Grady1997a) primary and complex metaphors both take their source domains from the body image. As noted before, a primary metaphor is based on a correlation between a concept in the conceptual domain (matrix) of sensory experiences and another one in an abstract conceptual domain. The metaphorical mapping takes place at the conceptual level, and hence, in this account, the body contributes to metaphorical cognition as an object of knowledge. Metaphors such as INTIMACY IS CLOSENESS or AFFECTION IS WARMTH are classic examples of experientially motivated primary metaphors in which we associate two different conceptual domains.
Complex metaphors are not considered to be directly experientially motivated. They are a combination of primary metaphors recruiting wider frame knowledge (including believes, norms, etc.) (e.g. Grady Reference Grady2005b; Lakoff Reference Lakoff2008a). Standard examples of complex metaphors are long-discussed conceptual metaphors such as LOVE IS A JOURNEY or THEORIES ARE BUILDINGS. In these cases, the metaphorical mapping doubtlessly takes place at the conceptual level. It is thus suggested here that both primary and complex metaphors imply a mapping that takes place at the conceptual level and have (aspects of) the body image as a source domain.
A key element to discuss with respect to the difference between primary and complex metaphors is the notion of context. I propose that both primary and complex metaphors are contextually determined, although in very different ways. In order to highlight the contextually mediated nature of metaphors from the body image, we need to address a classic and hotly debated topic in CMT, which concerns the universal or nearly universal nature of conceptual metaphors, especially primary ones (e.g. Lakoff Reference Lakoff2008a). However, universality claims seem to be undermined both by theoretical considerations about the role of context in metaphorical conceptualization and by the huge amount of data on cross-cultural variation.
Kövecses (Reference 352Kövecses2005, Reference Kövecses2010a) stressed that there is a great amount of data on universal conceptual metaphors, and their universality seems to rely precisely on the universality of our bodily experiences. As Lakoff (Reference Lakoff2008a: 26) suggested, we all have the same bodies and almost the same relevant environments, and during our childhood the same connections between domains are likely to be built. We all have lived the physical experience of being up when we are happy. We all have lived the physical experience of warmth when feeling affection. Hence, this first level of metaphorical conceptualization is universal and deeply grounded in our bodies – although, in Lakoff’s account, it does not follow that those conceptual metaphors are expressed exactly in the same way in all languages. Nor does it follow that cross-cultural variation does not play any role at higher levels of conceptualization. In Lakoff’s view (Reference Lakoff2008a, Reference Lakoff2014), primary metaphors are atomic units that can be the basis for wider systems of contextually determined complex metaphors.
However, empirical evidence abounds on the side of cross-cultural or within-culture variations that can even be applied to primary metaphor (Kövecses Reference Kövecses2010a).4 According to Kövecses (ibid.: 204), this tension between universality and cultural variations in conceptual metaphors is the result of two different and concomitant pressures underlining our metaphorical conceptualization: One is the pressure of embodiment that forces metaphorical conceptualization toward its universal characterization; the other is the pressure of context, which is determined by local culture and forces metaphors in the direction of cross-culture and within-culture variations. To reconcile these different pressures, Kövecses (ibid.) proposes the notion of ‘differential experiential focus.’ Based on the idea that our bodily experiences consist of many different components, it assumes that we can single out and emphasize different aspects of that experience when using the same bodily experience as the source domain of a metaphor – depending on the context in which the metaphorical conceptualization takes place. Context is defined broadly by Kövecses (ibid.: 204) as the set of physical, social, cultural, and discourse aspects, the last of these also in relation to factors such as topic, audience and medium, that can affect the conceptualization of metaphors. Differences in experiential focus result from the role of context in the creation of primary metaphors. Although we all share a large number of physical and subjective experiences, which of these experiences that we focus on when establishing a primary metaphorical connection depends on the context in which these experiences are embedded. Hence, at this level, the role of context is that of selecting some aspects of our universally shared experiences. The context works as a lens that allows us to isolate some features of primitive universally shared human experiences that often and very early on occur together in our life (for a related view, see Casasanto, this volume).
Things are different with complex metaphors. Here, the role of context is much more pervasive. The blending of elements (primary metaphors, frame knowledge, beliefs, other complex metaphors, metonymies etc.; see Grady Reference Grady2005b; Lakoff Reference Lakoff2008a; Ruiz de Mendoza, this volume) that is involved in building complex metaphors is largely determined by contextual factors. Complex metaphors completely rely on cultural practices.
Issues related to the universality vs. culture-specificity of metaphorical conceptualization clearly need to be addressed in great detail because the metaphorical mappings take place at the conceptual level, from concept to concept – and even in the case of primary metaphors – from a conceptual representation of the body to another conceptual representation.
When considering the mapping from motor abilities to perception, the issue of the different pressures applied by embodiment and culture has a very different import. On the one hand, it is true that there are motor activities that are deeply culturally embedded and that these activities, for example, the practice of a specific dance or sport, have specific motor programs underlying them. As such, these motor programs are culturally determined.
In this context, it should furthermore be noted that the mechanism of embodied simulation is sensitive to these cultural differences. If I have never seen someone dancing the capoheria, I will likely not have any motor simulation of the observed action or a very low activation of the corresponding motor areas when I finally watch someone dancing it. The motor programs that in our motor system implement and underlie any culture-specific practices are the product of our learning activities in specific contexts.
On the other hand, when considering the contribution that our body schema makes to cognition, we primarily think of the set of processes and mechanisms that are the condition of possibility of any learning activity. These mechanisms and processes are innate and universal. Hence, the contribution the body schema makes to perception is the matrix of species-specific and universal aspects in human cognition.
4 Embodied Simulation
By means of the mechanism of embodied simulation, the metonymic mapping from motor abilities to perceptual experiences can be exploited to support cognitive activities at the level of the body image even when we are not carrying out any overt movement (Gallese & Cuccio Reference Gallese, Cuccio, Metzinger and Windt2016; Gallese & Lakoff Reference Gallese and Lakoff2005). Indeed, both thought and language recruit the mechanism of simulation. It is important to highlight that the mechanism of simulation allows us, in particular circumstances, to go from the activation of motor areas of the brain to the level of bodily sensations (Cuccio Reference Cuccio2014, Reference Cuccio2015), and this further highlights the foundational relation that connects actions/movements and perception.
This hypothesis is confirmed by empirical data. For example, Costantini and colleagues (Reference Costantini, Galati, Ferretti, Caulo, Tartaro and Romani2005) carried out an fMRI study to investigate the potential simulation elicited by the observation of both possible and biomechanically impossible movements of fingers. The authors of this study observed the activation of a motor simulation in both conditions. Additionally, data from this study also showed the activation of the Posterior Parietal Cortex (PPC), an area of the brain that is involved in the processing of bodily sensations, in both experimental conditions, but the PPC was significantly more activated in the biomechanically impossible movement condition. Moreover, particular circumstances – e.g. related to attention, emotion, pain or pleasure – can amplify the simulation effect. The experimental stimulus in the condition with the biomechanically impossible movements clearly suggested to the participants a potentially very painful situation. The simulation, in this and analogous cases, is not confined to the motor areas but also involves areas related to the experience of bodily sensations. This involvement of brain areas related to bodily sensations led participants of the study to report that they felt “sensations” that were determined by the observation of the experimental stimuli.
It has also been found that the mirror-neurons system is involved in action prediction: To predict other people’s actions, we recruit our motor system (cf. Abreu et al. Reference Abreu, Macaluso, Azevedo, Cesari, Urgesi and Aglioti2012). An fMRI experiment by Abreu and colleagues (ibid.) studied the ability of basketball players, both expert athletes and novices, to predict the outcome of free throws performed by others. Abreu and colleagues investigated whether other neural regions are also involved in predicting others’ actions. They found that in both expert basketball players and novices, the observation of the experimental stimuli (free throws performed by others) not only led to the recruitment of the mirror system and, hence, to motor simulation but also determined the activation of the sensory cortex. “Motor simulation,” the authors say, “may imply a mapping of specific sensory features” (ibid.: 1652). The specific experimental task of this study required participants to pay attention to the action features carried out by basketball players observed in short movies. The task performed led both to the activation of a motor simulation and to the involvement of the sensory cortex. Very likely, bodily sensations were directly recruited in solving the cognitive task the participants were asked to perform (to predict the outcome of free throws performed by others). The involvement of the sensory cortex is, thus, considered to be task-dependent.
In other words, embodied simulation does not always determine the activation of bodily sensations. There are cases, for instance, in which it has the function of preparing us for an action we can potentially carry out. A good example is tool-use behavior. Our interactions with tools are usually underpinned by the activation of the mechanism of simulation (Caruana & Cuccio Reference Caruana and Cuccio2015). In these cases, embodied simulation does not have the role of helping us understand others, and it likely does not lead to the experience of any bodily sensations. In this case, the cognitive role of simulation is to “suggest” to us the right action and prepare our motor system to perform it. Bodily sensations are not required for this.
The mechanism of simulation seems to be able to determine bodily sensations, even though this does not always happen (and when it happens, these bodily sensations are not always nor very often conscious).
In the hypothesis proposed here, embodied simulation allows us to exploit the metonymic mapping that goes from the body schema to perceptual experiences. Interestingly, Raymond Gibbs (Reference Gibbs2005b) reconsidered the notion of ‘image schema,’ defined in CMT as recurrent patterns of/in our sensorimotor experiences (Johnson Reference Johnson1987), in terms of their role in the mechanism of Embodied Simulation. For him, they are psychologically real experiential “gestalts,” construed “on the fly,” and not abstract mental representations unconsciously recalled during linguistic processing.
The hypothesis that image schemas play a crucial role in embodied simulations does not contradict the two levels of embodiment proposed here. In Gibbs’ account, embodied simulation is defined not as a merely neural phenomenon but as the experience of bodily feelings activated by an imaginative process. Embodied simulation, so conceived, can be considered as the mechanism that allows us to exploit the first foundational level of embodied metonymic cognition. In other words, the direct and unmediated contribution of the body to cognition that characterizes the first level of embodiment can be recruited to carry out cognitive tasks that involve the use of representations of the body (as part of the imaginative process, that is).
5 Conclusions
To summarize, I have proposed that – on the basis of the distinction between body schema and body image – it is possible to identify two levels of embodiment. The first level is the level of invisible metonymies that are foundational to every possibility of perception and cognition. They have (aspects of) the body schema as a source domain. The power of these metonymies is silent and transparent because in this case our bodily capabilities are the glasses through which we look at the world. The body is not the object of our attention and explicit knowledge. It is, instead, the condition of possibility of perception and knowledge. In this first case, the contribution of the body to thought and language is direct.
The second level of embodiment is the level of visible metaphors. They take source domains from the body image. In comparison with the silent and transparent power of metonymies based on the body schema, the power of these metaphors can be considered audible and visible. In this case our body, i.e. our bodily experiences, attitudes, and beliefs about the body, becomes the object of explicit knowledge and attention. The body is thus explicitly considered as an object of knowledge, and the contribution of the body to thought and language is mediated by our cultural, environmentally situated, and linguistically structured representations of the body itself.
1 Introduction
Metaphor is far more than a rhetorical tool used in poetry and literature. Metaphors of all shapes and sizes permeate our everyday communication, written and spoken. They also appear in visual media, including sculpture, paintings, and architecture, as well as in music. Metaphors, as argued by cognitive linguists, are anchored in our embodied experiences and recruited to help us make sense of abstract and complex entities and situations in the world (Gibbs Reference Gibbs1994; Johnson Reference Johnson1987; Kövecses Reference Kövecses2002; Lakoff & Johnson Reference Lakoff and Johnson1980).
Metaphors can be analyzed in different ways, depending on the goals of the researcher. One common distinction made among cognitive linguists is between “primary” versus “non-primary” metaphors (Grady Reference Grady1997b, Reference Grady1999, Reference Grady2005b; Grady et al. Reference Grady, Oakley and Coulson1999; Grady & Ascoli, this volume). Primary metaphors are thought to arise from our most basic physical and perceptual experiences in the world. On this view, the metaphor MORE IS UP, for instance, emerges from repeatedly observing the natural correlation between verticality and quantity (e.g. stacking cookies, piling up rocks, filling up glasses) (Lakoff Reference Lakoff1987: 276–277). Repeatedly creating or observing a “higher” stack of cookies comes to be associated with a greater amount of cookies. The same goes for a “higher” pile of rocks or for “rising” water levels.
Cognitive linguists often contrast primary metaphors like these with “non-primary” metaphors, an example of which is THEORIES ARE BUILDINGS, which underlies statements such as, “He destroyed my theory” or “My theory rests on solid ground” (Grady Reference Grady1997b; cf. also Kövecses Reference Kövecses2002: 108–110). Conceptual metaphor theory in particular maintains that, in understanding such statements, people map the source domain of BUILDINGS onto the target domain of THEORIES (for an extensive discussion of this point, see Steen, this volume). In contrast with MORE IS UP, there is no obvious embodied correlation associated with THEORIES ARE BUILDINGS; the presence of buildings does not necessarily correlate with the presence of theories. Rather, there is a perceived resemblance between THEORIES and BUILDINGS. Moreover, primary metaphors such as MORE IS UP are assumed to be universal because the correlation between verticality and quantity is a fact of the natural world. By contrast, THEORIES ARE BUILDINGS presupposes a specific cultural context, a context where people theorize and talk about creating and maintaining theories (see also Cuccio, this volume).
This is not to say that primary metaphors are acultural. Primary metaphors have abundant cultural reflections. With MORE IS UP, for example, measuring cups, thermometers, and graphs reflect the mapping of verticality onto quantity that we often see in Western cultures. In this chapter, we argue that such cultural reflections of primary metaphors do not merely signal metaphorical content but also play a more active role, actively shaping the conceptual systems of people who witness those reflections (for closely related views, see Gibbs Reference Gibbs1999; Kövecses Reference Kövecses2015; Marghetis Reference Marghetis2015). Once primary metaphors, perhaps originating from experienced embodied correlations, take hold and become part of multiple conceptual systems (i.e. of multiple speakers), the linguistic and cultural structures that emerge from this help maintain and strengthen the underlying metaphorical relationship. Linguistic and cultural reflections of primary metaphors may thus “feed back” into the underlying conceptual structure. This view resonates with approaches in cognitive science that emphasize the role of cultural artifacts in scaffolding conceptual knowledge (e.g. Hutchins Reference Hutchins1995, Reference Hutchins2005) and with the view that primary metaphors are not always purely “embodied” but may have different origins (e.g. also linguistic, cultural ones) (Casasanto Reference Casasanto2014b, this volume). Cognitive linguists could benefit from a broader perspective of metaphor, one that embraces the complex web of interactions between language, culture, and embodied experience.
Here we focus on three primary metaphors: MORE IS UP (section 2.1), SOCIAL DISTANCE IS PHYSICAL DISTANCE (section 2.2), and SIMILARITY IS PROXIMITY (section 2.3). For each, we discuss linguistic reflections (for a larger crosslinguistic survey, see Grady & Ascoli, this volume), gestural reflections, and non-linguistic cultural reflections, as well as the relevant evidence from behavioral experiments. In section 3, we argue that linguistic, gestural, and cultural representations of metaphor should not be viewed merely as passive indicators of underlying conceptual mappings but, rather, as building blocks for creating and re-creating metaphor (cf. Gibbs Reference Gibbs1999; Kövecses Reference Kövecses2015; Marghetis Reference Marghetis2015; Winter Reference Winter2014). In section 4, we discuss how primary metaphors overlap and interact.
2 A Complex Web of Language, Culture, and Cognition
2.1 MORE IS UP
We first discuss linguistic, gestural, cultural, and cognitive reflections of primary metaphors with MORE IS UP. This metaphor is expressed when English speakers make statements like “high tax rates,” or when German speakers make statements like “Die Preise sind gestiegen” (‘The prices have risen’). Besides these linguistic reflections, MORE IS UP is expressed via gesture. Metaphor is often expressed in manual gestures (Chui Reference 340Chui2011; Cienki Reference Cienki and Koenig1998; Cienki & Müller Reference Cienki and Müller2008b; Sweetser Reference Sweetser1998). Figure 6.1, a still image from the TV News Archive, shows a gesture by Michael Hayden, former CIA director (2006–2009).1 Hayden is talking about employment in the CIA and, specifically, about a division of the CIA that he classifies as “core support.” According to Hayden, there is a “disturbingly high number of contractors” in this core support division. In Figure 6.1, the palm of his left hand is facing toward the audience and toward the camera. As he says “high number,” he moves the hand upward. This movement is consistent with MORE IS UP. That the movement is time-locked with the verbal phrase suggests that gesture and the metaphorical semantics are tightly coupled in this example (for a close discussion of further examples, see Winter et al. Reference Winter, Matlock, Knauff, Pauen, Sebanz and Wachsmuth2013).


Figure 6.1 MORE IS UP expressed in co-speech gesture on the phrase “high number.” The hand starts low and moves up to a higher position (a), with the end point shown in (b).2
Cultural reflections of MORE IS UP are pervasive. Floors in tall buildings are numbered with smaller numbers at the bottom. Doctors measure humans using scales with smaller numbers at the bottom and increasingly larger numbers going upward. Beakers and measuring cups have small numbers at the bottom and large numbers at the top, and so do thermometers. Cultural reflections of MORE IS UP are prevalent in graphs (Tversky Reference Tversky2011), where it is a convention to put larger numbers higher on the y-axis than lower numbers. This characterizes particularly bar plots and line graphs, which are often used in science and newspapers. Empirical work on the understanding of graphs has furthermore shown that vertical bar plots that embody MORE IS UP are easy to interpret, more so than horizontal bar plots (Fischer et al. Reference Costantini, Galati, Ferretti, Caulo, Tartaro and Romani2005).
Some researchers have pointed out exceptions in the cultural patterns of MORE IS UP (e.g. Holmes & Lourenco Reference Holmes, Lourenco, Carlson, Hölscher and Shipley2011; see also Tversky Reference Tversky2011). Consider numbers on cell phones, where smaller numbers are at the top rather than the bottom, or the rank orders of tournaments, where the first rank is listed at the top and lower ranks at the bottom. However, crucially, these number uses contrast with measurement cups and thermometers in that they do not imply a true sense of quantity, which is commonly called a “cardinal” use of number. Instead, cellphones present a primarily “nominal” use of numbers (e.g. the phone number +1-998-532-9193 is not “more” than the number +1-138-777-6124), while tournament ranks represent an “ordinal” use of number (see Nieder Reference Nieder2005 for excellent discussion). Hence, cultural reflections that do map true quantity onto verticality tend to obey the MORE IS UP principle (see Winter et al. Reference Winter, Matlock, Shaki and Fischer2015).
Experiments have provided abundant behavioral evidence to support the claim that people have a mental association between verticality and quantity (for a review, see Winter et al. Reference Winter, Matlock, Shaki and Fischer2015). For example, when people are asked to generate a sequence of numbers as randomly as possible, they tend to generate larger numbers after having moved their eyes upward (Loetscher et al. Reference Loetscher, Bockisch, Nicholls and Brugger2010), or after having moved their head upward (Winter & Matlock Reference Winter, Matlock, Knauff, Pauen, Sebanz and Wachsmuth2013). Similar associations between verticality and quantity have been found in more classic button-press paradigms. For example, when participants are asked to indicate whether a number is “even” or “odd” through using two buttons, they tend to respond faster to a larger number when the button is located at a relatively higher position (Hartmann et al. Reference Hartmann, Gashaj, Stahnke and Mast2015; Ito & Hatta Reference Ito and Hatta2004; Müller & Schwarz Reference Müller and Schwarz2007).
2.2 Social Distance is Physical Distance
People often talk about social distance in terms of physical distance. For example, in discussing friendships or romantic relationships, English speakers make statements such as “The couple is slowly drifting apart,” or “Bill and Marco have gotten closer lately.” Such cases imply a change in social distance, not physical distance. German speakers also do this, as seen in “Wir waren uns einmal sehr nah” (‘We were very close once’). In these and other linguistic reflections of the primary metaphor SOCIAL DISTANCE IS PHYSICAL DISTANCE (aka INTIMACY IS CLOSENESS), people talk about aspects of social relationships in terms of physical space. More precisely, the amount of space separating people reflects the nature of their relationship, such that larger distances indicate larger degrees of estrangement/alienation, etc.
An example of gesture related to SOCIAL DISTANCE IS PHYSICAL DISTANCE is shown in Figure 6.2, also from the TV News Archive. In this example, Sir Paul McCartney is being interviewed by David Letterman. In describing his relationship with pop star Michael Jackson he says, “So, we kinda drifted apart.” He describes how the relationship deteriorated, and in doing so, he gestures. He raises both hands to his chest, where they are momentarily held close together, and quickly moves them away from each other. Critically, the distance between McCartney’s two hands is smaller at the beginning of the gesture than it is at the end, reflecting increased distance, showing how people spontaneously use the spatial modality of gesture to express state changes in social relationships.


Figure 6.2 Co-speech gesture expressing SOCIAL DISTANCE IS PHYSICAL DISTANCE. First the hands are close together (a), then farther apart (b).3
Above and beyond gesture, we see other cultural reflections of SOCIAL DISTANCE IS PHYSICAL DISTANCE. Social scientists have discussed “segregation effects” and “peer effects” in the context of human relationships.4 Segregation effects refer to cases in which people physically move closer to others they perceive as similar to themselves (Miller & Page Reference Miller and Page2007: 143–146; see also Bishop Reference Bishop2008). Peer effects refer to cases when people get physically close to each other and then begin to pick up certain behaviors from each other (Christakis & Fowler Reference Christakis and Fowler2009). So people tend to move toward others they perceive as similar, and to become more like them once they are physically close. These two tendencies can lead to large-scale correlations of social distance and physical distance. This principle is not only characterizing modern societies but has also been shown for old hunter-gather settlement sites (Wiseman Reference Wiseman2014). Thus, culture at large reflects the principle of SOCIAL DISTANCE IS PHYSICAL DISTANCE.
The association between ‘social distance’ (‘intimacy’) and ‘physical distance’ is also reflected in film, another form of cultural representations. Here we discuss a scene from Before Midnight, the third movie in a trilogy about two characters in stormy relationship.5 The main characters, Jesse and Celine, husband and wife, are on vacation on a Greek island and, instead of having a romantic evening, have a heated discussion about Jesse’s teenage son, Henry. The argument drags on for 20 minutes, spanning a wide range of heated topics, such as irrational thinking in men versus women, personal sacrifices in marriage, and infidelity. This part of the argument fluctuates in emotional intensity, becoming very loud and caustic at times, but calm and subdued, at others. In the end, Celine leaves the room when she becomes enraged with Jesse.
Over the course of the argument, the physical distance between Jesse and Celine changes in ways that are consistent with SOCIAL DISTANCE IS PHYSICAL DISTANCE. When the argument becomes aggressive and heated, there appears to be greater physical distance between the characters. Figure 6.3 shows two successive moments at the beginning of the argument. Jesse sits on the bed in (a), and Celine, on a couch in the adjacent room, in (b). Jesse pointedly asks, “So this is how you want to be spending this evening? I mean, this is what you wanna do tonight?” to which Celine curtly responds, “Well, you started it!” At this time, Jesse and Celine appear to be as far from each other as possible in the hotel room. Importantly, the distance between the viewer and the characters in the film is also accentuated via the camera. In (a), Jesse appears to be very far away, both from Celine and from the viewer. The physical distance is highlighted by the fact that the door frame is included in the shot, showing physical separation between the two. Moreover, both shots in (a) and (b) show the full body (total shots). In the previous scenes where the couple was intimate, there were more close-ups, focusing on the face and suggesting physical proximity. Thus, both the relative positioning of the characters and the position of (and perspective taken by) the camera adhere to SOCIAL DISTANCE IS PHYSICAL DISTANCE.


Figure 6.3 Spatial positions of characters expressing SOCIAL DISTANCE IS PHYSICAL DISTANCE in Before Midnight. (a) Jesse criticizes Celine for arguing when they should be making love instead; (b) Celine retaliates by blaming Jesse for having started it.
Figure 6.4 shows two shots from before the argument, where physical (and hence also social) distance appears to be smaller, even when Jesse and Celine do not share the same frame. One way that physical proximity is suggested is by the size of the characters in each shot: Rather than the full total shots in Figure 6.3 (showing each body almost entirely), the characters are shown up close, with most of the image space taken up by their bodies. Further, Figure 6.4b shows the same door frame as Figure 6.3a, but separation seems less because Celine is standing at the door.


Figure 6.4 Spatial positions of characters expressing SOCIAL DISTANCE IS PHYSICAL DISTANCE in Before Midnight. Jesse (a) and Celine (b) in a friendly interaction about a gift from their friends.
So far, we have talked about linguistic, gestural, and cultural reflections of SOCIAL DISTANCE IS PHYSICAL DISTANCE. Several experimental studies suggest that the domain of physical distance is automatically accessed when thinking about likeability and intimacy. For example, in Matthews and Matlock (Reference Matthews and Matlock2011), people drew a path through a park depicted by a map. On the map, stick figures represented characters described as “friends” (low social distance) or “strangers” (high social distance). On average, lines intended to represent paths were closer to the “friends” than to the “strangers” on the map. In another experiment, Williams and Bargh (Reference Williams and Bargh2008b), people drew two points on a sheet of paper, either very close to each other or very far away from each other. People in the “far” condition reported that they felt less of an emotional attachment to family members than did people in the “close” condition (although Pashler et al. Reference Pashler, Coburn and Harris2012 report a failure to replicate this result). These and other studies indicate that when performing spatial tasks or when making social judgments, people automatically consider social distance and physical distance together. Such work supports the idea that SOCIAL DISTANCE IS PHYSICAL DISTANCE is not merely expressed in linguistic, gestural, and cultural content but is part of our conceptual system.
2.3 SIMILARITY IS PROXIMITY
English speakers often talk about SIMILARITY in terms of how PROXIMAL or DISTAL things are relative to each other. For example, your friend describes her political views as being “very far from” or “very close to” your political views (cf. examples in Casasanto Reference Casasanto2008a: 1047), or a chef tastes her sauce and says, “It’s getting close now,” referring to how similar the sauce is to the sauce she made a week ago. A comparable German example is seen in statements such as “Diese Ansichten sind weit voneinander entfernt” (‘These views are far away from each other’).
Just as with SOCIAL DISTANCE IS PHYSICAL DISTANCE, the primary metaphor SIMILARITY IS PROXIMITY can be expressed through gesture. The speaker in Figure 6.5, Michael Powell, CEO and president of the National Cable & Telecommunications Association, is asked by an interviewer whether wired and wireless markets should be regulated in the same way. He responds that regulations should be “harmonized” and that wired and wireless markets are “increasingly trending toward being more similar, not more different.” When he is talking about wired and wireless technologies becoming “more similar,” he moves his hands, palms facing toward each other, toward the middle of his body. While saying “not more different,” he moves his hands apart. This sequence is integrated, with his hands continuously approaching each other and retracting again, beginning with “increasingly” in the utterance. The gesture makes it clear that two spatial positions are prominent, one being close (coinciding with the “similar” part of the sequence) and one being far (coinciding with the “far” part of the sequence). In this example, distance is primarily marked through dynamic movement toward or away from the midpoint of the body. The amount of distance between the hands is associated with the degree of similarity or difference.


Figure 6.5 Co-speech gesture expressing SOCIAL DISTANCE IS PHYSICAL DISTANCE. Both hands move toward a location in the center of the body for expressing SIMILARITY (a). For expressing DIFFERENCE, the hands move apart (b).6
SIMILARITY IS PROXIMITY also appears in the spatial location and configuration of cultural artifacts. We see this in how rooms in houses are arranged. Things that are similar by virtue of function are grouped together. For instance, substances for cleaning the body are co-located in bathrooms. Edible things are found in kitchens. Clothing items are found together in bedrooms. This pattern is also seen in design, including web design, where things that perform similar functions, such as menu buttons, are positioned close to each other. This pattern is also seen in the design of virtual worlds (e.g. Waterworth et al. Reference Waterworth, Lund, Modjeska, Munro, Höök and Benyon2003). Humans intentionally use space to sort things in their environment according to similarity (cf. Kirsh Reference Kirsh1995). Grouping like things together in space helps people perceive them as similar (Wertheimer Reference Wertheimer and Ellis1938).
Again, these are relatively small-scale reflections of a larger cultural principle that is evident in society-wide scales. For example, people within a city tend to self-organize into districts that are relatively homogenous with respect to factors such as ethnicity or socioeconomic status (Schelling Reference Schelling1971; Bishop Reference Bishop2008). Often, similar people are located near each other because of necessity. Students are located near other students because they attend the same university and live near campus. Nurses are located in or close to hospitals or other medical facilities because they provide care for patients. Commuters are near other commuters because they share the same road or mode of transportation.
What is the experimental evidence for the conceptual nature of the metaphor SIMILARITY IS PROXIMITY? Casasanto (Reference Casasanto2008a) asked participants to rate unrelated words while they were being presented on a computer screen at varying distances. When participants were presented with two words that were close to each other, they rated them as more similar than when they were presented with words that were relatively far from each other. Boot and Pecher (Reference Boot and Pecher2010) found that participants were quicker to judge colors (e.g. two shades of blue) as being similar when the colors were presented close together, and quicker to judge them as different when they were presented relatively far apart (see also Breaux & Feist Reference Breaux, Feist, Love, McRae and Sloutsky2008). Winter and Matlock (Reference Winter, Matlock, Knauff, Pauen, Sebanz and Wachsmuth2013) showed that cities or people that were described as similar were subsequently placed closer to each other in a drawing task. Finally, Guerra and Knoerferle (Reference Guerra, Knoeferle, Carlson, Hölscher and Shipley2012) showed how visual depictions of various distances can affect the comprehension of sentences involving similarity. In their task, participants were shown concepts, i.e. words such as “stupidity” and “wisdom,” on two cards that were near or far from each other, and then read the sentence, “Stupidity and wisdom are certainly different.” Sentences about differences were read more quickly when the words “stupidity” and “wisdom” had been far from each other. Conversely, sentences about similarity were read more quickly when the concepts had been presented close to each other. Similar to our discussion of MORE IS UP and SOCIAL DISTANCE IS PHYSICAL DISTANCE, such research on SIMILARITY IS PROXIMITY suggest that this primary metaphor is deeply entrenched in our cognitive system, and there is much convergent evidence to support the hypothesis that SOCIAL DISTANCE IS PHYSICAL DISTANCE is a mapping not just in language and culture but also in our conceptual systems.
3 A Cultural Feedback Loop
In the preceding section, we reviewed MORE IS UP, SOCIAL DISTANCE IS PHYSICAL DISTANCE, and SIMILARITY IS PROXIMITY. Of key interest was how these exemplary primary metaphors are not purely conceptual. They are also expressed through language, gesture, and culture. In this section, we discuss some implications from this co-expression through multiple channels (for related arguments, see Gibbs Reference Gibbs1999; Kövecses Reference Kövecses2015; Marghetis Reference Marghetis2015; Winter Reference Winter2014). We begin by discussing the traditional view of primary metaphors.
Primary metaphors are thought to come from repeatedly experiencing a set of embodied correlations. This is a plausible proposal given that we know that children are exceptionally good at detecting statistical regularities in their environment (Kirkham et al. Reference Kirkham, Slemmer and Johnson2002; Saffran et al. Reference McNaughton, Barnes, Gerrard, Gothard, Jung and Knierim1996; Saffran et al. Reference Saffran, Johnson, Aslin and Newport1999). Moreover, there is a plausible neuronal mechanism that can readily explain the cognitively entrenched nature of conceptual metaphors. This mechanism is “Hebbian learning” (Hebb Reference Hebb1949), often summed up in the slogan “neurons that fire together wire together” (cf. Lakoff Reference Lakoff2012). For example, repeatedly experiencing the correlation between verticality and quantity will repeatedly activate neurons associated with the perception of space and neurons associated with numerical estimation. Over time this pattern strengthens the connections between the neurons thus frequently co-activated. Tapping into and entrenching such correlational structures is generally thought to be the source of primary metaphors (for more discussion, see this volume: Casasanto; Grady & Ascoli).
Yet, from the perspective of a learning child, there is no principled difference between environmental correlations commonly subsumed under the “embodied origins” of primary metaphors and the cultural correlations discussed above. That metaphors come to be expressed in language, gesture, and culture means that language, gesture, and culture yield a new set of correlations that provide input to a child’s metaphor system. Children grow up in a cultural world where they are surrounded by metaphors becoming expressed in cultural artifacts, gestures, and metaphorical verbal language.
These correlations may continue to play a role once a particular metaphorical mapping has been learned in childhood. While growing up in a metaphor-infused culture, people are constantly “reminded” of the metaphorical mappings they learned at a young age. Such an argument is presented in Winter (Reference Winter2014) in an analysis of a specific cultural reflection: horror movies.7 The argument is that horror movies often reflect the primary metaphors BAD IS DOWN and BAD IS DARK (or EVIL IS DARK). This is even evident in DVD stores: DVD covers get obviously darker as one goes from the comedy section to the horror section. Within a given horror movie, shifts in darkness and verticality are frequently expressed over the course of the narrative. For example, the 2012 movie The Cabin in the Woods follows a downward trajectory as things become progressively worse for the protagonists. Often, these primary metaphors are expressed more locally in a single shot, such as the camera panning down to a dark hole from which a monster emanates. So, nearly every time people watch such a movie, they (re-)experience (old, already entrenched) correlations between the source domains of VERTICALITY and DARKNESS and the target domain of GOOD/BAD. On this view, the horror movie serves as a reminder of the metaphorical mapping already engrained in the general cultural context, but it also extends and elaborates those mappings by giving them concrete, entertaining, and emotionally engaging cultural representations (see also Forceville Reference Forceville2008).
This general principle is not limited to horror movies of course. All kinds of cultural reflections of metaphors (e.g. posters, advertisements, books) function as such reminders, further strengthening the mapping and maintaining it in our culture. When we discussed MORE IS UP, SOCIAL DISTANCE IS PHYSICAL DISTANCE, and SIMILARITY IS PROXIMITY in section 2, we talked about linguistic, gestural, and cultural “reflections” because the representations are often analyzed as merely reflecting underlying conceptual content. In particular, in the domain of media, “multimodal metaphors” are often seen as passive reflections of our internal conceptual world, a view that is driven by the largely cognitive orientation of Lakoff and Johnson (Reference Lakoff and Johnson1980), which sees underlying conceptual structures as the ultimate cause of metaphorical language. However, metaphor theory has to acknowledge that verbal, gestural, and cultural reflections are witnessed by others, and when they are, external representations of metaphor affect the cognitive systems of those observers.
Marghetis (Reference Marghetis2015) discusses a similar concept in what he calls “gestural contagion,” the idea that co-speech gestures help to propagate metaphorical concepts. His research experimentally demonstrates that seeing a particular metaphor expressed in gesture changes the subsequent understanding of the target domain in a non-gestural task. For example, seeing a gesture in line with ARITHMETIC IS MOTION ALONG A PATH (Lakoff & Núñez Reference Lakoff and Núñez2000) activates spatial representations where small numbers are to the left of larger numbers more so than seeing a gesture in line with ARITHMETIC IS OBJECT COLLECTION.
A useful way of looking at the influence of culture and external representations of metaphor is presented in Kövecses (Reference Kövecses2002), who distinguishes three levels at which metaphor should be investigated: the “sub-individual” level (i.e. embodied experience), the “individual” level (i.e. cognitive mappings inside single minds), and the “supra-individual” level (cultural representations). Some metaphor theorists maintain that direction of causality goes from the sub-individual to the individual to the supra-individual level, in a feedforward manner (but see this volume: Casasanto; Gibbs). Winter (Reference Winter2014), in line with Gibbs (Reference Gibbs1999), Marghetis (Reference Marghetis2015), and others, argues that it is important to remember that cultural representations of metaphors (understood broadly as including artifacts, gesture, and linguistic expressions) feed back into the cognitive systems at the individual level. That is, if we view metaphor as a multi-scale phenomenon distributed across different levels, we should not assume that metaphors in the “underlying” conceptual systems (the individual level) lead to cultural representations (supra-individual), but that the connection between individuals and representations is a two-way street.
The precise ways through which cultural representations interact with cognitive metaphorical systems are still underexplored. Winter (Reference Winter2014) proposes at least three different ways in which culture and cognition come together with respect to metaphor. First, cultural representations elaborate on metaphors and enrich them with specific examples, e.g. a monster in a horror movie that instantiates the more general BAD IS DARK in a highly specific and concrete way (cf. the many examples in Forceville Reference Forceville, Kristiansen, Achard, Dirven and Ruiz de Mendoza Ibañez2006a, Reference Forceville2008, this volume; Forceville & Urios-Aparisi Reference Forceville and Urios-Aparisi2009). Second, cultural representations (including artifacts) may strengthen metaphorical representations, i.e. act as reminders in different ways and at different time points. Third, cultural representations may create new metaphorical representations, or re-create metaphorical representations that already exist in a new generation of speakers in a culture. In contrast to gesture and language, non-verbal cultural representations (e.g. artifacts, movies) play a special role because they are less ephemeral (e.g. a movie can be watched again and again) and distributed more widely (e.g. movies are watched by millions of people). Thus, these cultural representations have a way of stabilizing metaphorical representations throughout a culture, and hence also in the minds of its members.
Similar ideas have been offered by Daniel Casasanto, who argues against the idea that all metaphors are embodied and proposes that metaphors have diverse origins. In discussing CONSERVATIVE IS RIGHT/LIBERAL IS LEFT, Casasanto (Reference Casasanto2014b) argues that – given the lack of any obvious embodied correlations in the natural world for this association – this metaphor must be acquired on the basis of language use, specifically, exposure to phrases such as “the right-wing party” or “his political views are left-leaning” (and would thus not count as “primary”; see Grady & Ascoli, this volume). Casasanto (Reference Casasanto2014b) also proposes that the left–right orientation of time (in Western cultures), which is not reflected by linguistic metaphors (and would thus not count as “primary” either), is influenced by the frequent use of calendars and other cultural representations of time. Finally, he argues that GOOD IS RIGHT/BAD IS LEFT emerges from individual embodied experience (ibid.), namely people’s handedness (see also Casasanto Reference Casasanto2009a; Casasanto & Henetz Reference Casasanto and Henetz2012), essentially through correlations of feeling positive emotions when performing actions fluently with their dominant hand, i.e. the right one in the vast majority of cases. This metaphor – in contrast to THE FUTURE IS RIGHT/THE PAST IS LEFT – has some linguistic reflections (e.g. “this is the right thing to do,” “He is not in his right state of mind”), but language does not reflect that left-handed people mentally associate GOOD with LEFT.
So, although embodiment is a powerful explanatory tool in metaphor analyses, we should avoid immediately defaulting to it, especially at times when evidence for it seems to be lacking (cf. Casasanto & Gijssels Reference Casasanto and Gijssels2015). Some claims about the embodied origin of primary metaphors would be difficult to test experimentally (see also Casasanto Reference Casasanto2014b: 249). For instance, what experiment would be capable of testing whether MORE IS UP is derived from perceiving the natural correlation of quantity and verticality and nothing else? This would require factoring out all cultural reflections associated with this mapping, which is likely to be impossible. As previously discussed, because our environment is so infused with metaphor, it is challenging to tease apart the relative contributions of various correlations.
4 Interactions and Gradations between Primary Metaphors
If, as we argue, children and adults are sensitive to metaphorical correlations in the natural world, in language, including gesture, and culture, then we have to acknowledge that some correlations do not correspond as clearly to particular metaphors as we might expect. There is a tendency in metaphor research to characterize metaphors as discrete entities (see also Gibbs Reference Gibbs2011c). Yet, it is precisely the “embodied correlations view” that, when turned to its full logical conclusion, can lead us to question the discreteness of metaphors (see also this volume: Gibbs; Jensen).
Imagine a bar graph next to an upward-pointing arrow of green color, a form of visual representation that features prominently in TV and online reports of sales, revenues, and the stock market. In these financial contexts, the upward-pointing arrow and the bar graph are conventionally interpreted in terms of numerical quantity, e.g. increased revenues. However, an affective or evaluative component may also come into play, with the upward arrow indicating that things are going in a POSITIVE direction, i.e. UPWARD. This emotional message is highlighted by the fact that frequently in news reports, such arrows are in green color if they point upward and in red color if they point downward. In other words, the color scheme emphasizes the evaluative component on top of the association between quantity and verticality. A metaphor analysis of such a visual display thus needs to recognize that there is the potential for MORE IS UP and GOOD IS UP being co-present.
Similar cases of co-present metaphors are SIMILARITY IS PROXIMITY and SOCIAL DISTANCE IS PHYSICAL DISTANCE (INTIMACY IS CLOSENESS), discussed in section 2. The target domains SIMILARITY and SOCIAL DISTANCE are both correlated in the social world and covary with the source domain of PHYSICAL DISTANCE. People who are more similar to each other tend to like each other more. People who like each other tend to become more similar. And people who like each other and are similar to each other tend to be in close physical space. Hence, in the social world, the metaphors SIMILARITY IS PROXIMITY and SOCIAL DISTANCE IS PHYSICAL DISTANCE are often conflated, and this carries over to the linguistic instantiations of the respective metaphors.
Take, for example, the sentence “We’re headed in opposite directions” (Gibbs Reference Gibbs2011c: 531), commonly discussed as an instance of the metaphor LOVE IS A JOURNEY. Depending on the intentions of the speaker and nature of the situation, this example reflects LOVE as a MOVEMENT ALONG A PATH, with “headed” and “directions” implying motion. Increased distance is also inferred, suggested by the physical movement of the two people in two different directions. The sentence also conveys the negative status of the relationship, with loss of intimacy and increased social distance. Thus, this specific instantiation of the complex metaphor LOVE IS A JOURNEY reflects SOCIAL DISTANCE IS PHYSICAL DISTANCE, in addition to such primary metaphors as ACTION/PROGRESS IS MOTION, STATES ARE LOCATIONS, and PURPOSES ARE GOALS (see Lakoff Reference Lakoff2008a). In addition, SIMILARITY IS PROXIMITY may play a role, as the experiential contexts overlap with those motivating some of these primary metaphors, such as RELATIONS ARE CONTAINERS and INTIMACY IS CLOSENESS. A couple that goes in different directions often does so because of insurmountable differences in attitudes, tastes, lifestyles, opinions, or beliefs. Moreover, once “apart,” the members of the separated couple are likely to become more dissimilar from each other over time because of the lack of frequent interaction. The statement “We’re headed in opposite directions,” said with the intended LOVE IS A JOURNEY reading, is likely to lead to inferences about intimacy and similarity consistent with both SOCIAL DISTANCE IS PHYSICAL DISTANCE and SIMILARITY IS PROXIMITY. The same argument applies to other instances of LOVE IS A JOURNEY, as in “our relationship is at a crossroads” (cf. Gibbs Reference Gibbs2011c).
Let us consider non-linguistic cultural representations that conflate coherent conceptual metaphors. Timelines, such as calendars, can represent both THE FUTURE IS TO THE RIGHT as well as MORE IS TO THE RIGHT. As one goes from left to right on a timeline, temporal distance increases concomitantly with numerical quantity (for discussion, see Winter et al. Reference Winter, Matlock, Shaki and Fischer2015; Marghetis Reference Marghetis2015; Marghetis & Youngstrom Reference Marghetis, Youngstrom, Bello, Guarini, McShane and Scassellati2014). In a similar fashion, MORE IS BIG often correlates with IMPORTANCE IS SIZE because bigger quantities of any object are generally more important.
So, culture, language, and gesture provide speakers with many correlations, along with environmental/natural correlations. But just what is being correlated may not always be clear, and this transfers over to metaphors, where there is ambiguity about which metaphor is highlighted in a given situation. This is also in line with Gibbs’ dynamic systems approach to metaphors in discourse:
A given conceptual metaphor is not just activated, and employed as a single entity, to help interpret a metaphorical utterance. Instead, multiple conceptual metaphors, which may have arisen to prominence at a specific moment in time, given the particular dynamics of the system at that moment, may collectively shape the trajectory of linguistic processing so that no one conceptual metaphor has complete control over how an utterance is interpreted.
Observational evidence for the simultaneous co-activation of metaphor comes from Walker and Cooperrider (Reference Walker and Cooperrider2015), who show for time metaphors that English-speaking gesturers simultaneously move their hands forward and to the right when talking about the future, and backward and to the left when talking about the past. Thus, in interactions about time, gestures indicate that speakers simultaneously conceptualize the target domain in terms of two distinct metaphors, one in which time is thought to extend along a front-to-back axis and one in which time is thought to extend along a left-to-right axis (as on a calendar). This discussion is not intended to criticize labels such as “MORE IS UP” or “SIMILARITY IS PROXIMITY.” Delineating metaphors gives us a convenient way to talk about the recurrent patterns that they entail. Still, we need to remind ourselves that metaphors such as MORE IS UP, SOCIAL DISTANCE IS PHYSICAL DISTANCE, and SIMILARITY IS PROXIMITY are not discrete entities.
5 Conclusions
In this chapter, we reviewed evidence for three primary metaphors: MORE IS UP, SOCIAL DISTANCE IS PHYSICAL DISTANCE, and SIMILARITY IS PROXIMITY. Apart from being entrenched conceptually, these are jointly expressed through language, gesture, and culture. We have taken this multitude of multimodal metaphorical “reflections” to argue that primary metaphors have multi-causal origins and that embodied experience of natural correlations cannot be the only story.
We have argued that the conflation of (different, but coherent) metaphors in the natural and the sociocultural world is mirrored by a conflation in linguistic expressions of metaphor, as well as the manifestations of metaphor through other modalities. Finally, these considerations suggest that it is fruitful to focus on the interactions between different modalities of metaphorical expression, as well as interactions between different metaphors, rather than to focus exclusively on particular modalities or particular metaphors in isolation.


















