1 Introduction
Emblematic gestures (or emblems) have been given a range of denominations in the literature (e.g. autonomous, quotable, semiotic, folkloric, or symbolic gestures). Emblems are culture-bound gestures; they differ interculturally and intraculturally, that is, both among different cultural and linguistic areas and among individuals and social groups within the same culture. These gestures are easily translated into verbal language, and they are quotable; they are equivalent to utterances, and in many cases, they have names. Typical emblems are gestures used – alongside or without words – for greetings (welcome or farewell), for (often obscene) insults or mockery, to indicate places or people (deictics), to refer to the state or characteristics of a person (to be drunk, to be asleep, etc.), to give interpersonal orders (shut up!, come!, move away!, listen!), or to represent actions (to eat, to drink, to copulate, to commit suicide, etc.). Many emblems show a clear perlocutionary component (to offer, to threaten and to praise, to promise, or to swear, etc.) in the sense of Reference AustinAustin (1962).
The mainstream tradition in the study of emblems has always emphasized their autonomy from speech, but this does not mean that they cannot appear simultaneously with verbal (or vocal paralinguistic) elements. Rather, it means that the gesture has reached very high levels of conventionality and systematicity so that emblems are interpretable (like words, to a degree) with a high level of context independence. It has also been emphasized that emblems can be precursors of certain units of sign languages and that they often play a role in the latter’s origin. On the other hand, the emblematic capacity can be regarded as associated with illocutionary force, which is one of the most characteristic features of these units.
All of these precedent features explain why we often find emblematic gestures described in dictionaries, with their own entries or in more or less conventional colloquial expressions: for example, give sb [somebody] the finger (US, “to show someone in an offensive way that you are angry with that person by turning the back of your hand towards them and putting your middle finger up,” Cambridge Dictionary, 2021, Definition 1); or two fingers (UK, “in Britain, a sign that is considered rude, made by holding your hand up with your palm facing towards you and your first and second fingers held in a V shape: She drove past and stuck two fingers up at him,” Cambridge Dictionary, 2021, Definition 2). The comparison of these two emblems already gives many clues as to why a semiotic and sociocultural analysis of these units is needed. This is even more evident if we contrast the first item with a similar emblem made with the index finger (which can have different meanings by culture: “first,” “request”/“question,” etc.), or if we contrast the second item with the emblem that is made with the same morphological configuration of the hand but with the palm facing out (the well-known gesture of “victory,” internationally widespread).
As regards the lines of research on this topic, the most traditional one has focused on the collection and analysis of emblems from the viewpoints of history and cultural anthropology, dialectology, and linguistic geography. More recently, from a pragmatic/semiotic and ethnographic view, emblems have been conceived of as multimodal tools on the frontier between verbal and nonverbal modes which form part of the communicative repertoire of individuals and sociocultural groups. On a cognitive dimension, they show clear cases of embodiment of meaning and are susceptible to many processes of metaphorization, metonymy creation, and interference between modalities. Emblems can be conceived of as prototype categories, and their salience and relevance are evident in the communicative processes of production and comprehension. The applications of their analysis are numerous: lexicography, second language learning, and natural language processing, inter alia.
2 Emblems and Other Gestures or Nonverbal Acts
The analysis of emblems is associated with the study of the categorization of gestures and all subsequent attempts to establish different classes and subclasses of gestures. It is also related to the implicit or explicit definition of gestures, which is much less straightforward than it may at first appear. Reference McNeillMcNeill (1992, p. 37) was right in saying that “[m]any authors refer to all forms of nonverbal behavior as ‘gesture’, failing to distinguish among different categories, with the result that behaviors that differ fundamentally are confused or conflated.”
Common dictionaries define gesture, essentially, as an expressive movement, which is also the core of the definitions in traditional studies. In fact, the etymology of gestures takes us to the Latin noun gestus, “attitude or movement of the body,” which derives from the verb gerere, “carry,” “drive,” “carry out (actions or activities),” “show (attitudes).” The same Latin etymological family also includes manager/management, gestation/ingest, and register/suggest, which are all related to the idea (and the metaphor) of carrying something, where the something in question is (generally) some meaning or some feeling/emotion.
In ordinary dictionaries we often find the distinction between nonsignificant gesture (e.g. a bad gesture or an involuntary gesture) and a significant or expressive gesture (a farewell gesture or a gesture of threat/fear). In classical rhetoric we already find a particular interest in gestures that can function as words, while, in the nineteenth century, Reference GratioletGratiolet (1865) begins to speak of symbolic gestures (in a very particular way), and Reference WundtWundt (1900/1973) subsequently makes much clearer references to this class of gestures.Footnote 1 Reference EfronEfron (1941/1972) incorporates the term emblem from the Renaissance (cf. Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfTeßendorf, 2013, p. 83), but only for symbolic, conventional, and arbitrary gestures. Later, Reference Ekman and FriesenEkman and Friesen (1969, p. 63) extend the use of the term to “those non-verbal acts which have a direct verbal translation, or dictionary definition, usually consisting of a word or two, or perhaps a phrase” (cf. also Reference Ekman and FriesenEkman & Friesen, 1972).
Since then, the denomination of emblems or emblematic gestures has alternated with other names, which have nonetheless failed to have the same good fortune. For example, folkloric gestures (Reference HayesHayes, 1951), semiotic gestures (Reference BarakatBarakat, 1973), and, in the French tradition, the term quasi-linguistic (Reference Dahan and CosnierDahan & Cosnier, 1977). Kendon has proposed the terms autonomous and quotable,Footnote 2 which have become more commonly used than the previous ones (especially the latter term), but not more common than the term emblem. Each of these alternative denominations emphasizes an obvious trait or aspect of the gesture, but at the same time hides others, and perhaps for this reason – because of their partialness – they have not met with success. By contrast, the term emblem hides under a technical “surface” the possibility of a variety of definitions and ultimately becomes less compromised. Surely this (relative) vagueness – and a (broad) consensus on some of its features – has ended up becoming a practical advantage for research, far removed from any terminological dissensus.
2.1 Emblem as Category: Emblematicity Criteria
Reference McNeillMcNeill (1992) called the arrangement that Adam Kendon had made of gestural categories in previous studies “Kendon’s continuum”: Gesticulation –> Language-like Gestures –> Pantomimes –> Emblems –> Sign Language. Reference GullbergGullberg (1998, p. 97) proposes an expansion of the preceding continuum (which Reference KendonKendon [2004] does not consider useful), whereas Reference McNeill and McNeillMcNeill (2000) later breaks down the previous categories into four continua (see Table 1.1), again demonstrating that the question of the categorization of emblems is always a part of the joint categorization of nonverbal acts.
Table 1.1 Position of emblems in four continua, according to Reference McNeill and McNeillMcNeill (2000, pp. 2–5)
| Continuum 1: relationship to speech | ||||||
| Gesticulation | ---> | Emblems | ---> | Pantomime | ---> | Sign Language |
| obligatory | optional | obligatory | ditto | |||
| presence of speech | presence of speech | absence of speech | ||||
| Continuum 2: relation to linguistic properties | ||||||
| Gesticulation | ---> | Pantomime | ---> | Emblems | ---> | Sign Language |
| linguistic | ---> | ditto | ---> | some linguistic | linguistic | |
| properties absent | properties present | properties present | ||||
| Continuum 3: relationship to conventions | ||||||
| Gesticulation | ---> | Pantomime | ---> | Emblems | ---> | Sign Language |
| not | ---> | ditto | ---> | partly | fully | |
| conventionalized | conventionalized | conventionalized | ||||
| Continuum 4: character of the semiosis | ||||||
| Gesticulation | ---> | Pantomime | ---> | Emblems | ---> | Sign Language |
| global and | ---> | global and | ---> | segmented | ---> | segmented |
| synthetic | analytic | and synthetic | and analytic | |||
The clarity of McNeill’s proposal, which is summarized in Table 1.1, is based on the explicit formulation of four criteria, the impossibility of simple or dichotomous distinctions, and an assumption of the graded nature of the concept of gesture.
In addition to McNeill’s contribution, other proposals have emphasized similar or complementary aspects. This diversity also shows clearly that the emblem is not a “natural” category that combines perceptions of the world, but a projection of scientific theory onto reality; it is therefore not a reactive and descriptive exercise but a proactive one. Reference HannaHanna (1996) had already constructed a semiotic notion of emblem in which the graded character mentioned above is fundamental: “[I]ndividual emblems have a developmental trajectory, and so emblematic status may be seen as a point on a scale, rather than as in total opposition to other sign types” (p. 289). Hanna sought a nonverbocentric definition of the emblem and concretely gave this one:
I propose that the emblem be considered as a sign of which the interpretants in a given cultural group fulfill at least the following tasks:
(a) Set up a piece of human gestural activity as a sign.
(b) Set up a sign in such a way that it is usually interpreted as having been deliberately produced, and communicative intention is generally [?] attributed to the immediate producer of the sign.
(c) Set up the sign as the replica of a type already known, that type being fairly precise as regards the physical shaping and the interpretation of significance. Strong conventions govern emblems so that the tokens of the one type closely resemble each other (Reference HannaHanna, 1996, pp. 289–290).
Reference Payrató, Rector, Poggi and TrigoPayrató (2003) and Reference Payrató and ClementePayrató and Clemente (2020, Sections 2.1.1–2.1.2) partly followed this approach and based their construction on a prototypical categorization of both the notions of gesture and of emblem. Instead of handling closed categories, their prototypical categories are open (in the sense of Reference Croft and CruseCroft & Cruse, 2004). The different “specimens” are more or less close to an ideal pattern, existing or not, which is the one that satisfies more features (or satisfies the features to a greater degree). The theory is applicable to many typologies, for example, to grammatical categories in the linguistic field and to the concepts of gesture and emblem. In the former case (gesture), the fundamental morphological feature is “Bodily action or movement,” and the pragmatic feature is “Meaningful and relevant action (‘ostensive’) accompanied by verbal language or in the absence of verbal language.” In the latter case (emblem), the features of the prototypical conception are presented in Table 1.2.
Table 1.2 Basic and additional optional features to characterize emblematic gestures as a prototypical (physical/morphological) and a pragmatic category (Reproduced, with permission of the publisher, from Reference Payrató and ClementePayrató and Clemente [2020, p. 50, Table 2.3])
a.1 Bodily action or movement
b.1 Meaningful and relevant action (“ostensive,” intended as a message), even in absence of verbal language, addressed to a copresent recipient in an interactional setting b.2 Illocutionary force b.3 Sociocultural conventional action b.4 Semantic core of non-natural meaning (symbolic) b.5 Deliberate (non-accidental) action b.6 Conscious action
c.1 Action or movement involving only hands and arms c.2 Action involving head movement c.3 Action involving facial movement c.4 Action involving eye movement c.5 Action involving other body parts
d.1 Attachable to verbal language d.2 Quotable d.3 Translatable easily to verbal language |
In addition to a basic physical or morphological trait (a.1, as bodily action) and a basic pragmatic trait (b.1, as a relevant, ostensive action), the prototypical emblem necessarily includes traits (b.2–b.6), with optional additional physical or morphological features (c.1–c.5, on a prototypical scale) and additional pragmatic features (d.1–d.4).
In each case, the lower the feature appears in the table, the more optional its presence is. For example, (d.4) is very relative and dispensable: certain emblems do not appear to be equivalent – at least not exactly so – to crossculturally common verbal speech acts, but rather appear to be culturally specific, for example, the mano a borsa in Italy (see Reference Poggi, Attili and Ricci-BittiPoggi, 1983). It is also relative or debatable whether emblematic gestures can be translated “easily” (d.3); in some cases, we find emblems that have a very specific meaning and yet do not exactly reflect a verbal correlate. The higher the feature appears in the table, the more obligatory it is. So, the emblematic gesture can be defined – as a gesture – by (a.1), and specifically (its emblematicity) by the series (b.1)–(b.6): an ostensive gesture, with illocutionary force (which materializes a communicative act: an assertion, order, promise, etc.),Footnote 3 which is conventional (in a sociocultural environment) and has a core that is meaningful (nonnatural, symbolic), deliberate (nonaccidental), and conscious. All of these features – one at a time and all together – allow (prototypical) emblems to be distinguished from other gestures with partially common features, such as what are often called recurrent, interactive, or pragmatic gestures (cf. Section 2.2). In this way, prototypical emblems can also be distinguished from other “pseudo-emblems” or “quasi-emblems,” in the sense of gestures that gradually approach the complete performance of the features.
2.2 Emblems, Recurrent Gestures, and Pragmatic Gestures: Vocal Items and Sign Items
Another dimension that must be considered in the study of emblems is the temporal one. The analysis can take time into account (and be diachronic) or not (synchronic). In the latter case, we study the repertoire constituted at a certain point in time, for example, the present. In the former case, we study the evolutionary process that has led to the birth of the emblem, its emergence, and origin (Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfTeßendorf, 2013, Section 6.2; cf. Reference Kendon and PoyatosKendon, 1988a). This process can be understood basically as a threefold one: (1) acquisition of illocutionary force, (2) conventionalization of action (in social terms), and (3) progressive autonomy or independence from speech (Reference Payrató and ClementePayrató & Clemente, 2020). The notion of recurrent gestures (see Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014 and in this Handbook for a synthesis) is very much concerned with these processes and refers to gestures that are one step before emblems (which, in turn, appear one step before sign language items), if we adapt Kendon’s continuum. The repertoires of emblems and the repertoires of recurrent gestures in a community are to be understood in synchronic analysis as open wholes with fuzzy boundaries,Footnote 4 whereas in diachronic analysis they are, in many cases, regarded as a representation of successive phases.
On the other hand, the notion of what constitutes a pragmatic gesture is often confused (Reference Payrató, Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and BressemPayrató & Teßendorf, 2014). The term is very general, but it can also be used to delimit a highly specific category: in broad terms, one composed of gestures that do not contribute to the propositional meaning of a statement but that do serve as guides to understand or modify it. Reference KendonKendon (2017) distinguishes four functions of pragmatic gestures: modal, performative, parsing, and operational (the last one indicates the evidential status of what is said). In conjunction with these gestures – or considered as a separate category – interactive gestures (Reference Bavelas, Chovil, Coates and RoeBavelas, Chovil, Coates, & Roe, 1995; Reference Bavelas, Chovil, Lawrie and WadeBavelas, Chovil, Lawrie & Wade, 1992) operate on relevant aspects of interaction (maintenance of the turn, request for turns, etc.).
Emblems can also be related in many ways to vocal paralinguistic elements, not least because they fulfill a number of similar functions. For instance, there are sounds for calling, asking for silence, and conveying rejection or acceptance. Some of these already have a verbalized version (onomatopoeia or ideophones), which represents “their entry” into verbality (e.g. in Catalan ecs!, uix!, uf!, or buf! as different types of rejection). Cultural roots are visible in the repertoire of both sound and onomatopoeia. The repertoire of a culture’s conventionalized sounds is usually much smaller than its repertoire of emblems, but many studies are needed to describe and relate them and to make intercultural comparisons.
As for sign language items, McNeill characterizes them as fully conventional, segmentable, analytical, and with (full) linguistic properties (see Table 1.1). In contrast, emblems have only a few linguistic properties (though more than gesticulation or pantomime), are segmentable but synthetic, and are not fully standardized. These features mean that there is no “language of emblems” similar to a natural spoken or signed language: there is no sequentiality or proper syntax. At most we can juxtapose some emblems and convey relatively undeveloped meanings (“eat” + “after,” “you” + “shut up,” “me” + “drink” + “leave,” “you” + “move” + “over there” …). These combinations can also occur with linguistic elements so that the emblem fills in empty verbal boxes (in the so-called “mixed syntax” of Reference Slama-Cazacu, McCormack and WurmSlama-Cazacu, 1976) or builds “composite utterances” (Reference EnfieldEnfield, 2009). Nor does the compositionality of emblems seem to be maintained (Reference Payrató, Müller and PosnerPayrató, 2004): Attempts to treat them as sets of minimal traits (see e.g. Reference Meo-ZilioMeo-Zilio’s [1961a] attempt for Spanish of Uruguay; cf. also Reference SparhawkSparhawk, 1978) result in rather poor results, which are very different from those found with linguistic items (phones/phonemes, morphs/morphemes from complex syntactic strings). Instead, one type of compositionality that warrants further study is the grouping of two emblems performed by different parts of the body, for example, in gestures of insult (the forearm jerk with the arms plus horns or the fig with the finger)Footnote 5 or gestures of indifference or disregard made at the same time by shrugging the shoulders and turning the mouth downwards (cf. Reference DebrasDebras, 2017, and Reference Jehoul, Brône and FeyaertsJehoul, Brône, & Feyaerts, 2017).
3 Origins, Types, Structures, and Functions of Emblems
Differences between emblems arise as a consequence of their origins and the functions they carry out. Reference Posner, Rector, Poggi and TrigoPosner (2003) shows that emblems can come from processes of ritualization of widely varied (and originally noncommunicative) actions. Thus, the concept of emblem closely approximates to the concept of display (as a communicative signal, the result of a ritualization). Displays are present in a host of evolved animal species from birds to primates (cf. Reference Eibl-Eibesfeldt and HindeEibl-Eibesfeldt, 1972).
Other emblems come from the categories that Reference Ekman and FriesenEkman and Friesen (1969) called affect displays and illustrators. As mentioned above, from a pragmatic view we can understand their emergence as a complex process of acquiring illocutionary force, conventionalization, and autonomy from speech. Other examples concern very particular historical or cultural processes (see, in particular, Reference Morris, Collett, Marsh and O’ShaughnessyMorris et al., 1979, and recent cases in Reference BrookesBrookes, 2001, Reference Brookes2011) or specific taboos (see Reference Brookes, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBrookes, 2014b). They are sometimes related to linguistic expressions, such as the emblem for asking for “time out” representing a T with the hands (in many sports), or an ad hoc case, such as a flight attendant on a plane representing “tea” or “coffee” by making a T or C with the fingers. Many others have a very obvious iconic component and can be associated mimetically with basic human actions (eating, drinking, watching, listening …) and with image schemas (cf. Section 5.5: path, container, cycle, matching, pressure …).
Apart from their referents, Reference KendonKendon (1981, p. 152) proposed speaking of their “base” as “the object, action, or (in some cases) abstract entity that the gestural form may be regarded as being modeled upon,” and he distinguished six types:Footnote 6 (1) interpersonal actions (as in the Fingertips Kiss), (2) “intentional movements” (the Head Toss), (3) action patterns (among others, the Flat-hand Flick), (4) concrete objects (the Fig Hand), (5) symbolic objects (the Finger Cross), and (6) abstract entities (the Hand Purse meaning “many”).
The functions of emblems can be analyzed in different ways. As speech or communicative acts of illocutionary value, a canonical classification (à la Searle, see Reference SearleSearle, 1976) would separate emblems into assertive, exhortative (directive), commissive, expressive, and declaration (producing formal changes in world entities), as in Reference PayratóPayrató (1993). Considering them from a viewpoint that combined function with meaning (a semantic core), Reference Johnson, Ekman and FriesenJohnson, Ekman, and Friesen (1975) distinguished eight domains: (1) interpersonal directions or commands, (2) one’s own physical state, (3) insults, (4) replies, (5) one’s own affect, (6) greetings and departures, (7) physical aspect of persons, and (8) unclassified. In his comparative analysis of a variety of repertoires, Reference KendonKendon (1981) found three basic categories: interpersonal control, announcement of one’s current state, and evaluative response. In the French research tradition, Reference Dahan and CosnierDahan and Cosnier (1977) proposed a classification similar to the preceding ones:Footnote 7
I. Expressive
1. Affective expressions
- Detachment
- Disorientation and reflection
- Annoyance
2. Appreciations and comments
- Declarative with negative connotation
- Declarative with neutral or positive connotation
II. Conative
III. Phatic
1. Appreciations and comments
- Interactional modulators
2. Greetings
3. Deictics
4. Others
IV. Operational
Resting on semantic and pragmatic criteria, this latter classification is more accurate and comprehensive than preceding ones. Other types of emblems can be distinguished by their dependence on the particular linguistic expressions of a language or variety or by their usage in vast cultural areas. There is a gradation that would take us from local to regional or supraregional solutions (Reference PayratóPayrató, 2008). In fact, no one has dared to speak of universal emblems because it has always been recognized that the character of these units is intrinsically culturally specific. This does not mean, however, that their extent cannot vary, depending largely on the degree of iconicity and the type of action. Basic human actions can be reflected in strongly iconic emblems that are very similar to one another, resulting in a stark contrast to symbolic emblems, which are often associated with local sociocultural or linguistic content.
4 Repertoires of Emblems: Methodology
In fact, many published collections of gestures – characterized with or without adjectives as “symbolic,” “folkloric,” “traditional,” etc. – are mostly collections of emblems because they are merely gestures that happen to have more regular formal standards and keep a basic core meaning out of context. That is why we find them also represented in images, paintings, and objects (amulets, key chains, etc.).
Notwithstanding some nuances, the first repertoires can be regarded as those of Reference BonifacioBonifacio (1616/2018), Reference Bulwer and ClearyBulwer (1644/1974), Reference AustinAustin (1806), and especially Reference JorioJorio (1832/2000), Reference MalleryMallery (1881a, Reference Mallery1881b, Reference Mallery1891), and Pitrè (1889) (cf. Reference KendonKendon, 2004; Reference Payrató and ClementePayrató & Clemente, 2020; Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfTeßendorf, 2013). From there we can jump to compilations that begin around 1950 (in chronological order: Reference DevereuxDevereux, 1949; Reference BrewerBrewer, 1951; Reference CardonaCardona, 1953–54; Reference AmadesAmades, 1957; Reference Meo-ZilioMeo-Zilio, 1961a, Reference Meo-Zilio1961b) and then to the 1970s with the appearance of the earliest established repertoires that have explicit samples and criteria (Reference Johnson, Ekman and FriesenJohnson et al., 1975; Reference Saitz and CervenkaSaitz & Cervenka, 1972) and that in some cases refer to vast geographical and cultural areas (Reference BarakatBarakat, 1973; Reference CreiderCreider, 1977).
It is very difficult to give a complete list of repertoires. Apart from (uncommon) technical repertoires in, for instance, sports and professional domains, Reference Payrató and ClementePayrató and Clemente (2020, pp. 83–84) have compiled roughly a 100 works on current emblems. These works range from short articles (with a limited number of gestures) to dictionaries that can stretch to hundreds of pages (like Reference Meo-Zilio and MejíaMeo-Zilio & Mejía, 1980–83). Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfTeßendorf (2013, pp. 89–90) separates “mono-cultural” repertoires from “cross-cultural and contrastive emblem collections.” Among the former group, the most prominent and most numerous are dedicated to Italian, Spanish, Portuguese, French, and English (cf. Reference Bonaiuto, Bonaiuto, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBonaiuto & Bonaiuto, 2014; Reference Galhano-Rodrigues, Müller, Cienki, Fricke, Ladewig, McNeill and BressemGalhano-Rodrigues, 2014; Reference PayratóPayrató, 2008, Reference Payrató, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014; Reference Payrató and ClementePayrató & Clemente, 2020; Reference Poggi, Müller, Cienki, Fricke, Ladewig, McNeill and BressemPoggi, 2014; Reference Rector, Müller, Cienki, Fricke, Ladewig, McNeill and BressemRector, 2014; Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfTeßendorf, 2013). Included in the latter group are comparative studies on two repertoires (the first probably being Reference Saitz and CervenkaSaitz and Cervenka, 1972, dedicated to Colombian and U.S. American) or on geographical regions, with Reference Morris, Collett, Marsh and O’ShaughnessyMorris et al. (1979) being the most well-known work, while Reference Rector, Trigo, Müller and PosnerRector and Trigo (2004) compared “Portuguese communication” on three continents; cf. also Reference Reiter, Müller, Cienki, Fricke, Ladewig, McNeill and BressemReiter, 2014 and Reference BrookesBrookes, 2004, Reference Brookes, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a).
Considering that there are between 6,000 and 7,000 languages in the world and that the set of corresponding cultural fields is very high but impossible to calculate with accuracy, the amount of available data on emblems is clearly very small. Considering also the issue of the diversity of comparison methodologies (cf. Reference Payrató, Cavé, Guaïtella and SantiPayrató, 2001; Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfTeßendorf, 2013, p. 89), solving the global problem becomes a serious challenge for future studies of emblems and for intercultural pragmatics. Reference PoyatosPoyatos (1975) and Reference KendonKendon (1981) were pioneers in this comparative line (cf. Reference KendonKendon, 2004; Reference Payrató, Müller, Cienki, Fricke, Ladewig, McNeill and BressemPayrató, 2014; Reference Poggi, Müller, Cienki, Fricke, Ladewig, McNeill and BressemPoggi, 2014; and Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfTeßendorf, 2013, for an overall view).
Leaving aside precedents linked to anthropology and popular culture (especially Reference HayesHayes, 1959, a guide to collecting gestures), work on the methodology for the analysis of emblems began with the proposal put forward by Reference PoyatosPoyatos (1975), whose central question was how to create a repertoire. As for recordings, Reference EfronEfron (1941/1972) was the first to use cameras, while Reference Johnson, Ekman and FriesenJohnson et al. (1975) first proposed the technique to follow: a double method of coding and decoding from a questionnaire, while controlling for the degree of informants’ certainty in relation to their answers, and for the degree of the units’ naturalness. Reference CalbrisCalbris (1990) also proposed measures of this kind for emblems associated with French, with intracultural and intercultural comparisons, in order to check units, and Reference Poggi and Magno CaldognettoPoggi and Magno Caldognetto (1997) and Reference PoggiPoggi (2002, Reference Poggi, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014) provided a great deal of information on the development of the so-called “gestionario.” Reference BrookesBrookes (2005), following Sherzer’s ethnographic approach (see Section 5.3), introduced a host of innovations by combining the importance of establishing the repertoire with a microanalysis of the usage of emblems in natural communicative interactions, searching for the contextual meaning of emblems and their pragmatic values.
In principle, the notation, transcription, and representation of emblems do not differ from such processes used for other gestures (cf. Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMüller, Cienki, Fricke, Ladewig, McNeill, & Teßendorf, 2013, Reference Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014). Drawing and photography have been the traditional methods of representation. Some collections can now be viewed on video on the Internet and the gain is obvious because all phases of the emblem are visible, not just the stroke (or central phase), which is the phase reproduced in static media such as drawing or photography. Moreover, videos also provide more information on accompanying facial expressions and eye gaze. Images from video or films make it possible to better appreciate the differences between emblems. Some emblems may appear the same if we look at only one image (such as the examples mentioned in Section 1 above: “victory” and “two fingers”), and in fact they often cause much confusion in work with informants. When using video, however, the differences in the production process are very evident and the interpretations become much more consistent.
5 Perspectives on the Analysis of Emblems
The precedents for our current analyses of emblems first arose in the fields of local historical studies (Reference JorioJorio, 1832/2000), studies of popular culture (Reference AmadesAmades, 1957; Reference HayesHayes, 1940, Reference Hayes1951), anthropology (Reference MalleryMallery, 1881a, Reference Mallery1881b, Reference Mallery1891), and dialectology (Reference DanguitsisDanguitsis, 1943; Reference RohlfsRohlfs, 1959). Reference EfronEfron (1941/1972) and Reference Ekman and FriesenEkman and Friesen (1969) combined anthropological and psychological perspectives (but also a semiotic basis) in the earliest studies that can be viewed as scientific stricto sensu. At the same time, Reference BraultBrault (1963) and Reference GreenGreen (1968) undertook the first studies that could be considered applied or for specific purposes, in particular for the teaching of French and Spanish, respectively. Throughout the second half of the twentieth and the beginning of the twenty-first century, we can distinguish different perspectives on the analysis of emblems, which appear grouped together, as described in Sections 5.1–5.5.
5.1 A Cultural View: Geographic Distribution
Reference Morris, Collett, Marsh and O’ShaughnessyMorris et al. (1979) developed one of the most well-known and frequently cited studies on emblems. It was based on a survey of 1,200 informants, randomly selected in public places, in 40 European cities. They surveyed the 20 gestures reproduced in Figure 1.1 and then compiled their results in tables and maps according to meaning and frequency of use.
Figure 1.1 Illustration of the 20 symbolic gestures analyzed by Reference Morris, Collett, Marsh and O’ShaughnessyMorris et al. (1979), with their names: (1) The Fingertips Kiss, (2) The Fingers Cross, (3) The Nose Thumb, (4) The Hand Purse, (5) The Cheek Screw, (6) The Eyelid Pull, (7) The Forearm Jerk, (8) The Flat-Hand Flick, (9) The Ring, (10) The Vertical Horn-Sign, (11) The Horizontal Horn-Sign, (12) The Fig, (13) The Head Toss, (14) The Chin Flick, (15) The Cheek Stroke, (16) The Thumb Up, (17) The Teeth Flick, (18) The Ear Touch, (19) The Nose Tap, and (20) The Palm-Back V-sign
Despite obvious problems in the selection of the localities surveyed (due to an imbalance in the representation of the respective cultures) and considerable methodological confusion (cf. Reference Payrató and ClementePayrató & Clemente, 2020), it is undeniable that the study by Reference Morris, Collett, Marsh and O’ShaughnessyMorris et al. (1979) contributed a great deal of information on the knowledge and use of these emblems, their geographical and cultural distribution, and their historical origins. Some of the emblems – e.g. nos. (1) and (12) – have more than 2,000 years of history (they come from classical Greek and Roman civilization; see Reference Fornés and PuigFornés and Puig, 2008). Others are quite recent, such as (16), which became popular because of American influence throughout the second half of the twentieth century, rather than coming from ancient Rome, despite popular lore to the contrary (Reference Morris, Collett, Marsh and O’ShaughnessyMorris et al. 1979, p. 187). Some extend across broad domains (e.g. [3], [7], [9] and [16]), while others belong to a much more closed cultural domain (e.g. [5], [14], [15], [17], and [18], which are typical for Italy). In some cases, a boundary (or gestural isogloss) may be established, analogous to a dialectal isogloss, such as the one that separates the use of (13) as a negation, typical of southern Italy (and some other European areas, such as Greece, Bulgaria, Albania, Macedonia, and Romania) and the Middle East, from the denial made with lateral movements of the head, typical of northern Italy, above Naples, and of many other European areas (cf. also Reference RohlfsRohlfs’ pioneering 1959 work on the same gesture or that of Reference Geiger and WeissGeiger and Weiss, 1950, with a map of gestures used to seal a transaction in the market).
The cultural history of some emblems is so specific that it has been collected in monographs, for example, those on the fig ([12] above, see Reference Leite de VasconcellosLeite de Vasconcellos, 1925), the “V” for “victory” (Reference SchulerSchuler, 1944), the nose thumb ([3], Reference TaylorTaylor, 1956) and many others (cf. Reference Fornés and PuigFornés & Puig, 2008; Reference Serenari, Rector, Poggi and TrigoSerenari, 2003). According to Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfTeßendorf (2013, p. 90), cross-cultural findings on emblems can be grouped into “issues of varying complexities: Differences in the meaning(s) of individual gestures, their spread and distribution; differences in cultural key concepts expressed by emblems and finally differences in the use, size and diversity of a gestural repertoire” (cf. also Reference KitaKita, 2009, Reference Matsumoto and HwangMatsumoto & Hwang, 2013).
5.2 The Semiotic and Pragmatic View
The meanings and functions of emblems are linked to the immediate interactive situation, but the usage of emblems can be associated basically with two broad communicative domains. The first and fundamental one is that of ordinary language (oral, spontaneous, and informal); in other words, the usual communicative interaction within speech communities. Emblems are thus associated with the colloquial or ordinary variety of a language and are literally “everyday gestures” (Reference Payrató, Müller and PosnerPayrató, 2004). In a second domain, emblems can be used as equivalents of special languages (i.e. for specific purposes), for example, in professional domains when certain messages need to be “signalled,” often due to the distance between interlocutors (e.g. outdoor workers exchanging instructions).Footnote 8 Special domains are also found in cases such as monasteries (when there is a vow of silence, see e.g. Reference BarakatBarakat, 1975 and cf. Reference KendonKendon, 1990, Reference Kendon2004) and very clearly and popularly in sports domains (i.e. in the codes of referees and umpires: Reference Payrató and ClementePayrató & Clemente, 2020). In some cases, the emblems of both domains – current and specialized – can coincide, for example, the emblems used to represent numbers (which also show clear cultural differences). Another different situation arises with the emergence of sign languages that can substitute speech in almost any circumstance (and thus alternate with speech), for instance, Aboriginal Australian sign languages (Reference KendonKendon, 1988b) or Plains Indian Sign Language (Reference MalleryMallery, 1881b, cf. Reference KendonKendon, 2004, pp. 303–306).
In the case of current repertoires, Reference KendonKendon (1981, Reference Kendon2004) has highlighted the contexts in which the usage of emblems is most common: at a certain distance, when the environment is noisy, or vice versa, if silence is required, when no answers are needed, etc. In these cases, the simplicity of the emblematic “performance” exceeds the qualities of verbal language. However, a more in-depth analysis of how emblems and verbal language interact should be undertaken in order to discover the semantic-pragmatic specificities of the combination or explain better the absence of speech. Emblems can be performed as completely autonomous gestures, they can fill empty grammatical slots (Reference Alturo, Clemente, Payrató, Fernández-Villanueva and JungbluthAlturo, Clemente, & Payrató, 2016), or they can be combined with speech. Reference Poggi, Müller, Cienki, Fricke, Ladewig, McNeill and BressemPoggi (2014, p. 1484) proposes drawing a “proto-grammatical” distinction between articulated and holophrastic gestures: “A holophrastic signal is a unitary signal that conveys a whole communicative act, of both performative and propositional content, while an articulated signal conveys only part of it” (Reference Poggi, Müller, Cienki, Fricke, Ladewig, McNeill and BressemPoggi, 2014, p. 1485, see also Reference Poggi and Magno CaldognettoPoggi & Caldognetto, 1997).Footnote 9 These alternatives warrant further study, and indeed some ethnographic analyses such as those discussed in Section 5.3 have already shed light on these points and on the birth and spread of certain emblems.
5.3 The Ethnographic Analysis of Emblems
Reference SherzerSherzer (1973) has shown that the pointed-lip gesture (typical of Cuna de San Blas, Panama, but also of some other places in Central and South America) is much more than a simple deictic gesture with univocal meaning. His ethnographic analysis of the peculiarities of this emblem demonstrates that it can have up to nine different meanings. These nine meanings can be classified according to the dichotomous application of four criteria (+/− pointing, +/− interactional, +/− opening, and +/− mocking).
Subsequently, Reference SherzerSherzer (1991) conducted a similar study in the case of the Brazilian thumbs-up gesture (no. 16 in Figure 1.1), and Reference BrookesBrookes (2001) followed the same line in her analysis of the “Clever gesture” among young black urban dwellers of Johannesburg and surrounding areas in South Africa. With an innovative methodology based on a direct and natural analysis of interaction (and not on questionnaires and surveys), Brookes also showed how different pragmatic meanings and functions take shape around a semantic core associated with the action of “seeing.”
An added value of the ethnographic study of emblems is that, despite the (relatively) strong stability of their formal standards, they are items subject to a high degree of sociocultural, generational (historical), and functional (stylistic) variation, so their contextual and contrastive analysis is imperative. In his analysis of Arabic gestures, Reference BarakatBarakat (1973, p. 760) has noted that “[s]emiotic gestures are linked to male, female, and age groups, although not exclusively by each group since the same configuration for one gesture may be used by more than one group but with different meanings.”
Ethnographic studies should also provide data on why profound differences exist between emblem repertoires. Reference EfronEfron (1941/1972) has shown that a group of Italians analyzed in New York used many emblems (more than a hundred), while a group of Eastern European Jews used very few (six). Reference KendonKendon (2004) attributes the abundance of emblems (and other gestures) in the city of Naples to the ecology of interaction (and the vitality of street life).
5.4 Cognitive Dimensions of Emblems
Despite the advancement of the cognitive-studies paradigm and its impact on communication studies, emblem analysis has not benefited much from the cognitive perspective. However, it does appear that this perspective could be applied in many ways. To begin with, emblems are generally perceived very clearly and unambiguously in the communities or social groups where they are used. Expressed in terms of relevance theory (e.g. Reference Sperber and WilsonSperber & Wilson, 1986; Reference Wilson, Sperber, Horn and WardWilson & Sperber, 2004), they clearly contribute to an interlocutor’s increased cognitive environment (or mental context) and they do so with very low processing costs. They are at the opposite end of what interlocutors interpret as an irrelevant gesture or action (at least in communicative terms). In any event, the need to take into account a principle such as relevance (which includes other conversational maxims of a smaller scope) in any explanation of linguistic and gestural (at least emblematic) use is reinforced by the application of the principle to these two domains (cf. Reference WhartonWharton, 2009).
On the other hand, though we do not yet have detailed studies on the subject, many emblems appear to be related to image schemas,Footnote 10 as in the case of other types of gestures (Reference Cienki and HampeCienki, 2005, Reference Cienki, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013; Reference LadewigLadewig, 2011). This would appear to be a useful avenue to pursue in the future analysis of emblems (cf. Reference Payrató and ClementePayrató & Clemente, 2020, Section 4.1.1).
Consideration should also be given to applying the family resemblance model (cf. Reference WittgensteinWittgenstein, 1953, and also Reference Rosch and MervisRosch & Mervis, 1975) to many emblems that are related to one another or may be considered variants, and which are affected by various phenomena of synonymy or polysemy. The radial conception of emblems in networks or structures (Reference Payrató, Rector, Poggi and TrigoPayrató, 2003; Reference Payrató and ClementePayrató & Clemente, 2020) seems more plausible and explanatory than a lexicographic list of units.
The metaphorical and metonymic tropes present in emblematization processes are also a subject for analysis. The cognitive paradigm has been useful in discovering many aspects of gesture (Reference Cienki, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfCienki, 2013; Reference Cienki and MüllerCienki & Müller, 2008, Reference Cienki, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013; Reference Mittelberg, Waugh, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Waugh, 2014), but much remains to be done precisely in the case of emblems. Some of the metaphorical, metonymic, and combined processes that occur in these cases have already been identified (cf. Reference PayratóPayrató, 2008; Reference Payrató and ClementePayrató & Clemente, 2020). Reference Poggi, Müller, Cienki, Fricke, Ladewig, McNeill and BressemPoggi (2014) also nods in this direction when she describes ironic/sarcastic or nonliteral uses of emblems.
5.5 Applied Studies
The lexicographic side of emblem collection is obvious. Indeed, the vast majority of gesture dictionaries are emblem dictionaries, though they do not use the word because it is likely to sound too technical (especially for publishers). Entries are usually sorted by the basic meaning of units, but obviously this creates great many problems. Managing an emblem dictionary is not as easy as managing a word dictionary: It looks more like a dictionary of colloquial expressions (where it is often not easy to find certain items). Only one dictionary is organized around the body parts involved (Reference Bäuml and BäumlBaüml & Baüml, 1975), but while its criteria are highly original, it frankly does not seem to be very helpful in making searches easier.
The most extensive dictionary is likely the one from Reference Meo-Zilio and MejíaMeo-Zilio and Mejía (1980–83), which has two volumes, 425 pages, and a large number of items (sorted by semantic slogans) from an immense territory (Spain and Latin America). Unfortunately, it contains very little information on the data-collection methodology. Other dictionaries worth mentioning are inter alia: Reference MunariMunari (1963, Reference Munari1994) for Italian; Reference Calbris and MontredonCalbris and Montredon (1986) and Reference CalbrisCalbris (1990) for French; Reference Kreidlin, Müller and PosnerKriedlin (2004) and Reference MonahanMonahan (1983) for Russian: and Reference PapasPapas (1972) for Greek. In some cases, the purpose of the dictionary is not merely compilation, description, or disclosure but rather to make progress in basic research, for example, the dictionary planned in the case of Berlin (which takes into account both the German and Turkish languages and their speech communities, cf. Reference Serenari, Rector, Poggi and TrigoSerenari, 2003, Reference Serenari, Müller and Posner2004).
Some of these dictionaries have been devised for the teaching of a second language, such as Reference GreenGreen’s (1968) for the teaching of Spanish or the brief article by Reference BraultBrault (1963) for the teaching of French. The language and culture of the recipient is also sometimes taken into account (see Reference Takagaki, Ueda, Martinell and GelabertTakagaki, Ueda, Martinell, & Gelabert, 1998, for teaching Spanish, mainly to Japanese students).
Other fields that have been addressed very little in the case of emblems, but that also warrant in-depth study, include speech pathology and speech recognition and synthesis. In the former case, we have no data on how certain pathologies affect the knowledge and ability to use emblems, but they would seem to be linked only with verbal abilities (see Reference Xua, Gannon, Emmorey, Smith and BraunXua, Gannon, Emmorey, Smith, & Braun, 2009; cf. Reference Lindenberg, Uhlig, Scherfeld, Schlaug and SeitzLindenberg, Uhlig, Scherfeld, Schlaug, & Seitz, 2011). This is indeed what we would expect, given that the quasi-verbal nature of these units has always been asserted. Nor do we have many studies in the field of speech synthesis and speech recognition that include references to emblems. Nevertheless, it is equally clear that the meaning conveyed through emblems should be able to be recognized (and analyzed), on the one hand, and be elaborated (and synthesized), on the other. This would contribute to a more natural and faithful picture of human–computer communication.
6 Closing Remarks
Emblems are a subject of study in which many distinct fields of analysis converge, though they can sometimes appear as disparate as the history of art (and of painting in particular), lexicography, computational linguistics, and speech recognition, to name but a few. This could also be said if we look at the main perspectives of analysis: anthropological, psychological, cognitive, semiotic, or linguistic, to list only the main ones. In all cases, we can discover a common interest: communication and multimodal interaction (cf. Reference Mondada, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMondada, 2013). Nonetheless, studies that analyze the particular operation of emblems in interaction remain scarce, while other issues such as the categorization of items and their collection in repertoires have hitherto received priority. In addition, because these repertoires have been compiled using different methods, they are not easy to compare, so the data at our disposal may sometimes seem large in quantitative (absolute) terms but in fact be scattered and poorly contrasted in qualitative terms. However, there are many communities whose repertoire of current emblems has not yet been described. These future repertoires should constitute common research corpora available to all researchers, marked by strict control of functional (/stylistic or contextual), sociocultural, and historical (/generational) variables as well as the sex of the emblem users.
There are no easy recipes or tips for future research, but studies on emblems cannot be limited to the taxonomic stage (compilation and description), nor can they be “atheoretical,” that is, autarkic with respect to theorizing about communication. The need for an interdisciplinary theoretical framework to frame interpretations and explanations seems obvious, and intercultural pragmatics and cognitive science need to play a role in the shaping of such a framework. The goal can be summed up in a nutshell: to discover the interrelationships between cognitive, sociocultural, and pragma-stylistic factors in the use of these units, which constitute a perfect example of the multimodal complexity of human communication.
1 Introduction
Recurrent gestures are conventionalized co-speech gestures. They show a stable form–meaning relationship and occur repeatedly across different contexts and speakers. They are often derived from practical actions and are engaged in semantic and pragmatic meaning-making. In fact, the communicative potential of these kinds of gestures has always been a subject of study in the field of rhetoric (Reference MosherMosher, 1916; Reference OttOtt, 1902; Reference QuintilianQuintilian, 1969). However, within the field of gesture studies, recurrent gestures remained a marginal research phenomenon (Reference Bavelas, Chovil, Coates and RoeBavelas, Chovil, Coates, & Roe, 1995; Reference CalbrisCalbris, 1990; Reference KendonKendon, 1995) until the upsurge of interest observable in recent years. A psychological perspective on gesture, which conceives of gestures as spontaneous creations reflecting the imagistic side of thinking (cf. Reference McNeillMcNeill, 1992), has dominated the field of gesture studies for a long time. With it, conventionalized manual movements, such as recurrent gestures, were excluded from the research agenda (cf. Reference HarrisonHarrison, 2018; Reference MüllerMüller, 2018). As such, the rising interest in recurrent gestures “is quite significant for a field that has long wondered about the purpose and function of co-speech gesturing and that has kept itself in a theoretical bind by ruling out the conventionality and language-like nature of hand gestures” (Reference Di Paolo, Cuffari and De JaegherDi Paolo, Cuffari, & De Jaegher, 2018, p. 301).
Recurrent gestures have been studied from different disciplinary perspectives that have explored processes of their emergence and stabilization as well as facets of their communicative potential. In what follows, three different lines of research will be outlined that explore the individual, the linguistic, and the cultural side of recurrent gestures.
1.1 What Are Recurrent Gestures?
Adam Reference KendonKendon’s (2004) and Cornelia Reference Müller, Müller and PosnerMüller’s (2004) studies on the Palm Up Open Hand can be considered as the starting point for a growing interest in “recurrent gestural forms” (Reference KendonKendon, 2004, p. 226; Reference Müller, Müller and PosnerMüller, 2004, p. 239). Their research pointed out that speakers of different cultural backgrounds not only perform spontaneously created and idiosyncratic gestures (Reference McNeillMcNeill, 1992) but also show gestures which are stable in form and meaning. These gestures recur across different contexts and speakers while responding to the local exigencies of an interactive situation (Reference LadewigLadewig, 2010, Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b). Their investigations identified topics and research questions which can be incorporated in an overarching research agenda to be addressed from a variety of disciplinary perspectives as will be shown in this chapter (and in Reference Harrison and LadewigHarrison & Ladewig, 2021).
The most salient feature Kendon and Müller point to is the stabilization of gestural forms and meanings. Recurrent gestures, like any other gestures, are performed spontaneously, meaning their performance is not planned by a speaker. However, they differ from “singular gestures” (Reference MüllerMüller, 2010, Reference Müller2017), whose forms and meanings emerge while speaking. In the field of gesture studies, singular gestures are also called iconic gestures or metaphoric gestures (Reference McNeillMcNeill, 1992). They have not undergone stabilization processes and are therefore not culturally shared as is the case with recurrent gestures and emblems.
Of course, gestures themselves form culturally shared practices (cf. Reference StreeckStreeck, 2009), falling back on different semiotic strategies of meaning-making (cf. Reference KendonKendon, 1980; Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014). Yet recurrent gestures, as the term suggests, are born through repetition and have become sedimentedFootnote 1 forms of embodied meaning “available in the joint embodied know-how” (Reference Di Paolo, Cuffari and De JaegherDi Paolo et al., 2018, p. 151) of the individuals of a community.
The second characteristic of recurrent gestures that arouses the interest of gesture scholars is their specialization in pragmatic meaning-making. This aspect was first addressed from the point of view of rhetoric (Reference QuintilianQuintilian, 1969) and in the education of actors (Reference MosherMosher, 1916; Reference OttOtt, 1902). The pragmatic meaning potential of recurrent gestures can be discovered in studies on contexts of use. They show:
how certain stable gestural forms with what we called “pragmatic” functions are used. […] They appear to have emerged as ways of treating certain recurrent features of discourse in interaction, including topic specification, refusal, negation, and offering and asking. That gestural expressions are used to treat these kinds of moments may be explained in part by the fact that such moments transcend any particular form of verbal expression, they can be indicated and treated apart from speaking as well as simultaneously with it and can serve, thus, as modulators of or as operators upon whatever spoken discourse may be involved.
This means, while singular gestures often contribute to the proposition of a multimodal utterance, recurrent gestures mainly serve pragmatic functions. Reference KendonKendon (2004, pp. 158–159) identified four pragmatic functions of co-speech gesture, that is, (a) modal – framing how an utterance should be interpreted, (b) performative – enacting a speech act such as offering ideas or stopping someone, (c) parsing – punctuating the spoken discourse into logical components, and (d) interactive and interpersonal functions regulating turns at talk such as holding the floor or requesting a turn. Recurrent gestures are specialized in fulfilling these functions, which is reflected in other terms for them proposed by a “function-centered approach” (Reference CooperriderCooperrider, 2019, p. 226) including “pragmatic gestures” (Reference KendonKendon, 1995; Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and BressemPayrató & Teßendorf, 2014; Reference StreeckStreeck, 2009; Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and BressemTeßendorf, 2014), “gestures with pragmatic function” (Reference KendonKendon, 1995, Reference Kendon2004), “interactive gestures” (Reference Bavelas, Chovil, Lawrie and WadeBavelas, Chovil, Lawrie, & Wade, 1992), or “speech handling” gestures (Reference StreeckStreeck, 2009). Their ability to participate in pragmatic meaning-making is considered based on and informed by the manual action of which recurrent gestures are born. To be more precise, many recurrent gestures “are abstracted versions of practical actions” (Reference StreeckStreeck, 2017, p. 203). They are not images of actions as psychological accounts have suggested (cf. Reference MüllerMüller, 2018, p. 9) but are movements of the hands which have become communicative actions (examples are shown in Figure 2.3). Their embodied basis in everyday actions is considered as one of the reasons why recurrent gestures are so widespread and why they occur even in completely unrelated languages (see Section 4). Reference Bressem and WegenerBressem and Wegener (2021, p. 245), for example, argue that the “instrumental actions on which, for instance, Holding Away gestures are based, are common practices not particular to specific cultures, but [they] are rather elementary human experiences.”
A third aspect that Reference KendonKendon (2004) and Reference Müller, Müller and PosnerMüller (2004) address is the diversity of recurrent gestures in terms of their contextual and kinesic variation. Both researchers identified changes in the gestural forms that go along with changes in their “semantic theme” (Reference KendonKendon, 2004, p. 227). These form variations may also stabilize. The Palm Up Open Hand gesture, for instance, shows “some highly recurrent features which are combined or blended with the core form” (Reference Müller, Müller and PosnerMüller, 2004, p. 253). These features include the use of both hands, the repeated downward movement and/or the rotation of the hand. Other form features such as an antagonistic and wide lateral motion appeared to be less recurrent.
The configuration of the hand and the orientation of the palm often form the core of a recurrent gesture, while the movement of the hand or the position in the gesture space may form their variants. The Holding Away gesture, or VP gesture in Reference KendonKendon’s (2004) terms (Figure 2.3), for instance, is characterized by a flat hand oriented away from the speaker’s body. In cases where it marks the change of a topic or signals inference, it is held close to the speaker’s body (Reference Bressem and Wegenercf. Bressem & Wegener, 2021). In cases of stopping or interrupting a coparticipant’s line of action, it is often directed toward the addressee and thus moved into the interactional gesture space (cf., Reference CalbrisCalbris, 2003; Reference HarrisonHarrison, 2018; Reference KendonKendon, 2004). Another example is the Cyclic gesture, which is characterized by the continuous outward rotation of the hand (Reference LadewigLadewig, 2010, Reference Ladewig2011, Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a). Its various interactional functions are embodied by the position in gesture space and the size of movement. When embodying the speaker’s communicative activity of searching for a word or concept, the gesture is located in the central gesture space. When requesting the interlocutor to continue with an activity, it is often located in the lateral periphery of the gesture space and performed with a large movement.
What these examples show is that, similar to any other gesture, recurrent gestures are engaged in meaning-making in situ (“local meaning,” Reference LadewigLadewig, 2011, Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b). Their response to the local exigencies of an interactive situation becomes evident in different form and meaning variants, which themselves can become stabilized. These gestural variants specialize in dealing with recurrent cognitive, interactive, and communicative tasks (cf. Reference StreeckStreeck, 2017) associated with recurring communicative activities or “contexts of use” (Reference ScheflenScheflen, 1973; Reference SherzerSherzer, 1991). This observation advanced the idea of “gesture families,” that is, socially shared groupings of gestural forms (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, Bressem, & Müller, 2014; Reference KendonKendon, 2004; Reference Müller, Müller and PosnerMüller, 2004), which is a useful notion to study the “diversity of recurrency” (Reference HarrisonHarrison, 2018, p. 213; see also Reference Harrison and LadewigHarrison & Ladewig, 2021) and explore stabilization processes in gestures (see Section 3.1).
As mentioned before, recurrent gestures were studied from different disciplinary approaches which explored processes of their emergence and stabilization, and the facets of their communicative potential. In what follows, three different lines of research will be outlined dealing with the individual, the linguistic, and the cultural dimension of meaning construal involved with recurrent gestures.
2 Recurrent Gestures as Sedimented Individual and Social Practices
Understanding gestures as part of a “‘cultural body’ which is the sedimentation of its spontaneous acts” (Reference Merleau-PontyMerleau-Ponty, 1963, p. 249) is particularly advanced within a praxeological account of gestures. This approach is informed by the practice turn (cf. Reference Schatzki, Cetina and Von SavignySchatzki, Cetina, & Von Savigny, 2005) which has inspired theories to contradict the view that (social) meaning is mediated only by immaterial, disembodied, mental representations of the world (the linguistic turn). For praxeology, the corporeality of practices and the materiality of contexts are crucial for the creation of intersubjectivity (the material turn):
A whole range of material social science research fields, from organizational research, sociology of science and technology, and gender studies to media research and life-style analysis, now regularly fall back on praxis-theoretical vocabularies to reconstruct the routines in companies, the forms of using technical and media artifacts, the characteristics of gendered performances, or, for example, the “doing culture” in everyday time practices.
The basic notion of practice was introduced to bridge the theoretical gap between agency and structure, or subjectivism and objectivism. It is, among others, informed by what has been called micro-sociology, that studies social actions between social actors in smaller social units (Reference GarfinkelGarfinkel, 1967; Reference GoffmanGoffman, 1956), Reference WittgensteinWittgenstein’s (1953) usage theory of meaning, and the agency versus structure debate addressing the fundamental question of whether the individual’s agency or the social structure shapes human societies (Reference BourdieuBourdieu, 1977; Reference GiddensGiddens, 1984). The notion of practice highlights the idea of doing culture and “is typically invoked to explain continuities or commonalities among the activities of social groups” (Reference Rouse, Schatzki, Cetina and von Savigny.Rouse, 2005, p. 199). It allows us to focus on recurring and collective conduct based on practical knowledge rather than an intellectual “knowing that” (schema, codes). This knowing how can be considered as “a conglomerate of everyday techniques, a practical understanding in the sense of understanding something” (Reference ReckwitzReckwitz, 2003, p. 289, my translation). Practices rely on the recurrence and habitualization of acts which are considered as starting points for their emergence in language and gestures. However, the “logic of practice” (Reference BourdieuBourdieu, 1990) is to bring about openness and changeability that enable agents to adapt them to specific situations while dealing with them in a skillful way (cf. Reference ReckwitzReckwitz, 2003, p. 294).
When gestures are considered as practices, they are understood as routinized, nonreflexive, and preconceptual activities of the body in a material world (cf. Reference Streeck, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfStreeck, 2013). They are conceived of as products of practices of which they are also resources (cf. Reference Streeck, Deppermann, Feilke and LinkeStreeck, 2016, p. 69) and as prereflexive forms of enaction (i.e. the body is in action before being conscious of it). This embodied practical knowledge becomes explicitly visible in recurrent gestures, as many of them are born of practical actions “though not [a] particular body’s actions – anybody’s actions” (Reference StreeckStreeck, 2017, p. 269). These “abstracted versions of practical actions” (Reference StreeckStreeck, 2017, p. 203) are both personal and cultural and are always related to a situation in which they are embedded. As such, the meaning of gestural practices and their practical understanding cannot be found only in the form itself but also in its relationship to a communicative context (cf. Reference BatesonBateson, 1972). The interrelation of gestural practices with their surrounding has been captured by the notion of “ecologies of gestures” (Reference StreeckStreeck, 2009), of which six have been introduced: “Making sense of the world at hand,” “[d]isclosing the world within sight,” “[d]epiction,” “[t]hinking by hand,” “[d]isplaying communicative action,” and “[o]rdering and mediating transactions” (pp. 8–11). Recurrent gestures are most often observed in the latter two ecologies because they embody communicative actions which often regulate human interaction, such as “giving and receiving,” “holding,” “holding at bay,” “slicing” or “cutting,” and the “precision grip” or “rotation” (Reference StreeckStreeck, 2009, Reference Streeck2017). These recurrent gestures are conceived of as not only more personal habits but also as routines that speakers develop as solutions for recurrent communicative tasks. They form an individual’s repertoire of communicative practices, invoking sensations and affective dimensions every time they are performed, making a speaker “feel himself, consciously or not, a particular kind of person” (Reference StreeckStreeck, 2017, p. 296). As such, recurrent gestures are part of a process of constantly shaping and reshaping a speaker as a speaker – a process described as self-making or autopoeisis (Reference Di Paolo, Cuffari and De JaegherDi Paolo et al., 2018; Reference StreeckStreeck, 2017). Individuals, while reproducing themselves as speakers, also reproduce society (cf. Reference HellerHeller, 1984) because practices of every kind sustain a community. Recurrent gestures are, thus, also understood as a form of “cultural action” embedded in a circle of individuation and of sustaining society:
Cultural development is possible because embodied persons adaptively and creatively sustain and reproduce themselves. We must therefore turn to the individual body if we want to understand the reproduction – or re-instantiation and “re-inscription” – of embodied culture. Thus, in the study of embodied communication practices, biological, phenomenological, sociological, linguistic, and anthropological perspectives merge.
Conceived of as both forms of self-individuation and resources that form a “community of practice” (Reference Lave and WengerLave & Wenger, 1991), recurrent gestures make visible the practical knowledge of dealing with communicative, interactional, and cognitive tasks. As incorporated and, thus, sedimented social acts, they create a shared cultural dimension affirming their meanings when being enacted but also creating novel significance that emerges from the particular situations and further social acts in which they are embedded (cf. Reference Di Paolo, Cuffari and De JaegherDi Paolo et al., 2018). Thus, recurrent gestures, like other stabilized expressions, are embedded in “complex processes of sedimentation and spontaneity” (Reference Di Paolo, Cuffari and De JaegherDi Paolo et al., 2018, p. 10). Novel gestural expressions can evolve as local repertoires that find ways of becoming shared in a community. However, some gestural expressions might stay as a once-in-a-life-time occurrence or develop into a personal habit. If particular performances of a recurrent gesture that respond to the local exigencies of an interactive situation become part of an “in-group pragmatics” (Reference Di Paolo, Cuffari and De JaegherDi Paolo et al., 2018, p. 158), the emergence of systematic form, meaning, and context variation can be observed. The genesis of such mutually sedimented structures shows similarities to the emergence of grammatical structures in spoken and signed languages and, thus, reveals gestures’ potential to become language-like. Both aspects, that is, stabilization and the “linguistic potential of gesture” (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMüller, 2013; see also Reference ArmstrongArmstrong, 1999/2002), are addressed in Section 3.
3 Recurrent Gestures and Their Linguistic Potential
What does it mean to become language-like? Clearly, this aspect goes back to a linguistic view of gestures and, thus, of the linguist’s trained eye searching for units and patterns that show a degree of stability and recurrence. As a matter of fact, the idea that gestures show some degree of stability has been conceived of as a criterion for identifying types of gestures ever since antiquity (Reference BarakatBarakat, 1969; Reference MosherMosher, 1916; Reference OttOtt, 1902; Reference QuintilianQuintilian, 1969). Yet, processes of becoming stable forms have been addressed only selectively (see Section 1). During the past few years, researchers, most of them linguists, have become increasingly interested in defining stability and fleshing out parameters that contribute to the emergence of stabilized gestural forms (Reference Fenlon, Cooperrider, Keane, Brentari and Goldin-MeadowFenlon, Cooperrider, Keane, Brentari, & Goldin-Meadow, 2019; Reference LadewigLadewig, 2010, Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a; Reference MüllerMüller, 2018). This interest was particularly encouraged by the development of new computational, statistical, and technical tools and growing archives of audio-visual data. While conducting corpus analyses of either verbal expressions or recurrent gestures, researchers became aware of recurring “speech gesture ensembles” (Reference KendonKendon 2004, p. 310) which brought the close relationship of recurrent gestures and speech to the fore. Starting from a particular verbal unit or a recurrent gesture (cf. Reference MüllerBressem & Müller, 2017), researchers have begun reflecting upon the cognitive entrenchment of these multimodal patterns dubbed “multimodal constructions” (e.g., Reference AndrénAndrén, 2010; Reference BressemBressem, 2021; Reference SchoonjansSchoonjans, 2017; Reference ZimaZima, 2017). The point of departure is the premise that “constructions” (Reference GoldbergGoldberg, 1995), that is, entrenched form–meaning pairings taken as basic units of language, are potentially multimodal in nature because language use is multimodal (with modality understood here as a semiotic resource creating meaning). Among the gesture-based multimodal constructions discussed so far are the “Negative-Assessment-Construction,” incorporating the “Away action scheme” (Reference MüllerBressem & Müller, 2017), the “‘Tell me’ Joint Action Construction,” including the Cyclic gesture (Reference Ruth-HirrelRuth-Hirrel, 2018), and “complex beat-point constructions,” incorporating pointing gestures with the movement pattern of beats (Reference Ruth-Hirrel and WilcoxRuth-Hirrel & Wilcox, 2018). The speech-based constructions analyzed are, for instance, the German existential construction es gibt (“there is/are”) incorporating the Palm Up Open Hand (Reference MittelbergMittelberg, 2017), linguistic motion constructions of English, including “[V (motion) in circles],” “[zigzag],” “[N spin around],” and “[all the way from X PREP Y]” incorporating different movement patterns (Reference ZimaZima, 2017) or appositions accompanied by head nods (cf. Reference LanwerLanwer, 2017). Along with the systematic description of multimodal units comes the discussion of suitable grammatical models in which the phenomenon of systematic and frequent co-occurrences can be integrated. Proposals that have been made address, for instance, a continuum of constructions of infrequency to frequent systematic co-occurrences of gestures and speech (cf. Reference Zima and BergsZima & Bergs, 2017, p. 2) or prototype theory (Reference CienkiCienki, 2017).
Research on stabilization processes in the manual modality alone began in the early years of gesture studies (Reference Kendon and PoyatosKendon, 1988), where stabilized gestures were defined as “stylized gestures” or “repeated expressions” (Reference KendonKendon, 1995, pp. 274–275) showing semantic and pragmatic consistencies. With focus on the study of particular gestural forms, the methods of analyzing gestures were improved and interindividual similarities in gestural patterns were documented. What is more, the systematic description of gestural forms (Reference CalbrisCalbris, 1990) linked with contexts-of-use analyses (Reference ScheflenScheflen, 1973) revealed formal variations which reflect meaning varieties in recurrent gestures. As pointed out before, one of the landmark studies is on the Palm Up Open Hand revealing that different aspects of form, such as a repeated downward motion or a wide lateral motion, embody how the arguments offered on the open palm are presented. According to Reference Müller, Müller and PosnerMüller (2004, p. 252), a repeated downward motion embodies the listing of arguments whereas a wide lateral motion embodies a wide range of entities offered. Other examples are the Cyclic gesture or recurrent gestures of negation where varying aspects of form embody different interactional and cognitive functions such as reference to the speaker or the addressee or the speaker’s perspective on discursive objects and how they are handled (see Section 1). In these cases, interactional, situational, and cognitive aspects become embodied and stabilized in certain form parameters such as the position in gesture space or a lateral motion.
The idea that varying meanings are embodied in different gestural forms was put forward early (Reference JorioJorio, 1832/2000; Reference Neville and BlackmanNeville, 1904; Reference WundtWundt, 1901) but pursued only selectively (Reference EfronEfron, 1941/1972; Reference SparhawkSparhawk, 1978) until the rise of gesture studies in the 1990s (see e.g. Reference CalbrisCalbris, 1990; Reference KendonKendon, 1995). It was particularly advanced by research on so-called gesture families. This notion is inspired by work on recurrent gestures and highlights the interrelatedness of gestures showing similarities in form and meaning:
When we refer to families of gestures we refer to groupings of gestural expressions that have in common one or more kinesic or formational characteristics. […] The forms within these families, distinguished as they are kinesically, also tend to differ semantically although, within a given family, all forms share in a common semantic theme.
Hence, gesture families are not only distinguished by a stabilized core but also by a variation of form that embodies different aspects of the situational exigencies and of meaning. These form meaning variations were conceived of as “‘morpho-kinetics’ of gesture” (Reference KendonKendon, 1996, p. 6), “emerging morpho-semantics” (Reference KendonKendon, 2004, p. 224), or “rudimentary gesture morphology” (Reference Müller, Müller and PosnerMüller, 2004, p. 254). They are aspects of the “diversity of recurrency” (Reference HarrisonHarrison, 2018, p. 213; see also Reference Harrison and LadewigHarrison & Ladewig, 2021) which make this type of gesture particularly interesting for linguistic analyses because what we can observe here are “emergent forms of compositionality” (Reference MüllerMüller, 2018, p. 16; see also Reference FrickeFricke, 2010). Accordingly, the stabilization of gestures is closely related to the process of “de-composition of holistic gestural movements into smaller form Gestalts (the kinesic core with its prototypical meaning)” (Reference MüllerMüller, 2017, p. 297) where those aspects of form that lose meaning can take over other functions and become stabilized as well. Therefore, the study of recurrent gestures provides insights into the emergence of structures that allow movements of the hands to develop into language, that is, sign language:
In recurrent gestures, we see how a holistic gestalt begins to break down into formational features, some of which contribute to the thematic core of the gesture [while] others [do] not. In gesture families we can observe how structural islands based on two different types of principles emerge (gesture families based on a formational and semantic core and gesture families based on a shared and basic action underlying all members of the family).
Although the idea of language-like structures in gestures still needs to be fleshed out in future research, research on embodied concepts that show similar grammatical meanings in spoken and signed languages goes in this direction. Among the phenomena studied so far are embodied forms of negation (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem & Müller, 2014a; Reference Bressem and WegenerBressem & Wegener, 2021; Reference CalbrisCalbris, 2003, Reference Calbris, Cienki and Müller2008; Reference HarrisonHarrison, 2018; Reference KendonKendon, 2004), aspectuality (Reference Boutet, Morgenstern and CienkiBoutet, Morgenstern, & Cienki, 2016; Reference Cienki and IriskhanovaCienki & Iriskhanova, 2018; Reference DuncanDuncan, 2002; Reference Ruth-HirrelRuth-Hirrel, 2018), and plural forms (Reference Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem, 2014, Reference Bressem2021). In fact, many of these concepts are embodied by recurrent gestures. What is more, the different form–meaning variations of recurrent gestures have been treated as an important factor in the description of stabilization processes in gestures as they relate to the stabilization of different gestural form parameters. In more detail, similar to grammaticalization processes in spoken and signed languages, stabilization processes in gestures can be conceived of as continua where the occurrence of certain parameters is indicative of different degrees of stabilization (e.g. Reference LadewigLadewig, 2010, Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a; Reference MüllerMüller, 2018). Their investigation certainly gives insights into the linguistic potential of gestures because “this admitting of degrees is not true only of gesture, or of semiosis, but is the beating heart of all that we call linguistic” (Reference Di Paolo, Cuffari and De JaegherDi Paolo et al., 2018, p. 301). This aspect is elucidated in the following section.
3.1 Recurrent Gestures as Hybrid Forms on the Gesture Continuum
The idea that gestures are stabilized to different degrees has inspired gesture scholars to introduce the notion of the “Gesture Continuum” (Reference McNeill, Müller, Cienki, Fricke, Cienki, Ladewig, McNeill and TeßendorfMcNeill, 2013), formerly known as “Kendon’s Continuum” (Reference McNeillMcNeill, 1992)Footnote 2 because it gives insights into the evolution of gestural forms. Conceiving of gestures in terms of continua goes back to Reference Kendon and PoyatosKendon’s (1988) article “How gestures can become like words,” where he addressed lexicalization processes of gestures and facial expressions and described important stages in the emergence of conventional forms:
What is to be noted about this is not just the process by which an “iconic” form becomes “arbitrary,” but also that, as the form comes to be shaped by the requirement that it be a distinctive form within a system of other forms, it is freed of the requirement of being a “picture” of something, so that then it becomes free to take on a general meaning as well. It thus becomes available for recombination with other forms and so may come to participate in compound signs or sentences. […]. [T]here is an implication that these forms emerge in the course of their use in interaction. This is crucial to bear in mind. That is, the process of lexicalization of gesture requires that there be a community of users. Again, however, this is not only essential if stable forms are to emerge; it is also essential if these forms are to have general, that is to say, conceptual meanings. What two or more people jointly agree to refer to cannot be private, and thus it will take on a general reference rather than a specific one.”
Kendon’s thoughts on the emergence of lexicalized gestures gave rise to the notion of “Kendon’s continuum” (Reference McNeillMcNeill, 1992), which has been very influential in the field of gesture studies but ironically excludes recurrent gestures (or equivalent categories at the time with other names) from the research agenda and, thus, disregards stabilization processes in gestures as a proper subject for investigation (for a discussion, see Reference MüllerMüller 2018). Kendon himself has not systematically outlined gesture continua but he undoubtedly set the stage for conceiving of dynamic processes of stabilization in terms of different degrees showing different parameters (e.g., Reference LadewigLadewig, 2010, Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a; Reference MüllerMüller, 2018). In one of his early case studies, for instance, he described “the ‘emergence’ of quotable gestures or emblems” as “a process of conventionalization of spontaneously created metaphorical gestures” (Reference KendonKendon, 1995, p. 275) and, thus, conceived of processes of abstraction as one parameter for stabilization. Moreover, he suggested that gestures are “a mode of expression that varies in the degree to which it is conventionalized and also in the degree to which it is ‘detachable’ from speech” (Reference KendonKendon, 1995, p. 267). In so doing, he paved the way for understanding stabilization in gestures in terms of dimensions rather than categories (illustrated by the shades in Figure 2.1; see Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014b, p. 1570).
Figure 2.1 Continuum of stabilization from gestures to sign
The stabilization of gestures goes along with a change of gestural meaning and often with the emergence of pragmatic functions. Reference BrookesBrookes (2001), referring to quotable gestures (emblems), points out that
[v]ariation in the range of meanings and speech act functions these gestures fulfil suggests that quotable gestures may begin as spontaneous depictions that are used to fulfil immediate communicative needs. As they are found to fulfil important practical and then social functions offering opportunities to express important conditions and social relations, the meanings and functions of these gestures expand.
Recurrent gestures are a proper subject for studying these processes because they are “hybrids of idiosyncratic and conventional elements” (Reference MüllerMüller, 2017, p. 280) where some aspects of form contribute to the core of the gesture and others are free to respond to the local exigencies of an interactive situation. Due to their hybrid character, they occupy a place between spontaneous (singular) gestures and emblems on a continuum of increasing stabilization (e.g. Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014b; Reference MüllerMüller, 2017). Here they are in a special position because they show variants that are similar both to singular gestures and to emblems. In view of these observations, it is proposed to regard a taxonomy of gestures in terms of dimensionsFootnote 3 rather than in terms of categories.
The idea that the stabilization of gestures is a dynamic process in which recurrent gestures form an important step is supported by distributional analyses of recurrent gestures. They show that recurrent gestures themselves are characterized by different degrees of stabilization and should therefore be considered as forming a continuum (Reference LadewigLadewig, 2010, Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b). To give an example (Figure 2.2), the variants of the family of the recurrent Cyclic gesture differ in the ranges of meaning and in the stabilization of their form aspects (Reference LadewigLadewig, 2010, Reference Ladewig2011, Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b). Whereas the Cyclic gesture in descriptions embodies continuous aspect (Reference VendlerVendler, 1957) and thus can depict various events, the Cyclic gesture in a word or concept search has only three possible meanings, reflected in three sequential positions: that is, (a) during a phase of non-fluent speech while searching for the word or concept, (b) during a phase of fluent speech or in the transition from non-fluent to fluent speech, when finding the word or concept, and (c) during a phase of fluent speech after the search. The Cyclic gesture in requests prompts the addressee to continue to do something. Thus, the latter appears to be the most stabilized variant. The variation in the gestural form supports this argument. Cyclic gestures which embody aspectual meaning show the lowest degree of stabilization and are therefore closest to singular gestures. Additional to the core form of a continuous circular movement, this variant is used with a flat hand. The Cyclic gesture embodying the activity of searching for a word or concept shows the same hand shape and is positioned in the center of the gesture space. The Cyclic gesture used in the context of requests shows three stabilized aspects of form, that is, the hand shape, the position in gesture space, and the movement size. Furthermore, it is the only variant that can substitute for speech. Gestures showing these characteristics have been labeled emblems or “quotable forms” (Reference KendonKendon, 1995, p. 272).
Figure 2.2 Continuum of stabilization in the Cyclic gesture
The observed variation in the position and movement size of the gesture shows interesting analogies to sign language. As Wilcox and colleagues (Reference WilcoxWilcox, 2005; Reference Wilcox, Rossini, Pizzuto and BrentariWilcox, Rossini, & Pizzuto, 2010) have shown, the sign IMPOSSIBLE in Italian Sign Language (LIS), which is also characterized by a rotating movement, exhibits different “pronunciations” (Reference WilcoxWilcox, 2005, p. 30). It is argued that the identified variations in the location and movement size of this sign are “analogous to prosodic stress” (Reference Wilcox, Rossini, Pizzuto and BrentariWilcox et al., 2010, p. 353). However, in the case of modal verbs in Italian Sign Language, both form aspects have achieved a grammatical status and mark morphological alternations of strong and weak forms. A second observation to be mentioned is that the movement pattern of a continuous rotation has been observed to mark aspectuality in sign languages (e.g. Reference Klima and BeluggiKlima & Beluggi, 1979, p. 293). Tying these findings back to the analysis of the Cyclic gesture, it can be argued that this recurrent gesture has developed to a marker of aspect in sign languages as it is often used to mark continuous events. It may also have developed into lexical morphemes related to time such as PROCESS or HOUR in German Sign Language (Reference LadewigLadewig, 2020, pp. 37–38).
The observed commonalities of recurrent gestures and signs of sign language give reasons to consider gestures as a source for stabilization processes in sign languages (see e.g. Reference Shaffer, Janzen, Conathan, Good, Kavitskaya, Wulf and YuShaffer & Janzen, 2000; Reference van Loon, Pfau, Steinbach, Müller, Cienki, Fricke, Ladewig, McNeill and Bressemvan Loon, Pfau, & Steinbach, 2014; Reference WilcoxWilcox, 2005). This aspect was taken up in the different versions of gesture continua where signs mark their endpoint. Studies investigating the developmental paths of gesture to sign adduced evidence that gestural forms can undergo processes of stabilization in form and meaning and develop into linguistic elements of signed languages, such as discourse markers or lexical and grammatical morphemes (e.g. Reference Janzen, Pfau, Steinbach and WollJanzen 2012; Reference Janzen, Shaffer, Meier, Cormier and Quinto-PozosJanzen & Shaffer 2002; Reference Pfau and SteinbachPfau & Steinbach 2006; Reference Shaffer, Janzen, Conathan, Good, Kavitskaya, Wulf and YuShaffer & Janzen 2000; Reference WilcoxWilcox 2004, Reference Wilcox2005). Recurrent gestures appear to be an important stage in these processes. Studies on the Palm Up Open Hand in different sign languages (e.g. Reference Conlin, Hagstrom and NeidleConlin, Hagstrom, & Neidle, 2003; Reference Cooperrider, Abner and Goldin-MeadowCooperrider, Abner, & Goldin-Meadow, 2018; Reference Engberg-Pedersen, Schulmeister and ReinitzerEngberg-Pedersen, 2002; Reference van Loon, Pfau, Steinbach, Müller, Cienki, Fricke, Ladewig, McNeill and Bressemvan Loon et al., 2014) or the evolution of modal markers in American Sign Language (ASL) and French Sign Language (LSF) (Reference ShafferShaffer, 2000; Reference Shaffer, Janzen, Conathan, Good, Kavitskaya, Wulf and YuShaffer & Janzen, 2000) support this argument. These studies show that recurrent gestures may be considered as a stage in the grammaticalization process from gesture to sign where they may bypass the stage of emblems. However, this does not imply that recurrent gestures are borrowed from spoken languages and incorporated into the sign system. Although cases of language contact occur (see Reference MüllerMüller, 2018), “[s]uch gestures are not ‘hearing people’s’ gestures, they belong to deaf people, too, and evidence is mounting that they are integral to both lexicalization and grammaticalization patterns in sign languages” (Reference Janzen, Pfau, Steinbach and WollJanzen, 2012, p. 836). The similarities in forms and functions of manual movements across spoken and signed languages are indicative of a modality-independent motivation and evolution of gestures and signs to the point where the forms are integrated into a language system.Footnote 5 Then they can develop into linguistic units within the confines of a language system.
4 The Cultural Dimension of Recurrent Gestures
As mentioned before, recurrent gestures can become sedimented forms of embodied meaning available in the joint embodied knowledge of individuals and communities where they may form repertoires. Researchers have only started to document repertoires of recurrent gestures in different communities. Reference StreeckStreeck (2017), for instance, has reconstructed a repertoire of “[c]onventional gestures and personal habits” (Reference StreeckStreeck, 2017, p. 203) observed in one speaker which recur over various contexts and which are considered as a “result of ongoing self-making” over time (Reference StreeckStreeck, 2017, p. 287, see Section 2). Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem and Müller (2014b) reconstructed a repertoire of recurrent gestures with pragmatic functions of German speakers (Figure 2.3). Their corpus includes several gestures of negation, such as the Brushing Away gestures (Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and BressemTeßendorf, 2014), the Holding Away gesture, the Sweeping Away gesture, and the Throwing Away gesture. Bressem and Müller also documented different gestures embodying a back-and-forth, motion including the Vague gesture, the Weighing Up gesture, or the Change gesture (Figure 2.3).
Figure 2.3 Repertoire of recurrent gestures based on Bressem and Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller (2014 b, pp. 1580–1584;
Recently, Will (2022) published a study of a repertoire of recurrent gestures in Hausa speakers. Among the gestures documented are known forms such as the Sweeping Away gesture or the Holding Away gesture. Will also introduced new gestural forms including the Two-Finger Tap gesture, the Holding gesture, the Washing gesture, the Shaking gesture, and the Snapping gesture. The latter brings an aspect to the fore that has long been neglected in gesture studies. The sound the hands produce while gesturing can be an essential part of multimodal meaning-making. Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem and Müller (2014b) made similar observations for the Dropping gesture (“Dropping of the hand,” see Figure 2.3) where the lax flat hand drops on the lap and produces an “acoustic signal” (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem & Müller, 2014b, p. 1584). This gesture is used to “dismiss topics of talk by marking parts of the talk as less important and interesting” (p. 1584).
Repertoires of recurrent gestures provide the basis for cross-cultural comparisons to explore the distribution of recurrent gestures across different communities and to determine their functions and variations. Some researchers argue that the documentation of gestural repertoires and their description can only be a first step in cross-cultural studies. “[E]xplanation is the end goal” as Reference CooperriderCooperrider (2019, p. 227) notes. Reference KendonKendon (1981) invites researchers to study the “geographic distribution” of gestures and to inquire the reasons for functional similarities and differences across cultures (Reference KendonKendon, 1981, p. 108). As a matter of fact, cross-cultural studies of recurrent gestures are rare, but researchers started to reason about the distribution of recurrent gestures across different communities. One reason for their emergence across cultures is their engagement in the construal of pragmatic meaning. “[G]estures reveal a great deal about interactional practices, the social norms that underlie them and how local and wider ideologies in societies shape the nature of gestures and their use” (Reference Brookes and Le GuenBrookes & Le Guen, 2019, p. 129). Accordingly, if certain practices, norms, and ideologies are shared among speech communities, they may share gestural forms.
As outlined in the sections above, there has been much study of the interactional dimension of recurrent gestures. Their social functions are slowly moving into the focus of attention. Examples are studies on recurrent gestures within politics, which provide insights into processes of social power and control (e.g. Reference StreeckStreeck, 2008; Reference WehlingWehling, 2017). Reference StreeckStreeck (2008), for instance, documented a repertoire of recurrent gestures used by Democrats in the 2004 US election campaign where the members of the democratic party appeared to share a “code” or “public gesture style” (Reference StreeckStreeck, 2008, pp. 156, 178). While avoiding mimetic forms of depiction, the politicians use recurrent gestures with pragmatic functions to mark speech acts and visualize the information structure “hereby providing viewers with visual structure that facilitates the parsing and processing of speech” (Reference StreeckStreeck, 2008, p. 154). Based on his analysis, Streeck formulates a broader research agenda where the study of recurrent gestures in political discourse may provide insights into the “theory of self-presentation (Goffman, 1959) within the context of electoral politics” (Reference StreeckStreeck, 2008, p. 183). This research can also contribute to the notion of “para-interaction,” which is currently discussed in the field of media-linguistics (e.g. Reference Luginbühl and SchneiderLuginbühl & Schneider, 2020). Lempert’s analysis (Reference Lempert2011) of the Ring gesture observed for Barack Obama contributes to such an agenda. His study shows that Obama used the Ring gesture recurrently to make a sharp point. These occurrences of the Ring gesture go along with the creation of a “persona” (Reference Horton and Richard WohlHorton & Richard Wohl, 1956) who exhibits the attributes of “being argumentatively ‘sharp’” (Reference LempertLempert, 2011, p. 241) and thus appears to be authoritative. In another study, Reference LempertLempert (2017) compared forms and functions of the Precision-grip with the Slice gesture, the Index-finger-extended, the Power grip, and enumeratives used by Barack Obama and Hilary Clinton during the primary season of 2007−08. He observed a kinesic and functional relationship between the precision grip and the extended index-finger, termed “pragmatic affinity” (Reference LempertLempert, 2017, p. 37). Both gestures are used to make a point. Their status of being close and thus to fulfill similar functions is evidenced by the way in which the gestures are performed. In some cases, the hand shapes seem like blends. In other but rare cases, these gestures slip within a single intonation unit. Lempert concludes that the gestures documented belong to a register that has not developed into a stable code yet. “To turn to gestural enregisterment is to turn to the messiness of an assemblage of gestural signs and make that very state of existence an object of investigation” (p. 62). Reference StreeckStreeck’s (2008) conclusions go a step further because he sees “a surprising congruence between the type of gestures that Quintilian advocated” and “what appears to be an unspoken consensus about adequate gesticulation among the Democratic Party politicians” (Reference StreeckStreeck, 2008, p. 178). While rejecting Quintilian’s notion of a normative pairing between gestures and pragmatic functions, the consensus described includes the avoidance of iconic or depictive gestures, the use of gestures to mark discourse functions, and the restriction of the movement range of co-speech gestures (cf. Reference StreeckStreeck, 2008, p. 178). One motivation for the emergence of such a way of gesturing is that the politicians are eager to keep a rhetorical style which may go along with the creation of persona in para-(social) interaction. This may well be the case, but the preference for recurrent gestures is also related to the observed modes of speaking, which is mainly arguing in the context of election campaigning. In other televised conversational settings, Barack Obama, for instance, uses a lot of depictive gestures, which goes along with impersonating himself as an individual who creates a narrative of his legacy.
Coming back to the question of why recurrent gestures are found across different speech communities, two cross-linguistic studies should be mentioned. Reference Ruth-HirrelRuth-Hirrel (2018) compared the forms and functions of the Cyclic gesture (see Figure 2.3) in English and Farsi. One of her observations is that language-specific properties appear to interact with the stability of gestural variants of the Cyclic gesture co-occurring with progressive constructions in English. In these cases, the Cyclic gesture is performed with both hands using asynchronous large circles. These instances suggest a stabilization of form and meaning on the level of multimodal constructions (see Section 3). This is different in Farsi. Although the Cyclic gesture occurs with progressive constructions in Farsi, only a low degree of form stability could be documented. The study is currently being expanded by integrating the Cyclic gesture used in German (Reference Ladewigcf. Ladewig, 2020). Both authors argue that the common basis for the occurrence of the Cyclic gesture in the three speech communities is the domain of time (see “domain-centered approach,” Reference CooperriderCooperrider, 2019). Accordingly, in all three languages, the Cyclic gesture has emerged from experiences with cyclic motions and the recurrence and repetition of events through time.
Reference Bressem and WegenerBressem and Wegener’s (2021) cross-linguistic study of the Holding Away gesture (Figure 2.3) in German and Savosavo (a Papuan language of the Solomon Islands) reveals parallels in the pragmatic function in both (completely unrelated) languages. This recurrent gesture relates discourse segments in different ways. It may operate on the level of the message, when setting up a contrast or making inferences, but it can also be used as a topic-relating discourse marker when emphasizing the speaker’s focus on the conclusion of a topic and a subsequent topic change. In addition to the common functions, the study also identified two message-connecting functions of the Holding Away gesture that are unique to each language. Savosavo speakers occasionally use the Holding Away gesture to mark elaboration in speech and thus the insertion of additional information. German speakers, on the other hand, use this gesture “to indicate that the present utterance is an inference drawn from the previous utterance. This inferential use of the gesture is in fact the most frequent use of the holding away gesture in our German data set” (Reference Bressem and WegenerBressem & Wegener, 2021, p. 231). The action scheme of holding something away, showing the effect of clearing the body space, is considered as the derivational base of this gesture. Variations observed in both speech communities are not only motivated by cultural specificities, according to the authors, but also by the type of data. As such, the perceived differences may arise from the data collection technique in terms of purpose of recording (for research or entertainment), the degree of interactivity (monologic or dialogic), and generic expectations shaping discourse in different situations.
Although cross-cultural studies of the kind mentioned are still rare, their documentation in different speech communities has already shown how widespread and diverse they are. Due to these observations, Reference CooperriderCooperrider (2019) treats recurrent gestures as “natural conventions,” which means that “they are culturally selected (i.e., conventionalized) from a menu of motivated (i.e., natural) options” (p. 229). He argues that basic communicative functions result from a set of motivated possibilities from which groups tend to select a few. The consequences resulting from this proposal are twofold. “First, there will be very few absolute universals – specific recurrent gestures or gestural practices that are found the world over. But, second, there will be very few one-off cases that are found in one place and only one place” (Reference CooperriderCooperrider, 2019, p. 230). Cross-cultural comparisons can spell out the continuum from “one-off cases” to absolute gestural universal, providing insights into the motivation and stabilization of gestural forms and their diversity of recurrency (Reference Harrison and LadewigHarrison & Ladewig, 2021). They should be considered as the next step of the research agenda introduced at the beginning of this chapter.
5 Conclusion
This chapter has given an overview of the research strands on recurrent gestures. These gestures are stabilized forms that embody a practical knowledge of dealing with different communicative, interactional, and cognitive tasks. Due to their hybrid character, they occupy a space between singular gestures and emblems on a continuum of stabilization. Whereas the early days of recurrent-gesture research focused on the identification of these gestures and on the refinement of descriptive methods, their role in self-individuation and their social role are moving into the focus of attention. Based on the thorough studies of recurrent gestures in individual speech communities, the time is ripe to study recurrent gestures cross-culturally. Moreover, these studies pave the way for fleshing out modality-independent and modality-dependent processes of stabilization in manual movements and, thus, for the further development of an interface between gesture and sign.
1 Introduction
When human beings communicate with each other, they use their body’s natural media – movements of their hands, eyes, eyebrows, mouth, and head, as well as postural shifts – to make meaning. These meaningful expressions of the body are “inseparable” from the spoken or signed signs they accompany (Reference KendonKendon, 2009, p. 363). The speaker or signer uses them to spontaneously create dynamic physical images of objects and scenes that they have previously experienced or that they are newly imagining. When recounting a story, for instance, speakers may imitate the posture, manual actions, and/or facial expressions of the friend or story character – human or otherwise – that they are talking about. Just as a speaker’s hands can represent a physical object, so too can her entire body. For instance, when verbally describing a huge old tree swaying in the wind one might have seen on a walk, one may for a moment actually become the tree by aligning one’s legs and torso vertically to portray the trunk and stretching one’s arms upward, and swaying from side to side to portray the branches moved by the wind. If, rather than swaying in the wind, the tree had been struck by lightning, one might bend one’s torso or arm to indicate the angle at which the trunk now stands. These kinesic, bodily means of communicating human experience and the world around us are examples of the common semiotic practice understood as gesture, in which parts of the speaker/signer’s body, most frequently the hands, represent objects or scenes in the space around a speaker’s body, or when a speaker’s entire body becomes a corporeal icon of something, herself or someone else performing an action or expressing a sentiment (Reference Mittelberg, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg, 2014; Reference Müller, Santi, Guaïtella, Cave and KonopczynskiMüller, 1998a, Reference Müller1998b).
According to the US philosopher and semiotician Charles Sanders Peirce (1839–1914), the relationship between the gestures described above and the persons, objects, or scenes they represent is iconic. Iconicity, one of Peirce’s three semiotic modes alongside indexicality and symbolicity rests upon a perceived similarity between the form of a gesture and what it stands for (Reference Peirce and BucherPeirce, 1955). Peirce defines icons as having “qualities which ‘resemble’ those of the objects they represent” (Peirce, 1903, CP 2.276)Footnote 1. While the term “icon” might suggest a visual bias, Peirce already had a multimodal understanding of iconicity: something serving as a material sign carrier – like a word or gesture – may look, feel, move, smell, sound, or be structured like something else. In the tree example we began with, iconicity underpins the relationship between the structural and behavioral features of the bodily posture and gestures, on the one hand, and the gesturer’s mental representation and perceptual experience of the tree (and the wind going through it) on the other hand. The tree example also demonstrates the abstraction that underlies gestural processes and that results in signs with varying degrees of schematicity. As Arnheim asserts, “[b]y the very nature of the medium of gesture, the representation is highly abstract” (Reference Arnheim1969, p. 117). In other words, the representation of a living being or an object through the human body is always, as any sign process is, partial, or metonymic (Reference Mittelberg, Waugh, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Waugh, 2014), and conditioned by the affordances and constraints of the medium of the human body and the ways it is possible to move and produce meaningful gestalts. This is not only true of gestural sign formation, but of sign formation in visuo-spatial modalities in general, including in signed languages, which exploit the same bodily means of representing yet have their specific ways of incorporating iconicity into lexicalized signs (Reference Wilcox, Occhino and DancygierWilcox & Occhino, 2017). The sign in British Sign Language for TREE, shown in Figure 3.1, features a vertically raised arm with thumb and fingers outstretched and the other arm horizontally and contiguous with the upwards-stretched arm providing the “ground” on which the tree is situated (see Reference TaubTaub [2000], in which the American Sign Language (ASL) sign for TREE is the basis for her account of abstraction and schematization in ASL; see also Section 3.2 below). The lexicalized image icon of TREE “both preserves the structure of the image and fits the phonotactic constraints of the language” (Reference TaubTaub, 2000 p. 34).
Figure 3.1 Sign for TREE in British Sign Language
Iconicity is not relegated only to visuo-spatial languages but rather is a property that motivates language structure regardless of the mode of communication (e.g. manual signs, spoken forms, and written forms) (e.g. Reference Hodge and FerraraHodge & Ferrara, 2022; Reference Nielsen and DingemanseNielsen & Dingemanse, 2021; Reference Perniss, Thompson and ViglioccoPerniss, Thompson, & Vigliocco, 2010). While these recent works demonstrate a resurgence in the study of iconicity as a language-general property over the last decade, studies from a range of approaches in the previous half-century demonstrate the iconic grounding of structures in spoken language. For instance, phonological and morphological structure (Reference Jakobson, Waugh and Monville-BurstonJakobson, 1966), the lexicon (Reference WaughWaugh, 1992), and syntax (Reference HaimanHaiman, 1985, Reference Haiman2008) have been shown to be motivated by perceptual and structural similarity. (For recent work on iconically motivated structures, see e.g. Reference DevylderDevylder’s [2018] account of possessive constructions in the Paamese language of Vanuatu.)
Leaving iconicity in other language structures aside, this chapter presents an overview of the fundamental role iconicity plays in the formation and interpretation of co-speech gestures. Iconic and representational aspects of communicative body postures and hand movements, which have always been a central issue in gesture research (e.g. Reference McNeillMcNeill, 1992; for overviews see Reference Hodge and FerraraHodge & Ferrara, 2022; Reference Mittelberg, Evola, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Evola, 2014), typically have a close semantic relationship with the propositional content of the verbal utterance they occur with (Reference KendonKendon, 2004; Reference Kita and McNeillKita, 2000). Iconic gestures (which Reference McNeillMcNeill [1992] terms iconics, Reference Müller, Santi, Guaïtella, Cave and KonopczynskiMüller [1998a] calls referential, and Reference KendonKendon [2004] and Reference StreeckStreeck [2009] call depictive gestures) are broadly understood as manual gestures and body postures that represent concrete objects and actions, as in the examples given above. However, iconicity plays a far greater and more complex role in gestural communication beyond simple resemblance relations and concrete content. In this chapter, we introduce the role of iconicity as a motivating ground for gesture formation, moving beyond a narrow definition to introduce the workings of representation in gesture more broadly.
To provide a theoretical foundation for the various modality-specific manifestations of iconicity in gesture that we will discuss in this chapter, we draw on Peircean semiotics and cognitive linguistic accounts of how iconicity is inherent to embodied conceptual and linguistic structures (e.g. Reference Lakoff and JohnsonLakoff & Johnson, 1999; Reference TaubTaub 2000, Reference Taub2001; Reference WilcoxWilcox, 2004). Peircean semiotics and cognitive linguistics recognize that repeated, similar experiences with the physical and social world are at the root of embodied patterns of sensing, acting, thinking, and communicating (e.g. Reference DanaherDanaher, 1998; Reference Mittelberg, Cienki and MüllerMittelberg, 2008, Reference Mittelberg2019a, Reference Mittelberg2019b), and both approaches consider how these patterns play out in multiple modalities and sign systems. Building on these premises, we present different kinds and degrees of iconicity observable in gesture. The chapter is structured as follows: We first lay out the semiotic foundations of representation and iconicity, established by Peirce, including diagrammatic and metaphor iconicity, and apply them to gesture, while also discussing the role of abstraction, metonymy, and viewpoint (Section 2). In Section 3, we first clarify terminological issues in this area and then provide an overview of recent foundational approaches to iconicity and representation in gesture studies as well as a survey of the techniques speakers use to create gestural signs. In Section 4 we highlight recently applied and empirical research, and we close with final thoughts in Section 5.
2 Semiotic Foundations of Iconicity and Representation in Gesture
Largely since the 1980s, but aligned with earlier semiotic theories as well (e.g. those of Peirce and Jakobson), language has been shown to be situationally grounded and multimodal. That is, the multisensorial experience of humans as embodied beings in the world is a motivating factor in linguistic structure, conceptual knowledge, and language use (Reference GibbsGibbs, 2005). Human communication – particularly in the visuo-spatial modality of gesture and signed languages – is thus understood as being rooted in embodied patterns of experience and expression (e.g. Reference Janzen and ShafferJanzen & Shaffer, 2022; Reference Perniss and ViglioccoPerniss & Vigliocco, 2014). Building on Peirce’s well-known assertion that “we think only in signs” (Peirce, c.1895, CP 2.302), in this chapter, we are particularly interested in how speakers embody salient facets of their thinking, remembering, and imagining in gestural signs, and how their interlocutors interpret and understand bodily expressed meanings in the context of multimodal interaction. We start with Peirce’s model of the sign and what it can explain about gestural signs.
2.1 Peirce’s Sign Model and the Triad Icon–Index–Symbol
To understand iconicity, we first grapple with the notion of “sign” as defined by Peirce (c.1897, CP 2.228), who used the term representamen to mean “something which stands to somebody for something in some respect or capacity.” A representamen is a material sign carrier, for example, a spoken word, a sign in a sign language, or a gesture, such as in the tree gesture described earlier. When perceived by an addressee, the representamen creates an “equivalent sign, or perhaps a more developed sign,” the interpretant of the original sign (Peirce, c.1897, CP 2.228). The interpretant is the cognitive response that is evoked in the mind of the person interpreting the sign, thus linking the representamen with what it is taken to stand for, the object. Semiotic objects encompass physical objects and actions as well as abstract notions and affective states, including concepts, relations, qualities, and feelings, and so on, or anything that can be represented by a sign. The affordances of the body determine in part what can be gesturally represented as an object, and how it is represented (Reference Mittelberg, Hinnell, Pelkey and CobleyMittelberg & Hinnell, 2022). Some objects can be genuinely portrayed by the hands and body: For example, actions the hands routinely do – such as opening or closing a door – are easily enacted in iconic gestures. Larger objects that one normally cannot hold in one’s hands, such as a city skyline, need to be brought down to a much smaller scale to be iconically depicted in gesture, and other objects simply do not lend themselves so freely to depicting via gesture, for example, colors. Finally, Peirce’s model of the sign describes the Ground of a representamen as the relevant aspect that it foregrounds in the object. Iconic grounds imply that an object is not represented in all its aspects but in partial and abstracted ways (Peirce, c.1897: CP 2.228; Reference Sonesson, Müller, Cienki, Fricke, Ladewig, McNeill and BressemSonesson, 2014; see Section 2.3 on metonymy). In both gestural sign formation, or production, and sign interpretation, or processing, the three basic semiotic relations intertwine: similarity (iconicity), contiguity (indexicality), and conventionality (symbolicity), with one of them being predominant and thus determining the gesture’s primary function (Reference Peirce and BucherPeirce, 1893: CP 2.275; e.g. Reference Enfield, Streeck, Goodwin and LeBaronEnfield, 2011; Reference FrickeFricke, 2012; Reference Mittelberg, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMittelberg, 2013).
While this chapter focuses on iconicity and representation in gesture, before moving on, we briefly introduce the other two main sign relations proposed by Peirce to better understand what characterizes iconic signs and how iconic dimensions interact with other semiotic modes in the gestural modality (for a more detailed account, see Mittelberg & Hinnell, 2002). Indexicality is often taken in gesture studies to be synonymous with pointing gestures (Reference FrickeFricke, 2007, this volume; Reference McNeillMcNeill, 1992). Points create a relation between the tip of the articulator – the finger or hand depending on the hand shape, or even nose in the case of nose points (Reference Cooperrider and NúñezCooperrider & Núñez, 2012) – and the target, real or imagined, of the pointing. The third relation between representamen and object is symbolic. For Peirce, symbolic signs are primarily rooted in conventionality and habit (e.g. Peirce, 1902, CP 2.170) but not necessarily in arbitrariness, as posited by Reference SaussureSaussure (1916/1986) regarding linguistic signs. Emblems, such as the thumb up gesture signaling approval, are truly symbolic signs in which conventionality is usually afforded by sociocultural conventions (e.g. Reference CalbrisCalbris, 1990; Reference McNeillMcNeill, 1992). The Peircean idea of habit is particularly suited when considering the gradually routinized correlations between recurring gestural forms, their action origins, and schematic meanings (Reference MittelbergMittelberg, 2019b). Examples include the frequent use of certain gestural forms in a given context as in the case of recurrent gestures – such as the cyclic gesture (Reference LadewigLadewig, 2011) – which can fulfill various conventionalized pragmatic functions (e.g. Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem & Müller, 2014a). (See also Ladewig, this volume, on recurrent gestures.)
2.2 Subtypes of Iconicity: Image–Diagram–Metaphor
To further characterize iconic signs, Peirce distinguished three subtypes of icons (Peirce, 1903, CP 2.276) – images, diagrams, and metaphors. With regard to gesture, image iconicity captures depictive gestures, for example, portraying the salient features of an object, as in the tree example above, or of the actions of a character (animate or inanimate). The degree of iconic substance can vary. For instance, Reference BouvetBouvet (1997) gives the example of a child “becoming” a helicopter, with his torso becoming the body of the helicopter and his arms representing the rotating blades. Whole-body enactments like this one, or the tree mentioned earlier, show a more iconic form than if an object (or motion) – such as the shape of a tall building or the winding path of a mountain trail – is briefly outlined in the air.
Diagrammatic icons in gesture are those that exhibit connections between two or more locations in gesture space. Rather than resembling their object as image icons do, diagrams are more aptly recognized as schematic spatial representations of relations between items (Peirce, c.1897: CP 2.228). For example, in motion-capture renderings of a speaker describing a travel itinerary by sketching out the path linking several destinations in gesture space, the movement trace becomes a digital, iconic sign of a diagrammatic gesture (see Reference Mittelberg and RekittkeMittelberg & Rekittke, 2021). Diagrammatic iconicity has been shown to underpin, among other things: tree diagrams illustrating kinship relations (Reference EnfieldEnfield, 2009; Reference Gaby, Verstraete and HafnerGaby, 2016) or syntactic structures (Reference Mittelberg, Cienki and MüllerMittelberg, 2008); gestures that accompany contrastive expressions in speech (Reference HinnellHinnell, 2019); and specific visuo-spatial signs, for example, the sign in Auslan (Australian Sign Language) police catch thief, which “mirrors both the spatial and agentive relations between policeman and thief” (Reference Hodge and FerraraHodge & Ferrara, 2022, p. 4; Reference Johnston, Edmondson and WilburJohnston, 1996).
In gesture, metaphor iconicity captures representations in which a comparison underlies the gestural image (Peirce, c.1897, CP 2.228; see also Reference Mittelberg, Cienki and MüllerMittelberg, 2008, Reference Mittelberg, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014). For example, in her study on aspect-marking gestures, Reference HinnellHinnell (2018) showed a gesture accompanying the utterance “that jackpot keeps getting higher and higher,” as shown in Figure 3.2. The gesture features both hands in a flat, outstretched form, facing the body, and moving over each other one after the other to form an upward-moving rotation. The gesture form moving upward is underpinned by the image schemas path, cycle, and verticality (Reference JohnsonJohnson, 1987) and the correlated conceptual metaphor more is up (Reference Lakoff and JohnsonLakoff & Johnson, 1980), that is, the more of something there is, the higher it can pile up. The increase in the “jackpot” (an abstract amount of money) motivates the repeated upward-moving arm movements, which effectively parallel the linguistic iconic reduplication “higher and higher.”
Figure 3.2 Gesture motivated by more is up conceptual metaphor in “jackpot keeps getting higher and higher”
Other examples of metaphor iconicity include a schematic gesture that very naturally accompanies the spoken utterance describing a speaker’s habit of watching a sitcom series, “from where I was till like the end of the season,” in which the gesture manifests as a horizontal, relatively straight motion of the right hand moving from the speaker’s left to right. This gesture inherits its form and meaning in part from the embodied image schema source-path-goal and the conceptual metaphor time is space (for a more detailed analysis see Reference MittelbergMittelberg, 2018). Other metaphoric gestures involve gesturing hands that seem to be describing or handling physical objects while the speaker is talking about abstract notions (e.g. Reference Cienki and MüllerCienki & Müller, 2008; Reference Müller, Müller and PosnerMüller, 2004), such as moral values (Reference Cienki and KoenigCienki, 1998) and grammatical categories (Reference Mittelberg, Cienki and MüllerMittelberg, 2008; Reference StreeckStreeck, 2009). These invoke the object image schema and relatedly the ideas are objects metaphor (Reference Lakoff and JohnsonLakoff & Johnson, 1980). Gestures thus show a tendency to physically embody aspects of the source domain of a metaphorical construal (e.g. Reference Müller and HampeMüller, 2017).
Metaphor iconicity also motivates, at least in part, many gesture forms associated with pragmatic and recurrent gestures. For example, the Holding Away gesture (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem & Müller, 2014b; Reference Bressem, Stein and WegenerBressem, Stein, & Wegener, 2017), which features a hand oriented vertically and facing away from the body, represents a physical barrier that the speaker places between herself and her interlocutor. In this case, the hand functions as a barrier as if stopping a physical object (via image iconicity reflecting the image schema barrier); but for the barrier to be meaningful as a Holding Away gesture in a discourse context between two speakers, there is metaphoric iconicity that holds as well, rooted in the metaphor communication is object transfer (Reference Lakoff and JohnsonLakoff & Johnson, 1980). Thus, the barrier is erected (i.e., the speaker’s hand is raised) to ward off an incoming discourse object from the interlocutor (hence it is also called a “fend off“ gesture; Reference WehlingWehling, 2017), which also allows the speaker to hold the floor. In the examples in Figure 3.3 from a multimodal corpus study (Reference HinnellHinnell, 2020), each speaker utters the discourse juncture but anyways while at the same time gesturing a fending off or Holding Away gesture. The speaker buys herself time to shift topics, while at the same time preventing her interlocutor from interrupting and claiming the floor. While this gesture exhibits schematic iconicity of a barrier, there is no direct semantic relation between this gesture’s form and the speech content; the gestural action thus does something in its own right.
Figure 3.3 Metaphor iconicity in Holding Away gestures with “but anyways”
While gesture analysis typically involves categorizing gestural signs, for example, by identifying all iconic or metaphoric gestures in a data set, gesture scholars have recognized the need to consider semiotic dimensions, rather than gesture categories, to do justice to the gradient multifunctionality observed in many gestures (e.g. Reference Enfield, Streeck, Goodwin and LeBaronEnfield, 2011; Reference KendonKendon, 2004; Reference McNeillMcNeill, 2005; Reference MüllerMüller, 1998b). Indeed, especially when following Peirce (e.g. c.1895: CP 2.302), the different semiotic relations – notably, iconicity, indexicality, metaphoricity, and conventionality – can only be seen as interacting in a given gestural sign alongside other features of the multimodal context in determining the locally predominant function of a specific gesture (see Reference Mittelberg, Cienki and MüllerMittelberg, 2008, Reference Mittelberg, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013). Under this hierarchized view of semiotic layering, it is evident that predominantly symbolic signs actually often incorporate indexical and/or iconic dimensions. For example, a conventionalized “come here,” or beckoning, gesture, combines all three meaning relations: It simultaneously “points” toward the intended recipient (indexical); it represents the path between the gesturer and the recipient (iconic); and it is conventionalized in cultures (symbolic): for example, in the USA, people beckon with the palm up, whereas in Mexico, speakers beckon with the palm down (Reference Cooperrider, Goldin-Meadow and DancygierCooperrider & Goldin-Meadow, 2017, p. 121). Another basic cognitive-semiotic principle that is also involved in the gestural sign processes described so far is metonymy, to which we now turn.
2.3 Abstraction, Metonymy, and Viewpoint in Gestural Signs
The complex relationship between gestural representation and the speakers’ inner and outer world largely rests in the experientially motivated, schematic, and metonymic nature of iconic and metaphoric gestures. In this section, we explore processes of abstraction and metonymy in gestural signs and how these interact with the expression of viewpoint.
Gestures are by nature abstract(ed) and partial representations (Reference ArnheimArnheim, 1969; Reference KendonKendon, 2004; Reference MüllerMüller 1998b; Reference StreeckStreeck, 2009) and thus inherently metonymic (Reference MittelbergMittelberg, 2019a). As illustrated earlier in this chapter, and likely intuitively known to the reader, only certain – for example, prototypical or locally relevant – aspects of a particular object or action are highlighted in the creation of an iconic and/or metaphoric gesture; other implied aspects often need to be completed, imagined, or otherwise inferred by the interpreting embodied mind. As emphasized by Streeck, rather than copying abstracted features, gestures are “tools that enable and accomplish the abstractions” (Reference Streeck2009, p. 120), for example, abstracting path from motion events (p. 133). In the Introduction to this chapter, we gave the example of a tree swaying in the wind that one has seen on a walk. In that example, the whole-body gestural icon of a tree necessarily involves an abstraction from the specific, fully fledged tree that was seen, with its idiosyncratic shape, distinctive bark, leaves, and color, and how the scene was experienced multisensorially by feeling and hearing the wind moving the leaves and branches. Similarly, the lexical BSL sign tree in Figure 3.1 (and, as Reference TaubTaub [2000] notes, for the sign tree in ASL), this fully conventionalized sign is highly abstracted and schematized, losing many details: The five outstretched digits do not represent the number of branches in a specific tree, for example, nor does the pivoting action of the arm and hand represent the precise degree of swaying of the leaves of the tree in question.
In addition to such fully coded iconic signs, signed discourse often comprises iconic gestural elements exhibiting spontaneous abstraction and modification (Reference LiddellLiddell, 2003; Reference Perniss, Özyürek and MorganPerniss, Özyürek, & Morgan, 2015). Conversely, iconic and metaphoric gestures may reflect sedimented abstraction processes leading to rather schematic imagery with schematic meanings and pragmatic functions, as we saw in the barrier example (Reference HinnellHinnell, 2020; Reference MittelbergMittelberg, 2019b). As emphasized by Reference MittelbergMittelberg (2019a), metonymy is one of the central forces leading to the emergence of such strongly habitualized hand shapes and movement routines with retraceable action-based motivations, which are, to a certain extent, reminiscent of semantic bleaching and grammaticalization processes in language (Reference Hopper and TraugottHopper & Traugott, 2003; Reference Janzen, Shaffer, Meier, Cormier and Quintos-PozosJanzen & Shaffer, 2002). For example, Reference MittelbergMittelberg (2017) has suggested that the manual action of giving is the experiential substrate of (palm up open hand [PUOH]) gestures that are observed with the German intransitive existential construction es gibt, translated as “there is/are” rather than as a ditransitive “give” (geben) verb. In such multimodally instantiated constructions, routinized gestures metonymically enact “reduced and more schematic variants of the full action of giving,” in which “the act of giving is reduced to an act of unimanual holding that exhibits a decreased degree of transitivity and iconicity, thus evoking, for instance, a scene of existence, or presence, rather than a scene of object transfer” (Reference MittelbergMittelberg, 2017, p. 14).
Beyond structural and behavioral similarity, as highlighted in the tree example, metonymic abstraction in gesture (and sign language) further exploits contiguity relations that can be observed between the gesturer’s body and its environment (Reference Mittelberg, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMittelberg, 2013). Contiguity encompasses factual connections such as physical impact, contact, adjacency, but also spatial and temporal proximity or distance (Peirce, 1901: CP 2.306). Gestures naturally (re)establish contiguity relations that occur between hands and the material world they habitually get in touch with, for example, by holding, moving, or otherwise manipulating objects, tools, technical devices, and other artifacts. Examples include cases where speakers pretend to be holding and showing what they are talking about by either seemingly holding an imagined object with both hands or presenting something on an open palm (contact, adjacency). In the study mentioned above, a speaker made a PUOH gesture when saying in German that es gab ja die Analogie zur Musik (“there was the analogy to music”; Reference MittelbergMittelberg, 2017, p. 7). The speech content draws attention to an invisible object (the analogy) which needs to be inferred from the visible open palm. Although the gestural enactment stems from a physical action, it is not iconic of the meaning conveyed in speech. Rather, the gesture contributes to the overall meaning of this multimodal performance by metonymically alluding to an imagined contiguous physical object that metaphorically stands for an abstract entity (Section 2.2). In the interpretation of multimodal metaphor, metonymy may thus lead the way into metaphor (Reference Mittelberg, Waugh, Forceville and Urios-AparisiMittelberg & Waugh, 2009; see Reference Mittelberg, Waugh, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Waugh, 2014, on contiguity relations and ensuing distinct metonymic modes [Reference Jakobson and PomorskaJakobson & Pomorska, 1983] in gesture).
The meaning of iconic and metaphoric gestures may also be anchored in a metonymic portrayal that activates a larger pragmatic context. If a speaker communicates to a colleague, for instance, that she will send her a message by raising her hands as if typing on an imagined keyboard, the observer does not expect, nor require, the gesturer to also gesture the keyboard itself in order to understand that the message is typed on a keyboard. The observer would also infer, then, that the speaker will be sitting at a desk, for example, typing at a computer. By dynamically abstracting salient characteristics, a quick gestural action, such as the typing hands, can metonymically evoke not only the fully performed action of hitting particular keys, but associated actions, persons, purposes, results, and mental states as well – aspects that are “metonymically linked in a pragmatically structured context of experience, or frame (Reference FillmoreFillmore 1982)” (Reference MittelbergMittelberg, 2019a, p. 2). Iconic gestures can thus “trigger an ensuing associative chain and a larger semantic network” (Reference MittelbergMittelberg, 2019a). These inferential processes involving metonymy rest upon what Reference LangackerLangacker (1993) calls reference-point phenomena, as highlighted by Cienki (2017) and Reference MittelbergMittelberg (2019a).
Last, the partial and thus metonymic construal of discourse contents is also conditioned by a particular viewpoint. Iconic gestures, especially, tend to be shaped by one of the viewpoint strategies speakers typically adopt when recounting, for example, a scene they witnessed first-hand or saw in an animated cartoon (Reference McNeillMcNeill, 1992): character viewpoint by enacting their own previous behavior or the actions of another person or character; observer viewpoint by singling out, for instance, the motion path of a character; or dual viewpoint by combining the two, for example, imitating the body posture of a person walking up a hill while drawing the path he took in the air (e.g. Reference ParrillParrill, 2009; Sweetser, 2012). Given the kinesic affordances of the speaker, or signer, being able to employ several bodily articulators simultaneously (a characteristic that is very different from the more linear nature of speech), they can also impersonate two people at the same time, for example, by miming the manual actions of one person and the facial expressions of another (see Reference DudisDudis, 2004, on body partitioning in ASL). Gestures and signs thus allow us to represent different construals of the same experience by expressing multiple viewpoints (Reference StecStec, 2012; Reference Sweetser, Janzen and ShafferSweetser, 2023).
Having provided a basic semiotic characterization of co-speech gestures, and particularly of predominantly iconic and metaphoric gestures, we will now review various foundational perspectives on how these kinds of bodily signs contribute to multimodal meaning-making.
3 Perspectives on Gestural Representation, Iconicity, and Sign Formation
This section presents an overview of some of the prominent views on representation and iconicity in gesture, including gestural practices of sign formation. Before continuing our exposition, it seems useful to address some terminological issues, as the terms “representation” and “reference” are sometimes employed inconsistently in the gesture literature. It is outside the scope of this chapter to fully treat these complex questions; below, we provide some first points of orientation.
3.1 Terminological Considerations
As introduced above, in Peircean semiotics, iconic signs (and not only iconic ones) serve purposes of representation. With an iconic gestural form, a speaker may represent – that is, depict, portray, enact, imitate, mime, illustrate, demonstrate, or sketch – facets of her outer and inner world of experience in a subjective fashion. She can, for instance, depict a physical object that she has held in her hands many times, for example, her favorite cup; but she may also use an iconic gestural description of the shape of a (not yet existing) dress she is planning to design herself. In cognitive (and cognitive-linguistic) accounts of how meaning arises from multimodal descriptions, the idea of a mental representation of the objects being described is central and combines concepts and embodied schemata of multisensorial experience (see Section 2.2). Iconic (and metaphoric) gestures are assumed to actively partake in processes of conceptualization and imagination (see Sections 3 and 4).
Another way the term “representation” is used concerns semiotic practices of gestural sign formation. Such modes or techniques of representation reflect the different ways in which the hands and other body parts can create iconic signs (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014). Each mode determines how a gestural form depicts an object and/or action, for example, what features are actually “picked out” and represented in a given gesture (see Section 3.3).
Reference is also a relevant notion here. In McNeill’s original classification (Reference McNeill1992) and in subsequent widespread use in psychological studies, representational gestures (comprising iconics and metaphorics) and deictic gestures have been grouped together as “referring” or referential gestures, that is, relating to the referential content in speech. Here we highlight the problems that come with this broad grouping to show the necessity of a deeper understanding of the semiotic processes we introduce in this chapter. In brief, as opposed to most iconic gestural signs, deictic gestures typically do not depict their referent, but rather point at it; they refer, for example, to something by indicating a referent in space. In these cases, the referent may be an actual physical object in the immediate material environment, such as a chair, but also a mountain range in the far distance, or a location in gesture space (Reference FrickeFricke, 2007, this volume; and Reference McNeill, Cassell and LevyMcNeill, Cassell, & Levy, 1993). So, importantly, such highly indexical gestures – which are primarily based on contiguity and not on similarity – usually do not represent content (but see Reference Hassemer and McClearyHassemer & McLeary, 2018).
Looking closely at the intricate mechanisms of representation and reference allows us to better understand the very nature of gesture. Importantly, gestures have, as compared to words, a different, namely dynamic, three-dimensional visuo-spatial modality and can thus establish different kinds of semiotic relationships to what the speaker is talking about; or even produce additional iconic structure as in the barrier example discussed earlier. Gestures thus show a natural propensity to create iconic (and indexical) grounds with a broad array of both static and dynamic phenomena (e.g. Reference Mittelberg, Waugh, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Waugh, 2014). Furthermore, although gestural signs may exhibit advanced degrees of conventionalization, the way they signify is, for the most part, not based on highly coded form–meaning relationships that typically underpin reference in spoken and signed languages (e.g. Reference Sweetser, Guo, Liven, Budwig, Ervin-Tripp, Nakamura and ÖzcaliskanSweetser, 2009; see also Reference Sandler, Gullberg and PaddenSandler, Gullberg, & Padden, 2019, on “visual language”). The terms “referent” and “referential gestures” are notably used in linguistic accounts of gesture (e.g. Reference CalbrisCalbris, 1990; Reference MüllerMüller, 1998b); as in linguistics, one speaks of a referent of a word as the concept denoted by the word, which may or may not imply a physical referent object situated in the extralinguistic context.
Some scholars have come to question the idea of representation and reference in gesture, for example, suggesting that a gesture is often what it is taken to be about (Reference McNeillMcNeill, 2005; Reference Merleau-PontyMerleau-Ponty, 1962; Reference MittelbergMittelberg 2019a, Reference Mittelberg2019b) and/or emphasizing the role of enaction in sense-making (e.g. Reference Di Paolo, Cuffari and De JaegerDi Paolo, Cuffari, & De Jaegher, 2018). Having discussed basic terminological aspects, we now turn to a brief overview of some of the foundational contributions toward the study of gestural iconicity and representation in recent decades.
3.2 Proposals on Representation and Iconicity in Gesture
In this section, we introduce the main tenets of influential proposals on representation and iconicity in gesture over the last 30 years. For more detail, including historical synopses, see, for example, Reference Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfBressem (2013), Reference KendonKendon (2004), Reference Mittelberg, Evola, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Evola (2014), Reference MüllerMüller (1998b), and Reference Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMüller, Ladewig, & Bressem (2013); see Reference Hodge and FerraraHodge & Ferrara (2022) for a recent overview.
One of the most commonly referred to typologies in current gesture studies, especially in psychology and psycholinguistic approaches, is David Reference McNeillMcNeill’s (1992) Peirce-inspired characterization of gesture types that describes iconic gestures as gestures that represent relevant aspects of the meaning, conveyed in the concurrent speech, that relate to physical entities and actions. These relevant aspects may be illustrated through an isomorphic correspondence (Reference Kita and McNeillKita, 2000, p. 162) between gesture hand shape, trajectory, or some quality of the movement and the action, motion, person, or object that it represents. McNeill’s typology further distinguishes between the concrete and abstract natures of the entity being represented: Iconics represent a concrete object or action and metaphorics are those in which the visuo-spatial form presents an abstract notion, such as “knowledge, language itself, the genre of the narrative, etc.” (Reference McNeillMcNeill, 1992, p. 80). For McNeill, iconic and metaphoric gestures are representational gestures, while deictics (pointing) and beats (rhythmic) are not. He further illuminates the viewpointed nature of iconic gestures, distinguishing between character, observer, and dual viewpoint (Reference McNeillMcNeill 1992, Reference McNeill2005; Reference ParrillParrill, 2009; Sweetser, 2012; see Section 2.3 for more details).
Linguistic anthropologist Adam Reference KendonKendon (2004) focused heavily on how individual and recurrent gesture forms, such as gesture families, become meaningful in their specific, culturally shaped contexts-of-use. Within the referential function of gesture, or what he called “visual action,” he delineates gestures that “provide a representation of an aspect of the content of an utterance” (p. 160, italics in original) and those gestures that contribute to the content of an utterance by pointing to an object of reference (deictic gestures). In dealing with representational gestures, he seeks to understand the techniques that are used to achieve representation (see Section 3.3) as well as the different contributions that representational gestures make to utterance meaning.
Geneviève Calbris was one of the earliest modern scholars to discuss matters of convention in representational gestures. In her description of mimic representation, Reference CalbrisCalbris (1990) examined how mimetic gestures can reproduce the shape and dimensions of an object, the way an object is handled or used, or the operation of an object. She emphasized that – regardless of their motivated, and hence iconic, nature – mimetic gestures always also integrate conventional dimensions in the sense that they reflect cognitive schemata or culturally engrained practices. For example, a French speaker’s gesture for calling someone on the phone conventionally involves a single hand as if holding the phone as an instrument up against one’s ear. Calbris pointed out that, in the days of rotary phones at least, in Italy, Neapolitans gestured the act of calling someone by miming the dialing of the number on a phone in front of them with small circular movements. Thus, she observed cultural differences not only regarding which features of a scene or a reference object get selected and then encoded for representation in the gestural modality but also how exactly a gesturer might mimic an action. Finally, Calbris also elaborated the schematicity of certain mimetic gestures resulting from the “powers of abstraction. […] Even in evoking a concrete situation, a gesture does not reproduce the concrete action, but the idea abstracted from the concrete reality” (Reference CalbrisCalbris, 1990, p. 115; Reference CalbrisCalbris, 2011; Calbris & Copple, this volume).
In Cornelia Müller’s multifaceted research related to representation and gesture, she distinguishes between referential gestures denoting concrete entities from those denoting abstract entities (Reference MüllerMüller, 1998b, p. 113). Her work on metaphor in gesture has shown that gestures have the capacity to activate, or awaken, conventionalized images and other aspects of underlying metaphorical construals, thus portraying their dynamic dimensions (e.g. Reference Müller and HampeMüller 2017). Adopting Reference BühlerBühler’s (1934 [1984]) Organon model of communication to multimodal interaction, Müller highlights that in gesture, too, the three sign functions proposed by Bühler – the representing (content-oriented), expressive (speaker-oriented), and appellative (interlocutor-oriented) function – typically interact to varying degrees (Reference MüllerMüller, 1998b, p. 104; Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014). Müller further elaborates how gestures are forms of visual and manual thinking that are shaped by different “modes of gestural representation” (Reference Müller, Santi, Guaïtella, Cave and KonopczynskiMüller, 1998a, Reference Müller1998b, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014, this volume) which are the focus of Section 3.3 on iconic sign creation.
A characteristic of gestural representation emphasized by many scholars is polysemy. Due to gestures’ schematic and partial way of representing or enacting (Section 2.3), a single gesture form is potentially polysemous in that it may create iconic relationships with different referents, thus taking on different, including metaphoric meanings (e.g. Reference CalbrisCalbris, 2011; Reference Cienki and KoenigCienki, 1998; Reference Mittelberg, Cienki and MüllerMittelberg, 2008). For example, in a certain discourse context, an arm and hand placed horizontally in front of the speaker’s body, with the flat hand facing downwards, may represent a rug the speaker is talking about, whereas the same gesture form in another speech context might represent more abstractly the flatness of a specific desert’s topography (Reference KendonKendon, 2004, p. 160). In both cases, the gesture represents certain aspects of the propositional content of the utterance, which explains why iconic gestures have also been referred to as content gestures – as opposed to interactive gestures (Reference Bavelas, Chovil, Lawrie and WadeBavelas, Chovil, Lawrie, & Wade, 1992). Hence, it is often not evident what iconic gestures represent when focusing on their formal features and not considering the concurrent linguistic signs and other contextual factors (which is often a first step in gesture analysis; see Müller, this volume).
In this section, we have provided a brief synopsis of key elements in recent approaches to iconicity and representation in gesture. Beyond those mentioned here, other seminal scholars describe phenomena of representation but prefer to avoid the concept of iconicity. Streeck, for example, has a point in arguing that “representation actively organizes the world,” rather than simply “looking like” or “being like” something in the world (Reference Streeck2009, p. 119). We now turn our attention to semiotic practices of gestural sign formation and thus to how gestures may reflect, construe, and also create facets of the interlocutors’ material, social, semiotic, and imaginative worlds for communicative purposes.
3.3 Modes and Techniques of Gestural Sign Formation
The ways in which gestures are formed have been investigated by many scholars already mentioned here (e.g. Reference CalbrisCalbris, 1990, Reference Calbris2011; Reference KendonKendon, 2004; Reference Müller, Santi, Guaïtella, Cave and KonopczynskiMüller, 1998a, Reference Müller1998b, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014; Reference StreeckStreeck, 2009) as well as earlier ones (e.g. Reference Ekman and FriesenEkman & Friesen, 1969; Reference Mandel and FriedmanMandel, 1977; Reference WundtWundt, 1973). Their respective classification systems all attempt to distinguish a variety of different techniques or modes of representation or depiction. Here we explore these distinctions, largely according to Kendon, Müller, and Streeck’s work (see also Reference ClarkClark, 2016, on depiction, and Reference Ferrara and HodgeFerrara & Hodge, 2018, for a recent overview).
Kendon highlights several ways in which a gesture “may provide a representation of an aspect of the content of an utterance” (Reference Kendon2004, p. 160): modeling, enacting, and depicting. In modeling, a body part is used as a model for some object, for example, a hand takes a form that “bears a relationship to the shape of the object the gesture refers to” (p. 160). Enacting, or pantomime, involves gesturing body parts that “engage in a pattern of action that has features in common with some actual pattern of action that is being referred to,” while in the depicting mode the gesturing hands (or other body parts) “‘create’ an object in the air” through sculpting or sketching, for example (Reference KendonKendon, 2004, p. 160).
Müller’s original classification (Reference Müller, Santi, Guaïtella, Cave and Konopczynski1998a, p. 123; Reference Müller1998b, p. 323) introduced four modes of representation in gesture, namely drawing (e.g. tracing the oval shape of a picture frame), molding (e.g. as if sculpting the form of a crown); acting (e.g. pretending to open a window), and representing (e.g. a flat open hand standing for a piece of paper). She later suggested that techniques of representation can be boiled down to two fundamental modes: “In the acting mode, the hand(s) re-enact(s) any kind of action or any kind of movements of the hand. In the representing mode, the hand(s) turn(s) into a manual sculpture of an object” (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014, p. 1696). For Müller, the “acting” mode consists of enacting action and enacting motion. Enacting action is further differentiated for the presence, absence, and specification of an object (or not). For example, enacting action with no object could consist of waving or walking; enacting action with a specific object could represent turning a key; and enacting action with an unspecified object could involve presenting a “discourse object” with a PUOH gesture (Reference Müller, Müller and PosnerMüller, 2004). Acting also includes enacting motion only or depicting motion as well as path and/or manner of motion (as in “rolling down”). Within the representing mode, Müller distinguishes representing objects and representing objects in motion (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014, p. 1697; see also Müller, this volume).
For Reference StreeckStreeck (2009), depiction is but one of the six gesture ecologies he identifies as “ways in which gestural activity can be aligned with the world” (p. 8). Within depiction, he specifies a range of different practices that begin to explain how gestures can “depict, analyze, and evoke the world” (p. 120). He describes the depiction of real and fictive motion, for example, a gesture showing the path of a car in motion and a gesture showing a cliff “falling off to your left,” respectively, the latter of which is actually a dynamic gesture depicting a static feature in the world (Reference StreeckStreeck, 2009, p. 134). The technique of drawing enables the viewer to see a gestural trace left by a moving hand or finger, while handling involves a schematization through gesture of a practical action in the world, expressing a relationship between the speaker’s body and an object that the body normally handles, such as transporting (picking up, putting down) and grasping objects. Streeck’s notion of ceiving captures a more self-absorbed way of finding a gestural image for an emerging idea: “When ‘they think with their hands’, speakers rely on their bodies to provide conceptual structure” (Reference StreeckStreeck, 2009, p. 152). Speakers can also incorporate parts of their environment in which the narrative setting is depicted, which he calls “indexing,” or “projective indexing” when it projects these marks onto oneself, for example, when a speaker brings her hands to her own hair when referring to someone’s “blond angel hair” (p. 143). Finally, for Streeck, mimetic gestures, or enactments, involve the depiction of physical acts or behavior, producing an “abstract, i.e. gestural, version of a real-life act” (p. 145). Enactments “organize experience by enacting, exaggerating, embellishing, and modulating patterns made from the same stuff from which their denotata are made” (p. 147).
The classifications presented here illuminate the different ways in which gestures are formed to represent the world. Importantly, a requirement for representation is that the gesture is recognized as a representation by someone (or by a system), however schematized or sketchy it may be. This recognition requires an understanding of the sociocultural practices of gesturing in a given community, as well as of the material and semiotic context in which the gesture occurs, including the verbal utterances and actions performed by the interlocutors (e.g. Reference Enfield, Streeck, Goodwin and LeBaronEnfield, 2011). Meaning emerges through “interaction between the meanings of these gestures and the meanings of their associated words” (Reference KendonKendon, 2004, p. 163), and, according to Streeck “talk […] narrowly constrains what recipients expect to see in a depictive gesture” (Reference Kendon2009, p. 122). Hence, gestural representation allows us to imagine and understand the particularly relevant dimensions of what is being talked about, but also leads the mind to associate elements and ideas (as discussed in Section 2.3). A broad range of empirical studies has shed additional light on the phenomena discussed so far. In Section 4 we introduce a sampling of these.
4 Empirical Research Strands
There is much to be gained from understanding iconicity in gesture and iconicity in and of itself. However, representational gestures are also studied as a means of investigating a wide range of other questions in language research, including language evolution, language production and comprehension, first and second language acquisition, theories of embodied cognition, neurocognition, language impairments, cross-cultural and cross-linguistic variation, and many others. In this section, we provide a sampling of such studies, limiting the discussion to language acquisition and development, language and cognition, and computational modeling.
Representational gestures have played an important role in the study of language and cognition over the last century. They have been more widely studied than other types of gestures due partly to the fact that they are highly contextually driven and idiosyncratic, adding “semantic” (i.e. propositional) meaning to an utterance that reflects imagistic mental representation (Reference Hostetter and AlibaliHostetter & Alibali, 2008; Reference Kita and McNeillKita, 2000; Reference Kita and ÖzyürekKita & Özyürek, 2003; Reference McNeillMcNeill, 1992). This renders representational gestures “the most different from language” (Reference Kita and EmmoreyKita & Emmorey, 2023) compared to conventionalized pragmatic gestures, for example. Gestures produced alongside speech can “activate, package, and explore spatio-motoric representation” (Reference Kita and EmmoreyKita & Emmorey, 2023). That is, they can help us think and “fuel thought and speech” (Reference McNeillMcNeill, 2005, p. 3; see also Reference Goldin-MeadowGoldin-Meadow, 2003). When Reference McNeillMcNeill (1992) suggested that gestures provide a “window onto the mind,” he was suggesting that (primarily representational) gestures reveal thought, providing insight into cognitive functions and mental representations.
Representational gestures have been shown to help constitute thought (Reference Kita, Alibali and ChuKita, Alibali, & Chu, 2017). They are language-specific, that is, there is a close tie between the content of a representational gesture and the specific linguistic structure of the co-occurring speech utterance such that speakers gesture differently when the morphosyntax of the accompanying speech utterance is distinctive (e.g. Reference Kita and McNeillKita, 2000; Reference Özyürek, Kita, Allen, Furman and BrownÖzyürek, Kita, Allen, Furman, & Brown, 2005). Thus, one focus of cross-linguistic variation studies has been how different languages encode different aspects of motion events in speech and gesture, for example, path and manner of movement, and how these strategies reveal patterns that correlate with typological differences (e.g. Reference Kita and ÖzyürekKita & Özyürek, 2003; for an overview of cross-linguistic work on iconic and representational gestures, see Reference Mittelberg, Evola, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Evola, 2014). Other cross-linguistic studies focus on how such variation across languages can reveal culturally and linguistically specific differences in spatial thinking and speaking (Reference Kita, Danziger, Stolz and GattisKita, Danziger, & Stolz, 2001; Reference Özyürek, Hickmann, Veneziano and JisaÖzyürek, 2018a, Reference Özyürek, Rueschemeyer and Gaskell2018b).
The focus on language and thought has generated hypotheses and frameworks such as Slobin’s “thinking for speaking” hypothesis (Reference SlobinSlobin 1991, Reference Slobin, Gumperz and Levinson1996; Reference StamStam, 2006; cf. Reference Cienki and MüllerCienki & Müller, 2008; Reference McNeill, Duncan and McNeillMcNeill & Duncan, 2000, on thinking for speaking and gesturing), that has repercussions for research in both first and second language acquisition. To gain competency in a second language, for example, learners have to learn a new way of thinking for speaking, to encode experience according to the semantics and morphosyntax of the target language, and representational gestures can be an important tool in assessing learners’ competencies in this regard (Reference StamStam, 2015).
There is also an interest in representational gesture that focuses on the acquisition of gesture during the earliest phases of language development (e.g. Reference AndrénAndrén, 2010; Reference Capirci and VolterraCapirci & Volterra, 2008; Morgenstern, this volume). Findings from the longitudinal study by Capirci and colleagues (Reference Capirci, Contaldo, Caselli and VolterraCapirci, Contaldo, Caselli, & Volterra, 2005) suggest that there is a continuity between the production of the first action schemes, the first gestures, and the first words produced by children, for example, gesture-word combinations precede two-word speech. Similarly related to the root of representational gestures in actions (e.g. Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014), data from a study of Thai and Swedish children (Reference ZlatevZlatev, 2014) suggests that children’s first iconic gestures are instantiations of mimetic schemas, that is, bodily gestalts which arise locally through action imitation processes (as opposed to the more universal and abstract image schemas; Reference CienkiCienki, 2013; Reference MittelbergMittelberg, 2018). Iconic gestures are also explored in studies that attempt to ascertain the developmental timeline for acquisition of representational gestures in the first two to four years of life. For example, Reference Stefanini, Bello, Caselli, Iverson and VolterraStefanini, Bello, Caselli, Iverson, and Volterra (2009) found the majority of iconic gestures between 27 and 90 months to be action-based. Questions that arise in this line of research include whether iconic gestures in very early stages of language acquisition are better understood as a conventionalized gesture form learned from adults, or whether they are indeed produced as iconic gestural representations by young children used after they start to speak with increasingly complex morphosyntactic constructions (Reference NicoladisNicoladis, 2002, p. 244; see also Reference Mayberry and NicoladisMayberry & Nicoladis, 2000). Finally, representational gestures have been shown to make word learning easier than arbitrary gesture forms do (Reference Namy, Campbell and TomaselloNamy, Campbell, & Tomasello, 2004).
As to the role of gestures in aiding the acquisition of a second language, the findings are not yet conclusive (Reference Gullberg, Müller, Cienki, Fricke, Ladewig, McNeill and BressemGullberg, 2014, this volume). There is evidence that learners often use representational gestures to “elicit lexical help from interlocutors” (Reference Gullberg, Müller, Cienki, Fricke, Ladewig, McNeill and BressemGullberg, 2014, p. 1871), that representational and beat gestures are featured more frequently by instructors and caregivers in second language environments (e.g. Reference AllenAllen, 2000; Reference LazaratonLazaraton, 2004), and that they facilitate comprehension for the second language user (e.g. Reference Kelly, McDevitt and EschKelly, McDevitt, & Esch, 2009; Reference Macedonia, Müller and FriedericiMacedonia, Müller, & Friederici, 2011; Reference NicoladisNicoladis, 2007; Reference Sueyoshi and HardisonSueyoshi & Hardison, 2005), in a similar way as they have been shown to do for first language speakers (e.g. Reference Holler, Shovelton and BeattieHoller, Shovelton, & Beattie, 2009; Reference Rohlfing, Horst and TorkildsenRohlfing, 2019). Other studies have also shown that more advanced second language speakers of Spanish produce more representational gestures than beginner second language speakers, but both groups used fewer gestures overall than in their native language (Reference Gregersen, Olivares-Chuat and StormGregersen, Olivares-Cuhat, & Storm, 2009). Finally, in a study of Japanese learners of French, as proficiency increased, learners moved sequentially from producing predominantly representational gestures related to speech content toward discourse-level gestures (e.g. pragmatic gestures and beats) (Reference KidaKida, 2005).
The nature of iconicity has recently begun to be explored through computational modeling and robotics (see Jokinen, this volume, for an overview of research on communicative gesturing in robot–human interaction). For example, Reference Bremner and LeonardsBremner and Leonards (2016) explored the comprehension of iconic gestures made by a teleoperated robot and found that participants understood iconic gestures produced by the robot almost as well as when produced by a human. Robots are also being used to investigate whether social robots can facilitate second language learning in children (Reference de Witde Wit, 2022) with further technological projects resulting, for example, how to improve the design of the robot-performed iconic hand gestures (Reference de Wit, Willemsen, de Haas, van den Berghe, Leseman, Oudgenoeg-Paz, Verhagen, Vogt and Krahmerde Wit et al., 2022), an important endeavor given some evidence that synesthetic gestures do not produce the faciliatory effects attributed to gestures by humans (Reference Kopp, Church, Alibali and KellyKopp, 2017). Computational modeling has also been put to use in the study of iconic gesture, for example, Reference Bergmann and KoppBergmann and Kopp (2010) looked at systematic and idiosyncratic aspects of iconic gesture production and how these are interrelated by producing a computational model of iconic gesture formation.
The research on iconicity in gesture is, in part, part of a broader resurgence in the interest in motivations behind language structure and its origins (Reference Dingemanse, Blasi, Lupyan, Christiansen and MonaghanDingemanse, Blasi, Lupayan, Christiansen, & Monaghan, 2015; Reference Holler and LevinsonHoller & Levinson, 2014; see Reference Liebal and OñaLiebal & Oña, 2018, and Reference Perlman, Clark and TannerPerlman, Clark, & Tanner, 2014, on ape gesture). The significance of iconicity in spoken and signed languages has been called “a powerful vehicle for bridging between language and human sensori-motor experience […] [I]conicity provides a key to understanding language evolution, development and processing” (Reference Perniss and ViglioccoPerniss & Vigliocco, 2014, p. 1). As such, it deserves attention from the wide range of angles, some of which we have introduced here, that continue to inform our understanding of the ways sentient and non-sentient beings represent their inner and outer worlds through gesture.
5 Conclusion
This chapter has highlighted the cognitive-semiotic principles that are at work in the dynamic creation and understanding of iconic (and metaphoric) gestures, which exhibit varying degrees of experiential motivation, routinization, and schematicity. By returning to the Peircean sign model to scope out the semiotic complexity of co-speech gestures, we have attempted to evidence the role of the different subtypes of iconicity as important dimensions of iconic gestural signs that nonetheless also interact with other sign–Object relations such as indexicality and conventionality. We also examined how abstraction, metonymy, and viewpoint jointly underpin the schematic forms of gestures and their potential meanings and functions.
As we hope to have shown, theoretical and empirical research into gesture, as discussed in this chapter, allows for deep insights into the very nature not only of iconicity, but also of meaning and representation more broadly. There remain, however, many issues concerning gestural representation, reference, and enaction that need to be teased apart more fully. The introduction to empirical research in fields both within and adjacent to gesture studies is indicative of the need to pursue the study of iconicity in order to further understand the polysemiotic and multimodal nature of language, whether primarily spoken or signed. Looking ahead, the ongoing study of iconicity and representation in gesture for their own sakes, and as they relate to fields as diverse as language acquisition, language evolution, and social robotics, will advance our understanding of the kinesic sign processes in which human language, cognition, and interaction are rooted.
1 Introduction
Multimodal utterances should be considered as composite wholes because hand movements can potentially instantiate structures and functions of language and speech, for example, indexical and deictic functions (Reference BirdwhistellBirdwhistell, 1970; Reference CalbrisCalbris, 1990, Reference Calbris2011; Reference EfronEfron, 1941/1972; Reference EnfieldEnfield, 2009, Reference Enfield, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013; Reference FrickeFricke, 2007, Reference Fricke2012, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b, Reference Fricke, Jungbluth and da Milano2015, Reference Fricke2021; Reference HjelmslevHjelmslev, 1943/1969; Reference Kendon and Ritchie KeyKendon, 1980, Reference Kendon, Versante and Kita2003, Reference Kendon2004; Reference MittelbergMittelberg, 2006, Reference Mittelberg, Cienki and Müller2008; Reference MüllerMüller, 1998, Reference Müller, Müller and Posner2004, Reference Müller2008, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013; Reference Müller, Bressem, Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMüller, Bressem, & Ladewig, 2013; Reference PikePike, 1967; Reference WundtWundt, 1900/1904, Reference Wundt1900/1973).
Since gesture is an important way of directing the addressee’s attention in relation to the visible context of the utterance, it has a central role in deixis (Reference BühlerBühler 1934/1982a, Reference Bühler, Jarvella and Klein1982b, Reference Bühler1934/1990; Reference Levinson, Horn and WardLevinson, 2004; for an introductory overview on deixis and pointing see Reference KitaKita, 2003 and Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, 2014a). Some occurrences of verbal deictics, for example, the demonstrative this in the utterance I mean this book, not that one!, obligatorily require a directive pointing gesture to accompany them (Reference BühlerBühler, 1934/1990, p. 107). Reference FillmoreFillmore (1997, pp. 62–63) termed such multimodal occurrences (both optional and obligatory ones) the “gestural use” of verbal deictics, in contrast to symbolic and anaphoric use. Although the term “deixis” is originally based on the idea of directing someone’s attention to something by means of pointing, direc-tive pointing gestures are not the only type of co-speech gestures that contribute to deixis. Iconic gestures that form part of multimodal utterances, for example, may instantiate the targets to be pointed at and may function as the deictic object of the deictic relation (Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, 2014a). Pointings and iconic gestures differ from verbal deictics with respect to their mediality. In contrast to verbal deictics in spoken language, which only occur in the linear dimension of time, gestures that accompany them additionally use three-dimensional space. Consequently, the question arises as to what is meant by the term “spatiality” when talking about language, indexicality, deixis, and space in multimodal utterances including co-speech gestures (Reference Fricke, Jucker and HausendorfFricke, 2022). Based on the Peircean concept of sign, we will distinguish between communication by spatial means, communication about space, space as a concept, and concrete space which does not function as part of a sign relation (Section 3). The main focus in this chapter is on different forms of deictic spaces which communication partners create by using different gestural and verbal means and different spatial concepts while talking face-to-face about space. Based on a Peircean semiotic approach to language and gesture, Section 2 on indexicality and deixis gives a brief overview of different traditions of research and also works out commonalities and differences between deixis and metonymy, both of which are covered by the concept of indexicality in Peircean semiotics.
2 Indexicality and Deixis in Gesture: Different Lines of Research Traditions
According to Peirce, the term indexicality encompasses various forms of context dependency, which are essentially based on contiguity or causality (Reference Peirce, Hawthorne, Weiss and BurksPeirce, 1931–58, CP 2.306, 2.248, 3.361, 8.341). Smoke, for example, is an indexical sign for fire. Fire and smoke are not only related with respect to space and time, but also smoke is caused by fire (Reference Peirce, Hawthorne, Weiss and BurksPeirce, 1931–58, CP 2.300). Through the use of particular verbal expressions such as the deictics I, you, here, there or then and now, a dependency on the situational context of the utterance is being established. Deictics such as these can only be interpreted by taking into account the situation in which they are being expressed. Only our knowledge of the situation provides us with the information about who is speaking or who is being addressed, where here and there is located, and when then and now happened or will happen.
The same applies to pointing. Let us take a typical pointing gesture (the so-called G-form based on sign language hand shapes) with an outstretched arm and an index finger as an example and imagine someone standing in a park, surrounded by different trees. When the pointing person slowly turns around his or her own body axis without moving his or her arm, then, depending on the current direction of his or her body, the extension of the pointing vector would lead to different trees, serving as target points in the park. It is as if there were a straight line drawn between the point where the body is located, the instantiated origo or origin, and a particular tree, the instantiated deictic object.
Another form of contiguity is found in metonymies. The context-dependency in this case is not based on a contactless, indicative function, but either on a part–whole relationship (pars pro toto, internal metonymy) or, alternatively, on relationships of adjoining and direct contact (external metonymy) (Reference Fricke, Mittelberg, Liedtke and TuchenFricke & Mittelberg, 2019; Reference Jakobson, Waugh and Monville-BurstonJakobson, 1990a, Reference Jakobson, Waugh and Monville-Burston1990b; Reference Jakobson and PomorskaJakobson & Pomorska, 1983; Reference MittelbergMittelberg, 2006, Reference Mittelberg2010, Reference Mittelberg2017; Reference Mittelberg, Waugh, Forceville and Urios-AparisiMittelberg & Waugh, 2009, Reference Mittelberg, Waugh, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014).
Imagine a person describing how to hand a piece of paper over to someone else. In a multimodal utterance, the respective iconic gesture depicting the paper may be executed via internal or external metonymy. For example, by using the gestural mode of representation whereby “the hand acts” according to Reference MüllerCornelia Müller (1998, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014), the speaker acts as if holding the piece of paper. By means of contiguity and direct contact (external metonymy) (Reference Mittelberg, Waugh, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Waugh, 2014, p. 1754), the gripping fingers, which are not part of the piece of paper itself, stand for the paper. If the gesture, however, does not imitate gripping the paper, but embodies the piece of paper as a whole instead (the gestural mode “the hand represents” according to Reference MüllerMüller 1998, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014), for example, by using the left flat hand with the palm oriented upwards (the palm up open hand [PUOH] gesture), the hand stands “pars pro toto” for the piece of paper (internal metonymy) (Reference Mittelberg, Waugh, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Waugh, 2014, p. 1750).
They can also constitute further concatenations of indexical signs: Both types of gestural metonymies can become the starting point of an additional metonymic relation as, for example, CONTAINER-FOR-CONTAINED (external metonymy) (Reference Mittelberg, Waugh, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg & Waugh, 2014, p. 1754). With the order “I want another glass,” accompanied by an iconic gesture that imitates the gripping of the stem of a wine glass, the speaker refers to the content of the glass but not to the glass itself. In any restaurant or bar, a polite, attentive waiter or waitress would certainly not serve an empty wine glass.
Metonymic gestures can also contribute to deixis. Iconic gestures that form part of the multimodal utterance, like the metonymic grasping fingers in our example, may instantiate the deictic object of the deictic relation (I want another glass with a directing co-speech gaze at the gesture). The iconic gesture as deictic object is representing a wine glass that stands for its content, the red or white wine (see Section 4 on deixis at signs vs. deixis non-signs).
Indexicality and deixis have been generally characterized as introducing context-dependent properties into language (for overview, see Reference Levinson, Horn and WardLevinson, 2004, and Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, 2014a). The term “deixis” is originally based on the idea of directing attention to something by means of pointing (Reference LyonsLyons, 1977, p. 636). However, linguistic deixis is neither limited to pointing nor can verbal deictics be derived from pointing gestures alone. According to Bühler, who is the founder of modern deixis theory in the European line of tradition, verbal deictic expressions always include a component of “naming”: “the simple reference to something to be found here or there, at a certain place in the sphere of actual perception, must clearly be distinguished from the quite different information that it is of such and such character” (Reference BühlerBühler, 1934/1990, p. 102). By emphasizing that pointing and naming “are able to complement each other” as parts of utterance formation (Reference BühlerBühler, 1934/1990, p. 102), Bühler has to be considered as an important predecessor of Kendon’s idea of “gesture-speech ensembles” (Reference KendonKendon, 2004, p. 127).
Due to the fact that Bühler’s book Theory of Language (Reference Bühler1934/1990) was translated into English in its entirety only in the 1990s, its reception on the international stage was relatively late. As a consequence, two different lines of research in deixis theory have emerged: an initially primarily European tradition in the line of Bühler, and an Anglo-American line of tradition in which deixis and indexicality are largely equated and considered to be coextensive (Reference Levinson, Horn and WardLevinson, 2004, p. 97). Both lines of tradition differ mainly with regard to the following aspects (Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, 2014a): (1) the scope of the notion of deixis, (2) the concept of origo or deictic center (concrete and fixed vs. abstract and movable), (3) the concepts of deictic reference and deictic space (perceivable entities vs. perceivable as well as imaginary entities), and (4) the role of the human body (marginalized vs. crucial).
In the European tradition in the line of Bühler, deixis is predominantly tied to the concept of origo (or deictic center), while in the Anglo-American tradition (under the influence of analytic or linguistic philosophy), context dependency or indexicality (in a broad sense) is equated with deixis. Following Fillmore, nearly the entire Anglo-American line of deixis theory limits the general term “deixis” to perceptual deixis and to the actual speaker and his spatio-temporal coordinates (Reference FrickeFricke, 2002, Reference Fricke and Lenz2003, Reference Fricke2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a; see also Reference HanksHanks, 2005, p. 196 on “spatialist” and “interactive” points of views in Anglo-American deixis theory).
The dichotomy between “deictic” and “intrinsic” also descends from the Anglo-American tradition (Reference Miller and Johnson-LairdMiller & Johnson-Laird, 1976). Only phenomena related to the intrinsic coordinates of the speaker (e.g. his left-right and front-back axis) are classified as deictic in this context. This means that the Anglo-American deixis theories tend to be very speaker-centered:
We will call the linguistic system for talking about space relative to a speaker’s egocentric origin and coordinate axes the deictic system. We will contrast the deictic system with the intrinsic system, where spatial terms are interpreted relative to coordinate axes derived from intrinsic parts of the referent itself. Another way to phrase this distinction is to say that in the deictic system spatial terms are interpreted relative to intrinsic parts of ego, whereas in the intrinsic system they are interpreted to intrinsic parts of something else.
Theoretically, phenomena that Bühler describes as a transfer or displacement of the origo and pointing at imaginary entities (deixis at phantasma, imagination-oriented deixis) are hardly taken into account. In the Anglo-American tradition, even more so than in the Bühlerian one, the focus lies on perceptual deixis (or demonstratio ad oculos in Bühlerian terms). This has only begun to change with William Hanks, who took Bühler’s Theory of Language on board and whose work also represents a turning point in research in that it emphasizes the communicative and interactive aspects of deixis (Reference HanksHanks, 1990, Reference Hanks, Duranti and Goodwin1992, Reference Hanks and Lucy1993). Particular aspects of Hank’s concept of “sociocentricity” (Reference HanksHanks, 1990, p. 7) can also be found in Bühler’s approach. Bühler emphasizes that deixis has to be described in terms of the full model of human communication: “[…] the sender does not just have a certain position in the countryside as does the sign post; he also plays a role; the role of the sender as distinct from the role of the receiver” (Reference BühlerBühler, 1934/1990, p. 93). Along with pointing gestures as an indispensable part of instantiating the deictic function in human interaction, Bühler’s concept of a “tactile body image“ (very similar to “image schema” in later approaches) connected with the transferable origo reveals him as an early predecessor of crucial concepts of embodiment in cognitive semiotics, linguistics, and gesture studies (Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, 2014a, p. 1812).
The following characterization of deixis connects perspectives of interaction and cognition from different lines in linguistic research on deixis (e.g. Reference BühlerBühler, 1934/1982a, Reference Bühler, Jarvella and Klein1982b, Reference Bühler1934/1990; Reference ClarkClark, 1996, Reference Clark and Kita2003; Reference Clark, Schreuder and ButtrickClark, Schreuder, & Buttrick, 1983; Reference DiesselDiessel, 2006; Reference Ehlich and SchweizerEhlich, 1985, Reference Ehlich and Ehlich2007; Reference EnfieldEnfield, 2003, Reference Enfield2009, Reference Enfield, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013; Reference FrickeFricke, 2002, Reference Fricke and Lenz2003, Reference Fricke2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a, Reference Fricke, Jungbluth and da Milano2015; Reference GoodwinGoodwin, 1986, Reference Goodwin2000a, Reference Goodwin2000b; Reference Goodwin and Kita2003; Reference HanksHanks, 1990, Reference Hanks, Duranti and Goodwin1992, Reference Hanks and Lucy1993, Reference Hanks2005, Reference Hanks2009; Reference Hausendorf and LenzHausendorf, 2003; Reference HavilandHaviland, 1993, Reference Haviland and Kita2003; Reference Kendon, Versante and KitaKendon & Versante, 2003, Reference Kendon2004; Reference McNeill, Cassell and LevyMcNeill, Cassell, & Levy, 1993; Reference StreeckStreeck, 1993; Reference StukenbrockStukenbrock, 2014, Reference Stukenbrock2015; Reference Tomasello, Moore and DunhamTomasello, 1995, Reference Tomasello2008, Reference Tomasello2009) and serves as a starting point for further analyses in this contribution:
For mutual understanding in face-to-face interaction, speakers and their addressees need to be simultaneously engaged in perception, imagination, and other cognitive processes. Deixis assumes a particular function in the coordination of mental representations as well as social interaction: It can be understood as a communicative and cognitive procedure in which the speaker is directing the attention of the addressee by the words, the gestures and other directive clues that he uses; these diverse means of expression co-produce context as common ground.
As pointed out by Tomasello, achieving the goal of joint attention between interaction partners is always intention-driven: “Thus, to interpret a pointing gesture one must be able to determine: what is the intention in directing my attention in this way?” (Reference TomaselloTomasello, 2008, p. 4). The directive function of pointing can be instantiated by different articulators, for example, pointing with gaze (e.g. Reference GoodwinGoodwin, 1980; Reference HeathHeath, 1986; Reference KendonKendon, 1990; Reference KitaKita, 2003; Reference StreeckStreeck, 1993, Reference Streeck1994, 2002; Reference StukenbrockStukenbrock, 2015), with the lips (Reference EnfieldEnfield, 2001; Reference SherzerSherzer, 1973; Reference Wilkins and KitaWilkins, 2003), with the nose (Reference Cooperrider and NúñezCooperrider & Núñez, 2012), with the feet (e.g. Reference FrickeFricke, 2007), and with different kinds of hand configurations (e.g. different outstretched fingers pointing at an object or a lateral flat hand [PLOH] pointing in a particular direction) (e.g. Reference FrickeFricke, 2007, Reference Fricke2010, Reference Fricke, Jucker and Hausendorf2022; Reference HavilandHaviland, 1993, Reference Haviland and Kita2003; Reference Jarmołowicz-Nowikow, Müller, Cienki, Fricke, Ladewig, McNeill and BressemJarmołowicz-Nowikow, 2014; Reference KendonKendon, 2004; Reference Kendon, Versante and KitaKendon & Versante, 2003; Reference StukenbrockStukenbrock, 2015; Reference Wilkins and KitaWilkins, 2003).
With respect to deictic gestures, in different kinds of languages – for example, in Italian (Reference KendonKendon, 2004; Reference Kendon, Versante and KitaKendon & Versante, 2003), in Tzotzil (Reference Haviland and KitaHaviland, 2003), or in German as has been shown by a quantitative study (Reference FrickeFricke, 2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b) – the form differentiation between the PLOH and the G-Form is simultaneously connected with a semantic differentiation also present within verbal deictics in certain languages (e.g. dieser [this] vs. hin/her [to/fro]). The meaning of the G-Form can be paraphrased as “pointing to an object,” whereas the meaning of the PLOH gesture is directive (“pointing in a direction”) (Figure 4.1).
Figure 4.1 Two types of pointing gestures in German: G-form and PLOH (Reference FrickeFricke, 2007, p. 109; Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, 2014b, p. 1623)
Moreover, these two forms and their respective meanings can be morphologically blended, showing a rudimentary kinesthematic compositionality analogous to that of the so-called phonesthemes (e.g. smog as a blend or morphological contamination of the two words smoke and fog) (see Reference FrickeFricke, 2012, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b, on gestural kinesthemes and their morphological complexity).
Why does the palm in Figure 4.2 not face downwards? This could be explained by assuming a morphological blending of the G-form (pointing to an object) and the PLOH (pointing in a direction) which can be paraphrased as “pointing to an object in a particular direction” (Reference FrickeFricke, 2007, p. 110; Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, 2014b, p. 1624). Accompanying the German verbal deictic da [there], the index finger points to a spatial point instantiated by an object while maintaining the above-mentioned direction “straight ahead.”

Figure 4.2 Blending of G-Form and PLOH
According to Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke (2014b, p. 1626), “the concept of kinesthemes does not only support and further elaborate the hypothesis of a ‘rudimentary morphology’ in co-speech gestures (Reference Müller, Müller and PosnerMüller, 2004, p. 3), especially in so-called gesture families, but also substantiates the category of ‘recurrent gestures’ located between idiosyncratic and emblematic gestures in Kendon’s Continuum” (e.g. Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem & Müller, 2014; Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014).
Focusing on spatial deixis in multimodal interaction, Section 3 addresses the dynamic aspects of space.
3 Language and Space: A Semiotic Approach Based on the Peircean Concept of Sign
The proposed semiotic approach to language and space is based on the fundamental Peircean assumptions that any entity can be interpreted as a sign or as a non-sign, and that the interpretation of an interpreter or addressee is independent of the sign producer’s intention. Take a cup, for example: In the default case, cups contain liquids and people use them for drinking. Apart from that, speakers can also use them spontaneously in creative ways, for example, to illustrate a car-crash scenario: “Two days ago, I was in a car crash. I was parked here (cup 1) and this idiot came speeding at me from the left (cup 2) and smashed into me.” In this scenario, the cups are dissociated from their standard use and gain a new context as part of a sign relation: They stand for something else, namely the two cars which the speaker is referring to, and which are not present in the utterance situation. Analogously, concrete space – like any other entity – can be interpreted as a sign in Peirce’s triadic model of the sign:
A sign […] [in the form of a representamen] is something which stands to somebody in some respect or capacity. It addresses somebody, that is, creates in the mind of that person an equivalent sign, or perhaps a more developed sign. That sign which it creates I call the interpretant of the first sign. The sign stands for something, its object. It stands for that object, not in all respects, but in reference to a sort of idea, which I have sometimes called the ground of the representamen.
Using the Peircean concept of sign as its starting point, the proposed analysis of spatial deixis in multimodal interaction addresses the dynamic aspect of space by integrating different dimensions (Reference FrickeFricke, 2007, Reference Fricke2012). In Peirce’s model, a sign is understood as a triadic relation between the representamen or sign vehicle (R), its object (O), and its interpretant (I) (cf. Reference Peirce, Hawthorne, Weiss and BurksPeirce, 1931–58), as illustrated in Figure 4.3.
Figure 4.3 Space as a relatum in a Peircean triadic sign
Building upon the three relata – representamen, object, and interpretant – we can distinguish between: (1) communication by spatial means (space used as a representamen or sign carrier, e.g. gestures), (2) communication about space (space used as an object of the triadic sign, e.g. Potsdamer Platz in Berlin used as an object of verbal and gestural route descriptions), and (3) space as a concept (space used as an interpretant, e.g. a map-like vs. a sphere-like concept of space) (for gestures as expression of conceptualization, see Reference Cienki, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfCienki, 2013a, Reference Cienki, Auer and Hilpert2013b). If space is not used to instantiate one of the three relata of the Peircean concept of sign, then Concrete space is treated as a non-sign, that is, it is not interpreted as standing for something else (e.g. different regions with particular varieties and dialects in areal linguistics). Starting from this systematic distinction we arrive at the schema of four subfields of semiotic space (Reference Fricke, Jucker and HausendorfFricke, 2022) in Table 4.1. Each section represents a different aspect of language and space associated with different areas of research in linguistics and semiotics.
The four subfields are considered to be inherently dynamic. As a relatum of a particular sign configuration, space has to be thought of as a dynamic process of semiosis and not as a static entity. Despite their mutual influence, the four subfields can be distinguished analytically and constitute separate fields of research in linguistics and semiotics. One might ask: Why choose these particular four fields and not five or seven – or just one? The answer is that the underlying semiotic systematicity only allows for these specific four areas: The first distinction is between space as a sign and space as a non-sign (subfield 4). Within the triadic (or three-place) sign relation, space can only occupy three different places (subfields 1–3). We consider these four subfields of space to be primary. Other secondary types of space can be created by further concatenation of Peircean triangles in complex processes of semiosis (Reference FrickeFricke, 2007, Reference Fricke2012).
In deixis theory, the term “local [also spatial] deixis” is used in three different ways: first, when speakers use deictic expressions with optional co-speech pointing to refer to space itself or entities located in space; second, when the demonstratum – the immediate target object of deictic pointing, which is not necessarily identical with the reference object intended by the speaker – is spatial; and third, when the meaning of a deictic expression contains the respective semantic feature, for example, [+spatial] versus [+temporal]. Consequently, mainly the subfields “communication about space” and “space as a concept” are covered by deixis. Despite the ubiquity of co-speech pointing gestures, the subfield “communication by spatial means” has not yet been in the focus of deixis theory. With respect to deixis and other linguistic phenomena, the fourth subfield “space as a non-sign” is merely instantiated by different local areas and their particular linguistic varieties as the subject of cross-linguistic studies and areal linguistics.
In Sections 4 and 5, we will consider phenomena of space that correspond to the three places of the Peircean triadic sign: gestures as spatial representamens, Potsdamer Platz as an object of deictic communication, and different conceptualizations of space instantiating the interpretant. It will be demonstrated that the semiotic distinctions introduced above are indispensable for a clear-cut notion of deixis as well as of space.
4 Deixis at Signs and Non-Signs
When referring to space via pointing gestures and verbal deictics, the processes of semiosis can be simple (deixis at non-signs) or complex (deixis at signs, semiosis type O1 = R2) (Reference FrickeFricke, 2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a). Based on the Peircean concept of sign, the term “deixis at non-signs” refers to the default case in which both communication partners have perceptual access to the demonstratum, which in this case is identical with the reference object intended by the speaker. This means that the demonstratum of the pointing gesture or the verbal deictic is not interpreted as a sign, as in I mean this book, not that one! The term “deixis at signs” is used when the deictic object (demonstratum) is an entity that is interpreted as standing for something else, namely the reference object intended by the speaker, as in the example of the beakers standing for cars. It is proposed that in this case the relata can concatenate by serving more than one semiotic function. In example (1), the demonstratum of the pointing gesture R1 is the flat hand of the addressee B, which is the object O1 of the first sign relation. But, at the same time, the flat hand functions in a second sign relation as the sign vehicle, or representamen R2, that stands for the intended reference object, a particular building at Potsdamer Platz (O2), which is not present in the utterance situation. In Section 6.3, it will be demonstrated how this kind of concatenation of Peircean triangles on the representamen-object axis can also be adopted for the analysis of metonymic and metaphoric use of space (Reference FrickeFricke, 2007).
(1)
A: [das iss die Arkaden/]Footnote 1 “that is the Arkaden”
While giving directions in the absence of the route described, speaker A is pointing at the flat left hand of addressee B. This flat hand represents a particular building at Potsdamer Platz, namely the Arkaden, a glass-covered shopping mall. In contrast to deixis at non-signs, the demonstratum and the reference object intended by the speaker are not identical, but differ from each other: The flat hand the speaker is pointing at is interpreted as a sign for the intended reference object, the Arkaden. This relation is illustrated by the Peircean configuration of the sign processes in Figure 4.5: The demonstratum of the pointing gesture R1 is the flat hand of the addressee, which is the object O1 of the first sign relation. But, at the same time, the flat hand functions as the sign vehicle, or representamen R2, in a second sign relation that stands for the intended reference object, the Arkaden (O2), which is not present in the utterance situation. This example is part of a longer sequence of interaction during which both communication partners build up a shared map-like model of Potsdamer Platz collaboratively by the use of verbal and gestural means (see Section 5) (Reference FrickeFricke, 2007, p. 208; for collaborative use of gesture space, see also Reference Furuyama and McNeillFuruyama, 2000 and Reference McNeillMcNeill, 2005, p. 161).
Example (2) illustrates the case of “deixis at non-signs.” This means that the entity that the pointing gesture or the verbal deictic refer to is not interpreted as a sign. Consequently, the demonstratum and the intended reference object are part of the same sign triad (Figure 4.7) and do not differ from one another:
(2)
A: [du kommst hier vorne raus an dieser Straße (.)] “you get out here right in front at this street (.)”

Figure 4.7 Deixis at non-signs as a Peircean sign configuration
The pointing gesture in example (2) is directed at a target point instantiated by the entity (the street) to which both the speaker and the addressee have perceptual access while communicating. The target object (demonstratum) does not stand for something else. Therefore, it is not interpreted as a sign according to Peirce but is identical with the reference object intended by the speaker. In other words: For both the speaker and the addressee, the street is the object of their conversation and corresponds to the object (O) of the Peircean triangle. But dissociated from its place in the semiotic three-place relation, the street is, in fact, the street and nothing else. Therefore, it belongs to the subfield “space as a non-sign” (see Table 4.1).
Table 4.1 Schema of four semiotic subfields of space
|
|
| Space as a non-signConcrete space |
It is worth pointing out that the semiotic distinction between deixis at signs and deixis at non-signs introduced above allows for the resolution of the inherent contradiction in the widespread differentiation between demonstratio ad oculos and deixis at phantasma as proposed by Reference BühlerBühler (1934/1990). The following quotation exemplifies Bühler’s first main case of deixis at phantasma, which is conceived of as a kind of theater stage on which the speaker performs like an actor:
“Here I was – he was there – the brook is there”: the narrator begins thus with indicative gestures, and the stage is ready, the present space is transformed into a stage. We paper-bound people will take a pencil in hand on such occasions and sketch the situation with a few lines. […] If there is no surface to draw a sketch on, then an animated speaker can temporarily “transform” his own body with two outstretched arms into the pattern of the battle line.
Considering the final sentence of this quotation, we can observe that the battle line which is embodied by the speaker’s outstretched arms is perceptible and not imaginary. Although perceptibility is the distinguishing criterion for demonstratio ad oculos, this example is classified as deixis at phantasma. How does Bühler come to this classification? The answer can be found in the alternative interpretation: If classified as demonstratio ad oculos, pointing at a “real” battle line in a battle could not be differentiated from pointing at an embodied battle line in a narration (Reference FrickeFricke, 2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a). It should be noted that Bühler – like most other scholars in deixis theory – does not differentiate between the demonstratum of a deictic utterance including gestures and the reference object intended by the speaker. With regard to Bühler’s example, the intended reference object is imaginary but the perceptible demonstratum is not. Thus, the precise nature of the distinction that Bühler wishes to draw remains unclear in certain aspects. The distinction between deixis at non-signs versus deixis at signs (Reference FrickeFricke, 2002, Reference Fricke and Lenz2003, Reference Fricke2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a) introduced in this section allows for a precise sign-based analysis of the distinction between demonstratio ad oculos and deixis at phantasma which avoids relying on non-linguistic ontological differences. Against the backdrop of Bühler’s characterization of deixis at phantasma, what all the examples he gives in the quotation above have in common is that the demonstratum, regardless of whether it is imaginary or not, is interpreted as a sign standing for something else, namely for the reference object (imaginary or perceptible) intended by the speaker. This complex deictic relation can be illustrated by the concatenation of Peircean sign triads (Reference FrickeFricke, 2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a). Complex sign concatenation can also be applied to the analysis of metaphoric uses of deictic space, as will be shown in Section 6.3, in which particular forms of collaboratively created spaces stand for increasing or decreasing emotional consensus in face-to-face interaction.
5 Forms of Deictic Space
The different forms deictic space can assume are based on our everyday concepts of space in relation to the speaker and his primary origo. According to Fricke’s concept of origo-allocating acts, the primary origo is connected to the role of the speaker who, as the current holder of the primary origo, intentionally allocates secondary origos to his own body, or to other perceptible or imaginary entities (Reference FrickeFricke, 2002, Reference Fricke and Lenz2003, Reference Fricke2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a). For a detailed discussion of this concept with regard to Bühlerian and Anglo-American deixis theory – for example Reference Fillmore, Jarvella and KleinFillmore (1982, Reference Fillmore1997), Reference HanksHanks (1990, Reference Hanks, Duranti and Goodwin1992, Reference Hanks2009), Reference LevinsonLevinson (1983/92, Reference Levinson, Horn and Ward2004), Reference LyonsLyons (1977), Reference Miller and Johnson-LairdMiller & Johnson Laird (1976), Reference TomaselloTomasello (2008) – see Reference FrickeFricke (2007, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a). Analogous to deictic objects, these instantiations of secondary origos can also be interpreted as signs or as non-signs. In contrast to Reference BühlerBühler’s (1934/1990) deixis theory, this concept of origo is structured hierarchically and based on the assumption of an intention-driven agent who allocates and instantiates the origos provided by the deictic utterance. It allows for distinguishing between forms of deictic space that include the primary origo (sphere-like) and forms that exclude it (map-like and screen-like). Figure 4.8 illustrates the process of origo-allocation in a nutshell:
Figure 4.8 The origo-allocating act according to
By talking to somebody, a person acquires the speaker’s role, and with it the privilege to allocate local origos or to provide the local origo with intrinsically oriented entities. Such an entity can also be the speaker himself (Reference FrickeFricke, 2002, Reference Fricke and Lenz2003). Therefore, it is important to distinguish between two different aspects of the speaker’s potential for deictic reference: first, the speaker who, in his role as speaker and as holder of the primary origo, allocates secondary origos intentionally; and second, the speaker who, as an intrinsically arranged entity, instantiates a secondary origo (example [4]). This distinction is important since the sphere-like space surrounding the speaker, the holder of the primary origo, may also contain the addressee’s body, the holder of a secondary local origo, as illustrated in example (3):
(3) A: The key is to your left. (Primary origo: speaker A; secondary local origo: the addressee’s body)
(4) A: The key is to my left. (Primary origo: speaker A; secondary local origo: the speaker’s body)
In Sections 5.1, 5.2, and 5.3, it will be demonstrated how forms of deictic spaces differ, first, with respect to the pattern of origo-allocation they allow for, and, second, with respect to the complex (deixis at signs) or simple (deixis at non-signs) processes of deictic semiosis involved. For a detailed discussion with regard to Reference McNeillMcNeill’s (1992) model of gestures as indicators of perspective taking (protagonist’s viewpoint vs. observer’s viewpoint) see Reference FrickeFricke (2002, Reference Fricke2007).
5.1 Sphere-like Spaces
Sphere-like spaces obligatorily surround the speaker as the holder of the primary origo and allow for both deixis at signs and deixis at non-signs. In the default case, they use the typical model of three-dimensional Euclidean space in which the speaker occupies the center. Optionally, addressees, bystanders, and any other entity can be incorporated into this conceptual model of space. Hence, secondary local origos allocated by the speaker can be instantiated either by himself, the addressee, or any other entity. Examples (5) and (6) are taken from video recordings of subjects providing each other with route descriptions, initially, at Potsdamer Platz, and, subsequently, in a room at the Technische Universität Berlin. In both examples, the speakers construct sphere-like spaces that stand for something else mentally, namely what a pedestrian would see when following a predetermined route at Potsdamer Platz. The speaker in example (5) (Figure 4.9) imitates the process of taking a photograph of the umbrella-like roof of the Sony Center from inside the building. Standing 300 m away from the building, she uses the gestural mode “the hand acts” (Reference MüllerMüller, 1998, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014) to create a sphere-like space surrounding her, which functions as a sign for the roof that is absent from her field of vision.
(5) B: (click sound that imitates the clicking of a release button.)
(6)
A: [links]1 [und rechts]2 [riesen Hochhäuser]3 “[left]1 [and right]2 [huge skyscrapers]3”
In example (6) (Figure 4.10), the speaker has allocated a secondary local origo to her own body and localizes the highest buildings at Potsdamer Platz to her left and right using three co-speech molding gestures with a deictic function. Her gaze moves along the vertical axis, thus directing the attention of the addressee to the height of the buildings. Gaze and hands collaborate in order to create the sphere-like characteristic of the deictic space with the speaker at its center.
5.2 Map-like Spaces
In contrast to sphere-like spaces, map-like spaces and the entities they consist of always stand for something else and do not include the primary origo instantiated by the speaker. The speaker can optionally allocate secondary local origos to entities of map-like spaces that fulfill the function of a sign. In example (7) (Figure 4.11), the speaker uses her right index finger in the gestural mode “the hand draws” (Reference MüllerMüller, 1998, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014) in order to create a map-like space by drawing a two-dimensional flat circle that locates the brook at Potsdamer Platz in relation to other objects in the vicinity.
(7)
A: [hier ist das Bächlein/ (..)] “here is the brook”
(8)
hier iss das Gewässer/ (..) und hier ist das Haus (..) ja/ (..) o.k./ (..) und hier sind wir (..) [wir gehn von da nach da] “here is the stretch of water (..) and here is the building (..) and we are here (..) right (..) okay (..) [we go from there to there]”
In example (8) (Figure 4.12), secondary origos are allocated to both the speaker and the addressee, who as imaginary pedestrians are virtually following a predetermined route to be taken in the future. The origo instantiations are embodied gesturally by the “walking” fingers of the right hand and expressed verbally by the use of the person deictic we. In contrast to example (7), the “walking” fingers are part of the map-like space and interpreted as signs by the communication partners. Although the co-speech gestures in this example touch the table surface, they do not belong to the category of classic bodily contact gestures (e.g. handshakes). A map-like space could also have been created by “drawing” and “walking” fingers used as contactless gestures, as will be shown in example (10): Accompanying her verbal utterance, the speaker is drawing a line in the air that represents a particular path as part of a two-dimensional map-like representation of Potsdamer Platz from a bird’s-eye view. In order to accomplish the task of a successful route description, the presence of the table surface simply invites the communication partners to use it in order to create a map-like space.
5.3 Screen-like Spaces
In contrast to a map-like space, which resembles a horizontally oriented map seen from a bird’s-eye view, a screen-like space uses the vertical dimension. The form of this kind of space creates the impression of a screen augmented by the dimension of depth, like a box (cf. Reference McNeillMcNeill, 1992). Similar to a map-like space, but in contrast to a sphere-like space, it does not include the primary origo. A screen-like space is shown in example (9) (Figure 4.13), in which the speaker A is modeling the semicircular shape of the Sony Center building at Potsdamer Platz. She creates a three-dimensional space in front of her from which she is excluded, as opposed to occupying the center of a sphere-like space (Figures 4.9 and 4.10) or constructing the two-dimensional bird’s-eye view of a map-like space (Figures 4.11 and 4.12).
(9)
A: [das rechte iss | äh das Hochhaus von äh | dem Sonygelände was halbrund iss] “[the right is | er the skyscraper from er | the Sony complex which is semicircular]”
As examples (7) to (9) demonstrate, both map-like and screen-like spaces exclude the primary origo, and they correlate with different gestural modes of representation according to Reference MüllerMüller (1998, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014): Gestures with the most map-like character correspond to two-dimensional drawings, and those with the most screen-like character correspond to three-dimensional modelings.
6 Collaborative Creation of Deictic Space in Face-to-Face Interaction
The forms of deictic space introduced in Section 5 can be further subclassified with respect to their mode of creation and their temporal structure (Reference FrickeFricke, 2007, Reference Fricke2008, 2009). Sphere-like, map-like, and screen-like spaces can be created by speakers in collaboration with their addressees while turn-taking. Two main types of interactive gesture spaces can be distinguished: shared spaces and separated spaces. The gestures produced by the speaker and the addressee either temporally overlap or are executed in temporal succession. Table 4.2 summarizes the subclassification schema which the examples in Sections 6.1 and 6.2 refer to.
Table 4.2 Forms of deictic space: mode of creation and temporal structure (Reference FrickeFricke, 2007, p. 272)
| Mode of creation | Temporal structure | |
|---|---|---|
| Successive | Simultaneous | |
| Separated | Example (13) | |
| Shared | Example (11) | Example (10) |
6.1 Shared Spaces
6.1.1 Shared and Simultaneous
In example (10) (Figure 4.14), the subjects A and B are sitting in a room at the Technische Universität Berlin, where A is describing a route at Potsdamer Platz that B is to take in the future. The speaker and the addressee share the same gesture space, which stands as a complex sign for Potsdamer Platz that is not present in the utterance situation. B’s hands make the form of a T, which represents a crossing. Speaker A uses her right hand to point at B’s hands and thus to localize her as an “imaginary” addressee projected into the future. In this way, both communication partners solve the task of reconstructing the route together very calmly and cooperatively.
(10)
A: 1[nein du bist jetzt eigentlich= (.) du gehst hier die Straße entlang (.) dann bist du hier/ (..)] und (.) äh (.)]1 2[überquerst hier/ (.) die Straße/ (.) die Ampel (.) bist auf der andern Seite (..)]2 3[und hier überquerst du dann wieder\]3 “1[no you are now actually= (.) you go here along the street (.) then you are here/ (..)] and (.) er (.)]1 2[cross over here/ (.) the street/ (.) the traffic lights (.) you are on the other side (..)]2 3[and here you cross over again\]3”
6.1.2 Shared and Successive
In the route description given in example (11) (Figure 4.15), A and B are at Potsdamer Platz, where they create a gesture space in temporal succession collaboratively. B is describing a route that leads along the back of the Stella Musical Theater, where she has never actually been. Earlier on in the conversation, the theater building was localized at the center of the space between A and B, such that B was facing the theater entrance. B’s hand draws a line that represents the path behind the theater. This visualization of the path is gesturally maintained during the subsequent direction given by A, who – in addition – allocates a secondary local origo to B. This is indicated by both A’s gesture and by a slight turn of her upper torso toward B. There is no temporal overlap between the gestures produced by A and B, hence the gesture space they create is said to be shared and successive.
(11)
B: 1[also ich bin hinter dem Theater langgelaufen\ (..) “So I walked along behind the theater” A: 2[genau du bist hinter dem Theater]1 lang/ …]2 “Right you walked along behind the theater”
6.2 Separated Spaces
6.2.1 Separated and Successive
Analogous to shared spaces that are created successively, separated spaces with the same temporal organization are characterized by the fact that the positions of the gesturally located objects are not maintained, as illustrated by example (12). By using successive iconic gestures, A and B locate their own respective Infobox building in their own gesture space with their own secondary local origo.
(12)
B: also [(.) hier iss die Infobox (.)] “so here is the Infobox” A: [ja (..)] “right”
6.2.2 Separated and Simultaneous
Example 13 (Figure 4.17) shows a rather difficult communication situation in which the communication partners create their own separate gesture spaces simultaneously: B is localizing a stretch of water with her left hand and a particular building with her right hand. A disagrees with B, claiming that the spatial relation between the building and the water is in reverse to that suggested by B. Her voice is slightly raised. B reacts to A’s objection and shouts “I don’t understand this.” In contrast to the example of a simultaneously created shared space (Figure 4.14), speaker and addressee split the gesture space and create separate models of Potsdamer Platz with conflicting localizations of the same entities simultaneously.
(13)
B: 1[wenn HIER das Gewässer iss\ (.) {(.)} “1[if here is the stretch of water\ (.) {(.)}” A: {hm} (.) B: 2[und DA das Haus\ (.) “2[and there the building\ (.)” A: nein 3[nein HIER iss das Gewässer 4[und DA iss das Haus\ (..) “no 3[no here is the stretch of water 4[and there is the building\ (..)” B: das verSTEH ich nich\]4]3 (..)]2]1 “I don”t understand this\]4]3 (..)]2]1”
6.3 Indexicality and Metaphor: Deictic Spaces as Metaphorical Signs for “Doing Emotion” in Face-to-Face Interaction
When comparing examples of shared and separated spaces that are simultaneously created, we can observe that some occurrences of separated spaces are correlating with emotional antagonism due to conflicting communicative intentions and interactional goals (see example [13]), whereas shared deictic spaces (see example [10]) correspond to emotional consensus and joint communicative goals. From a semiotic point of view, these kinds of spaces metaphorically stand for something else; they are interpreted as indexical signs of different emotional states. Assuming that feelings are not simply “expressed” by means of vocal utterances and gestures that are specific to individuals but are culturally shaped and collaboratively created, then the deictic opposition [+origo-inclusive] versus [–origo-inclusive] in verbal deictics, like here versus there, we versus you, now versus then, and respective differences in gesture-space construction (we-space vs. I-and-you-space) may indicate changes of emotional states in face-to-face interaction and also provide multimodal means for “doing emotion” in the ongoing conversation (cf. Reference FrickeFricke, 2010, Reference Jungbluth, Fetzer and OishiJungbluth, 2011; for impolite uses of pointing gestures see Reference MüllerMüller, 1996). Further investigations of collaborative space construction therefore offer a promising perspective in fields of application like psychotherapy, coaching, rhetorics, teaching, or arbitration. Figure 4.18 shows the spectrum of gesturally created spaces between maximum emotional consensus and maximum antagonism:

Figure 4.18 Forms of gesture space and their correlation to increasing emotional distance in face-to-face interaction
To conclude: Expanding the perspective on the deictic object and taking into account the processes of its interactive construction as well as the possibility of complex semiotic concatenation within semiosis establishes promising links to current research on multimodal metaphors and metonymies (see Sections 2 and 6).
7 Theoretical Implications and Conclusion: Why We Need a Semiotic Approach for Analyzing Deixis, Gesture, and Space
Modern deixis theory following Reference BühlerBühler (1934/1990) justifies the necessary combination of gestural pointing and verbal deictics with the argument that only in this way are speakers able to successfully refer to entities in particular utterance situations. Consequently, we need to consider a multimodal utterance as a whole because hand movements can potentially instantiate structures and functions of language and speech, for example, deictic functions.
Moreover, a multimodal approach to grammatical and pragmatic functions needs to combine at least the following two perspectives: first, the perspective of mode neutrality as a tertium comparationis for comparative analyses and a unified approach to different modes and, second, the perspective of media specificity that, for example, reflects on the particular potential of signs and their modalities for representing something (for an overview, see Reference FrickeFricke, 2021).
Therefore, the definition of linguistic categories, such as any kind of “space” in deixis theory, for example, should not depend on any linguistic “substance” (cf. Reference HjelmslevHjelmslev, 1969). The required tertium comparationis can be provided by semiotics and, notably, by the Peircean concept of sign applied to the notion of space, as demonstrated in Section 3. Furthermore, with reference to the Peircean model of the triadic sign, we have distinguished between spatial deixis by means of gesture (representamen), space as an object of communication (object), and space as a concept (interpretant). If space does not instantiate any relatum of the triadic sign, then the concrete space indicated by the speaker/gesturer is interpreted as a non-sign.
Based on the specificity of gestures as a spatio-temporal medium, different forms of deictic space, such as sphere-like, map-like, or screen-like spaces can be created by speakers in collaboration with their addressees while turn-taking. Two main types of interactive gesture spaces can be observed: shared spaces and separated spaces. These kinds of spaces can be used metaphorically for “doing emotion” between maximum emotional antagonism and maximum emotional consensus in face-to-face interaction. Consequently, they are interpreted as indexical signs of different emotional states.
To conclude: A semiotic approach to deixis, gesture, and space not only allows for a tertium comparationis with respect to the modality of the deictic signs under investigation but also provides us with tools for representing implicit semiotic processes like complex sign concatenation (e.g. deixis at metonymies and metaphors) that have not yet been revealed by linguistic deixis theory.
1 Introduction
Given that faces in dialogue are lively, eloquent, efficient, infinitely varied, and ubiquitous, it is well past time to start investigating them in more detail.
Facial gestures constitute an integral component of language use in face-to-face dialogue (Reference Bavelas and ChovilBavelas & Chovil, 2000). Facial movements contribute syntactic, semantic, and pragmatic information to speakers’ utterances. The face is also a key communicative resource for listeners to provide feedback and comments.
Historically, scientific inquiry about facial movements has been directed away from language and situated within an emotion framework. The relationship between facial movements and emotion is a topic that is best described as … complicated. The study of facial expressionsFootnote 1 comes bundled with historical assumptions, controversies, unsubstantiated claims, production in nonsocial settings, and posed facial configurations. The renaissance of emotion expression research in the 1960s was fueled principally by interest in facial expressions for basic emotions (e.g. anger, fear, happiness, and sadness). The facial expressions were theorized as distinctive facial movement configurations universal and biologically “hard-wired” to internal emotion programs (Reference EkmanEkman, 1997).Footnote 2 However, evidence shows there is no one-to-one relationship between emotion experience and their respective prototypical facial expressions (Reference Duran, Reisenzein, Fernandez-Dols, Fernandez-Dols and RussellDuran, Reisenzen, Fernandez-Dols, & Russell, 2017; Reference Reisenzein, Studtmann and HorstmannReisenzein, Studtmann, & Horstmann, 2013).Footnote 3 There is also considerable variability in facial patterns for an emotion category and across different instances for an individual (Reference Barrett, Adolphs, Marsella, Martinez and PollakBarrett, Adolphs, Marsella, Martinez & Pollak, 2019).
Emotion researchers have long acknowledged that emotion expressions constitute only a minority of facial movements (Reference Ekman, Friesen and OstwaldEkman & Friesen, 1977; Reference Fridlund and GilbertFridlund & Gilbert, 1985; Reference Matsumoto, Ekman, Fridlund and DorwickMatsumoto, Ekman, & Fridlund, 1991). Facial expressions are differentiated from facial movements that serve conversational functions. Reference Ekman and FriesenEkman and Friesen (1969) classified language-related facial behaviors as emblems, illustrators, and regulators. In a later publication, Reference Ekman, Aschoof, von Cranach, Foppa, Lepenies and PloogEkman (1979) subsumed these categories under the heading conversational facial signals.Footnote 4 Reference ChovilChovil (1989, Reference Chovil1991a) classified conversational facial displays as: syntactic, semantic, nonredundant or redundant with spoken content, and listener comments. More recently, Reference Bavelas, Gerwing, Healing, Seyfeddinipur and GullbergBavelas, Gerwing, and Healing (2014a) organized facial displays as gestures using Reference KendonKendon’s (2004) specifications for functions of hand gestures (referential, pragmatic, etc.).
Whereas facial gestures such as brow actions that serve syntactic functions (e.g. word/phrase emphasis) are general across language use, the gestures discussed in this chapter are more specialized. The facial gestures are selective and intended to illustrate different ways that facial movements relate to language in face-to-face dialogue.
Referential emotion gestures are symbolic representations of affective concepts and mimetic enactments of facial expressions. Listener co-narrative gestures are displays of affective understanding of the narrator’s past affective experience that contribute to the story-telling. Disgust co-speech gestures track the evolution from rejection of bad smell/tastes to dislike of something objectionable and negation in spoken utterances. Facial shrug, thinking face, and iconic mouth gestures have similar counterparts in hand gestures. Shrugging facially can take on a variety of meanings in addition to the classic “I don’t know” message. The thinking face marks temporary engagement in cognitive activities. Iconic mouth gestures provide information about physical features of an object or event.
Examples are numbered and square brackets around bolded text are used to indicate the positioning of the facial gesture in the utterance. Facial actions are described below the marked area of spoken content. Information on positioning and facial actions was not available for some examples.
2 Referential Emotion Gestures
“And I was just like, [$#%!].”
Emotion events incurred outside of the immediate setting are common topics of everyday conversations. Studies suggest that affective experiences are frequently shared later with others (Reference RiméRimé, 2009, Reference Rimé, Russell, Fernandez-Dols, Manstead and Wellenkamp1995). Narrative accounts can include descriptions of personal and/or another individual’s reaction. Semantic emotion content can be represented by affective lexicon (e.g. “worried” and “infuriated”), figurative expressions (e.g. “scared to death” and “about to explode”), and terms that refer to expressive actions (e.g. glared and frowned). Language however, is not always well suited for describing complex facial expressivity and enactment can be an efficient means of representation. In a study of language used in emotion event descriptions, Reference Fussell, Moss, Fussell and KreuzFussell and Moss (1998) reported that:
at several places in our [tape-recorded] transcripts it was apparent that speakers resorted to bodily representations of characters’ facial expressions and postures. Often, they prefaced these nonverbal displays by saying something such as: “The best way for me to communicate this is to show you.”
The use of facial movements to symbolically represent emotion concepts in utterances was discussed by Reference EkmanEkman (1997, Reference Ekman, Larrazabal and Perez Miranda2004), who proposed referential emotion gestures were transformed versions of emotional expressions. Ekman reported that a common formation consisted of one movement of emotion facial configurations.
Take, for example, a person who says he had been afraid of what he would learn from a biopsy report and was so relieved when it turned out to be negative. When the word afraid is said, the person stretches back his lips horizontally, referring facially to fear.Footnote 5
Referential gestures can take on more complex forms and affective meanings other than what Ekman observed in his data. In example (1) (Reference ChovilChovil, 1991a, p. 180), the emotion referent and facial gesture fall outside the tra-ditional domain of emotion research. The speaker is talking about her young son’s constant questions (a behavior many parents can relate to). The emotion term “exasperating” is accompanied by an eye roll gesture.
(1) “ … . sometimes I find them amusing, other times I find them [exasperating].”
[Raised brows, eyes widened and eyes rolled]
Emotion referents can be represented gesturally without accompanying lexical descriptions. Example (2) (Reference ChovilChovil, 1989, p. 66) is from a narrative in which the speaker was recounting a past minor conflict with her father, who asked her to do something when she was very busy. After describing their brief argument and eventual agreement to his request, she uses a stylized “anger” display to portray her feelings at the time.
In the next example (Reference WakslerWaksler, 2001, p. 133) the speaker is describing her first time in-line skating. The facial gesture is used with an arm gesture in a classic “panic” display that depicts the experience in a dramatic (and entertaining) way.
(3) “I’m speeding down the hill, and I don’t know how to stop, and I’m all [Flails arms with terrified expression].”
Facial gestures are also used to represent reactions and behaviors of others. In example (4) (Reference WakslerWaksler, 2001, p. 136), the speaker demonstrates the response of an acquaintanceFootnote 6. The facial gesture in example (5) (Reference ChovilChovil, 1991a, p. 182) is used to show how the person ‘looked down” at the speaker in a confrontational interaction.
(4) “And she was all like [Facial expression of boredom, rolling eyes].”
(5) “And the guy just sorta looked [pause] you know, sorta looked down at me.”
Reference Stec, Huiskes and RedekerStec, Huiskes, and Redeker’s (2016) study of multimodal direct speech quotations identified enactments of characters’ facial expression as being a typical component. Indeed, facial gestures were found to be more frequent than manual gestures. In example (6), gestures are used in a reconstructed dialogue between the speaker (younger self) and her mother about a Halloween costume change.Footnote 7
(6) Younger self: “Mom I don’t want to be a clown.”
[Disappointed look]
Mother: “Well um it’s kind of last minute, all the Halloween stores are closed.”
[Frustrated look]
Examples (7) (Reference Stec, Huiskes and RedekerStec et al., 2016, p. 2) and (8) (Reference ChovilChovil, 1989, unpublished data) are somewhat distinctive in that the facial gestures depict a “collective” affect (“they” and “we”) of the event rather than the emotion of one particular character.
(7) “and I like took a nap to it upstairs and then my friends like the next thing I know
[they’re like hey hey we’re going to go downstairs now the show’s going to start].”
[Wide alert eyes]
(8) “The phone rings, (brief pause) my brother’s on the phone,‘We’re in an accident.’
We’re going [exaggerated intake of breathe] ‘Ohhh, my gawd!’”
Gestural representations of semantic emotion referents depict meanings specific to the speaker’s communicative goal. The gestures appear to be typically stylized displays and incorporate iconic expressive actions that would be recognizable to the listener. Various researchers have noted the use of quotatives “like” or “all” to mark forthcoming enactments in utterances (Reference Fox and RoblesFox & Robles, 2010; Reference StreeckStreeck, 2002; Reference WakslerWaksler, 2001).
Referential gestures lend themselves well to experimental paradigms used in hand gesture research. Visibility and copresence can be used to investigate communicative functions of the gestures. Retelling of cartoons can elucidate how characters’ facial expressions are gesturally demonstrated (e.g. features selected) and their relation to verbal descriptions (Reference Bavelas, Gerwing, Healing, Seyfeddinipur and GullbergBavelas, Gerwing, & Healing 2014a). Movie stimuli of human characters undergoing emotional events enable a more in-depth analysis of referential emotion gestures. Although facial gestures may be more common, a hand gesture occurs in the example below. The gesture appears to depict “hand-wringing,” an activity symbolic of worrying. The action did not appear in the movie:
For example, in one of the movie narrations, the speaker described the protagonist’s troubled state of mind by saying that she’s, “sitting there worrying.” The utterance accompanied a gesture in which the speaker’s two hands rotated around one another alternatingly, stroke agitation expressive of some kind of processing metaphor.
Narrative accounts of a character’s emotional experience (see Fussell & Moss, 1988) can be used to investigate how facial gestures relate to different types of emotion language and when showing “may be the best way to communicate this.”
3 Listener Co-Narrative Facial Gestures
Listeners are active participants in narrative story-telling, and facial displays are part of their communicative repertoire (Reference Bavelas, Coates and JohnsonBavelas, Coates, & Johnson, 2000). Information conveyed facially can influence or guide the narrator’s construction of the story. For example, facially displaying puzzlement or confusion may lead narrators to elaborate on a particular aspect. Listeners also participate by becoming “for the moment, co-narrators who illustrate or add to the story” (Reference Bavelas, Coates and JohnsonBavelas et al., 2000, p. 944).
In example (9) (Reference Stec, Huiskes and RedekerStec et al., 2016, p. 538), a woman was telling her friend that, as a young child, she had admired the artist Picasso but had been unimpressed when she saw an exhibition of his paintings. She imagines that if Picasso was hearing their conversation “he’s probably like rolling in his grave.” The friend responds with an enactment of the imaginary offended Picasso:
(9) “[Darn kids.]”
[Scrunches face in a scowl]
The response provides the narrator with feedback that the friend understood the intended meaning of her statement. However, unlike other possible responses (e.g. smiling and chuckling), the Picasso enactment contributes to the narrative story itself.
The listener in example (10) (Reference ChovilChovil, 1991a, p. 189) displays a “pain expression” to the speaker’s description of falling in a skiing incident. In example (11) (Reference ChovilChovil, 1989, unpublished data), the listener responds with a display of “disgust” to the speaker’s experience of swallowing fouled bad-tasting water after overturning her kayak.
(10) Narrator: “ … and um I fell and I did like a double back flip.”
Listener: [”Ooooo”]
[Eyebrows draw together and down, eyes squinted, mouth rounded]
(11) Narrator: “ … and I just got a big gulp of the Gorge [a polluted river]”
[Eyes opened widely, opened mouth, and inhales a big breath of air]
Listener: [”Auugh”]
[Eyebrows lowered, eyes tightly closed]
Although these responses have been labeled as facial motor mimicry (Reference Bavelas, Gerwing and FiedlerBavelas & Gerwing, 2007; Reference ChovilChovil, 1989, Reference Chovil1991a), it is important to note the listeners were not mimicking a facial expression of the narrator. In the first example, the narrator did not produce a facial expression. The second listener’s facial display does not mirror the narrator’s facial demonstration of swallowing water. The listeners’ responses display an affect that is appropriate and specific to the event described in the narrative.
In examples (12) and (13) (Reference ChovilChovil, 1989, unpublished data), the listener co-narrative gestures are produced as the narrator is describing the critical incident. The synchrony suggests that listeners anticipate what the narrator is about say, but the precise timing with the key points is remarkable.
(12) ”He drove, he drove too close and we had one front wheel off the cliff, [my side and the truck was going over the edge.]”
[Listener’s eyebrows raised, eyes widened)]
(13) ” … grill of a van [and it crashed into the side] just the back panel of the car … ”
[Listener’s eyes squinted, mouth pulled back/slightly up, followed by eyebrows raised up]
In example (14) (Reference ChovilChovil, 1989, unpublished data), the narrator is recounting an incident when she was injured by her horse. The listener displays three co-narrative gestures; the first two gestures have a similar facial movement pattern and mark details in the lead-up that are significant to the actual injury event.
(14) ” … and I was brushing off her leg and I was leaning down, sitting down and I had my face like rig[ht by her foot right?]
[Listener eyebrows raised, eyes squinted, mouth corners drawn back and down]
“ … and she stomped her foot and she [lifted it up] and she hit here in the eye.”
[Listener eyebrows raised, eyes squinted, mouth corners drawn back and down]
The third gesture occurs later in the story when the narrator tells about going to find help. As the narrator reenacts calling out to her mother, the listener demonstrates “looking“ and adds a direct speech quotation.
| Narrator: so I wandered around across the street to find my mom. [“Mom, mom”]. | |
| Listener: | “Rusty’s kicking me.” |
| [Eyebrows raised, eyes up looking off to the right] |
Evidence from experimental studies that facial reactions to seeing or hearing about another person’s experience are sensitive to visual availability and copresence supports the view that the facial displays function as communicative acts (Reference ChovilChovil, 1991b; Reference Bavelas, Black, Lemery and MullettBavelas, Black, Emery & Mullett, 1986). In narrative contexts, co-narrative facial displays convey understanding of the emotional import of the event – information not expressed in the narrator’s utterance. In this respect, the listeners participate by adding affective content to the story (Bavelas & Chovil, 1997). That facial displays can co-occur with narrator’s speech gives a new meaning to dialogue as a collaborative shared endeavor (Reference ClarkClark, 1996).
4 Facial Co-Speech Disgust Gestures: From Rejection of Bad Smells/Tastes to Grammatical Negation
“He is looking at something which smells bad.”
Facial displays of disgust are noted for involving facial actions linked to unpleasant sensory stimuli. The facial action most closely identified with disgust is nose wrinkling which is typically associated with foul odors (Reference Rozin, Lowery and EbertRozin, Lowery, & Ebert, 1994). Darwin linked disgust expression to rejection of bad tastes and as characterized by mouth opened widely “as if to let an offensive morsel drop out” (Reference DarwinDarwin [1872/1965], p. 258) Mouth actions are observed in infant responses to unpleasant-tasting (e.g. bitter) liquids (Reference Rozin and FallonRozin & Fallon, 1987). Another facial action, narrowed/closed eyes, is suggestive of rejecting offensive visual stimuli (Reference DarwinDarwin [1872/1965]). Researchers have proposed disgust facial expressions possibly originated from a primitive biological mechanism to protect against ingestion of potentially toxic or harmful foods (Reference DarwinDarwin [1872/1965]; Reference Rozin and FallonRozin & Fallon 1987; Rozin, Haidt, & McCauley, 1999). For the most part, the common view of disgust expressions has stayed around the primordial dinner table – as reactions to bad smells/tastes.
In language use, disgust co-speech gestures convey meanings drawn from a semantic rejection theme “dislike of something objectionable.” The gestures can indicate the speaker’s affective evaluation or mark a semantic target referent as objectionable. In examples (15) and (16) (Reference ChovilChovil, 1991a, pp. 180, 184), the speakers were identifying disliked foods for a dinner menu plan. In the first example, the gesture occurs with the speaker’s verbally stated dislike of a particular food. In example (16), the facial gesture is produced concurrently with spoken content that only names a food. The disgust gesture identifies it as an undesirable food.
(15) “ … [I hate, I hate desserts with alcohol in them].”
[One side of upper lip raised, eyes narrowed, and brows raised]
(16) “[Basic steamed white rice].”
[Nose wrinkle and eyes narrowed]
In the next example (Reference Antas and GembalczykAntas & Gembalczyk, 2017, p. 20), the speaker is commenting on a hypothetical proposition of not being a performer. The spoken content defines the idea as objectionable. This information is complemented by a facial gesture that conveys the speaker’s affective rejection of the imaginary prospect – from his mind so to speak.
(17) [“No life without acting.”]
[Wrinkles face and squints eyes]
Reference DarwinDarwin (1872/1965) observed disgust expression was frequently accompanied by a hand gesture “as if to push away or to guard oneself against the offensive object” (p. 257)Footnote 8. In example (18) (Reference Kendon and PoyatosKendon, 1988, p. 135), the speaker is talking about some young people she knew of. The utterance is completed with simultaneous actions described as a facial expression of “disgust” and rapidly moving both hands forwards, splaying her fingers to the fullest.”
(18) “Their parents are professors, but the kids are [disgust facial/hand gestures].”
Lack of contextual information limits interpretation, but the issue is probably socially unacceptable behavior as opposed to being smelly or unclean. The composite facial/hand gestures depict the target referent (“the kids”) as objectionable and unwanted (see Harrison, this volume, on gestures of negation).
Disgust can be expressed through vocalizations or emotive interjections. In English, disgust interjections have various forms such as “Eugh” (or “eew”), “Ugh,” and the conventionalized “Yuck” (Reference GoddardGoddard, 2014).Footnote 9 Reference Wiggins and SzatrowskiWiggins (2014) observed in her research on dinnertime conversations that adults predominantly used “Eugh,” whereas preschool children employed “Yuck.”
In example (19) (Reference ChovilChovil, 1989, p, 80), the speaker is commenting on her conversation partner’s suggested food item (thick slices of liver) for the dinner menu plan. The facial gesture is produced concurrently with the word “yeah” and together conveys agreement that liver is a food she also dislikes (or does not find appealing). The disgust interjection that follows marks a further sense of aversion to the food referent.
The speaker in the next example is commenting jokingly to her partner about the “appetizing” nature of their dinner menu of disliked foods (Reference ChovilChovil, 1991a, p. 186). The disgust gesture functions pragmatically to frame the gustatory interjection “Mmmmm” as jesting (nonserious). Together the actions convey the message “What an unappealing meal.”
Reference EkmanEkman (1976) identified the nose wrinkle as an emotion emblem gesture. The use of the nose wrinkle as conveying dislike is shown in (example 21) (Reference ClarkClark, 1996, p. 181):
(21) I walked into a sports store and asked whether they had Merco squash balls. The clerk said, “No, we have Dunlops.” I responded with [nose wrinkle], laughed, and said “No thanks,” and he laughed and said “OK.”
Disgust expressive actions are the only emotion-related actions (that the author is aware of) that have been linked to the evolution of language. Reference JespersenJesperson (1917) proposed the negation marking stem ne originated as “a primitive interjection of disgust, accompanied by the facial gesture of contracting the muscles of the nose” (p. 6). Studies suggest that the nose wrinkle functions as grammatical negation marker in signed language use (Reference Antzakas and ZeshanAntzakas, 2006; Reference MeirMeir 2004) and appears to have a similar function in spoken language. In example (22) (Bavelas & Chovil, 1997, p.335), an asymmetrical nose wrinkleFootnote 10 co-occurs with the negation “do not.” (It is important to note that the speaker had previously stated that she liked the food in question.) The facial gesture in example (23) (Reference ChovilChovil, 1991a, p. 183) accompanies “that’s not.”
(22) “I [don’t] eat it very often.”
[Asymmetrical nose wrinkle]
(23) “[That’s not] really nutritious.”
[Nose wrinkle]
Disgust co-speech gestures exemplify Reference Bavelas and ChovilBavelas and Chovil’s (2000) proposal that facial gestures range on a continuum of abstraction. At the most concrete level, disgust rejection messages relate to experiences with actual sensory objects. Co-speech gestures do not convey “bad smell/taste” reactions but rather use it as a metaphor to mark negative evaluation affect or to define a target referent as objectionable. Further along the continuum is the use of the nose wrinkle as a symbolic gesture for dislike. The most abstract category on the continuum – grammatical negation markers – reflects the evolution of disgust rejection into the linguistic structure of language.
5 Facial Shrug Gestures
“¯\_(ツ)_/¯”
Facial shrug gestures are characterized principally by mouth actions and quickly performed brief eyebrow raises. Reference EkmanEkman (1985) defined the prototypic facial configuration as “raising the eyebrows, dropping the upper eyelid, and making a horseshoe-shaped mouth” (p. 102). The “horseshoe” or downward turning of mouth corners has been referred to as the mouth shrug, a component of the prototypical shrug ensemble which includes hand and shoulder movements (Reference DebrasDebras, 2017; Reference Jehoul, Brône and FeyaertsJehoul, Brône, & Feyaerts, 2017; Reference MorrisMorris, 1994). Lips stretched back and lower lip pushed out are other facial actions that characterize facial shrugs (Reference ChovilChovil, 1989).
Facial shrugs are cousins to open palm actions that function as hand shrugs. The two forms of gestures appear to share themes such as absence of knowledge, nothing more to say, association with words such as guess, and marking resignation (Reference Cooperrider, Abner and Goldin-MeadowCooperrider, Abner & Goldin-Meadow, 2018; Reference Chu, Meyer, Foulkes and KitaChu, Meyer, Foulkes & Kita, 2014). Reference EkmanEkman (1985) classified the facial shrug as a facial emblem with the conventionalized meaning “I don’t know.” Reference MorrisMorris (1994) proposed the mouth shrug carried a disclaimer message (“I don’t know, It’s nothing to do with me,” “I don’t understand”). Reference DebrasDebras (2017) suggested the mouth shrug may be specialized to lack of knowledge meanings. However, facial shrugs can also convey other messages such as something does not matter (something that is “good enough”) and marking resigning or conceding a point (Reference Bavelas, Gerwing, Healing, Seyfeddinipur and GullbergBavelas et al., 2014a, Reference Bavelas, Gerwing, Healing and Holtgraves2014c; Reference ChovilChovil, 1989, Reference Chovil1991a).
The three examples below are variations of “I don’t know” that relate to an absence of knowledge. The facial shrug in example (24) (Reference DebrasDebras, 2017. p. 19) marks the speaker’s inability to comment on the topic of conversation as she lacks specialized knowledge on the subject matter.
(24) “[Facial shrug] I’m not an expert.”
[Mouth shrug]
In example (25) (Reference ChovilChovil, 1989, p. 74), speaker (B) facially shrugs that he is unable to provide the requested information. In this instance, the knowledge is not immediately available to be produced on the spot.
(25) Speaker A: “Um, what else?”
Speaker B: [Facial shrug]
[Mouth stretched back and twisted to one side]
The facial shrug in example (26) (Reference ChovilChovil, 1989, p. 74) relates to no knowledge of something existing. In trying to identify a disliked soup, the speaker shrugs “I don’t know of any” at the end of her statement about liking all soups.
(26) “Soups, I like almost every soup [Facial shrug].”
[Eyebrows raised]
In example (27) (Chovil, 1991, p. 186), the speaker was telling about a personal minor conflict experience. He ends the narrative stating it had occurred only few days prior, followed by a facial shrug as if to say “That’s about it. There isn’t anything more I can say.”
(27) “That was only a couple of days ago but ah [Facial shrug].”
[Lower lip pushed out and eyebrows raised]
In example (28) (Reference Bavelas, Gerwing, Healing and HoltgravesBavelas et al., 2014c, p. 20) the speaker is describing the ending of a scene from the movie Shrek 2. The facial shrug precedes “I guess … they take him“ and was interpreted as the speaker was unsure if her account coincided with the actual ending but it was good enough.
(28) “and then, [Facial shrug] I guess … they take him.”
[One corner of mouth stretched back and eyebrows raised]
Shrugging that something could be possible is illustrated in the example (29) (Reference ChovilChovil, 1989, p. 73). In a dinner menu planning task, one of the participants states he does not like cantaloupe. His conversation partner responds with a facial shrug as if to say “I suppose so, I find them okay though.”
(29) (First speaker) “ … I don’t like cantaloupe.”
(Second speaker) [Facial shrug]
[Eyebrows raised, right side slightly higher, eyes slightly upward/to the right]
In example (30) (Reference ChovilChovil, 1989, unpublished data), the speaker is relating how her husband’s recovery from being injured in a car accident has been a difficult time in their lives. The facial shrug functions to mark conceding that “at times it’s not too bad.”
(30) “Yeah it was rough but [Facial shrug] at times it’s not too bad.”
Perhaps the most interesting aspect of facial shrugs is that the mouth formations preclude physical articulation of speech “I cannot speak” and symbolically represent “I cannot say” meanings (Reference DebrasDebras, 2017). What speakers cannot say is “anything at all, anything more, without possibility, with complete certainty or absoluteness.”
6 The Thinking Face
I noticed a young lady earnestly trying to recollect a painter’s name, and she first looked to one corner of the ceiling and then to the opposite corner, arching the one eyebrow on that side; although, of course, there was nothing to be seen there.
Darwin is describing a facial gesture that was coined by Reference Goodwin and GoodwinGoodwin and Goodwin (1986) as the “thinking face.” Thinking face gestures have been typically described as consisting of withdrawal of gaze, upward turning of the head, and looking off to one side. The gestures can also include lowering/raising eyebrows, closing eyes, or twisting the mouth to one side (Reference ChovilChovil 1989, Reference Chovil1991a).
The thinking face is exactly what its name suggests – it signals the speaker is mentally engaged … for the moment. Thinking faces mark pauses where the speaker is thinking about what to say, recalling something from memory, or searching for words (Reference Bavelas, Gerwing and HealingBavelas, Gerwing & Healing, 2014b; Reference ChovilChovil, 1989; Reference Chovil1991a)Footnote 11. Gestures may be accompanied by verbal collateral markers (e.g. “um” or “uh”) that can also function to indicate a short delay in speaking (Reference Clark and TreeClark & Fox Tree, 2002). Reference Bavelas and ChovilBavelas and Chovil (2018) found thinking face gestures predominantly occurred early in the narrative when the speaker was organizing the information.
In example (31) (Chovil 1991, p. 183), the facial gesture marks a pause at which the speaker is attempting to recollect her most recent minor conflict incident:
(31) “ … the last disagreement I had was um [ ] with my mother actually.”
[Raises brows and looks up]
The thinking face gesture in example (32) (Reference ChovilChovil, 1989, unpublished data) contributes information to the spoken content that the speaker is engaged in thinking – in this instance: about a disliked food to suggest for the main course in the dinner menu plan.
(32) “[Main course], um”
[Eyes looking up and to the left]
The gestures share a common function with word search hand gestures. The “cyclic” hand gesture (Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014) or hand rotating gesture marks that the speaker has not finished talking and provides an explanation for the temporary break. The coproduction of the thinking face and rotating hand gesture is shown in the example presented below. The speaker is narrating a story about a close call incident when he was tree-planting one summer (see Reference Bavelas and ChovilBavelas & Chovil, 2018, for full transcript). The coproduced gestures occur when the speaker is trying to describe the problem a particular river-crossing posed for the tree-planting team:
(33) So like it’s – we’re going, [ahhh, see there’s sort of a – (pause)] area of rapids and um (pause), like, [uh, well, words-words-words-words.] area like a, like a rapid river you know like its all high. {Listener interrupts}
As the speaker starts to describe the situation, he stops and performs a quick hand rotation in combination with tilting his head up slightly and looking off to one side. He then turns his head back to the listener as he resumes speaking. In the second cogesture occurrence, the speaker tilts his head up, looks upward to one side in combination with a rotating hand gesture performed in rhythm with saying “words-words-words-words” after which he returns his gaze to the listener. The speaker returns to describing the river at which point the listener interjects “Yeah, okay. It’s dangerous.” The speaker responds with “Yeah, okay you got the danger part down” and continues with the next part of the story. Reference Goodwin and GoodwinGoodwin and Goodwin (1986) showed that listeners not only recognize that the speaker is having difficulties but can also be a coparticipant. In this example, the listener “helped” by informing the narrator that a detailed description was not necessary in the narrative account.
7 Iconic Mouth Gestures in Signed Language
Facial actions have a key role in sign language use (see Wilcox, this volume), even more so than in spoken language with respect to grammatical or syntactic functions (Reference SandlerDachkovsky & Sandler, 2009; Reference LiddellLiddell, 1980). Narrators also use referential emotion gestures in signed story-telling (Reference Emmorey, Messing and CampbellEmmorey, 1999; Reference Sutton-Spence and NapoliSutton-Spence & Napoli, 2010).Footnote 12 Reference Sandler, Levy and SchaefferSandler (2003, Reference Sandler2009, Reference Sandler2018) identified mouth gestures that function similarly to iconic hand gestures. In an analysis of the retelling of the cartoon “Canary Row,” Reference SandlerSandler (2009) showed how iconic mouth gestures provided information about dimension, shape, and motion. For example, description of Sylvester the Cat climbing up inside the drainpipe was complemented with a mouth gesture that provided information about the tight fit and narrowness of the drainpipe. Reference Sandler, Levy and SchaefferSandler (2003) provided some additional examples of iconic mouth gestures (examples [34], [35], [36]):
(34) Draining of water through a small opening (p. 400)
“He emptied the water out of the pool.”
(35) State of being filled to overflowing (p. 400)
“He loaded the wagon with grass.”
(36) Heavy (p. 401)
“carried a suitcase”
[Puffed cheeks]
Gestures utilize movements of mouth, lips, and changes in the cheek area to depict physical features and actions of the event being described. Meanings are context-dependent and provide information not included in the signed, semantic, content iconic mouth gestures.
8 Summary
We cannot afford to exclude anything until we are informed of what it is that we are excluding.
Facial gestures intersect with language in varied ways and in techniques for representation of meanings. Gestures are tightly interwoven and interdependent with speech. Meanings conveyed by gestures are specific to the context and contribute to the overall intended message. A distinctive feature of referential emotion gestures is displacement – speakers are referring to affective events incurred in the past or by another individual. Semantic emotion referents are represented by stylized facial portrayals and enactments of facial behaviors. The gestures provide a visual alternative or complement to lexical descriptions for conveying affective concepts and expressive actions. Listener co-narrative gestures convey empathic understanding with well-formed displays that “show how you felt.” The placement and synchronized timing with narrator’s provision of descriptive information added affective aspects to narrative stories. Disgust co-speech gestures are closely coordinated with accompanying speech and function to mark a semantic referent as objectionable or convey subjective dislike of the referent. Rejection meanings are conveyed by facial actions associated with unpleasant sensory experiences. Facial shrugs utilize closed mouth/lip formations and brief eyebrow raises to convey a range of messages such as insufficient knowledge, indeterminacy, and marking qualifications. The thinking face gesture utilizes the act of “visually searching” to symbolically represent engagement in “cognitive searching” activities. Iconic mouth gestures demonstrate the adaptability of the face to represent physical properties of objects and events being described linguistically by the hands.
The use of facial movements as intentional communication appears to emerge within the same age range as intentional gestural pointing and vocalizations – about 8 to 10 months. Among the first to appear are visually directing negative affect displays to a specific receiver and voluntary smiles (Reference Campos, Campos and BarrettCampos, Campos, & Barrett, 1989; Reference Jones, Collins and HongJones, Collins, & Hong, 1991; Reference Jones and HongJones & Hong, 2001). The ability to control and use facial movements as social-directed messages marks the point at which they leave their biological cradle and become an integrated component of language use. Facial gestures are meaningful symbolic acts and a valued space should be reserved for “gestures above the neck.”






