Gestures and Language

Part III - Gestures and Language

Published online by Cambridge University Press: 01 May 2024

Edited by

Alan Cienki

Show author details

Alan Cienki: Affiliation:
Vrije Universiteit, Amsterdam

Book contents

Summary

A summary is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.

Information

Type: Chapter
Information: The Cambridge Handbook of Gesture Studies , pp. 333 - 474

DOI: https://doi.org/10.1017/9781108638869 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2024

Part III Gestures and Language

13 The Role of Gesture in Debates on the Origins of Language

1 Introduction

Proposals that the semiotic system of gesture played a pivotal role in the evolution of language have been, and continue to be, influential (Reference ŻywiczyńskiŻywiczyński, 2018).* This statement, however, illustrates not so much a specific theory but an axis of debate in the field of language origins, along which “gesture-first” proposals traditionally compete with “speech-first” theories (Reference FitchFitch, 2010). Below this general characterization, there are many differences between gestural origin theories, including the understanding of basic notions such as the concept of gesture itself. Thus, we begin this chapter by discussing definitional issues concerning gesture, with differences both within and across fields. In the context of language origins, such differences in defining “gesture” have profound consequences for formulating and evaluating theories. Since our aim is to survey a considerable number of contemporary gestural theories of the origin of language, we do so by using a new typology, developed with the help of cognitive semiotics (Reference Sonesson, Ziemke, Zlatev and FrankSonesson, 2007 ; Reference Zlatev and TrifonasZlatev, 2015a), and in particular the notions of semiotic system and polysemiotic communication (Reference Zlatev and PetersStampoulidis, Bolognesi, & Zlatev, 2019; Reference Zlatev and PetersZlatev, 2019; Reference Zlatev, Devylder, Defina, Moskaluk and AndersenZlatev et al. 2023). In very brief terms, a semiotic system is a combination of signs or signals of a particular type, defined by characteristic properties, and the interrelations between these signs/signals. Universal human sign systems are language, gesture and depiction (the latter understood as forming marks on a two-dimensional surface that resemble three-dimensional or imaginary referents). Signal systems, for example spontaneous facial expressions and non-linguistic vocalizations, are under less voluntary control than sign systems (Reference Zlatev, Żywiczyński and WacewiczZlatev, Żywiczyński, & Wacewicz, 2020). Combinations of different sign and/or signal systems form the basis for polysemiotic communication or polysemiosis.

Applying this conceptual apparatus to gestural theories of language origins, the basic distinction relates to the question of whether the semiotic system of gesture played an exclusive role in early stages of language evolution or whether other semiotic systems were involved as well. A positive answer to the first question implies monosemiotic theories, which we review in Section 3; a positive answer to the latter implies polysemiotic theories, to which we turn in Sections 4 and 5. The latter are more commonly known as “multimodal theories,” but we avoid this term due to its excessive ambiguity.Footnote ¹ Importantly, we distinguish between two kinds of polysemiotic theories of language evolution: (a) equipollent, where language and gesture are considered equally prominent from the onset, and (b) pantomimic, where gesture played the main but not exclusive role in breaking from predominantly signal-based to sign-based communication.

After reviewing the evidence for each of the three kinds of gestural origin theories, we conclude that the last kind, namely pantomimic theories, appears to offer the most viable account of language origins. Further, it has the benefit of accounting for the evolution of polysemiotic communication as a whole.

2 Different Ways of Understanding Gesture

The work of Adam Kendon is hugely influential within gesture studies, and so is his characterization of gesture as bodily “movements that partake of […] features of manifest deliberate expressiveness to an obvious degree” (Reference Kendon2004, p. 14). “Expressiveness” implies that the communicator intends something by a gesture; “deliberate,” that this is done with a communicative intent; and “manifest,” that these features are to be discernible by an audience. As the notion of deliberateness is controversial, and is denied, for example, by Reference McNeillMcNeill (2012), gestures may be more generally defined as “expressive movements performed by the hands, the head, or any other part of the body, and perceived [predominantly] visually” (Reference Zlatev, MacWhinney and O’GradyZlatev, 2015b, p. 458, emphasis in original). How are gestures to be distinguished from other visually perceived communicative movements, such as adaptors and facial expressions (e.g. Reference Żywiczyński, Wacewicz and OrzechowskiŻywiczyński, Wacewicz, & Orzechowski, 2017)? An intuitive proposal of how to delineate this “lower boundary” of gesture is presented by Reference AndrénAndrén (2010), who argued that two dimensions of gestural meaning need to be distinguished: communicative explicitness (CE) and representational complexity (RC). Within each, there are three different levels and at least one of the two dimensions should be on the third level for the bodily act to count as a gesture. For example, the wave-bye gesture does not represent a goodbye but performs it, so it lacks the highest level of RC. Yet, it is typically produced with the communicative intent to be understood as taking farewell, and thus has the highest level of CE. On the other hand, an act of symbolic play performed in solitude would also qualify as a gesture, since by definition it is on the highest level of RC (if it is to be “symbolic”), even though it lacks a communicative intent completely. Most iconic (i.e. resemblance-based) gestures used in face-to-face communication would have the highest levels of both CE and RC.

Other researchers of human gestural communication adopt a narrower definition of gestures. Kendon’s approach encompasses so-called orofacial gestures: communicative movements of facial muscles and the tongue other than articulatory movements of speech; These would not be regarded as gestural by many (e.g. Reference Orzechowski, Wacewicz, Żywiczyński, Cartmill, Roberts, Lyn and CornishOrzechowski, Wacewicz, & Żywiczyński, 2014), though see Chovil (this volume). Reference McNeillMcNeill (1992, Reference McNeill2012) would further limit (prototypical) gestures to spontaneous and idiosyncratic hand and arm movements that are functionally integrated with speech, which is a very limited definition, based on his theory of the nature of gesture (Section 4.2).

On the other hand, within primatology, gestures are typically understood even more broadly than they are by Kendon, as concerning any communicative behaviors that involve body posture, facial expressions, and manual movements, and are mainly perceived visually (e.g. Reference Pika, Zlatev, Racine, Sinha and ItkonenPika, 2008a; Reference TomaselloTomasello, 2008).Footnote ² Voluntary control (Reference Byrne, Cartmill, Genty, Graham, Hobaiter and TannerByrne et al., 2017) and the related properties of flexibility (Reference Bard, Maguire-Herring, Tomonaga and MatsuzawaBard, Maguire-Herring, Tomonaga, & Matsuzawa, 2019) and plasticity (Reference Pollick and de WaalPollick & de Waal, 2007) are commonly used to differentiate gestures from other behaviors. For example, to initiate grooming, infant chimpanzees indicate the place they want to be groomed at, first by looking at this spot and then touching it (Reference Bard, Maguire-Herring, Tomonaga and MatsuzawaBard et al., 2019). This example also illustrates the receiver-directedness of ape gestures, further underlined by persistence: “[I]f the recipient’s response is not satisfactory, […] the signaller […] repeat[s] the produced signal” (Reference Fröhlich, Sievers, Townsend, Gruber and van SchaikFröhlich, Sievers, Townsend, Gruber, T., & van Schaik, 2019, cf. Reference Leavens, Hopkins and ThomasLeavens, Hopkins, & Thomas, 2004; Reference Tomasello, George, Kruger, Jeffrey and EvansTomasello, George, Kruger, Jeffrey, & Evans, 1985) and elaboration: If the recipient fails to respond in a satisfactory way, a gesture different from the original one may be used.

These findings testify to significant cognitive and semiotic complexity in ape gestures, possibly differentiating them from ape vocalizations. At the same time, this evidence does not amount to full-fledged communicative intent, which implies not just “intention” in the sense of volition, but a Gricean (second-order) intention that the addressee recognize the primary intention (Reference Zlatev, Madsen, Lenninger, Persson, Sayehli, Sonesson and van de WeijerZlatev et al., 2013), and it is not clear if apes can either produce or recognize such gestures. Further, unlike human gestures with the highest form of representational complexity (see above), ape gestures hardly refer to an object that is distinct from the communicator or the addressee, that is, they are “dyadic” (me–you) rather than “triadic” (me–you–referent) (Reference HurfordHurford, 2007 ), which prevents them from being full-fledged signs (Reference Zlatev, Żywiczyński and WacewiczZlatev et al., 2020).

As pointed out in the introduction to this chapter, these definitional differences imply that any comparison and discussion of gestural origins theories needs to proceed with care, and preferably with the help of a uniform conceptual apparatus, such as the one we propose below. In each of the following three sections, we review and evaluate one of the three basic types of gestural theories: monosemiotic, polysemiotic-equipollent, and polysemiotic-pantomimic.

3 Monosemiotic Gestural Theories

3.1 General Features

Monosemiotic gestural theories claim that early stages of language evolution exclusively depended on gesture. They share the postulate of monosemiosis with vocal, or “speech-first,” theories, and have complementary difficulties (Section 3.7). The rise of modern language-evolution research in the latter part of the twentieth century was marked by the formulation of gestural hypotheses of language origin, starting with the work of Reference Hewes, Andrew, Carini, Hackeny, Gardner, Kortland and WescottGordon Hewes and colleagues (1973). This both contributed to the methodology of language-evolution research, such as the use of converging evidence (Reference Żywiczyński and WacewiczŻywiczyński & Wacewicz, 2019, pp. 122–124), and defined dominant positions in the field. In the following subsections we review a number of such monosemiotic gestural theories, before proceeding with evaluation.

3.2 Hewes’ Gestural Primacy Hypothesis

Hewes put forward a theory of a gestural protolanguage, termed the Gesture Primacy Hypothesis (Reference Hewes, Andrew, Carini, Hackeny, Gardner, Kortland and WescottHewes et al., 1973), and suggested how this protolanguage transitioned into speech (Reference HewesHewes, 1977a). Hewes’ conception of protolanguage can be described as synthetic (as opposed to holistic) (Reference Żywiczyński and WacewiczŻywiczyński & Wacewicz, 2019, p. 187) as it takes protolanguage to have consisted of gestures as quasi-lexical units standing for objects and actions that could be combined into sequences, but, overall, lacking syntactic and morphological structure. Some of Hewes’ arguments focus on the observation that in naturally occurring conversation, gestures usually accompany speech; but the bulk of the evidence summoned by him in support of his theory can be summed up as follows:

Anthropological data. Hewes analyzed logs of European travelers, who were apparently able to communicate with indigenous peoples by means of gestures about even highly complex topics, such as topography, dangers that awaited newcomers, politics, or religion.
Comparative data. After multiple failures to teach non-human apes elements of spoken language (Reference FurnessFurness, 1916; Reference Hayes and HayesHayes & Hayes, 1952; Reference Kellogg and KelloggKellogg & Kellogg, 1933), there followed several successful attempts to teach them visually perceived forms of communication, including elements of American Sign Language (ASL) (Reference Gardner and GardnerGardner & Gardner, 1969, Reference Gardner, Gardner, Schrier and Stollnitz1971; Reference PremackPremack, 1970; Reference Premack, Premack, Schiefelbusch and LloydPremack & Premack, 1974). Based on these findings, Hewes argued that there is continuity between ape and human gestural behaviors, and discontinuity in vocal behaviors (Reference HewesHewes, 1977a, Reference Hewes and Rumbaugh1977b; Reference Hewes, Andrew, Carini, Hackeny, Gardner, Kortland and WescottHewes et al. 1973).
Neurocognitive data. Hewes appealed to evidence from neuropathology that indicated relative immunity of gestural communication in language-related disorders (e.g. Reference HewesHewes, 1977a, pp. 132–133), and to research on handedness and lateralization, where he drew attention to the fact that right-hand dominance for manual actions coincides with the left-hemisphere dominance for language processing and production (cf. Reference Knecht, Dräger, Deppe, Bobe and LohmannKnecht et al., 2000).
Signed languages. Hewes claimed that sign(ed) languages are universal in the sense that they can appear from scratch thanks to a high degree of iconicity (Reference HewesHewes, 1977a), which has since been confirmed by studies into emerging signed languages (Reference Meir, Sandler, Padden, Aronoff, Marschark and SpencerMeir, Sandler, Padden, & Aronoff, 2010a; Reference Senghas and CoppolaSenghas & Coppola, 2001; Reference Senghas, Kita and ÖzyürekSenghas, Kita, & Özyürek, 2004).

The combined force of this evidence led Hewes to the conclusion that gestural protolanguage had constituted the first form of homininFootnote ³ communication on the path to modern language. In fact, he was the first to use the term “protolanguage” in the technical sense as a transitional system between, on the one hand, the signal-based communication of apes, and on the other, human language. The visionary character of Hewes’ project is also testified to by the fact that it relied on areas of research that have been used as sources of evidence in language-evolution debates until now.

When evaluating the Gestural Primacy Hypothesis, it should be noted that although Hewes did not expressly commit himself to a narrow understanding of gesture as the communicative action of hands and arms (see Section 2), some of his key arguments indicate that he envisaged protolanguage as primarily relying on manual gesture. For example, such is the import of his discussion of the relation between handedness and language, but also the way he used signed languages and comparative studies favors a narrow definition of gesture. For example, the manual character of protolinguistic communication directly motivates Hewes’ explanation of the so-called volar depigmentation of the inner part of the hand in non-Caucasian populations: that it may be an adaptation for gestural communication, as it increases the visibility of the hands in the dark (Reference Hewes, Lock and PetersHewes, 1996). Apparently, he ignored the fact that volar depigmentation also affects the sole of the foot.

3.3 Stokoe and Research in Signed Languages

The second part of the twentieth century saw the beginning of modern research on signedFootnote ⁴ languages, founded on the postulate they are not qualitatively different from spoken languages (Reference EmmoreyEmmorey, 2002; Reference StokoeStokoe, 1960; Reference Stokoe, Casterline and CronebergStokoe, Casterline, & Croneberg, 1965). The pioneers of signed-language linguistics had a keen interest in language origins. For example, early works emphasize that gesture has a greater iconic potential than vocalization, which makes gesture a better candidate for a communicative system based on signs rather than signals (Reference Zlatev, Żywiczyński and WacewiczZlatev et al., 2020) and hence is a likely starting point for the evolution of language (Reference StokoeStokoe, 1960). Using insights from the emergence of signed languages, Stokoe argued that the spatial character of gestures could have facilitated the emergence of rudimentary syntax, as gestures are able to represent not only an action but also the agent who performs it and the patient affected by it (Reference Armstrong, Stokoe and WilcoxArmstrong, Stokoe, & Wilcox, 1995; Reference StokoeStokoe, 1991). It was proposed that nouns were derived from the shape and position of hands and arms, verbs from their actions, and that, collectively, they gave rise to prototypical sentences (Reference Armstrong and WilcoxArmstrong & Wilcox, 2007).

Hence, Stokoe and his collaborators proposed theories consisting of the following evolutionary stages: (a) gestural protolanguage with holistic and iconically motivated signs, (b) gestural language with discrete but iconically motivated signs and combinatorial syntax, (c) the transition into speech, which promoted conventionalization and growth of syntactic complexity.

3.4 Corballis’ Manual Protolanguage

Corballis’ position on the gestural origin of language is somewhat ambivalent. On the one hand, he understands gesture broadly as comprising a heterogeneous variety of forms of bodily action: from spontaneous hand movement accompanying speech, to glances, postures, and even orofacial gestures of the mouth area (Reference CorballisCorballis, 2002, Reference Corballis2003). On the other hand, most of his arguments for the gestural origin of language focus on manual gesture. In presenting them, Corballis organizes and upgrades many ideas put forward by Hewes and Stokoe. For example, he reviews the evidence coming from attempts to teach non-human apes manual-visual and vocal forms of communication, but also points to the fact that primates in general, and apes in particular, acquire manual skills with relative ease. These include not only communicative but also praxic skills (mainly, related to tool use), which contrasts with the difficulty with which they acquire vocal skills (Reference CorballisCorballis, 2012). He also brings up the problem of the neural infrastructure of language, focusing on the nature of the primate mirror neuron system and its homology with language circuits in the human brain (Reference CorballisCorballis, 2003). In his view, this fact is able to explain both the relative success in teaching apes to communicate gesturally rather than vocally and the correlation between handedness and cerebral asymmetry for language (Reference CorballisCorballis, 2012). On this basis, he argues that “manual gesture [is] a natural communication medium” (Reference CorballisCorballis, 2013, p. 203) for primates and, hence, the evolution of language must have begun with some form of manual protolanguage.

In relation to modern human communicative capacity, he uses the standard lines of argumentation for the gestural origin of language, referring to (a) the use of the hands as the most natural way to represent events in space and time in the absence of a shared code, and (b) the ready invention of sophisticated signed languages by the deaf (Reference CorballisCorballis, 2013). Taking stock of these two points as well as the postulate about the continuity of ape and human gestural behaviors, Corballis posits a scenario according to which language began with an internal capacity to engage in so-called mental time travel – “the mental reconstruction of personal events from the past (episodic memory) and the mental construction of possible events in the future” (Reference Suddendorf and CorballisSuddendorf & Corballis, 1997, p. 133). For him, the adaptive pressure for the evolution of this ability came with the uncertain and dangerous ecology of the Pleistocene era, which required long-term planning and a suite of other skills (Reference CorballisCorballis, 2019). The type of gesture hominins inherited from the Last Common Ancestor with apes was naturally adept at communicating sequences of past and future events. From this, Corballis envisages the gradual evolution of a communicative system, in analogy to the historical emergence of signed languages: It first relied on pantomime – understood as holistic iconic gesture – and later developed conventional signs as well as syntax (Reference CorballisCorballis, 2019). In this regard, his scenario resembles the trajectory of language evolution drawn by Stokoe and colleagues.

3.5 Arbib’s Mirror Neuron Hypothesis

The Mirror Neuron Hypothesis of Reference ArbibArbib (2005, Reference Arbib2012, Reference Arbib2016) is one of the most elaborate and empirically best-documented current theories of language evolution. As its name implies, Arbib envisages the mirror neuron system, and in particular the subsystem involved in grasping, as the basis for his evolutionary scenario. These evolved capacities for complex action recognition and complex imitation allowed for the imitation of aspects of observed movements even if they are not part of the imitator’s current stock of actions, thus introducing new variants of actions into the “praxicon,” an individual’s repertoire of actions. When coupled with communicative intentions (according to Reference ArbibArbib, 2016, already present in non-human apes), this developed into the communicative system of pantomime, which Arbib understands as holistic, impromptu gesture, which “allows the transfer of a wide range of action behaviors to communication about action and much more – whereby, for example, an absent object is indicated by outlining its shape or miming its use” (Reference ArbibArbib, 2012, p. 177). The impromptu character of pantomime resulted in its low replicability, as each pantomimic sign had to be invented and interpreted anew. The pressure for communicative effectiveness brought about the conventionalization and segmentation of pantomime, ultimately leading to the emergence of gestural “protosign.” In this way, holistic pantomime was transformed into a synthetic gestural protolanguage, as in the theories reviewed above.

Unlike conceptions of pantomime derived from mimesis theory (see Section 5), Arbib’s theory is consistently monosemiotic: the early stages of language evolution are limited to the semiotic system of gesture: first, holistic pantomime, and then conventionalized gestural protosign. When describing gestural protolanguage, Arbib is not as emphatic as Hewes, Stokoe, or Corballis about the importance of manual gesture at early stages of language emergence. However, the selection of the starting point of language evolution – the Mirror Neuron System for grasping and manual praxic actions attributed to our Last Common Ancestor with monkeys (LCA-m) – and his account of the transition of protolanguage into speech both imply a key role of manual gesture.

3.6 Tomasello’s Pointing and Pantomime

Reference TomaselloTomasello (2008, Reference Tomasello2009) refrains from articulating a detailed scenario of the evolution of language. The significance of his work for language origins lies in rich empirical evidence of a developmental and comparative nature, which focuses on the emergence of pro-sociality and shared intentionality as prerequisites for language. In terms of communication, uniquely human forms of social cognition are realized according to Tomasello in two kinds of gesture: pointing and pantomime. The first is manifest in informative-declarative pointing: Pointing performed with the intention of providing the recipient with new information. “Pantomiming,” the term Tomasello uses preferentially to “pantomime,” comprises iconic manual or whole-body gestures, which are used “(i) to indicate that this is the action I want you to perform, or that I intend to perform myself, or that I want to tell you about; and (ii) to request or otherwise indicate an object that ‘does this’ or an object that ‘one does this with’” (Reference TomaselloTomasello, 2008, p. 67). Pantomiming is then capable of expressing an open-ended range of meanings, which are action-orientated and displaced from the here-and-now. Thus, such gestures do not have internal morphological structures analyzable into discrete component parts. Similarly, pantomimes themselves are not replacements for words but correspond to larger units that are at least proposition-size. Words can only complete communicative acts by being combined with other words, but pantomiming, in Tomasello’s account, can serve as a complete communicative act with its own illocutionary force.

3.7 Evaluation

Monosemiotic theories tend to focus on the three lines of argumentation and corresponding evidence: (a) the gestural and vocal communication of non-human apes, (b) the expressive potential of gesture in contemporary interpersonal communication, and (c) neural links, such as those between hand and mouth. Let us consider each of these in turn.

As pointed out in Section 2, ape gestures are generally understood to be flexible, learned and volitionally produced, unlike ape vocalizations, which are often taken to be instinctive, species-specific, and involving little or no learning. Hence, supporters of gestural theories conclude that there is continuity between ape gesture and human communicative behaviors, and discontinuity between ape vocalizations and human speech. This argument has been backed up by ethological research documenting both the flexibility of ape gestures and the largely inflexible character of ape vocalizations, such as chimpanzee food cries (e.g. Reference CorballisCorballis, 2002; Reference DeaconDeacon 1997; Reference Hewes, Andrew, Carini, Hackeny, Gardner, Kortland and WescottHewes et al., 1973; Reference Scherer, Johnstone, Klasmeyer, Davidson, Scherer and GoldsmithScherer, Johnstone, & Klasmeyer, 2003; but see Reference Fröhlich, Sievers, Townsend, Gruber and van SchaikFröhlich et al., 2019). However, more recent ethological data have complicated this picture, as many primate calls demonstrate “audience effects,” whereby the intensity and rate of calling is regulated by situational context. For example, the manner of producing alarm calls depends on whether somebody is present or not, including the presence of specific recipients (Reference Crockford, Wittig, Mundry and ZuberbühlerCrockford, Wittig, Mundry, & Zuberlbühler, 2012; Reference Crockford, Wittig and ZuberbühlerCrockford, Wittig, & Zuberbühler, 2017). Food calls differ when the quantity of food is large or small (Reference Brosnan and De WaalBrosnan & de Waal, 2002), and are “associated with audience checking, gaze alternation and goal persistence” (Reference Fröhlich, Sievers, Townsend, Gruber and van SchaikFröhlich et al., 2019, p. 7). Primate vocalizations have also been shown to involve “functional reference,” productivity (devoid of compositionality) or tactical deception (Reference SlocombeSlocombe, 2011). Finally, there is growing evidence that naturally living apes combine gestures with vocalizations, touch, and haptic behaviors (Reference Fröhlich, Sievers, Townsend, Gruber and van SchaikFröhlich et al., 2019).

However, most researchers still accept that there is a qualitative difference between ape gestural and vocal behaviors, although it may not be as categorical as once assumed (Reference Bard, Bakeman, Boysen and LeavensBard, Bakeman, Boysen, & Leavens, 2014; Reference Hobaiter and ByrneHobaiter & Byrne, 2014; Reference PikaPika, 2008b). The new research does not disqualify monosemiotic gestural theories, but lends stronger support for polysemiotic approaches, which posit that vocalization may have played a non-negligible role in the evolutionary emergence of language.

Arguments about the expressive potential of gesture concentrate on differences between human gestures and the gestural behaviors of non-human apes, the most important being the ability of human gestures to denote objects, actions, and relations between them, which directly bears on the triadic nature of human gestures, in contrast to the dyadic design of ape gestures (see Section 2). There are two main categories of gestures that demonstrate human-specificity in this regard: pointing and iconic gestures. A number of proponents of gestural theories have considered the emergence of pointing an important stepping-stone toward language. Tomasello’s extensive work on pointing demonstrates that it represents an important watershed in the evolution of human cognition and communication: Non-human primates do not point to distal entities in their environments (Reference TomaselloTomasello, 2000, p. 358), while human infants as early as in the twelfth month of life perform spontaneous, informative pointing aimed at sharing attention with another person (Reference Liszkowski, Carpenter, Henning, Striano and TomaselloLiszkowski, Carpenter, Henning, Striano, & Tomasello, 2004). Tomasello holds that this difference results from the lack of cooperative motivations in non-human apes, but of equal importance is the semiotic quality of pointing: Like other human gestures, pointing is triadic in the sense that the communicator intends to bring the attention of the addressee to a relevant object, and intends for the addressee to recognize this, rather than just to look in a given direction (Reference ZlatevZlatev, 2008).

The other watershed in the evolution of language highlighted in gestural theories is that of iconic gestures.Footnote ⁵ While there is very little evidence that iconic gesture is present in gestural repertoires of non-human apes (Reference PerlmanTanner & Perlman, 2017), it appears in different forms of human gestural communication, such as co-speech gestures (Reference McNeillMcNeill, 1992), pantomiming events (Reference Zlatev, Wacewicz, Żywiczyński and van de WeijerZlatev, Wacewicz, Żywiczyński, & van de Weijer, 2017) and even in signed languages (Reference Klima and BellugiKlima & Bellugi, 1979). Arguments used in gestural theories that appeal to the iconic-expressive potential of gesture have been corroborated by experimental-semiotic research, which investigates how people develop novel communication systems in the absence of a shared code (Reference GalantucciGalantucci, 2009). Studies comparing spontaneous gestures and vocalizations conclusively show that gesture is a much better basis for developing a sign-based communication system than vocalization (e.g. Reference Fay, Arbib and GarrodFay, Arbib, & Garrod, 2013; Reference Fay, Lister, Ellison and Goldin-MeadowFay, Lister, Ellison, & Goldin-Meadow, 2014). Interestingly, such studies show that the combined use of spontaneous gesture and non-linguistic vocalization is not significantly better than the use of gesture alone (Reference Zlatev, Wacewicz, Żywiczyński and van de WeijerZlatev et al., 2017). Although there are many studies showing that vocalization does have iconic potential (e.g. Reference Ahlner and ZlatevAhlner & Zlatev, 2010; Reference Blasi, Wichmann, Hammarstörm, Stadler and ChristiansenBlasi, Wichmann, Hammarstörm, Stadler, & Christiansen, 2016; Reference Imai and KitaImai & Kita, 2014; Reference Lockwood and DingemanseLockwood & Dingemanse, 2015; Reference PerlmanPerlman, 2017; Reference Perlman and CainPerlman and Cain, 2014), the conclusion that iconic gesture is superior to vocalization for communication based on signs, in the semiotic sense of the term, is now well established in the literature (cf. Reference ŻywiczyńskiŻywiczyński, 2020). This conclusion has gained further support from research on emerging signed languages (Reference Lepic, Börstell, Belsitzman and SandlerLepic, Börstell, Belsitzman, & Sandler, 2016; Reference Meir, Sandler, Padden, Aronoff, Marschark and SpencerMeir et al., 2010a; Reference Meir, Aronoff, Sandler, Padden, Scalise and VogelMeir, Aronoff, Sandler, & Padden, 2010b; Reference Sandler, Meir, Padden and AronoffSandler, Meir, Padden, & Aronoff, 2005; Reference Stamp, Sandler, Roberts, Cuskley, McCrohon, Barceló-Coblijn, Feher and VerhoefStamp & Sandler, 2016). Such findings lend support to gestural origins theories in general, and are also consistent with polysemiotic accounts discussed in Sections 4 and 5 .

A third kind of evidence used in monosemiotic gestural theories derives from neuroscience. The most elaborate one of these is Arbib’s Mirror Neuron Hypothesis. To account for the problem of the transition from “protosign” to “protospeech,” Arbib proposes a mechanism of colateralization, whereby the activity of the area responsible for manual production gradually spilt over into the neighboring areas responsible for vocalization. In this way, the brain came to support protospeech through the invasion of the vocal apparatus by collaterals from the protosign system (Reference Rizzolatti and ArbibRizzolatti & Arbib, 1998; cf. Reference ArbibArbib, 2005 and Reference ArbibArbib, 2006). This colateralization hypothesis is supposed to explain not only a substantial degree of coupling of gestural and orofacial behaviors, but also segregation between them, manifested in dissociations between limb apraxia, speech apraxia, and aphasia (Reference ArbibArbib, 2006 ; Reference Vingerhoets, Alderweireldt, Vandemaele, Cai, Van der Haegen, Brysbaert and AchtenVingerhoets et al., 2013). Arbib’s account is interesting but is only able to suggest a neural mechanism that could have brought about the transition, while it fails to identify any evolutionary pressure responsible for it (Reference FitchFitch, 2010).

The issue of hand–mouth neural links belongs to one of the standard arguments used in gestural hypotheses. It rests on the assumption, partly corroborated by neurocognitive research, that hand and orofacial movements are governed by the same, phylogenetically old brain circuits (Reference Żywiczyński and WacewiczŻywiczyński & Wacewicz, 2019). Such links appear to be rooted in mouth feeding behaviors (Reference Gentilucci and CorballisGentilucci & Corballis, 2006), and are able to explain motoric phenomena like differences in mouth opening depending on whether we hold a small or large object when speaking (Reference Gentilucci and CorballisGentilucci & Corballis, 2006) or the activity of articulators resultant from specific manual movements (Reference Higginbotham, Isaak and DomingueHigginbotham, Isaak, & Domingue, 2008). All this evidence can be seen as supporting gestural theories, but is by no means limited to the monosemiotic type, and is arguably even stronger support for the polysemiotic theories discussed in the following sections.

An inevitable feature of monosemiotic gestural theories is “the transition problem” (cf. Reference Żywiczyński and WacewiczŻywiczyński & Wacewicz, 2019): If language emerged as a gestural phenomenon, why is it that all modern languages are predominantly vocal, with the exception of signed languages? This problem is identified as the key challenge to monosemiotic gestural theories both by its proponents (Reference ArbibArbib, 2005; Reference CorballisCorballis, 2002; Reference Hewes, Andrew, Carini, Hackeny, Gardner, Kortland and WescottHewes et al., 1973) and critics (Reference BurlingBurling, 2005; Reference FitchFitch, 2010; Reference MacNeilageMacNeilage, 2008). The only viable strategy to tackle this problem is to point to potential selection pressures facilitating the development of vocal communication despite its original gestural basis. There are a variety of candidates for such selection pressures, the best-known of which are the following:Footnote ⁶

Speech enables communication with poor visibility or in the dark (Reference RousseauRousseau, 1755/2008).
The voice attracts attention more effectively (Reference RousseauRousseau, 1755/2008).
Speech does not engage hands, thereby allowing their use in practical tasks – work or carrying objects during communication (e.g. Reference Carstairs-McCarthyCarstairs-McCarthy, 1996).
Speech allows one to teach manual activities, such as tool making (Reference Armstrong and WilcoxArmstrong & Wilcox, 2007).
The acquisition of speech begins in the human foetal life, which grants it a developmental advantage (Reference Hewes, Lock and PetersHewes, 1996).
Speech is more economical, as articulatory movements require less time and energy than hand, arm, and body movements (e.g. Reference Knight, Knight, Studdert-Kennedy and Hurford (Eds.)Knight, 2000).
Vocal communication facilitates continuous monitoring of the location of a child, which might have been important in hominins due to their hunter-gatherer lifestyle, and with lack of constant physical contact between mother and child, as is the case in the great apes (Reference FalkFalk, 2009).
Voice is directed at everyone and not only to a specific individual (Reference TomaselloTomasello, 2008).

While suggestive and deserving more research, it is generally accepted that these proposals can neither individually nor jointly resolve the transition problem for monosemiotic gesture theories. In general, they point to one or another deficit of the visual channel, and could thus be used as arguments for “speech first” theories. Reference FitchFitch (2010) criticizes the majority of the arguments listed above, as it is easy to find a counter-argument in each case. Gestures may not be visible in the dark, but they are visible by the firelight, and can be used in the tactile modality, as done by visually impaired signers. Further, the visual channel gains an advantage in long-distance or noisy communication, and successfully attracts attention in these situations. Although speech frees the hands and arms, gestures free the mouth, which was very significant in the Palaeolithic period, given that fossil data show that hominins intensively used teeth to chew hard foods and perform various mechanical operations. Importantly, the argument concerning the energetic effectiveness of speech is convincing only to the extent that speech and gesture are independent of one another, as in the pantomimic theories discussed in Section 5, but not according to the equipollent theories (Section 4), since if speech is (almost) necessarily accompanied by gesticulation, this way of communicating would be at least equally, if not more, costly than gestures alone.

Further issues could be brought against theories that trace the beginnings of language exclusively to the sign system of gesture. Regarding Reference Armstrong and WilcoxArmstrong and Wilcox’s point (2007, see above), in teaching manual activities, verbal instructions are less effective than demonstration or physical guidance of the learner’s hands. One thing that is problematic for the suggestion of Reference Hewes, Lock and PetersHewes (1996) is the developmental data showing equal paces of spoken and signed language acquisition. Reference FalkFalk’s (2009) idea of the vocal contact between mother and child does not require speech but just the emission of sound. Reference TomaselloTomasello’s (2008) point is compelling as far as open information sharing is concerned, but gesture allows more accurate choice of addressee, which is important in less cooperative contexts and, further, is less at risk of being discovered by enemies and predators (Reference Wacewicz, Żywiczyński, Smith, Smith and Ferrer-i-CanchoWacewicz & Żywiczyński, 2008).

There have been attempts to address the transition problem not by indicating adaptive pressures that could have affected the changeover from gesture to speech, but by highlighting the interaction between the two semiotic systems. Reference Goldin-Meadow, Smith, Smith and Ferrer-i-CanchoGoldin-Meadow (2008) points out that in modern face-to-face communication, (a) combinatorial-segmented information is usually transferred by speech, while (b) holistic-imagistic information is usually transferred by gesture. Further, while gestures can become lexicalized and grammaticalized to communicate (a), as in signed languages, speech is less predisposed to perform (b). This proposal can be formulated also along the following lines (Reference BrownBrown, 2012): Speech is less capable of iconic representation, for which there is solid evidence, as pointed out above, and this may have been the reason for the transfer from the hypothetical gestural protolanguage toward speech when the need for a larger vocabulary and more combinatorial structure arose. This argument is, in our view, the most promising for explaining the transition, but it should be noted that it presumes that at least some vocalization was present already for the shift from gesture to speech to commence. Hence, it is more properly seen as an argument in favor of pantomimic theories (see Section 5).

To sum up, there seems to be no convincing solution to “the transition problem” for purely monosemiotic gestural theories, given that the proposed solutions face counter-arguments or are underpowered in terms of evolutionary logic. These difficulties have contributed to a growing popularity of polysemiotic theories, which we review in the following sections. Two different kinds of these can be distinguished, which derive from different research backgrounds. Equipollent theories are primarily affiliated with modern gesture studies (e.g. Reference KendonKendon, 2004; Reference McNeillMcNeill, 1992, Reference McNeill2005), while pantomimic theories derive from mimesis theory (Reference DonaldDonald, 1991, Reference Donald2001; Reference ZlatevZlatev, 2008). The two positions share the assumption that gesture was not the only semiotic system at the origin of human language, but there are important differences between them regarding the role of vocalization, the evolutionary trajectory to modern-day human communication, and the end point of language evolution. Hence, we present and discuss them in two separate sections.

4 Equipollent Polysemiotic Theories

4.1 General Features

The defining property of equipollent origin theories is the postulate of an early integration of gesture and vocalization, with the foundational assumption that gesture and speech form two sides of a single cognitive-communicative system (Reference McNeillMcNeill, 2005) or process (Reference Kendon, Dor, Knight and LewisKendon, 2014). It should be noted that while being representative, the views of these scholars are not coextensive with that of gesture studies. Even early research often viewed gesture as a functionally versatile category, including Reference EfronEfron’s (1941) study of “illustrators,” “batons,” and “ideographs,” and Reference Ekman and FriesenEkman and Friesen’s (1969) study of “regulators.” Further, there was much interest in pantomime (e.g. Reference KendonKendon, 1992; Reference Laudanna and VolterraLaudanna & Volterra, 1991), emblematic gestures and their cultural variability (e.g. Reference KendonKendon 1995, Reference Kendon2004; Reference Poggi, Zomparelli and PoggiPoggi & Zomparelli, 1987), and adaptive movements reflecting bodily needs, psychological stress, and arousal (e.g. Reference Dittman, Siegman and PopeDittman, 1972; Reference Ekman and FriesenEkman & Friesen, 1969; Reference Freedman, Seigman and PopeFreedman, 1972; Reference WaxerWaxer, 1977). It is only more recently that attention seems to have shifted specifically to co-speech gestures: gestures that are temporally and semantically integrated with speech (Reference KendonKendon, 2004; Reference McNeillMcNeill, 1992). It is from this more recent approach within gesture studies that the most prominent equipollent language-origin theories derive.

4.2 McNeill’s Growth Point

Probably the best known of this class of theories is the scenario proposed by McNeill, which zooms in on the key notion in his account of gesture, the Growth Point (Reference McNeillMcNeill, 1992, Reference McNeill2005, this volume). In McNeill’s model, speech and gesture are coexpressive, but at the same time semiotically distinct, and responsible for the transmission of different aspects of the message: speech for propositional content and gestures for imagistic content. According to McNeill, the semantically most prominent element of the utterance comes at the stroke (i.e. the most pronounced phase) of a gesture. In this way, the Growth Point, the basic unit of thinking, becomes externalized.

This idea is also central to McNeill’s theory of language evolution, the critical moment of which is the integration of gestural and vocal communication, both at the level of cognition and expression (Reference McNeillMcNeill, 2012). The claim is that language originated from the coming together of vocalization and gesture to form a propositional-imagistic dialectic. The critical element of this process was what he calls “twisting” of mirror neurons, whereby they began “to respond to one’s own gestures, as if they were from someone else” (Reference McNeillMcNeill, 2012, p. 65). To support this idea, McNeill paraphrases Reference Mead and MorrisMead (1934/1974): “[…] a gesture is a meaningful symbol to the extent that it arouses in the one making it the same response it arouses in someone witnessing it” (Reference McNeillMcNeill, 2012, p. 180). As this gestural system was coorchestrated with vocalization, the Growth Point emerged.

It should be noted that McNeill does not provide any evolutionarily realistic pressures that could have been responsible for any of these changes. In fact, he suggests two conflicting accounts of how speech started, deriving it either (a) from movements “originally for ingestion, [which] could be orchestrated in new ways, by gesture imagery” (Reference McNeill2012, p. 65), or (b) from the type of polysemiotic communication that is found in extant non-human apes, such as “chimp gestures with vocalization” (Reference McNeill2012, p. 195). Although McNeill refers to the “twisting” of mirror neurons and the voice-gesture integration as adaptations, he actually describes them as saltational leaps (Reference GoldschmidtGoldschmidt, 1982),Footnote ⁷ not unlike Chomsky’s idea of a lucky mutation giving rise to the operation of Merge, which first endowed our ancestors with a language of thought and then with the communicative use of it (Reference Berwick and ChomskyBerwick & Chomsky, 2015).

4.3 Kendon’s Languaging

Kendon also proposes that the emergence of language crucially depended on the integration of speech and gesture but opts for a more gradual and evolutionarily realistic scenario. First, his notion of a “speech-kinesic ensemble” is less categorical about both the definition of gesture (see Section 2) and the functional interplay between speech and gesture (Reference Kendon, Dor, Knight and Lewis2014). His proposal is formulated in terms of a “dynamic orchestration of communicative action,” which extends far beyond the category of co-speech gestures and embraces any deliberately communicative bodily movements (hence, the use of the term “kinesic”), including postural shifts, eye contact, or facial expressions (Reference KendonKendon, 2004 , Reference Kendon2011). Likewise, “speech” is not confined to a purely linguistic capacity responsible for the transmission of propositional information, but extends to vocal means of expressing emotional-imagistic content, as in the case of paralinguistic features (e.g. emotional prosody) or iconic vocal phenomena, as in ideophones, phonesthemes, reduplication or word lengthening (Reference KendonKendon, 2008). Languaging, in line with his notion of utterance, is a dynamic process (Reference KendonKendon, 2004) that involves a tight integration between speech and gestures to the effect that “words and gestures labor together to produce virtual objects that serve as conceptual expressions” (Reference Kendon, Dor, Knight and LewisKendon, 2014, p. 168; see also Kendon, this volume). Unlike McNeill, who postulates a strict functional division between these two semiotic systems, Kendon submits that the interaction between them is dynamic and flexible, with one or the other being dominant depending on the social or environmental context, including factors such as the level of noise (Reference Kendon, Tannen and Saville TroikeKendon, 1985).

Applying this framework to language origins, Kendon posits that the beginning of the human-specific communicative system of languaging was marked by the coming together of speech and gesture (Reference KendonKendon, 2004). This is a gradualistic scenario that identifies multifarious factors both for the emergence of speech and gesture, and for their conjunction. On the one hand, Kendon highlights the importance of “communicative action” for the emergence of language, whereby vocal behaviors and gestures acquired representational functions. As the hominin ecology favored close-distance face-to-face interaction, they came to be used jointly and in the course of time merged into one process of meaning-making (Reference Kendon, Dor, Knight and LewisKendon, 2014). On the other hand, he points to “the original praxic nature of language” (Reference KendonKendon, 2017, p. 165), when he speculates that many gestures derive from object-handling actions.

However, Kendon’s evolutionary theory does not spell out a clear solution to the problem of why speech is the dominant system in the ensemble of languaging, the analog to the transition problem for monosemiotic theories. Rather, Kendon appeals to various arguments, starting from more physiologically oriented ones, such as hypotheses concerning orofacial movements as an evolutionary “bridge” between manual gesture and speech, or the neural links between hand and mouth, appealing to Arbib’s Mirror Neuron Hypothesis (see Section 3.5), to more semiotic ones, such as the role of various forms of sound symbolism in bootstrapping vocal signs.

4.4 Evaluation

Equipollent theories stress the polysemiotic, or more specifically bisemiotic, nature of modern language as well as its evolution. They argue for tight integration between the semiotic systems of speech and gesture, to the effect that they form one overarching system, such as McNeill’s “imagery-language dialectic” or Kendon’s version of the notion of “languaging.” The postulate of gesture-speech equipollence in modern human communication is accompanied by what McNeill refers to as the equiprimordiality of gesture and speech, whereby language is thought to have begun with the integration of vocalization and gesture, which jointly assumed representational and communicative functions. In this regard, McNeill posits a saltationist scenario of the sudden emergence of Growth Point, while Kendon argues for a long-drawn and multicausal process of voice and gesture integration. This latter approach is similar to views articulated by Reference Goldin-Meadow, Smith, Smith and Ferrer-i-CanchoGoldin-Meadow (2008) and Reference SandlerSandler (2013), whose theories may also be regarded as equipollently polysemiotic.

Such theories effectively obviate “the transition problem” that burdens monosemiotic gestural theories (cf. Reference KendonKendon, 2011), which is an advantage. However, they struggle to explain the adaptive pressures that would have brought about a strong form of gestural-speech integration, as well as the dominant role of speech in hearing populations. As they take language to derive from semiotic systems that became integrated in the course of hominin evolution, both McNeill and Kendon reject the possibility that language and gesture may have had independent evolutionary trajectories. Their evolutionary theories can be seen as growing out of the strong uniformitarianFootnote ⁸ conviction that, as Kendon’s puts it, the way language is today “it must have been in its beginnings” (Reference Kendon2011, p. 103). But it is hardly uncontroversial to claim that language (or languaging) necessarily involves the integrated use of the vocal and gestural channels (e.g. Reference Vigliocco, Perniss and VinsonVigliocco, Perniss, & Vinson, 2014).

There are in fact stronger and weaker positions on gesture-speech integration among equipollent theories. McNeill puts forward a more extreme version, whereby speech is impossible without gesture: “the core is gesture and speech together. […] They are united as a matter of thought itself. Even if for some reason a gesture is not externalized (social inappropriateness, physical difficulty, etc.) the imagery it embodies can still be present, hidden but part of the speech process” (Reference McNeillMcNeill, 2012, p. 19). A weaker version of the thesis, more consonant with Kendon’s position, has been elaborated by Kita, who argues that speech and gesture are distinct psychological processes that interact (e.g. Reference Kita and ÖzyürekKita & Özyurek, 2003, p. 30). According to such an account, both speech and gesture are governed by separate but tightly interrelated production mechanisms: an “action generator” responsible for the production of gesture and a “message generator” responsible for speech (Reference Kita and McNeillKita, 2000).

There are other issues with this type of theory as well. On the one hand, proponents of equipollent theories argue for the division of labor between the two parts of the integrated system; on the other hand, they obliterate the distinction between them, downplaying, for instance, the fact that speech performs the dominant role in the transfer of referential information in the “speech-kinesic ensemble.”Footnote ⁹ Equipollent theories also disregard the fact that the compositional nature of language can manifest itself not just in speech but also in various other subsystems such as writing involving the visual modality, the tactile modality (e.g. Braille), or haptic modality (e.g. Tadoma, the tactile lipreading of the deaf-blind). The growing amount of signed-language literature, particularly on emerging signed languages, has sometimes been used to support the equipollent position (e.g. Reference SandlerSandler, 2013), but, as Kendon rightly points out, it actually goes against it, by challenging the idea of a watertight integration between speech and gesture (Reference KendonKendon, 2009).

A final problem with equipollent gesture-speech theories is an exaggerated focus on co-speech gesture (see Section 2). McNeill’s “gesture continuum” (Reference McNeillMcNeill, 1992, Reference McNeill2005), for example, includes a variety of gestures that are produced in the absence of speech, such as “language slotted gestures,” emblems (produced at least sometimes without speech), pantomime, and the signs of signed languages; however, it zooms in on co-speech gestures (or “gesticulation,” in Kendon’s terms) and describes them as the prototypical form of gesture. Accordingly, much of contemporary gesture studies centers on co-speech gestures (e.g. many of the chapters in the two-volume set Body - Language - Communication, edited by Reference Müller, Cienki, Fricke, Ladewig, McNeill, Teßendorf and BressemMüller et al., 2013–Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014) and in this way, explicitly or implicitly, embraces the view that co-speech gesture is gesture par excellence and the ancillary view that speech is language par excellence. Such an attitude is, as we have seen, not unproblematic when it comes to language evolution. Arguably, it also overemphasizes the links between speech and gesture on the developmental level (Reference LevyLevy, 2011) and the neurocognitive level (Reference Demir‐Lira, Asaridou, Raja Beharelle, Holt, Goldin‐Meadow and SmallDemir-Lira et al., 2018) as well as ignores evidence of the dissociation between them, such as the developmental evidence for partial independence between language and gesture (e.g. Reference AndrénAndrén, 2010), or limited impairment of gestures in many forms of aphasia (e.g. Reference Whiteside, Dyson, Cowell and VarleyWhiteside, Dyson, Cowell, & Varley, 2015). Issues such as these have motivated the third general type of theories in our discussion, to which we turn in Section 5.

5 Pantomimic Polysemiotic Theories

5.1 General Features

A number of polysemiotic pantomimic theories claim that the unique features of human communication derive from a general cognitive capacity, whose appearance eventually led to the emergence of language and gesture, but also to other semiotic systems, such as music and dance, ritual, and depiction (forming marks on a two-dimensional surface that resemble three-dimensional referents, as in drawing and painting). The most common candidate for this is mimesis (Reference DonaldDonald, 1991, Reference Donald, Hurford, Studdert-Kennedy and Knight1998, Reference Donald, Tallerman and Gibson2012, Reference Donald, Hatfield and Pittman2013; Reference Zlatev and PetersZlatev, 2019). According to Donald, the original function of mimesis was to facilitate tool production, as it evolved as an adaption in late australopithecines or early Homo ca. 2 million years ago. Gradually, it was exapted for communication, allowing the use of the body as a representational device, whereby bodily movements could stand for something other than themselves (Reference Zlatev, Persson and GärdenforsZlatev, Persson, & Gärdenfors, 2005). As shown below, the breadth of the notion of mimesis gives rise to a number of related theories of language origins, which share the feature of viewing the evolution of language as an integral part of the evolution of human thought and culture in general, and at the same time focus on (polysemiotic) pantomime as an essential step in the evolutionary process.

5.2 Gärdenfors’ Pedagogy

Gärdenfors links the emergence of human-specific communication to pedagogy (Reference GärdenforsGärdenfors, 2017; Reference Gärdenfors and HögbergGärdenfors & Högberg, 2017). In his scenario, the original form of communication that emerged from mimesis was demonstration, in which a teacher performs an action for the benefit of a student, and which is defined by Reference GärdenforsGärdenfors (2017) as follows:

(D1) The demonstrator actually performs the actions involved in the task.
(D2) The demonstrator makes sure that the learner attends to the series of actions.
(D3) The demonstrator intends that the learner perceives the right actions in the correct sequence.
(D4) The demonstrator exaggerates and slows down some of the actions in order to facilitate for the learner to perceive important features.

Thus, demonstration both resembles praxis and differs from it, since its main goal is not practical per se, but for the student to understand how to perform the actions in question. Hence, demonstration is a true representational sign, and according to the definition of Reference AndrénAndrén (2010) also a gesture (see Section 2). According to Gärdenfors, the minimal “symbolic distance” between the expression and what it represents makes demonstration the likely candidate for being the first form of mimetic communication. The only difference between this and pantomime is feature (D1): Whereas in demonstration the teacher actually performs the actions involved in the task (e.g. one striking a stone to produce a stone tool), in pantomime the teacher pretends to perform the actions, making pantomime a form of pretense. This semiotic breakthrough allowed for the eventual emergence of language (Reference GärdenforsGärdenfors, 2017) and ritual (Reference GärdenforsGärdenfors, 2018).

This description indicates that gesture performed the dominant semiotic role in pantomime. However, it does not entail that pantomime was monosemiotic, and, indeed, this is not the case in a popular definition of pantomime in the field of language evolution as: “a non-verbal, mimetic and non-conventionalised means of communication, which is executed primarily in the visual channel by coordinated movements of the whole body, but which may incorporate other semiotic resources, most importantly non-linguistic vocalisations” (Reference Żywiczyński, Wacewicz and SibierskaŻywiczyński, Wacewicz, & Sibierska, 2018, p. 315). In other words, pantomime should be understood as a hybrid, polysemiotic system, combining both signs and signals, and a number of different sensory modalities (Reference Zlatev, Żywiczyński and WacewiczZlatev et al., 2020).

This is also apparently how Reference GärdenforsGärdenfors (2017) understands pantomime, given his frequent mention of physical objects and “vocal gestures.” Further, given that pantomime tends to represent events in an undifferentiated way, Gärdenfors outlines the transition to language in terms of its differentiation, accompanied by lexicalization and grammaticalization. He also assumes that the main referential role shifted to the vocal aspects of pantomime over time, without, however, explaining why.

5.3 Zlatev’s Mimesis Hierarchy

Building on the work of Reference DonaldDonald (e.g. 1991), Reference ZlatevZlatev (e.g. 2008, Reference Zlatev2014, Reference Zlatev, Etzelmüller and Tewes2016) attempted to make the concept of mimesis more explicit, as well as to develop a theory of its transition to language, and more recently to polysemiotic communication as such (Reference Zlatev and PetersZlatev, 2019). In the process, Zlatev has proposed a number of related definitions of the concept of bodily mimesis, one of which is the following:

[…] an act of cognition or communication is an act of bodily mimesis if: (1) it involves a cross-modal mapping between exteroception (e.g. vision) and proprioception (e.g. kinesthesia); (2) it is under conscious control and is perceived by the subject to be similar to some other action, object or event, (3) the subject intends the act to stand for some action, object or event for an addressee, and for the addressee to recognize this intention; (4) it is not fully conventional and normative, and (5) it does not divide (semi)compositionally into meaningful sub-acts that systematically relate to other similar acts, as in grammar.

(Zlatev, 2014 , p. 206, emphasis in original)

On this basis, Zlatev proposes an evolutionary (and in part developmental) model known as the Mimesis Hierarchy (Reference ZlatevZlatev, 2008). The rudimentary form of protomimesis, based on requirement (1), is found in activities like emotional and attentional contagion (e.g. contagious laughter), and is common for all primates. The more advanced form of dyadic mimesis (based on 1 and 2) involves volition and imitation, but not true representation or sign-function; it is common for all great apes. Only at the next level (based on 1, 2, and 3), referred to as triadic mimesis, do mimetic acts gain a clear sign-function, as well as Gricean communicative intentions (i.e. that the addressee should understand that a communicative act is being performed for their benefit). This level, also in agreement with Gärdenfors, is claimed to be uniquely human (Reference Zlatev, Persson and GärdenforsZlatev et al., 2005). Further, point (4) distinguishes mimesis from a conventionalized protolanguage and point (5) distinguishes it from language proper.

This provides a convenient conceptual apparatus, but does not address key questions such as what drove the evolutionary process, as well as more specific aspects of how the transition from triadic mimesis (i.e. pantomime) to protolanguage and language took place, including the shift from gesture to vocalization in terms of dominance. Reference Zlatev, Etzelmüller and TewesZlatev (2016) addresses these gaps, but in a somewhat schematic matter. With respect to evolutionary pressures, Zlatev appeals to an increase of prosociality in hominins, in the manner of Tomasello (see Section 3.6). As for the ecological pressures behind this, the uniquely human reproductive strategy among the great apes of alloparenting (Reference HrdyHrdy, 2009) is evoked. Concerning the gradual transition to vocalization, this is sought in the nature of pantomime itself: a hybrid system that is polysemiotic (i.e. combines various sign and signal systems) and multimodal (i.e. involves different sensory channels). The dominant semiotic system of pantomime is claimed to have been highly iconic gesture, understood in terms of the following properties (Reference Zlatev, Żywiczyński and WacewiczZlatev et al., 2020):

a) use of primary iconicity (Reference Sonesson, Rauch, Carr and GeraldSonesson, 1997), where the similarity between the gesture and what it represents is largely sufficient for establishing the referent, as opposed to secondary iconicity, where appreciation of this similarity comes only later;
b) use of the whole body, rather than the hands only (cf. Reference Żywiczyński, Wacewicz and SibierskaŻywiczyński et al., 2018);
c) use of a first-person perspective, where the gesturer adopts the perspective of the agent who performs the represented actions (cf. Reference Zlatev, Andrén, Zlatev, Andrén, Johansson-Falck and LundmarkZlatev & Andrén, 2009);
d) use of the enacting “mode of representation,” with the body of the gesturer mapping onto the (human) body of the referent (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014);
e) use of the peripersonal space, where gestures stand for actions in the space immediately surrounding one’s own body (Reference Brown, Mittermaier, Kher and ArnoldBrown, Mittermaier, Kher, & Arnold, 2019).

The transition towards language consisted in a transition from (a) primary to secondary iconicity, where resemblance is less important than convention, (b) whole-body gestures to manual gestures, (c) first-person to third-person perspective, and related to this (d) “tracing” and “embodying” modes of representation, where the hands represent specific features of the referents in different ways, and can consequently (e) denote objects that are more displaced in space and time. Such a gestural system is clearly much less iconic than that of pantomimic gesture, but nevertheless retains considerable iconicity, exceeding that of vocalization. Thus, Reference Zlatev, Etzelmüller and TewesZlatev (2016) makes use of the arguments presented by Reference BrownBrown (2012, see Section 3.7), to motivate the gradual transition from gesture to vocalization when the need for less iconicity and more “arbitrariness” arose.Footnote ¹⁰

But while language (realized as speech, writing, or signing) may be the dominant system in modern human communication when it comes to expressing propositions and narratives, it is very seldom used alone, but alongside other semiotic systems such as gesture and depiction (e.g. Reference GreenGreen, 2014): polysemiotic communication. An advantage of the mimesis/pantomime approach is that it can help explain this, as pantomime consisted of gesture, vocalizations, and “protodrawing,” when gestures left marks on surfaces such as sand (Reference Zlatev and PetersZlatev, 2019; Reference Zlatev, Żywiczyński and WacewiczZlatev et al., 2020; Reference Zlatev, Devylder, Defina, Moskaluk and AndersenZlatev et al., 2023).

5.4 Similar Accounts

Perhaps less elaborated but similar polysemiotic theories have been suggested by others as well. Reference Levinson, Enfield and LevinsonLevinson (2006), for example, has proposed the Interaction Engine hypothesis, according to which what evolved in our ancestors was a sociocognitive adaptation allowing “joint attention, common ground, collaboration and the reasoning about communicative intent” (Reference Levinson and HollerLevinson & Holler, 2014, p. 369) which transformed face-to-face interaction. Levinson argues that such communication was polysemiotic in the sense that it incorporated facial expressions, body movements, and affective vocalizations (cf. Reference Fröhlich, Sievers, Townsend, Gruber and van SchaikFröhlich et al., 2019) but was still lacking representational signs. In a next stage, iconic gesture accompanied by simple referential vocalizations emerged, the latter of which gradually assumed the dominant role in the transfer of meaning (Reference Levinson and HollerLevinson & Holler, 2014).

A similar scenario is proposed by Reference CollinsCollins (2014), according to whom the communication system of early Homo consisted of a majority of relatively involuntary, non-representational signals (“indices”), and a smaller inventory of voluntary, representational signs. The latter were initially more or less evenly distributed between the bodily and vocal channels, but gesture had a leading role. With Homo erectus and Acheulean culture,Footnote ¹¹ there was a sharp increase in the proportion of bodily-gestural signs, but for some unspecified reason the importance of the latter began to decrease from ca. 1 million years ago, and with the evolution of Homo sapiens, the relative importance of gesture and speech were reversed compared to what they were at the onset of language evolution.

5.5 Evaluation

Similarly to the equipollent type, pantomimic theories of language origins alleviate the transition problem of the monosemiotic gesture theories by drawing attention to the polysemiotic nature of modern human communication and arguing that the evolutionary beginnings of language must have been similarly polysemiotic. However, they disagree about the nature of polysemiotic communication, both in modern communication and as it concerns evolution.

First, pantomimic theories adopt a more complex view of human-specific communication, comprising not just speech and gesture, but also other semiotic systems like depiction and music. Some of these theories propose that all of these have developed from the fountainhead of (bodily) mimesis. The most important shared characteristic of these semiotic systems is that they consist of (representational) signs, which distinguishes them from most forms of animal communication, which are based on signals (Reference Zlatev, Żywiczyński and WacewiczZlatev et al., 2020). Modern human communication indeed involves combinations of such semiotic systems and is clearly polysemiotic (Reference Zlatev and PetersZlatev, 2019; Reference Zlatev, Devylder, Defina, Moskaluk and AndersenZlatev et al., 2023). For example, the Bayaka nomads of the Western Congo Basin incorporate whole-body, silent pantomime as well as vocalizations of the hunted game into their hunting narratives (Reference Lewis, Dor, Knight and LewisLewis, 2014), while inhabitants of central Australia speak and gesture as they draw in the sand while narrating (Reference GreenGreen, 2014). Considering these traditional societies with their traditional technologies, how much more polysemiotic is modern communication, mediated by all the current (electronic) media that we have at our disposal? The interactions between these systems are flexible and context-specific, not unlike the way Kendon describes the polysemiotic process of languaging (Section 4.3).

Second, proponents of pantomimic theories gain some support from “uniformitarianism” when they argue that the original human-specific system was likewise polysemiotic, though less differentiated than present polysemiosis, with “bodily mimesis,” or an “interaction engine” serving as the springboard. A clear difference from the equipollent polysemiotic theories is the claim that the division of labor between semiotic systems was different at the onset of the evolutionary process, with gesture serving the leading role, and speech playing the dominant role at its end. In this respect, pantomimic theories can use all the arguments from monosemiotic theories (Section 3) without encountering the transition problem to the same extent. However, they do face a version of it, as alluded to repeatedly above: How exactly can one explain the reconfiguration in the division of labor between semiotic systems from “more gesture” to “more speech”?

While we do not believe that a conclusive response to this question has been given, it appears that pantomimic polysemiotic theories are in the best position to answer it when compared to the other types, by emphasizing the flexible and context-dependent relation between semiotic systems. Further, speech and gesture have different intrinsic potentials for iconic representation, and contexts in which iconicity is less effective as a communication strategy, while a less iconic and more conventional, and systematically structured form of sign use would have been more effective and constituted an evolutionary pressure towards speech. One such context, suggested originally by Donald (1991), could have been a culture dependent on relatively complex narratives, or myths, where events are not represented just sequentially, but also in counter-iconic orders, reflecting causal and logical relations that require more systematic and conventionalized means of representation.

6 Conclusions

In this chapter we have provided a survey of a number of gestural theories of the origin of language, using a new typology that divides theories first in terms of whether they viewed “gesture alone” as the starting point to the emergence of language (monosemiotic theories) or gesture in combination with other semiotic systems (polysemiotic theories). The latter were then divided based on whether gesture and speech were considered to play a (more or less) equal role from the start to the present (equipollent) or whether gesture first dominated, but the vocalizations that were there from the start eventually gained dominance (pantomimic).

To sum up, while there is a lot of value in monosemiotic gestural theories, in particular in regard to the role of iconicity for “bootstrapping” a shared system of signs, they were all shown to have difficulty in explaining the transition from gesture to speech. In tackling this problem, many proponents of such theories tend to overemphasize the role of speech in modern human communication, at the same time as they tend to overemphasize the role of gesture at the beginning of the evolutionary process. Another problematic move is the stress that they put on the continuity between ape and human gestures, whereby they find themselves hard-pressed to fully account for the differences between these two phylogenetically distinct forms of gesture.

As for equipollent polysemiotic theories, they are at least to some degree able to account for the role of co-speech gesture in modern human communication but tend to disregard other forms of gestural communication (and other semiotic systems, even more). They find it difficult to provide an evolutionarily satisfactory explanation of why gesture and speech differ semiotically (rather than just postulating that they are two sides of a “dialectic”), or indeed why they are so closely connected.

We concluded with pantomimic polysemiotic theories, which in a way capitalize on the strengths of the others: Like monosemiotic theories, they claim an advantage of gesture at an initial evolutionary stage, and like the equipollent theories (perhaps most of all, that of Adam Kendon) they proclaim the advantages of flexible polysemiosis. As such, they appear to be best able to explain modern communication: not only speech, but language in general; not only co-speech gesture, but a wide variety of gestures; not only language and gesture, but also other semiotic systems, like depiction and music. In several cases, they are capable of proposing plausible evolutionary scenarios, informed by research in ethnography and non-human communication. At the same time, they are able to explain the differences between human-specific and animal communication, including ape gestures. Finally, their notion of polysemiosis alleviates the transition problem of monosemiotic theories. However, it does not eliminate it completely, and pantomimic polysemiotic theories still have to propose a mechanism responsible for reconfiguring the original system of pantomime into modern language and modern polysemiotic communication. While there are some attempts in this direction, this is still very much work in progress. We have indicated some promising ideas in this direction, including the role of narrative.

14 Gesture and First Language Development: The Multimodal Child

1 Introduction

The purpose of this chapter is to capture the role and use of gesture in first language development and its integration in the child’s multimodal communicative system. It presents an overview of theories and methods that have triggered and facilitated the study of gestures in language development. The chapter is primarily focused on production and on gestures used by neurotypical hearing children who acquire spoken languages.Footnote ¹ The main issues are illustrated with detailed analyses of examples extracted from longitudinal data in English and French(Reference Morgenstern and ParisseMorgenstern & Parisse, 2012).Footnote ²

The human communication system develops in a space of shared meanings in which adults socialize children into language in situated activities; consequently, this overview highlights the crucial role of caregivers in adult–child interactions. We thus first focus on the role of gestures in adults’ communicative input and then follow children’s development into the use of the adult multimodal communicative system. We use the word “multimodal” to refer to a variety of semiotic resources used within the audio-vocal and visual-gestural modalities (such as speech, gesture, gaze, facial expressions). In our perspective, humans have created language by interacting via different types of meaning-making resources which have collectively been sedimented through experience and use into what is called “language.” As speech is the primary mode of communication children are progressively mastering, their use of gesture will be presented before they begin speaking, then as they produce their first words, and finally once speech is mastered. At the end of the developmental process, speech becomes clearly predominant but is both complemented and supplemented by other semiotic resources according to variables such as linguistic context, situation, interlocutor, activity, or discourse genre (Reference Cienki, Badio and KoseckiCienki, 2012).

2 Historical Background and Evolution of Theoretical Approaches and Methods

Child language research is one of the first fields in which spontaneous interaction data were systematically collected, initially through diary studies (Reference IngramIngram, 1989; Reference MorgensternMorgenstern, 2009) and later by audio and video-recordings shared worldwide thanks to the CHILDES project (Reference MacWhinneyMacWhinney, 2000). This data-centered method has allowed many researchers to confirm that, in the course of their development, children make their way through successive transitory multimodal systems, each with their own internal coherence (Reference Cohen and Linguistique de ParisCohen, 1924). This phenomenon can be observed at all levels of linguistic analysis. The starting point of language-acquisition scholars’ interest in gesture or visible bodily action could be summarized in de Laguna’s famous assertion that “in order to understand what the baby is saying you must see what the baby is doing” (Reference De LagunaDe Laguna, 1927, p. 91). Children’s productions are like evanescent sketches of adult language and can only be analyzed in their interactional context, taking into account the interlocutors’ interpretations and reactions, shared knowledge, actions, gestures, facial expressions, postures, and head movements, as well as words produced by the children (Reference Morgenstern and ParisseMorgenstern & Parisse, 2007; Reference Parisse and MorgensternParisse & Morgenstern, 2010).

Children’s language development has long been described by starting with the first vocalizations and the maturation of phonic abilities without paying much attention to motor skills, actions, gaze, and gestures. However, the first diaries of observers of child language had already contained dazzling insights about the multisensory qualities (through hearing, sight, touch, and sometimes through taste and smell) and forms of expression (speech, gesture, gaze, facial expressions) that make language acquisition a multimodal process. During the second half of the nineteenth century, child language development was studied through researchers’ observations of their own children. The detailed follow-ups on children’s language production, anchored in their daily lives, were a source of fascinating links between motor and psychological development, cognition, affectivity, and language. The “founding fathers” of the study of child development and language had great intuitions about gestures and their relation to language. In his notes on his son’s development, Reference DarwinDarwin (1877) highlighted the importance of observing the transition from uncontrolled body movements to intentional gestures. Reference DarwinDarwin (1872) was mainly interested in the expressiveness of the baby, and his diary entries therefore focused on the expression of emotions. He first emphasized the different functions of intonation. In addition, according to him, certain habitual movements become automatic and are associated with communicative functions. This can be illustrated by the first bodily manifestations of negation (gestures of avoidance or rejection which consist of turning the head or the body away, pushing things away with the hand). Darwin’s insights were confirmed by more contemporary research. Mimetic patterns of imitable actions, shared representations of objects that can be manipulated, anchor the acquisition of the child’s first gestures (Reference Zlatev, Persson and GärdenforsZlatev, Persso, & Gärdenfors, 2005). There is now evidence from brain imaging studies that the use of language involves motor representations concerning more than just the movement of the vocal apparatus (Reference ArbibArbib, 2012). It is through subtle shaping of daily actions and practices with objects in the environment that manual-gestural communication in social interactions leads the child to adopt conventional forms of symbolic gestural and verbal language. Reference RomanesRomanes (1889) also provided interesting ideas on gesture in his own diary study of his child. He compared human and animal gestures and mentioned the gestural language of the deaf as a sign of the universality of symbolic gestures.

Despite those early observations, gesture was not in the foreground in studies on child language during the first half of the twentieth century. However, thanks in particular to the work of Reference BrunerBruner (1975, Reference Bruner1983), actions and gestures were gradually considered by researchers who analyzed early development. They saw gesture as a system of communication that precedes the verbal system and then becomes complementary to it. According to Reference Werner and KaplanWerner and Kaplan (1963, p. 66): “Linguistic representation emerges from, and is rooted in, non-linguistic forms of representation.” For Reference BatesBates (1976), children’s first gestures have properties that had been in her time specifically attributed by linguists to speech. Thirteen-month-old children (Reference Bates, Bretherton and SnyderBates, Bretherton, & Snyder, 1988) produce manual gestures that are considered the equivalent of nouns (e.g. brushing their hair when seeing a brush is equivalent to saying “brush”).

Research in language acquisition has now developed tools, methods, and theoretical approaches to analyze children’s situated multimodal productions, as they provide evidence for links between motor and psychological development, cognition, affectivity, and language. Those links can only be established by conducting both quantitative and qualitative analyses through both natural settings and experiments.

As of the end of the twentieth century, thanks to video data linked to transcripts with specialized software (CLAN, ELAN, PHON),Footnote ³ detailed coding and analyses of multimodality have been possible and have opened whole new fields of research. It is especially the case for researchers who study language with a usage-based perspective, in its natural habitat, that is daily discourse: “the prototypical kind of language usage, the form in which we are all first exposed to language – the matrix for language acquisition” (Reference LevinsonLevinson, 1983, p. 284). We are now able to document in detail how the visual and vocal modalities come together in daily interaction and progressively shape children’s language. Video-recording tools have notably advanced the detailed analysis of the organization of human action and interaction (Reference MondadaMondada, 2019). These tools have shaped new avenues of research on language in interaction, as it is deployed in multiple ecologies, both in time (the moment-to-moment unfurling of an interaction) and over time (multiple recordings over several years of the same children in their family environment). Sacks, as he was grounding what became the Conversation Analysis framework, encouraged the use of video-recordings (Reference Sacks, Atkinson and HeritageSacks, 1984, Reference Sacks1992) so as to capture, analyze, and share sequences that unveil the structure of everyday practices.

Building on Conversation Analysis, an integrative, multimodal approach to language was developed thanks to contributions from Reference GoodwinGoodwin (1986, Reference Goodwin2013) and Reference Levinson, Enfield and LevinsonLevinson (2006). The recognition of sign languages also played an important role in considering the gestural dimension. An abundance of work on gesture and on the complementary role of semiotic forms is emerging and enriches the research already carried out on the development of children’s language. Specialists in linguistic anthropology (Reference HavilandHaviland, 1998), confronted with a multiplicity of cultures, many of them with purely oral traditions, have helped linguists become aware that the formal apparatus constructed to describe languages is largely based on written texts (Reference LinellLinell, 2005) whereas the most common uses of language are in face-to-face interactions (Reference GoffmanGoffman, 1963). Semioticians (Reference KressKress, 2010) also insist on the importance of taking into account the different simultaneous channels (auditory, visual, tactile) with which we conceptualize the world around us and express ourselves.

Research on spontaneous data (Reference Morgenstern, Inbal, Estigarribia, Tice and KurumadaMorgenstern, 2014, Reference Morgenstern, Mazur-Palandre and Colon2019) is characterized by the researchers’ attempts to capture how children become competent interlocutors as they learn to deploy the various semiotic resources at their disposal in relevant ways, for example, by coadapting to their conversational partners in various situations and environments. Children’s use of speech and gesture differs from that of adults and dynamically changes over time. Children’s multimodal communicative expression is thus analyzed in longitudinal data with an ethnographic approach in line with Kendon’s call for studies of use in context inspired by Reference EfronDavid Efron (1941/1972) and Reference WundtWilhelm Wundt (1921/1973). Children’s communicative profiles are shaped by their local environment (family home) and their microcultural norms. Through children’s everyday interactions in their ecological circumstances, we can understand both how language is “experienced” (Reference OchsOchs, 2012) and how experience is “languaged” in situated activities. Multimodal approaches include some combination of verbal content and the accompanying prosody, facial expressions, posture, gesture as well as gaze according to a dynamic deployment of the “scope of relevant behaviors” (Reference Cienki, Badio and KoseckiCienki, 2012, Reference Cienki2017), ideally taking into account multiple factors such as age, context, and affordances of the situation or interlocutors. When children’s gestures are analyzed in interaction, we try to capture what the modality of gesture affords its child users as a means of communication among the other semiotic resources they have at their disposal and as their motor, cognitive, and social skills evolve in time.

Video-recording has also had an invaluable impact on experimental methods focused on gesture development (Reference Congdon, Novack and Goldin-MeadowCongdon, Novack, & Goldin-Meadow, 2018). Gesture had often been overlooked in standard psychological observation and research until experimenters started using cameras and making detailed annotations of children’s comprehension and production of gestures during the tasks they were being asked to do. Thanks to a wealth of studies, “gesture research has fundamentally changed the way psychologists think about language, learning, and reasoning” (Reference Congdon, Novack and Goldin-MeadowCongdon et al., 2018, p. 497). With video-recorded data, time is suspended. Researchers cannot only relive/replay the experimental sessions as many times at they wish, but they can also change the annotation schemes, enrich them, and revisit them, and when the data are shared, the studies can be replicated more faithfully.

In both naturalistic and experimental data, gesture types, patterns, variations, and formal components can be coded, quantified, or analyzed in fine detail. Investigative experiments are often informed by naturalistic field methods and enable researchers to test what occurs in ecological settings. Researchers make an indispensable contribution to determining which hypotheses should be tested. Since young children’s productions in experimental settings might be influenced by a variety of performance-related factors (Reference AirentiAirenti, 2015), and since observers have to overcome great challenges when video-recording data in naturalistic or home environments, experimental results and spontaneous speech data must both be collected in order to capture children’s multimodal communication system and its development. New avenues include the use of motion capture (Reference Morgenstern, Mazur-Palandre and ColonDodane, Boutet, Didirkova, Ouni, & Morgenstern, 2019) and also involve computational modeling (Reference Kaplan, Oudeyer and BergenKaplan, Oudeyer, & Bergen, 2008). Those possibilities might further extend our knowledge about gesture–speech relations (Reference Abramov, Kopp, Nemeth, Kern, Mertens and RohlfingAbramov et al., 2018) but have not yet been fully deployed for studying the development of multimodal communication.

3 Gesture and Scaffolding in the Adult Input

Children’s language gradually develops into rich multimodal production using the variety of semiotic resources at their disposal through constant exposure to the adult input. It is thus crucial to account for the multimodal input surrounding children in order to understand how they co-construct their communicative skills thanks to adult scaffolding. Given that parents provide their children with other forms of expressions along with speech, we will refer to “child-directed communication” rather than the usual expression “child-directed speech” to insist on the crucial function of the child’s multimodal input.

Scholars have wondered for centuries about how children construct meaning. In his reconstruction of his developmental process, Saint Augustine, back in the fifth century, stressed the link between surrounding events, adults’ actions, gestures, and words in his own acquisition of language (Reference Augustine1996, I.8). Word learning is facilitated by multimodal cues produced by adults (gazing, pointing, touching, manipulating) (Reference Booth, McGregor and RohlfingBooth, McGregor, & Rohlfing, 2008). Adults provide redundant sensory information and positively affect infants’ attention. Symbolic gestures, especially pointing (Reference Iverson, Capirci, Longobardi and CaselliIverson, Capirci, Longobardi, & Caselli, 1999), have been shown to facilitate children’s comprehension by providing additional cues. The role of pointing and eye gaze in the construction of joint attention has been a key topic of research in child language development (Reference Baldwin, Moore and DunhamBaldwin, 1995; Reference TomaselloTomasello, 2003). As shown in example (1), video (1) (Table 14.1) from our longitudinal data (Reference Morgenstern, Blondel, Beaupoil-Hourdel, Benazzo, Boutet, Kochan, Limousin, Hickman, Veneziano and JisaMorgenstern et al., 2018),Footnote ⁴ the caretaker’s gestural information reinforces the vocal information. In the examples, “nb” indicates the utterance number and the column “Part.” indicates the relevant participant in the interaction.

Table 14.1 Example (1). Video (1). Ellie, 10 months.Footnote ⁵ https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%201-Ellie-0–10-finished%20-%20gesture.mp4 (Mother is taking care of her child (CHI) Ellie; Grandmother is filming.)

nb	Part.	Actions and gestures	Vocal and verbal productions
1	MOTHER	Palm down lateral movement with both hands in front of her from center to exterior. Gaze at CHI.	Ellie, you finished?
2	ELLIE	Moves right arm up and down, left hand is grasping the tablet of her highchair. Gaze at camera.
3	MOTHER	Hands in rest position, intent gaze at CHI.	That’s hello, you’re waving
4	MOTHER	Palm down lateral movement with both hands in front of her from center to exterior. Gaze at CHI.	Are you finished?
5	ELLIE	Same movement as previous but with a wider range and excitedly. Gesture and prosody of vocal production are synchronized. Big smile at the camera.	Ha ha ha … ah ha ah ! ah!
6	MOTHER	Palm down lateral movement with both hands in front of her from center to exterior. Gaze at CHI.	Are you finished Ellie?
7	ELLIE	Very briefly produces the same gesture as her mother, with two arms, smaller range and in the reverse direction, from exterior where her arms were positioned, to center, eyes gazing down.

Ellie has finished her meal, her mother is providing both a spoken and a gestural expression of her state (Figure 14.1, turn 1).

Figure 14.1 Are you finished + palm down lateral movement (turn 4)

Reference Bahrick, Pickens., Lewkowicz and LickliterBahrick and Pickens (1994) explain that intersensory redundancy grounds how we detect stimuli that belong together and constitute a unitary event. They show that redundancy across the senses foregrounds core information for infants. Ellie’s mother presumably expects the gesture to help her daughter understand the meaning of her utterance. By linking gesture and speech, the mother grounds the word “finished” and facilitates the relation between sign and referent as she relies on both auditory and visual cues. She actually connects her own production of what Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem.Bressem and Müller (2014) call a “recurrent gesture” (palm down lateral, arms sweeping toward the exterior) used to express completion (Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014) to Ellie’s behavior. The child’s movement, seems to be addressed to her grandmother who is filming (the child waves her right arm quite excitedly, vocalizes and gazes at the camera [turn 2], and it is interpreted as another type of gesture (an emblem) by her mother [“that’s hello,” turn 3]). The child’s voice and arm movements are synchronous in turn 5: the gestural strokes and prosodic nuclei are perfectly aligned. Not only is Ellie’s body in harmony with her own voice (turn 5), despite a lack of semantic content, but after two repetitions on the part of the mother of the same multimodal production (turns 4 and 6), Ellie seems to echo her mother’s gesture in a more sketchy performance with a reverse movement from periphery to center, probably because of the positioning of her arms at the beginning of the gesture excursion (Figure 14.2, turn 7). In this way, Ellie’s body is in resonance with her mother’s.

Figure 14.2 Ellie’s resonance with her mother’s recurrent gesture (turn 7)

Caregivers, in a variety of cultures, synchronize words, actions, and gestures as they show objects, embody actions, or illustrate properties (Reference Zukow-Goldring, Dent-Read and Zukow-GoldringZukow-Goldring, 1997). These practices scaffold the referential process by linking words and gestures to objects and events (Reference Rader and Zukow-GoldringRader & Zukow-Goldring, 2010). Studies show that the nature and the frequency of maternal gesture influence the development of children’s communicative repertoires (Reference Rowe and Goldin-MeadowRowe & Goldin-Meadow, 2009). Caretakers also use gestures in playful scripts or songs and nursery rhymes, such as “bye bye” (waving hands), “peek-a-boo” (playfully hiding face with hands), “bravo” (clapping hands), and the French “ainsi font font font les marionettes” (a song that is accompanied by hand gestures representing puppets), in which they “teach” their children specific conventional gestures along with words.Footnote ⁶

Parents spontaneously use gestures in everyday communication. All those gestures derive from the culture in which the children are being raised and have very strong social and symbolic values. Caretakers embody their communicational intent more, and rely on redundant multimodal combinations, when the children are younger. Gestures are used to attract the child’s attention to particular events, objects, or words, to highlight, reinforce, and disambiguate the spoken content, and function as a support system secondary to speech (Reference Kelly, Arnon and ClarkKelly, 2011). This could be considered as a particular feature of language addressed to infants and young children. “Gestural motherese” is quite specific in terms of the type of gestures used: It involves more deictic and recurrent (semi-conventional) gestures and fewer representational gestures (Reference Özçalışkan and Goldin-MeadowÖzçalışkan & Goldin-Meadow, 2005).

However, not only do we need to integrate all semiotic resources to capture adult–child communicative behavior, but all actions and manipulations of artifacts could also be included. In real life and real time, humans communicate while they are involved in other activities such as eating, cooking, cleaning, drawing, digging. It is thus crucial to analyze interactions in the context of what could be called “multiactivity” (Reference Haddington, Keisanen, Mondada and NevileHaddington, Keisanen, Mondada, & Nevile, 2014). As we adopt a multimodal, dynamic, and situated approach to language use in interaction, the analysis of both actions and symbolic gestures in their multimodal and interactive context is particularly relevant to apprehend how adults and children distribute meaning. Actions and gestures produced by young children are often subtly integrated into adult–child interaction, interpreted, and reformulated into spoken language by their parents. Indeed, body actions, such as children lifting their arms to be picked up, are repeated sometimes several times a day, are interpreted and reformulated into spoken language by adults and thus become ritualized requests, just as reduplications such as papapa are interpreted as referring to the child’s father in French and reformulated into the conventional word “papa.” As children progressively gain control over their bodies, their behaviors are more and more often interpreted by those around them as meaningful and are linked to specific affordances and contexts: Bringing the palms of the hands together is interpreted as clapping and takes on a praising function or expresses glee (Reference Aronsson and MorgensternAronsson & Morgenstern, 2021); bringing the hand to the mouth or extending the arm toward a piece of fruit is interpreted as a request. Children’s body movements thus take on the status of conventional gestures and become intentional communicative signals (Reference TomaselloTomasello, 2008). Reference Goldin-Meadow, Mylander and FrankGoldin-Meadow, Mylander, and Frank (2007) have shown that when a mother reformulates the meaning of her infant’s gestures into words, those words for referents are more likely to enter the child’s spoken repertoire than words for non-reformulated referents are.

Past studies, our own research (Reference Morgenstern, Inbal, Estigarribia, Tice and KurumadaMorgenstern, 2014, Reference Morgenstern, Mazur-Palandre and Colon2019), and our examples illustrate that as children’s language skills develop, adults tend to give primacy to speech in their own productions but subtly use the visual-gestural modality to supplement and complement the verbal production according to their communicative needs. The use of multimodal child-addressed communication to reinforce verbal productions, along with actions, interpretations of children’s bodily movements into words, and adjustments to the child’s age as well as cognitive and linguistic skills, are crucial features of caretakers’ communication with young children.

4 Children’s Gestures before They Produce Spoken Language

As shown in Section 3, gestures provide a means for infants to enter communication prior to their own production of spoken language. Adults’ use of multimodal utterances seems to enhance children’s early comprehension of what adults are trying to communicate (Reference Pfandler, Lakatos and MiklósiPfandler, Lakatos, & Miklósi, 2013; Reference Wu and Gros-LouisWu & Gros-Louis, 2014). More specifically, young children assign meanings to nouns (Reference Clark and EstigarribiaClark & Estigarribia, 2011), to verbs (Reference Goodrich and Hudson KamGoodrich & Hudson Kam, 2009; Reference Ozçalışkan, Gentner and Goldin-MeadowÖzçaliskan, Gentner, & Goldin-Meadow, 2014), and to prepositions (Reference McGregor, Rohlfing, Bean and MarschnerMcGregor, Rohlfing, Bean, & Marschner, 2009) through the mediation of gesture.

Along with vocal productions, eye-gaze is the first semiotic resource used by children to enter communication. After three months old, infants increasingly look reliably in the direction of another person’s own attention as signaled by the direction of their gaze (Reference BrunerScaife & Bruner, 1975). As of nine months old, they follow the adults’ gaze along with pointing gestures (Reference Brooks and MeltzoffBrooks & Meltzoff, 2005). Young children then progressively learn to use their gaze to monitor and guide others’ attention.

In early childhood vocal-motor coordination develops with neural maturation, shown in the rhythmic qualities shared by early hand-banging movements and canonical babbling (Reference MasatakaMasataka, 2003). Reference Iverson and FaganIverson and Fagan (2004, p. 1063) suggest “a link between more speech-like vocalizations and manual activity that may be a precursor to coordinated manual movement of the sort involved in adult gestures” (see example [1], video [1] for an illustration).

Meaningful manual actions precede and pave the way for the development of language and share a semantic link with gestures and words (Reference Capirci, Contaldo, Caselli and VolterraCapirci, Contaldo, Caselli, & Volterra, 2005) as we have shown in example 1. Between nine and ten months old, children use ritualized requests like open-close grasping motions or pulling an open hand to obtain something (Reference Bates, Benigni, Bretherton, Camaioni, Volterra and BatesBates, Benigni, Bretherton, Camaioni, & Volterra, 1979). If infants’ first vocalizations imitate the prosodic patterns that are most salient and frequent in their environment and that have pragmatic and affective functions (Reference Esteve-Gibert and PrietoEsteve-Gibert & Prieto, 2013), thanks to their finer hand-motoric skills, gesturing also allows children to express more specific semantic functions and thus communicate meanings that they might not be able to express with vocal means (or at least that are not captured in their vocal productions by adults). According to Reference Goldin-MeadowGoldin-Meadow (1999, p. 423), gesture may thus serve as a “way-station on the road to language.” It has also been shown that “late-bloomers” (children who seem to have a delay in their first word production, but catch up by the age of three) can be differentiated from late-talkers thanks to their gestural performance (Reference Capone and McGregorCapone & McGregor, 2004); gesture can compensate for verbal expressive deficits.

Conventional gestures enter the child’s repertoire at around 10–11 months old. The most frequent and usually the first gesture used is pointing accompanied by gaze. It has a variety of functions and uses (Reference Bates, Benigni, Bretherton, Camaioni, Volterra and BatesBates et al., 1979; Reference Liszkowski, Carpenter, Henning, Striano and TomaselloLiszkowski, Carpenter, Henning, Striano, & Tomasello, 2004) and a variety of forms, the most common being index-finger pointing (Reference AndrénAndrén, 2010; Reference BatesBates, 1976; Reference Franco and ButterworthFranco & Butterworth, 1996; Reference Morgenstern, Morgenstern and Goldin-MeadowMorgenstern, 2022). Deictic gesture development indicates how children gradually distance themselves from objects physically, without using touch and manipulation, and enter symbolic communication (Reference Capone and McGregorCapone & McGregor, 2004).

Children’s pointing used in interaction is integrated in the dialogue by adults as (proto)speech acts, and could be treated as requests or declaratives according to context (Reference BatesBates, 1976; Reference MarcosMarcos, 1998). When children can point not only to request an object but also to refer to it, that reveals their ability to enter joint attention, but also to create a disconnection. Pointing at a location is progressively used to refer to an absent entity (Reference Le GuenLe Guen, 2011) and predicts to some degree the ability to use language.

The headshake is another gesture produced in early childhood in a wide range of cultures. It is taken up by children at first to mark refusal and rejection (Reference Beaupoil-Hourdel, Morgenstern, Boutet, Larrivée and LeeBeaupoil-Hourdel, Morgenstern, & Boutet, 2015; see also Reference GuidettiGuidetti, 2005). In example (2), video (2) (Table 14.2), Ellie is one year and two months old. She shakes her head several times and the situation illustrates how the conventional gesture can be seen in a continuum with the action of avoidance.

Table 14.2 Example (2). Video (2). Ellie one year and two months. Between action and gesture: headshakes. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%202-Ellie-1–02%20anticipation-grounded%20in%20routines.mp4

nb	Part.	Actions and gestures	Vocal productions
1	MOTHER	Is preparing to wipe CHI’s mouth with a tissue.	That’s where they’re supposed to go.
2	ELLIE	Attentively watches her mother (MOT) and starts to shake her head as MOT’s hand approaches her face, but not very vigorously.
3	MOTHER	MOT manages to wipe what she wanted from CHI’s face.
4	MOTHER	MOT prepares a spoonful of food and lifts it up.
5	ELLIE	CHI turns head and gaze away toward her right, lifts both her arms.	Da
6	MOTHER	Holds the spoon in place.	Are you done?
7	ELLIE	Gazes at MOT, then gazes down.
8	MOTHER		Are you sure
9	GDMOT (grandmother)	Filming	Are you finished Ellie?
10	ELLIE	Looks at GDMOT, shakes her head.
11	GDMOT	Filming	Oh, ooh! All done? Oh dear!
12a	ELLIE	Continues to shake her head with a severe gaze at GDMOT.
12b	GDMOT		Laughs
13	MOTHER	Brings up the spoonful to CHI’s mouth.
14	ELLIE	Turns head away.
15	MOTHER	Puts back spoon in bowl. Takes bowl and spoon away.	No, OK.
16	ELLIE	Very quick palm up open hand with left hand then opens arms.
17	MOTHER	Wipes highchair tablet with tissue.	Do you want to go out then?
(…)
28	MOTHER		Are you actually still hungry?
29	ELLIE	Smacks her lips and gazes smilingly at MOT.
30	MOTHER	Takes bowl and tries giving her a spoonful.	Oh you’re already done with breakfast aren’t you?
31	ELLIE	Turns head away very decisively.

From her past experiences, Ellie knows that when her mother prepares a wet tissue and moves it toward her face, she is about to wipe her face.

In turn 2, Ellie is already preparing to evade the wiping, but not too vigorously; maybe past experience has shown her that it is an unavoidable ritual and that as unpleasant as it is, it is still bearable. Her avoidance is therefore not a complete rejection. However, in turn 5, she is more adamant in her refusal of more food. Her movement allows her to avoid the food physically but also communicate her negation through a headshake and a vocal production “da” which could be associated to the word “done” as reformulated in turn 6 in her mother’s recast “are you done?”.

Both her grandmother with verbal questions (turns 9 and 11) and her mother with the attempt at giving her another spoonful (turn 13) check that the meal is really over. This is made clear not only through Ellie’s headshake, which avoids and refuses the food once again (turn 14), but also through her bringing both arms up in what is interpreted by her mother as a request to get out of the highchair (turn 17 “do you want to go out then?”).

Figure 14.3 Ellie’s preparation for avoidance

Figure 14.4 Avoidance and refusal

In the same video, when her mother says the word “hungry” (turn 28), Ellie represents the idea with a playful smacking of her lips, which is an action sometimes performed by family members when saying “hungry” (Figure 14.6, turn 29), even though she has demonstrated previously and confirms at the end of this sequence that she wants breakfast to be over, as formulated by her mother (turns 30 and 31).

Figure 14.5 Request to get out of the chair

Figure 14.6 a, b Smacking lips

What have been called “representational gestures” in the literature are used around the age of 12 months, before the 25-word milestone (Reference Capone and McGregorCapone & McGregor, 2004). A child can manually represent the action of holding a glass and drinking, or use her hand to comb her hair. Reference Goodwyn and AcredoloGoodwyn and Acredolo (1993) argue that a gesture or a word is symbolic if it refers to multiple examplars (including pictures or in absence of the referent), if it is produced spontaneously (without following the model of an adult), and if it is not part of a well-rehearsed routine. The status of those gestures often performed in pretend play is, however, very different from coverbal representational gestures used with speech later on.

Within a few months, children’s gestural repertoire is enriched, especially in cultures or families where the input is gesturally varied thanks to both the general multimodal features of face-to-face interaction and to specific use of gestures in infant-directed communication (as shown in Section 2).

Reference Iverson, Capirci and CaselliIverson, Capirci, and Caselli (1994) demonstrate that 16-month-old children show a preference for either words or gestures, but by 20 months, types and tokens of spoken words increase significantly. Reference Butcher, Goldin-Meadow and McNeillButcher and Goldin-Meadow (2000) longitudinally observed three boys and three girls during the transition from one-word to two-word speech. During the first session, most of the subjects (five out of six) produced the majority of their gestures without speech. During the following sessions, there was a decline in the proportion of gestures produced. By the end of the observation period, the children mainly used gesture-speech combinations. This is when the gesture-speech integration begins with the progressive predominance of speech in hearing populations.

5 Children’s Gestures as They Enter Verbal Language

Gestures and speech develop in hearing children as they manipulate more and more words. The child’s multimodal communication skills emerge in their first cross-modal combinations and multimodal constructions.

Speech and gesture together form an integrative system (Reference McNeillMcNeill, 1992). However, multimodal productions are quite rare at first or are mostly gestalt communicative acts in which body movements and vocal productions are not fully controlled. Children first use the audio-vocal and visual-gestural modalities together to communicate about the same element, as in example (3). In example (3), video (3) (Table 14.3), when she is one-and-a-half years old, Ellie waves her right arm, shakes her head and says “no” at the same time, but the headshake is pursued vigorously for a while. The gestural modality is amplified and maybe not as finely controlled as it will be when Ellie gets a little older and uses her semiotic resources with more expertise.

Table 14.3 Example 3. Video 3. Ellie, one year and six months. Refusal. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%203-Ellie-1–02-Tangerine%20or%20ball.mp4 Ellie is in her highchair. Her bowl of food with chicken and fish is almost finished.

Nb	Part.	Actions and gestures	Vocal and verbal productions
1	AUNT	Placing a piece of chicken on the fork in order to feed it to Ellie	Would you like a big piece Ellie?
2	ELLIE	Shakes her head and moves right arm with palm lateral toward her center and back.	No, no.
3	GDMOT	Filming	She’s always liked her fish though.
4	AUNT	Bent over the plate	Hu
5	GDMOT	Filming	XXX Footnote ⁷Ellie, fish.
6	ELLIE	Still slowly shaking her head from side to side without interruption.

First gestures are tightly connected to first words. Reference Iverson and Goldin-MeadowIverson and Goldin-Meadow (2005) found that pointing precedes and predicts lexical acquisition during the early stages of language learning. The authors have shown that there is an increase of pointing gestures during the period when children’s vocabulary expansion is the largest. At least in Western cultures, parents often respond to children’s pointing by labeling, which in turn helps children integrate those words into their verbal repertoire (Reference Bruner, Sinclair, Jarvelle and LeveltBruner, 1978; Marcos, 2003). Productive use of gesture and speech is linked. Children who produce more meanings in gesture at 14 months showed faster growth in productive vocabulary use (word types) between 14 and 46 months (Reference Rowe, Raudenbush and Goldin-MeadowRowe, Raudenbush, & Goldin-Meadow, 2012).

At that age, children still often use gestures rather than words to express themselves, as Ellie does emphatically in example (4), video 4 (Table 14.4).

Table 14.4 Example 4. Video 4. Ellie, one year and 11 months. Echoing speech with gesture. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%204-Ellie-1–11-stir.mp4

nb	Part.	Actions and gestures	Vocal and verbal productions
1	GDMOT	Filming	Is that enough sugar Ellie? Is there enough sugar there?
2	ELLIE	Ellie gazes at GDMOT, then picks up big bowl full of sugar.
3	MOTHER		Or do we need to put more in? Ellie, I think we need to put more in here.
4	ELLIE	Ellie continues to pick up heavy bowl of sugar and starts pouring.
5	MOTHER	Puts her hand on Ellie’s as the child is pouring the sugar and finishes for her.	Stop, stop!
6	GDMOT	Filming. Laughs loud.	Too late.
7	ELLIE	Emphatic shoulder shrug.
8a	MOTHER	Prepares flour and smiles.	She just like gave up.
8b	ELLIE	Laughing.

When her mother says “too late” after Ellie has poured too much sugar in the bowl to make her cake, Ellie deploys a beautiful shrug, which could be categorized as a recurrent or pragmatic gesture (Reference DebrasDebras 2017), and which is part of the child’s cultural repertoire of gestures: She lifts her shoulders as high as she can, slightly opening her two arms with a radiant smile on her face (Figure 14.7). Her grandmother reformulates the same meaning into speech by saying “she just like gave up.”

Figure 14.7 a, b Smiling, lifting shoulders, and opening arms

After that period, use of two simultaneous modalities for two different elements precedes the onset of two-word speech. This might be linked to children’s cognitive ability to produce simultaneous information before they can express the elements sequentially. Therefore, what Reference Levelt, Bellugi and Studdert-KennedyLevelt (1980) calls the “linearization problem” – in speech, we can only order the information successively in a linear format and we need to organize the elements and select what comes first – does not affect children who, just like adults, have multimodal resources to express themselves. By combining gesture with speech synchronously, they can form a predicative structure. In example (5), video (5) (Table 14.5), Ellie is pointing at various elements in the room and saying a color each time. The combination of gesture (pointing at Bob’s trousers) and word (black) in answer to the question “what color are Bob’s trousers?” could be glossed as “the trousers are black,” a multimodal utterance in which the subject, trousers, is designated through the gesture and the predication, “black,” is expressed with a word. Ellie expertly creates a multimodal construction with pointing + word, even though she does not yet appear to have mastered the association between color adjectives and the actual color of objects she points at, as the extract illustrates (she says black but the trousers are actually blue).

Table 14.5 Example 5. Video 5. Ellie, one year and nine months. Pointing + adjective. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%205-Ellie-1–09-pointing%20and%20word.mp4

nb	Part.	Actions and gestures	Vocal and verbal productions
1	GDMOT	Filming	What color are Bob’s trousers?
2	BOB	Gazing at Ellie	What color are my trousers?
3	ELLIE	Walks up to Bob. Points at the trousers with index on Bob’s leg.	Black
4	BOB		Black or are they blue?
5	ELLIE	Moving her arms up.	Blue.

In another session, at one year and 11 months, Ellie and her grandmother are playing with dolls called Maddie and Susie. The grandmother is taking care of Maddie while Ellie is taking care of Susie. The grandmother takes a bottle and pretends to feed Maddie. But Ellie shouts “Susie bottle!” and simultaneously produces a headshake. The headshake is often called a co-verbal gesture, but at least in this case, the spoken words could just as well be called “co-gestural.” The two modalities, verbal and gestural, are integrated to construct the multimodal utterance which could be interpreted as “no Grandma, the bottle is for Susie and not Maddie.” The predominance of the verbal channel characterizing Ellie’s communicative productions at the end of her second year leads us to subordinate gesture to speech from this age on in our terminology. However, in Ellie’s productions, each modality is used with a specific function and they are assembled to form a negative assertion. The two modalities do not seem to be organized into a hierarchy, but her linguistic community lead the child to favor the verbal channel for practical reasons in her daily communicative practices in a world in which we speak as we cook, eat, draw, clean, or drive and can skillfully combine bodily actions and spoken communication.

Thus, around the age of two, speech comes to be used more than gestures to refer to objects (Reference Capirci, Contaldo, Caselli and VolterraCapirci et al., 2005; Reference Iverson, Capirci and CaselliIverson et al., 1994). During their second year, children also start making combinations. They produce many more [gesture+word] and [word+word] combinations than [gesture+gesture] combinations (Reference Capirci, Contaldo, Caselli and VolterraCapirci et al., 2005; Reference Capirci, Iverson, Pizzuto and VolterraCapirci, Iverson, Pizzuto, & Volterra, 1996; Reference Goldin-Meadow, Morford, Volterra and ErtingGoldin-Meadow & Morford, 1990; Reference Stefanini, Bello, Caselli, Iverson and VolterraStefanini, Bello, Caselli, Iverson, & Volterra, 2009). In most cases when gestures are combined, one or both of the gestures are deictic rather than representational or pragmatic (Reference Capirci, Iverson, Pizzuto and VolterraCapirci et al., 1996; Reference Morford and Goldin-MeadowMorford & Goldin-Meadow, 1992; Reference VolterraVolterra, 1981), as in example (4).

Studies have also shown that [gesture+speech] combinations reliably predict the onset of [word+word] combinations (Reference Butcher, Goldin-Meadow and McNeillButcher & Goldin-Meadow, 2000; Reference Capirci, Iverson, Pizzuto and VolterraCapirci et al., 1996; Reference Iverson, Capirci, Volterra and Goldin-MeadowIverson, Capirci, Volterra, & Goldin-Meadow, 2008). Interestingly enough, a majority of all types of gestures (deictic or representational) are coordinated with speech rather than being used in isolation during this transition period (Reference AndrénAndrén, 2010). Even though speech and gesture have coevolved (Reference Levinson and HollerLevinson & Holler, 2014) and are codeveloped through infancy, hearing children who are bathed in multimodal input and capture language with all their senses are progressively directed toward the predominance of vocal communication mediated by adult scaffolding. When children are between 18 and 30 months, there still seems to be a symbiotic relation between speech and gesture. According to Andrén’s analyses of five Swedish children (Reference Andrén2010), a majority of gestures observed were coordinated with speech. Gesture seems to play an important role in the productive use of multiword speech as there is an increase in all types of gestures in association with speech between 24 and 30 months old. But speech becomes more dominant, confirming that it is the “typifying medium par excellence” in hearing dyads (Schutz, 1953, p. 10) as there is a shift to a more productive or more generalized mode of communicating in which multiword utterances become more common than one-word utterances associated with gestures.

6 Children’s Gestures When Verbal Language Is Mastered and Dominant

Once the multimodal communication system has been mastered, gesture and speech work more subtly together in later childhood. As speech develops, gestures become more and more diverse and elaborate, especially in their relation to speech. Representational gestures tend to appear more and more with verbs and adjectives, rather than with nouns (Reference Capone and McGregorCapone & McGregor, 2004) as shown in example (6), video 6, Table 14.6.

Table 14.6 Example 6. Video 6. Ellie, four years and two months. Big. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%206-Ellie-4_02-this%20big.mp4 Ellie is co-narrating for her grandmother, with her mother’s help, her visit to a zoo.

nb	Part.	Actions and gestures	Vocal and verbal productions
1	GDMOT		A Mummy and a baby gorilla?
2	ELLIE		Yeah
3	GDMOT		How big was it?
4	ELLIE	Extends her fore-arms, places her hands in front of her, palms facing each other, representing size of the baby gorilla.	I think it was about this big.
5a	GDMOT		Is it tiny?
5b	MOTHER	Filming	Hum. Seven months old baby.
6	GDMOT	Head tilt.	A bit bigger than that.
7	ELLIE	Spreads her arms a bit larger, gazes at her mother, than narrows the spaces between them again.	This big?
8	GDMOT		Did it hold on to its Mummy?
9	MOTHER	Spreads out her arms very large, V shape, hands open. Gazing at mother.	Mummy XXX and the Mum is this big.
10a	MOT	Filming	Yeah she was huge, wasn’t she?
10b	GDMOT	Gazing at CHI’s arms.	Huge.

Intriguingly, the visual modality is used to represent what cannot be expressed in words as well: the exact dimension (or supposed dimension) of a baby gorilla and her mother. Figures 14.8 and 14.9 illustrate how Ellie gestures to demonstrate sizes. As she is co-narrating the story with her mother for her grandmother, she is constantly visually checking with her mother (who is filming at the time) that her gesture is “correct.”

Figure 14.8 “I think it was this big”

Figure 14.9 a, b Size readjustment: “this big?”

After her mother and grandmother’s comments (turns 5b and 6), Ellie adjusts her gesture Figure 14.9 (turn 7).

Then Ellie spreads her arms much wider and the amplitude of the space between the hands is now accompanied with her arms extending in a V shape and her hands bending wide (Figure 14.12, turn 9) to indicate the “huge” (turns 10a and 10b) size of the mother gorilla.

Figure 14.10 “the Mum is this big”

Communication thus remains multimodal in face-to-face interaction throughout the rest of our life span. After representational gestures have increased and diversified even more, then beats are used (Reference CollettaColletta, 2004) as well as more metaphorical gestures and abstract deictics.

However, in his data, Reference AndrénAndrén (2010) found that as children’s language skills developed, conventional (recurrent) gestures were more likely to be used with speech than representational gestures. This is in line with Kendon’s observations (2008) on speakers with a rich repertoire of conventionalized gestures which are “fully integrated into the flow of everyday discourse” (p. 360). Andrén also found in his spontaneously recorded family interactions that the complexity in speech was symmetrically related to the complexity in gesture.

Studies have shown that, as of five years old, children’s motoric skills become more controlled and thus gesture-speech integration turns out to be more subtle and complex (Reference Alibali, Kita and YoungAlibali, Kita, & Young, 2000). Children’s gestures and speech become more closely aligned and complementary in terms of semantics, syntax, and pragmatics as they produce more complex multimodal utterances (Reference Morgenstern, Inbal, Estigarribia, Tice and KurumadaMorgenstern, 2014, Reference Morgenstern, Mazur-Palandre and Colon2019). Reference CollettaColletta (2004) also showed that multimodal story-telling skills (linguistic, prosodic, and gestural skills in narration) develop together and simultaneously. Once they have mastered the complexity of speech, children can still resort to multiple semiotic resources and conventionalized arbitrary as well as non-arbitrary mappings grounded in their sensory and affective experiences to fully express the various facets of their inner thoughts, their desires, feelings, and to make comments, explanations, or narratives. Reference Colletta, Pellenq, Nippold and ScottColletta and Pellenq (2009) conducted multimodal analyses of explanations produced by French children aged three to 11 years. The authors found an increase in all observed measures: duration, number of syllables, number of clauses, and use of connectives as well as use of co-speech gestures. Reference CollettaColletta (2009) showed that children aged nine years and over relied more on gesture and gaze resources and delivered truly embedded narratives, acting as narrators, rather than only recounting facts and events they witnessed. But multimodal language production is found to be closely tied to context and genre. Reference Alamillo, Colletta and GuidettiAlamillo, Colletta, and Guidetti (2013) compared explanations and narratives produced by the same group of children. The task had effects on the use of both language and gesture: Cohesion markers were more often used in narratives when gestures and subordinate markers were more frequent in explanations. Reference ÖzyürekÖzyürek (2014) also showed how the different multimodal layers of integrated processing depend on context, pragmatic knowledge, and the communicative intent of the speaker.

Though a rich range of facial expressions expressing emotions is discriminated in early childhood, their production, especially those expressing stance in interaction, develops at different rates and continues all the way to adolescence (Reference Odom and LemondOdom & Lemond, 1972). More recent work conducted in different regions of France indicates that the use of facial expressions is a complex developmental process influenced by several factors including age, gender, regional differences, and types of expression (Reference Grossard, Chaby, Hun, Pellerin, Bourgeois, Dapogny and CohenGrossard et al., 2018).

As children learn to modulate their expression with a rich pallet of multimodal tools, they can progressively use the various degrees of iconicity (Reference EmmoreyEmorey, 2014) that our semiotic resources can offer. They can use the most abstract, indirect relationships that have been generalized and conventionalized in language, as well as transparent imitation and embodied direct relationships that support their ability to refer to entities in their absence (displacement) and enable their interlocutors to capture their meaning. But gesture is not solely guided by our visual modality and by imagistic relationships between form and meaning. As we have illustrated in our examples, gestures are very much linked to actions. There is a continuum between actional/manipulative and communicative/symbolic gestures which also explains why blind children use gestures (Reference Iverson and Goldin MeadowIverson & Goldin Meadow, 1998). Proprioception and sensorimotor skills are an integral part of children’s entrance and mastery of multimodal communication.

In example (7), video (7) (Table 14.7), we illustrate Madeleine’s multimodal skills at the end of our longitudinal data.Footnote ⁸ As she is about to be seven years old, the little girl is able to recount both the content of the events and the discourse she has witnessed. In her productions related to the act of portraying (Reference StreeckStreeck, 2008), she has acquired the skills to show the situation reported in both the vocal modality and the gestural modality. She depicts her mother finding out on her phone’s agenda that she has a business meeting around the time of Madeleine’s birthday party.

Table 14.7 Example 7. Video 7. Madeleine, six years and 11 months. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%207-Mad-6_11-mince.mp4

nb	Part.	Actions and gestures	Vocal and verbal productions
1	MAD	Looks away from the observer, places her hands in front of her face to mimic the situation as if she were holding a phone. When saying “mince!” she brings both hands flat over her mouth. She then returns her gaze to OBS.	Maman un jour s’est mis devant son téléphone qu’elle s’est mis « han mince ». one day Mum looked at her telephone and went “oh no!”
2	MAD	Gaze first turns away from OBS and then returns directly to her as she says “ton anniversaire,” facial expressions mimicking the one she attributes to her mother, small movements of the head.	Parce qu’elle avait déjà tout préparé on avait déjà donné les invitations elle dit « han mince j’ai un rendez-vous pile à l’ heure de ton anniversaire » (exaggerated prosody). because she had prepared everything, we had already given out the invitations, she says “oh, no, I have a meeting exactly at the same time as your birthday party.”
3	OBS (observer)	Filming.	Yes
4	MAD	Gesture to express “régler” until end of utterance both hands apart, index and thumbs outstretched, other fingers bent, cyclic movement of hands.	En fait elle a essayé de régler, en fait c’est son collègue qui vient. In fact she tried to settle it, in fact it’s her colleague who is going.
5	OBS	Filming	Ah donc elle pourra être là. Oh so she’ll be able to be here.
6	MAD	Gaze leaves OBS, facial expressions mimicking those she attributes to her mother. Gaze moves back to OBS during reported speech.	Oui parce-qu’ elle était là « non mais moi je veux voir tes copines hein. » Yes because she was going “no but I really want to see your friends.”
7	OBS	Filming	Ouf! Good!
8	MAD		laughs
9	MAD	Gaze on OBS, facial expressions, small jerky hand gestures.	“Je veux être là”. “I want to be there.”

The first instance of reported speech attributed to Madeleine’s mother (turn 1) is not introduced by a quotative verb (turn 2): She uses non-segmental markers to indicate that the viewpoint has changed. This involves a change in voice and accentuated gesturing with specific facial expressions. The observer understands her perfectly (turn 3). Madeleine’s use of gaze to change perspective is particularly interesting: She stops gazing at the observer as she takes on the role of her mother, in a personal transfer reminiscent of what Reference CuxacCuxac (2000) describes in sign language narratives (turn 6). The alternation of gaze is very consistent throughout the sequence. Her gaze at her interlocutor indicates that she is in the discourse space. When gaze leaves the observer and is either on her hands “holding” the phone (Figure 14.11) as she embodies her mother, or up in the air as she makes exaggerated facial expressions, she expresses that she is entering narration space. But gaze management becomes even more complex. As she plays the role of her mother addressing herself, Madeleine, the little narrator, looks into the observer’s eyes (turns 6, 9, Figure 14.12) and thus makes the observer transfer into her own role when the event took place: The observer becomes Madeleine, while Madeleine acts as her own mother.

Figure 14.11 Narrative space

Figure 14.12 Discourse space

(reported speech inside narration)

Madeleine’s voice has become that of her mother, expressed by the subtle pitch variations in her prosody. Madeleine’s body embodies that of her mother with her gestures and facial expressions.

We also notice in this passage how some multimodal constructions are used automatically by Madeleine, such as the rather sophisticated recurrent gesture involving a hand configuration performed with both hands (index extended), a particular localization and a cyclical movement that accompany the verb “régler” in French (to settle the problem, Figure 14.13).

Figure 14.13 a, b, c [régler + recurrent gesture]

At seven years old, Madeleine has become an expert multimodal communicator who masters the different functions of each modality and handles multimodal constructions to express both her own perspective and the perspectives she is able to attribute to others.

More studies of children across languages and cultures are needed to fully capture in detail the development of the various categories of gesture and find out more about similarities and variations between individual children’s pathways to master the complexity of multimodal communication.

7 Conclusion

From an early age, children have cognitive skills that allow them to analyze the language input that surrounds them and thus structure their own practices. Without mastering the complex uses of each word, each gesture, each intonation pattern, and each multimodal construction, they can still construct meaning.

Children use all the resources provided by their bodies to express themselves when they are brought up in an environment that is favorable to multimodal communication and language development. They construct a shared repertoire of gestures and words with their interlocutors. But they constantly use the multimodal resources at their disposal and progressively enrich the complexity of their production, as the examples from Ellie and Madeleine’s longitudinal data have illustrated.

More specifically, throughout infancy, children learn the meaning of gestures in multimodal utterances produced around them. Child-directed communication is very often more emphatically multimodal than is typical in adult-addressed language. Children are also socialized to the use of gestures through specific routines, songs, and rituals in which gestures are integrated or even focused on. Coordination of gaze, gesture, facial expressions, posture, and speech for communication could already be captured in children’s early productions in gestalt, multimodal communicative acts. However, this orchestration of semiotic resources used for interaction develops steadily throughout childhood and into adulthood. Children learn to dissociate the uses and specific functions of each modality and to master the dynamic multimodal communicative system used around them and with them in their daily interactions.

15 Gesture and Second/Foreign Language Acquisition

1 Introduction

Most people in the world speak more than one language – many learning new languages throughout life, inside and outside of classrooms, for reasons of study, work, migration, religion, or pleasure. Studies of second language acquisition (L2 or SLA) or foreign language acquisition Footnote ¹ (FLA) examine how this comes about, querying how a new language emerges and develops in the mind in the presence of one or several existing ones. SLA studies track developmental trajectories and “outcomes,” and explore the factors assumed to influence acquisition, such as the nature of the languages that come into contact (e.g. the difference between being a Russian, Dutch, or Japanese learner of English), learners’ age (child vs. adult), skill levels or proficiency, individual cognitive capacities such as working memory and language-learning aptitude, the learning situation (classroom vs. naturalistic settings), the type of instruction (form vs. meaning-based), the amount of input/exposure, and patterns of output/use in conversation and interaction (see Reference Gass and MackeyGass & Mackey, 2012 for overviews). SLA is thus a vast field of study with linguistic, psychological and neurocognitive, social, anthropological, and pedagogical subfields. Until quite recently, gestures were not seen as scientifically relevant in any of these subfields. However, as embodied views of cognition and language gain ground (e.g. Reference BarsalouBarsalou, 2008; Reference Glenberg and KaschakGlenberg & Kaschak, 2002), and evidence grows that gestures are an integral part of language production and comprehension (Reference ClarkClark, 1996; Reference Goldin-MeadowGoldin-Meadow, 2003; Reference Kendon, Siegman and PopeKendon, 1972, Reference Kendon2004; Reference McNeillMcNeill, 1992, Reference McNeill2014; Reference Volterra, Beronesi, Massoni, Volterra and ErtingVolterra, Beronesi, & Massoni, 1990), gestures are becoming important to SLA concerns as part of the cross-linguistic, psycholinguistic, and sociolinguistic variation to consider.

Over the past two decades, gesture and SLA has become a thriving field of study in its own right (see Reference GullbergGullberg, 2006b, Reference Gullberg, Robinson and Ellis2008; Reference Gullberg and McCaffertyGullberg & McCafferty, 2008; Reference Stam and ChapelleStam, 2012; Reference Stam, Buescher, Phakiti, De Costa, Plonsky and StarfieldStam & Buescher, 2018 for overviews). The work largely concentrates on two broad domains: gestures as a window onto language acquisition issues; and gestures as a medium of acquisition where the effect of gesture on acquisition is examined. In principle, a third domain could be the study of the acquisition of gestural repertoires in and of themselves (cf. Reference Gullberg, Müller, Cienki, Fricke, Ladewig, McNeill and BressemGullberg, 2014), but very little such work exists.

The current overview will focus on traditional SLA, that is, on speakers whose language skills are still emerging (traditionally “L2 users, L2 learners”) and whose learning is at stake, leaving bilingualism aside (but for a review of gestures and bilingualism, see Reference Gullberg, Bhatia and RitchieGullberg, 2012a). Moreover, the review will focus on manual gestures rather than on the full repertoire of behaviors that are part of multimodal communication (gaze, body orientation, head movements, facial expressions, etc.). This chapter summarizes current research on what gestures reveal about SLA and L2 development, and how gestures affect SLA in interaction and in instruction. It closes with a possible research agenda, outlining further issues to explore. Methodological issues for gestures and SLA will not be discussed (but see Reference GullbergGullberg, 2010, Reference Gullberg and Chapelle2012b for overviews). A final terminological point is that we will follow Reference KendonKendon (2004) and refer to gesture functions (representational/referential, pragmatic) rather than gesture “types” wherever necessary.

2 Gestures in Acquisition

2.1 The Influence of Other Languages in SLA, Cross-Linguistic Influence

In contrast to child language acquisition, adults already have languages in place when they learn new ones. A key issue in SLA studies is how those established languages influence the emergence of new ones in speakers’ minds, or more generally, how languages in contact in one mind influence each other. Foreign accent is the textbook example of how a first language (L1) influences a second (L2). The study of cross-linguistic influence (CLI) (Reference Jarvis and PavlenkoJarvis & Pavlenko, 2008) is a huge research area in SLA, often seen as a main reason for why L2 learners do not achieve “nativelikeness.” Learners are assumed to continue to rely on categories and structures from the L1 rather than restructure towards those of the L2. Traditionally, similarities across languages have been assumed to facilitate learning (“positive transfer”), and differences to cause difficulties (“negative transfer,” “interference”), a view no longer adhered to in these simplistic terms (Reference Jarvis and PavlenkoJarvis & Pavlenko, 2008; Reference Odlin, Doughty and LongOdlin, 2003).

Cross-linguistic comparisons between structures in the L1 and the L2 are key to this line of study. The growing body of work showing that native speakers of different languages also gesture systematically differently as a function of how their languages encode and express meaning (see Reference KitaKita, 2009 for an overview) has opened an opportunity to use gesture analysis as a tool to reveal more about the nature of learners’ representations. Critically, the argument for SLA studies is that gestures may reveal whether L2 speakers continue to use conceptual representations and categories from the L1 (producing L1-like gesture patterns) rather than show restructuring towards L2 representations (producing L2-like gesture patterns). Crucial to this argument is the assumption that gestures reflect conceptual-semantic elements (e.g. path and manner of motion) as well as their morphosyntactic organization (e.g. word order, number of clauses).

A few linguistic domains with well-documented cross-linguistic differences have been studied for bimodal CLI effects. The vast majority of studies target voluntary and caused motion, drawing on Reference Talmy, Sutton, Johnson and ShieldsTalmy’s (1991, 2000) distinction between verb-framed languages that encode path of motion in verb roots (e.g. French traverser “cross”) and satellite-framed languages that instead encode path of motion in satellites (e.g. prepositions, like English across), leaving the verb to express manner of motion (e.g. crawl, run, sashay). Many studies have shown that L2 learners often do not gesture about motion like native speakers of the target language but continue to gesture in L1-typical ways even as they speak the L2. L1 traces can be found in gestural timing: Learners may temporally align their gestures with different spoken elements to native speakers (e.g. with verbs vs. locative phrases; e.g. Reference Choi and LantolfChoi & Lantolf, 2008; Reference StamStam, 2006). Traces can also be found in gestural forms with learners targeting different semantic content in gestures to native speakers (e.g. path, manner, objects; e.g. Reference GullbergGullberg, 2009; Reference Özyürek and SkarabelaÖzyürek, 2002). Findings are often discussed in terms of Slobin’s notion of “thinking for speaking” (Reference Slobin, Gumperz and LevinsonSlobin, 1996), that is, the idea that linguistic categories influence what information speakers select for expression when speaking. L1-like gesture patterns suggest that L2 learners continue to think for speaking in the L1 rather than the L2.

Similar results are found with expressions of time, and verbal aspect. Mandarin Chinese makes use of vertical time metaphors (e.g. shàng “above,” xià “below” for earlier [past] and later [future]), which are often accompanied by vertical gestures. English speakers instead express time in other metaphors (e.g. before, after) and often gesture on a lateral spatial axis (Reference Gu, Mol, Hoetjes and SwertsGu, Mol, Hoetjes, & Swerts, 2017). Mandarin learners of L2 English occasionally also produce vertical time gestures in L2 English, revealing a lingering L1 influence. The SLA of verbal aspect is well studied in speech (Reference Bardovi-HarligBardovi-Harlig, 2000), but less is known about the expression of aspect in gesture (e.g. Reference DuncanDuncan, 2002). A recent analysis opposes so-called bounded gestures, involving a “pulse of effort,” with unbounded gestures, involving “smooth movement” (Reference Cienki and IriskhanovaCienki & Iriskhanova, 2018, p. 3). In this study, native speakers of French aligned perfective aspect (passé composé) significantly more often with bounded gestures, and imperfective aspect (imparfait) with unbounded gestures, whereas L1 Russian speakers aligned bounded gestures with both aspects. Russian L2 learners of French continued to produce bounded gestures with both aspects, but also increase their use of unbounded gestures, suggesting the start of a shift towards a French pattern (Reference Denisova, Cienki and IriskhanovaDenisova, Cienki, & Iriskhanova, 2018).

A recent line of work examines CLI between gestures and sign languages, specifically whether L1 co-speech gestures influence L2 sign acquisition. It has been argued that gestures affect the acquisition of sign hand shapes and iconic signs adversely since hearing learners draw on their more varied and unrestricted co-speech gestures and therefore pay less attention to the (phonological and) articulatory details of signs (e.g. Reference Janke and MarshallJanke & Marshall, 2017; Reference Ortega and MorganOrtega & Morgan, 2015a, Reference Ortega and Morgan2015b). In contrast, other studies have suggested that gestures may help L2 sign learners acquire properties of sign prosody (Reference Brentari, Nadolske and WolfordBrentari, Nadolske, & Wolford, 2012). This is clearly a domain open to exploration.

The CLI of gesture on gesture is also examined. It is often assumed that members of some cultures gesture more than those of others (e.g. Reference ScheflenScheflen, 1972). Despite a noteworthy lack of data comparing gesture rates cross-culturally under comparable conditions, preconceived ideas are rife (cf. Reference Sekine, Stam, Yoshioka, Tellier and CapirciSekine, Stam, Yoshioka, Tellier, & Capirci, 2015). Native speakers’ gesture rates are occasionally reported, but typically in specific tasks and limited domains (e.g. Reference GullbergGullberg, 1998; Reference Iverson, Capirci, Volterra and Goldin-MeadowIverson, Capirci, Volterra, & Goldin-Meadow, 2008; Reference Pettenati, Sekine, Congestrì and VolterraPettenati, Sekine, Congestrì, & Volterra, 2012; Reference YoshiokaYoshioka, 2005). Nevertheless, a few studies have examined whether gesture rates transfer. One study explored transfer from what they assumed to be gesture-frequent languages, Spanish and French, into a language with a lower gesture frequency: English (Reference Pika, Nicoladis and MarentettePika, Nicoladis, & Marentette, 2006). All bilinguals gestured more than the monolingual English speakers, but it remains unclear why, since there was no baseline L1 data for the Romance languages. In contrast, Reference SoSo (2010) established that monolingual Mandarin Chinese speakers gestured less than American English speakers. Chinese-English bilinguals gestured as frequently in English as English monolinguals and more often than Chinese monolinguals in Chinese, specifically producing more representational gestures. It remains unclear why the influence only affected representational gestures and not non-representational ones. This line of study clearly needs solid baseline data, but also clearer theorizing around why gesture rates should transfer, and why certain gesture functions transfer more than others.

In sum, influence from the L1 on the L2 is typically found both in speech and in gesture. Importantly, most studies suggest that gestures are more conservative than speech, meaning that they can reveal enduring CLI even when speech may have shifted to target L2 structures. However, whether learners show persistent CLI in speech and gesture (Reference ÖzçalışkanÖzçalışkan, 2016), or early shifts of gesture patterns and evidence of further learning (Reference GullbergGullberg, 2009; Reference LewisLewis, 2012; Reference StamStam, 2015) depends on factors such as learners’ proficiency in, exposure to, and usage of the L2 over time. Control over these factors is vital in SLA research in order to meaningfully assess patterns and possible development.

Traditionally, CLI studies have only examined influences from the L1 on the L2. However, prompted by psycholinguistic insights that all languages that one knows are typically active at any time (Reference Van Hell and DijkstraVan Hell & Dijkstra, 2002), studies have now begun to examine influences from the L2 on the L1 in speech and gesture (cf. papers in Reference CookCook, 2003). For example, Japanese speakers with intermediate knowledge of L2 English talk and gesture significantly differently about manner of motion in their L1 Japanese than monolingual Japanese speakers do (Reference Gullberg, Robinson and EllisBrown & Gullberg, 2008). The distribution of manner across speech and gestures shows traces of English speech-gesture patterns, although the speakers’ L2 skills are very modest. Even more strikingly, when compared to themselves, their speech-gesture patterns are indistinguishable even when they are speaking the two different languages. The results reveal an influence of the L2 on the L1, but also an influence of the L1 on the L2. Similar bidirectional shifts have been found in Mandarin and Japanese intermediate learners of L2 English (Reference BrownBrown, 2015; Reference Iwasaki, Yoshioka, Pappalardo and HeinrichIwasaki & Yoshioka, 2020), and in the acquisition of a L2 sign language, where L1 co-speech gestures are affected by sign as reflected in increased overall gesture rates, increased number of iconic gestures, and number of hand shape types (e.g. Reference Casey, Emmorey and LarrabeeCasey, Emmorey, & Larrabee, 2012; Reference Gu, Zheng and SwertsGu, Zheng, & Swerts, 2019). Not surprisingly, the longer the experience with the L2, the more likely the influence on the L1 both for speech-gesture and for sign (Reference StamStam, 2015; Reference Weisberg, Casey, Sevcikova Sehyr and EmmoreyWeisberg, Casey, Sevcikova Sehyr, & Emmorey, 2020). Obviously, demonstrable influences of an L2 on an L1 even at modest levels of skill in the L2 raise important theoretical and practical challenges for views of the monolingual native speaker norm (cf. Reference DaviesDavies, 2003).

Although CLI is a flourishing area of study, many things remain unknown. We know remarkably little about CLI effects beyond the linguistic domain of motion. We must explore other linguistic domains if gesture analysis is really to contribute to SLA studies of CLI. It is an obvious challenge to first establish L1 baselines in new domains, especially for gesture, but that is a challenge we must meet. We also need more longitudinal studies to further our understanding of how the speech-gesture ensemble develops with increasing proficiency and time, and why gestures are more “conservative” (cf. Reference StamStam, 2015). Furthermore, CLI is only studied in bimodal language production. We know nothing about whether native interlocutors perceive and care about learners’ “manual accents” and non-target-like L2 gestures the way they do about foreign accent in speech. Although some studies show that learners’ production of gestures has a positive effect on assessments of their skills (Reference GullbergGullberg, 1998; Reference Jenkins and ParraJenkins & Parra, 2003), studies have not directly examined native perception of “foreign gesture” or its potential interactional consequences (but see, Reference Hooijschuur, Hilton and LoertsHooijschuur, Hilton, & Loerts, 2017).

2.2 General Learner Phenomena

SLA studies are not only interested in CLI. Research also examines learners’ language use as a variety in its own right, as well as properties determined by general learning mechanisms, referred to as interlanguage or learner varieties (Reference PerduePerdue, 2000; Reference SelinkerSelinker, 1972). In this perspective, gesture analyses shed light on details of general developmental patterns at a given level of skill.

One line of work considers the relationship between fluency, complexity, and accuracy (e.g., Reference Housen, Kuiken and VedderHousen, Kuiken, & Vedder, 2012), a field of SLA research that has hitherto focused exclusively on speech. However, in gesture studies, the relationship between fluency, proficiency, and gesture rates has generated much work. The general expectation is that the lower the proficiency, the higher the gesture rate – either because gestures can function as communication tools and problem solvers in L2 production (cf. Section 3.1), because (representational) gestures facilitate lexical access (Reference Rauscher, Krauss and ChenRauscher, Krauss, & Chen, 1996), or because greater cognitive load is associated with increased gesture production regardless of linguistic challenges (e.g. Reference Alibali, Yeo, Hostetter, Kita, Breckinridge Church, Alibali and KellyAlibali, Yeo, Hostetter, & Kita, 2017; Reference Melinger and KitaMelinger & Kita, 2007). And indeed, many studies show that L2 learners and bilinguals typically produce more gestures overall than native speakers and monolinguals do (see Reference Gullberg, Bhatia and RitchieGullberg, 2012a; Reference NicoladisNicoladis, 2007, for overviews). However, the link to fluency and proficiency is not straightforward. For example, gestures are overwhelmingly produced with fluent rather than with disfluent speech, both in L1 and L2 production (e.g. Reference Graziano and GullbergGraziano & Gullberg, 2018). Furthermore, gesture rates may be modulated by task demands (e.g. Reference Aziz and NicoladisAziz & Nicoladis, 2019; Reference LinLin, 2020), the languages involved (e.g. Reference SoSo, 2010), and individual communicative style (e.g. Reference GullbergGullberg, 1998; Reference Nagpal, Nicoladis and MarentetteNagpal, Nicoladis, & Marentette, 2011). Different gestures may also be differentially affected. Some studies suggest that low proficient learners mainly produce representational gestures (e.g. Reference KidaKida, 2005), whereas others find more representational gestures with increasing proficiency and fluency (Reference Gregersen, Olivares-Cuhat and StormGregersen, Olivares-Cuhat, & Storm, 2009; Reference GullbergGullberg, 1998), or even U-shaped developmental patterns (Reference Zvaigzne, Oshima-Takane and HirakawaZvaigzne, Oshima-Takane, & Hirakawa, 2019). Some suggest that pragmatic and deictic gestures are more frequent in early L2 production (Reference GullbergGullberg, 1998; Reference Isaeva, Fernández-Villanueva, Fernández-Villanueva and JungbluthIsaeva & Fernández-Villanueva, 2016). To understand the interplay between language skills/proficiency and bimodal behavior, we clearly need more detailed studies that track gesture use, task demands, linguistic complexity, fluency, cognitive capacities such as working memory (cf. Reference Cook, Fenn, Breckinridge Church, Alibali and KellyCook & Fenn, 2017), and (critically) independently established proficiency or skill levels. More detailed analyses of the temporal relationship between gestures and elements in speech (including disfluency) would also help elucidate the relationship. At this stage, it would be ill advised to consider gesture rate as a reliable diagnostic for L2 proficiency.

Another concern in SLA is how L2 learners at early stages of proficiency organize information about who does what to whom (“reference tracking”) to create coherent and intelligible discourse (Reference PerduePerdue, 2000), especially when they do not yet master pronouns or word order patterns in the L2. When introducing and referring back to entities, native speakers typically create referential chains of lexical noun phrases for new entities followed by pronouns for known ones (a girl-she). New entities are often accompanied by anchoring or localizing gestures, whereas known ones are not (Reference Azar, Backus and ÖzyürekAzar, Backus, & Özyürek, 2019; Reference Debreslioska and GullbergDebreslioska & Gullberg, 2020; Reference Foraker, Stam and IshinoForaker, 2011; Reference McNeillLevy & McNeill, 1992). In contrast to this pattern, early L2 learners with different L1s often use overly explicit chains of lexical noun phrases to refer to the same entity whether new or known (girl – girl) (Reference Hendriks and RamatHendriks, 2003; Reference WilliamsWilliams, 1988). They also anchor entities with gestures, and typically gesture about them at every mention regardless of whether they are new or known, creating over-explicit bimodal reference tracking (Reference Gullberg, Dimroth and StarrenGullberg, 2003, Reference Gullberg2006a; Reference So, Kita and Goldin-MeadowSo, Kita, & Goldin-Meadow, 2013; Reference So, Lim and TanSo, Lim, & Tan, 2014; Reference YoshiokaYoshioka, 2008; Reference Yoshioka and KellermanYoshioka & Kellerman, 2006). These patterns are similar regardless of learners’ L1 and L2, suggesting that this is a general learner phenomenon. Moreover, since the patterns persist even when addressees cannot see learners’ localizing gestures, they do not seem to be a strategy for disambiguation (Reference GullbergGullberg, 2006a). That said, the nature of the languages in contact may affect the timing of the anchoring gestures. For example, gestural reference tracking with lexical noun phrases in languages like Swedish and French may be different from that in languages like Japanese, Mandarin Chinese, or Turkish, which typically drop arguments (nouns, pronouns) in native production. It makes it challenging to assess whether anchoring gestures in clauses with dropped arguments serve other functions than they do in languages with overt arguments (cf. Reference So, Kita and Goldin-MeadowSo et al., 2013; Reference YoshiokaYoshioka, 2008). Moreover, over-explicit reference tracking is not necessarily found in L2 sign language (Reference Frederiksen and MayberryFrederiksen & Mayberry, 2019), again suggesting language-specific issues to explore. Much remains to be clarified regarding bimodal reference tracking in different L1–L2 pairings and at different proficiency levels.

3 Gestures in L2 Interaction and Instruction

3.1 Teachers’ and Learners’ Gestural Practices in and outside of Language Classrooms

An important domain in SLA studies is how learners and their native and non-native interlocutors combine speech and gestures in interactive practices to promote communication, comprehension, and learning. All forms of didactic talk – by adults to children (child-directed speech) and by native speakers to L2 users (”foreigner/teacher talk,” Reference Ferguson and HymesFerguson, 1971) – appear to display an increased use of gestures (e.g. Reference AdamsAdams, 1998; Reference AllenAllen, 2000; Reference Iverson, Capirci, Longobardi and CaselliIverson, Capirci, Longobardi, & Caselli, 1999; Reference LazaratonLazaraton, 2004). Teachers, instructors, and parents clearly think that seeing gestures facilitates learners’ comprehension (cf. Reference KellermanKellerman, 1992; Reference Sueyoshi and HardisonSueyoshi & Hardison, 2005) and possibly also learning. Studies of gesture in communicative practices are often – but not always – qualitative in nature, focusing on interaction. The focus is typically not on measurable effects on learning.

A range of studies investigates how learners themselves use speech and gesture as resources to enable communication in the L2 when linguistic skills are limited. Studies have focused on how speech and gesture are jointly deployed to avoid communicative breakdowns, resolve misunderstanding, and handle repairs. In SLA studies, the field of communication strategies examines how learners solve lexical, grammatical, and pragmatic problems (Reference Kasper and KellermanKasper & Kellerman, 1997). Studies have focused both on interactive and cognitive aspects of this process and identified strategies such as circumlocutions, word coinage, avoidance, and gestures. For example, French and Swedish learners produce representational gestures to resolve lexical issues in joint solutions with their interlocutors; representational and deictic gestures to handle grammatical difficulties; and addressee-directed and pragmatic gestures (“thinking” or “cyclic” gestures; Reference Gullberg, Streeck, Goodwin and LeBaronGullberg, 2011; Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfLadewig, 2014) to manage difficulties arising from non-fluent speech, turn-taking, and repairs (Reference GullbergGullberg, 1998, Reference Gullberg, Streeck, Goodwin and LeBaron2011). Conversation analysis shows that gestural word searches and repair sequences in particular are deeply interactive (Reference Fornel, Conein, Fornel and QuéréFornel, 1991; Reference Harrison, Adolphs, Gillon Dowens, Du and LittlemoreHarrison, Adolphs, Gillon Dowens, Du, & Littlemore, 2018; Reference Olsher, McCafferty and StamOlsher, 2008). Learners also talk and gesture to themselves to solve problems, rehearse and internalize new knowledge, known as private speech and private gestures (Reference LeeLee, 2008; Reference McCaffertyMcCafferty, 1998; Reference McCafferty and RosboroughMcCafferty & Rosborough, 2014). Such talk is often accompanied by beat-like gestures both during the problematic sequence and at the resolution of the problem (Reference HauserHauser, 2014, for solution strokes). Interestingly, when teachers themselves are non-native speakers, they too become more fluent the more they gesture (Reference SatoSato, 2020).

Another line of work focuses specifically on instructed SLA activities in classrooms. Studies in the traditions of Conversation Analysis or Sociocultural Theory exemplify how both learners and teachers use gestures in classroom interaction to support the teaching and learning of vocabulary, grammar, pronunciation, pragmatics, and even writing (Reference Eskildsen and WagnerEskildsen & Wagner, 2013; Reference Kim and ChoKim & Cho, 2017; Reference Kimura and KazikKimura & Kazik, 2017; Reference LazaratonLazaraton, 2004; Reference Matsumoto and DobsMatsumoto & Dobs, 2017; Reference SmotrovaSmotrova, 2017). Studies also examine how gestures are used to regulate classroom interaction (Reference CekaiteCekaite, 2009; Reference Tabensky, McCafferty and StamTabensky, 2008), raise awareness of elements to be learned (Reference HilliardHilliard, 2020; Reference Van Compernolle and Williamsvan Compernolle & Williams, 2011), and enable joint coproduction of the L2 (Reference Mori and HayashiMori & Hayashi, 2006; Reference Olsher, Gardner and WagnerOlsher, 2004). Students’ reactions to teachers’ gestures vary (Reference SimeSime, 2006), but learners clearly often pick up and reuse teachers’ and other students’ gestures, suggesting that these gestures play a role in the internalization and entrenchment of new lexical and grammatical elements (Reference BelhiahBelhiah, 2013; Reference Clark and TrofimovichClark & Trofimovich, 2016; Reference Eskildsen and WagnerEskildsen & Wagner, 2015; Reference Smotrova and LantolfSmotrova & Lantolf, 2013). Finally, gestures play a role in learners’ explorations of their L2 identities, especially after lengthy use of the new language (Reference Nardotto Peltier and McCaffertyNardotto Peltier & McCafferty, 2010; Reference Tian and McCaffertyTian & McCafferty, 2020).

Teachers’ gestures are the focus in studies on pedagogical practice. Tellier has identified three broad functions of teachers’ gestures, namely: to inform, to animate (meaning to enliven and guide), and to assess (Reference TellierTellier, 2006, Reference Tellier, Tellier and Cadet2014; cf. Reference Quinlisk, McCafferty and StamQuinlisk, 2008). Interestingly, when using gestures to inform and explain vocabulary, teachers modulate their behavior depending on the skill level of the student. They produce more representational gestures of longer duration and spatial expanse when explaining vocabulary to non-native than to native students (Reference Stam and ChapelleTellier & Stam, 2012), and often produce gestures both before and after speech pauses to highlight a new word (Reference Stam, Tellier, Breckinridge Church, Alibali and KellyStam & Tellier, 2017). This is a form of gestural “teacher talk” (Reference Sinclair and BrazilSinclair & Brazil, 1982).

The gestural assessment function is often studied under the SLA heading of (corrective) feedback and recasts, meaning cases where a teacher provides feedback to learners’ production, for example, through recasts that provide a correct reformulation (see Reference Nakatsukasa, Loewen, Nassaji and KartchavaNakatsukasa & Loewen, 2017 for an overview). Many studies in this domain are descriptive microanalytical studies (e.g. Reference Van Compernolle and Smotrovavan Compernolle & Smotrova, 2014), but there are also intervention studies using pre-test/post-test designs, with results showing both positive (Reference NakatsukasaNakatsukasa, 2016) and more modest effects of gestural recasts on learning (Reference NakatsukasaNakatsukasa, 2021), possibly depending on the linguistic domain being taught.

Overall, instructed SLA settings remain underexplored with regard to multimodality and gesture, in particular concerning effects of different kinds of instruction, such as a focus on formal aspects of language (“focus on form”) versus a focus on communication (“focus on meaning”) (Reference Doughty, Doughty and LongDoughty, 2003). Moreover, the acquisition of vocabulary continues to dominate these studies, whereas other linguistic domains should be examined. There is a budding literature on gestures in the teaching of grammar and pronunciation, but these are big domains where many subareas remain to be explored.

3.2 The Effects of Seeing and Producing Gestures on SLA

The studies reviewed above have all assumed that gestures benefit learners’ L2 development and use, but, with the exception of the pre-test/post-test studies, they have not typically measured actual effects. Outside of language studies, in contrast, a considerable body of research now deals with the measurable effects of seeing and producing gestures on memory and learning (see Reference Cook, Fenn, Breckinridge Church, Alibali and KellyCook & Fenn, 2017, for an overview). Results typically show that gestures promote learning. For example, learners who see and produce gestures learn more about math and science than those who do not, both in the short term and in the long term. Moreover, those with lower working memory capacity typically benefit more from gestures than those with higher capacity.

Studies examining possible effects for language learning and specifically SLA have been scarcer. However, following the pioneering studies by Reference AllenAllen (1995), Reference TellierTellier (2008), and Reference Kelly, McDevitt and EschKelly, McDevitt and Esch (2009), an explosion of behavioral and neurocognitive investigations now examine how gestures during training affect language learning in child and adult L2 learners.

Overall, studies find beneficial effects of gesture perception and, even more so, of gesture production in different linguistic domains. Vocabulary in real and artificial languages is better retained when presented with gestures, especially if learners produce gestures themselves during explicit training (e.g. Reference Andrä, Mathias, Schwager, Macedonia and von KriegsteinAndrä, Mathias, Schwager, Macedonia, & von Kriegstein, 2020; Reference García-Gámez and MacizoGarcía-Gámez & Macizo, 2019; Reference Kelly, McDevitt and EschKelly et al., 2009; Reference Krönke, Mueller, Friederici and ObrigKrönke, Mueller, Friederici, & Obrig, 2013; Reference Macedonia, Müller and FriedericiMacedonia, Müller, & Friederici, 2011; Reference MorettMorett, 2014, Reference Morett2018). Even without explicit training, seeing gestures implicitly helps learners to link meaning to word forms in an unknown language (Reference Gullberg, Roberts and DimrothGullberg, Roberts, & Dimroth, 2012). Gesture training improves L2 pronunciation of vowel and syllable durations, word stress, and intonation (e.g. Reference Ghaemi and RafiGhaemi & Rafi, 2018; Reference Gluhareva and PrietoGluhareva & Prieto, 2017; Reference Iizuka, Nakatsukasa and BraverIizuka, Nakatsukasa, & Braver, 2020; Reference Kushch, Igualada and PrietoKushch, Igualada, & Prieto, 2018; Reference Li, Baills and PrietoLi, Baills, & Prieto, 2020; Reference Yuan, González-Fuente, Baills and PrietoYuan, González-Fuente, Baills, & Prieto, 2019; Reference Zhang, Baills and PrietoZhang, Baills, & Prieto, 2018), and the learning of phonological distinctions such as lexical tones in Mandarin Chinese (Reference Baills, Suárez-González, González-Fuente and PrietoBaills, Suárez-González, González-Fuente, & Prieto, 2019; Reference Morett and ChangMorett & Chang, 2015). Even L2 grammar may benefit from gesture training, such as in learning about prepositions (e.g. Reference NakatsukasaNakatsukasa, 2016). The behavioral findings are bolstered by neurocognitive evidence highlighting that gestures are integrated with language in processing (see Reference Kelly, Breckinridge Church, Alibali and KellyKelly, 2017, for an overview), also offering possible explanations for the learning effects. It is suggested that gestures add a depth of encoding since sensorimotor brain networks are activated and grow larger the more sensory modalities are connected to a word (Reference Macedonia, Repetto, Ischebeck and MuellerMacedonia, Repetto, Ischebeck, & Mueller, 2019).

Importantly, studies also examine effects of different kinds of gestures. Most studies have investigated the effects of representational gestures on vocabulary learning, depicting concrete content (e.g. size, shape, movement) (e.g. Reference Kelly, McDevitt and EschKelly et al., 2009; Reference Macedonia and KnöscheMacedonia & Knösche, 2011; Reference MorettMorett, 2014; Reference PorterPorter, 2016; Reference So, Sim Chen-Hui and Low Wei-ShanSo, Sim Chen-Hui, & Low Wei-Shan, 2012; Reference TellierTellier, 2008). Studies have also examined the effect of gestures depicting abstract content (e.g. a lateral sweeping movement to depict duration), but here results are more mixed (Reference Baills, Suárez-González, González-Fuente and PrietoBaills et al., 2019; Reference Hirata and KellyHirata & Kelly, 2010; Reference Hirata, Kelly, Huang and ManansalaHirata, Kelly, Huang, & Manansala, 2014; Reference Li, Baills and PrietoLi et al., 2020). Studies have also examined the effect of non-representational gestures, such as prominence-marking manual beats (Reference Kushch, Igualada and PrietoKushch et al., 2018), hand-clapping (Reference Iizuka, Nakatsukasa and BraverIizuka et al., 2020; Reference Zhang, Baills and PrietoZhang et al., 2018); and head nods (Reference Zheng, Hirata and KellyZheng, Hirata, & Kelly Spencer, 2018).

Although gestures generally seem to boost SLA, various caveats apply. Importantly, the semantic content of gestures should match that of vocabulary items (Reference García-Gámez and MacizoGarcía-Gámez & Macizo, 2019; Reference Huang, Kim and ChristiansonHuang, Kim, & Christianson, 2019; Reference Kelly, McDevitt and EschKelly et al., 2009; Reference MacedoniaMacedonia, 2019). Non-matching gestures may hinder acquisition more than the total absence of gestures. Another important issue is task demands (Reference Kelly and LeeKelly & Lee, 2012; Reference Morett and ChangMorett & Chang, 2015) and learners’ developmental levels. For example, representational gestures help both children and adults to retain new vocabulary, but beats may only help adults (Reference So, Sim Chen-Hui and Low Wei-ShanSo et al., 2012).

The importance of these details is highlighted by some (seemingly) contradictory evidence for effects in the domains of phonology and phonetics. Hirata and colleagues (Reference Hirata and KellyHirata & Kelly, 2010; Reference Hirata, Kelly, Huang and ManansalaHirata et al., 2014; Reference Kelly, Hirata, Manansala and HuangKelly, Hirata, Manansala, & Huang, 2014) have found that while lip information improves English learners’ ability to perceive long and short vowels in Japanese, manual sweeping gestures indicating duration have limited effects both on the perception of vowels and on the retention of vocabulary containing the distinction. However, the same sweeping gesture has recently been shown to improve the pronunciation of the vowel distinction, but not performance on a perceptual task (Reference Li, Baills and PrietoLi et al., 2020). In both cases, then, effects are limited in the perceptual domain, but there is gestural boosting in L2 production. Reference Kelly, Breckinridge Church, Alibali and KellyKelly (2017) has suggested that some linguistic components are more deeply connected to gestures (e.g. concrete semantics, pragmatics) and may therefore be more susceptible to gestural effects on learning than others (e.g. syntax, phonology). Although an interesting suggestion, it may be premature to discount the effect of gestures on the acquisition of whole linguistic domains until more careful distinctions have been made between perception/production benefits and different task difficulties. The effect of cognitive task complexity also remains largely untapped in the gesture-learning literature (but see Reference Nicoladis, Pika, Yin and MarentetteNicoladis, Pika, Yin, & Marentette, 2007). Given the great variation of tasks used in these studies (e.g. perceptual tasks: multiple choice, word-meaning associations, word recognition, discrimination; production tasks: free and cued word recall, foreign to native/native to foreign translation, imitation), there is much to explore and systematize here.

Aspects that have not yet been properly addressed include how much training is needed for gestures to have an effect (Reference MacedoniaMacedonia, 2019), and the corresponding longevity of the effects. Most studies still find effects one to two weeks after training, but proper longitudinal studies are rare (but see Reference Macedonia and KlimeschMacedonia & Klimesch, 2014). Further, most studies are intervention studies involving explicit classroom-like instruction. Few studies have investigated the potential of gestures for incidental or implicit L2 learning, that is, learning in the absence of overt instruction. The difference between explicit and implicit language learning is an important research domain in SLA studies (see Reference HulstijnHulstijn, 2005 for an overview), where the role of attention and differences between procedural (implicit) and declarative (explicit) learning, memory, and knowledge are vital areas of study (e.g. Reference ParadisParadis, 2009; Reference Schmidt and RobinsonSchmidt, 2001; Reference UllmanUllman, 2001). No SLA studies of implicit/explicit learning include gestures. Since implicit L2 language learning is arguably more common in the world than explicit classroom instruction, the role of gestures in implicit L2 learning clearly merits study. A further issue concerns attention to the linguistic details, including parts of speech to be learned, and the role of similarity. It is well known in SLA that truly new distinctions may be easier to learn than distinctions that occupy a similarity space to the L1 (e.g. Reference Flege, Burmeister, Piske and RohdeFlege, 2002; Reference IjazIjaz, 1986). Very little attention is paid to these details, and potential (unintended) cross-linguistic effects therefore remain unexplored. Finally, the nature of the gestural enhancement receives surprisingly little detailed attention in these studies. With the exception of Reference MorettMorett (2018), who has shown that learners’ own spontaneous gesture production has greater effects on word recall than viewing someone else’s non-spontaneous gestures, very little information is provided about how training gestures are selected, their articulatory and temporal features relative to speech, and so on. All these aspects are likely to play a role but remain underdescribed and underexplored.

4 A Research Agenda for Gestures and SLA?

The study of SLA and gesture has made considerable progress in the past 20 years. However, many things remain unexplored. Across all domains reviewed above, there is an obvious need to move beyond the lexicon (and the domain of motion) and consider the acquisition of morphosyntax, (sustained) discourse, phonology, phonetics, figurative language, idiomatic expressions, and pragmatics both in production and in comprehension, both offline and online. The SLA field is rife with claims built on speech alone, ranging from the effects of the languages in contact, individual differences in cognitive makeup, language use, to type of instruction, and so on. These claims could and should be tested taking gestures into account (and preferably on a wider language sample than that which has been examined so far).

Fundamental questions about the study of SLA and gesture are when, why, and how speech and gestures change in L2; why L2 speech seems to change more readily than L2 gestures; under what conditions gestures do change, whether through imitation, through changes in language use, or both. We clearly need more longitudinal work to improve our understanding of these vital issues. Moreover, while we know very little about whether, when, and how L2 learners produce language-specific co-speech patterns, we know even less about the SLA of conventional, quotable gestures, or emblems. If communicative fluency and cultural appropriateness are seen as important to SLA, then the acquisition of gestural repertoires of linguistic and cultural communities should be included. Although a few studies investigate L2 users’ comprehension of culture-specific quotable gestures (e.g. Reference JungheimJungheim, 2006; Reference Molinsky, Krabbenhoft, Ambady and ChoiMolinsky, Krabbenhoft, Ambady, & Choi, 2005; Reference Wolfgang and WolofskyWolfgang & Wolofsky, 1991), it remains largely unknown whether L2 learners themselves produce them. For example, do L2 speakers learn to respect handedness taboos (e.g. Reference Kita and EssegbeyKita & Essegbey, 2001), or to produce appropriate gestural backchannelling, shifting from head toss to headshake (Reference Morris, Collett, Marsh and O’ShaughnessyMorris, Collett, Marsh, & O’Shaughnessy, 1979)? Since emblems function like idiomatic expressions, they may be subject to the same acquisition difficulties as spoken idiomatic expressions (e.g. Reference IrujoIrujo, 1993). However, since they are often assumed to be inherently “salient” in the absence of co-occurring speech, they may also be easier to acquire than spoken idiomatic expressions. A contrastive study of idiom acquisition in speech vs. gesture could illuminate whether the visual modality enjoys an advantage in learning.

Further, the majority of work on SLA and gesture probes L2 production. We know much less about L2 gesture perception – both about learners’ perception of surrounding gestures in classrooms, as well as in study abroad and immersion contexts, and the actual effect of gestures on L2 comprehension (but see e.g. Reference Sueyoshi and HardisonSueyoshi & Hardison, 2005, and Reference Drijvers and ÖzyürekDrijvers & Özyürek, 2020, for somewhat conflicting findings); and about native speakers’ perceptions of learners’ (foreign) gestures. Both domains need elucidating.

We are beginning to get a handle on the measurable effect of explicit gesture training on SLA and memory, but we still know virtually nothing about implicit effects of gesture processing in SLA, that is, the learning effect of gestures that are not part of explicit training, but just normal language use. We know little about the difference between instructed and uninstructed multimodal SLA, about different kinds of classroom instruction (focus on form vs. on meaning), and the role gestures may play in classrooms with literate vs. illiterate L2 learners (e.g. Reference Tarone and BigelowTarone & Bigelow, 2005).

A final methodological point is worth making. We should all aim to provide much more detail on gestures themselves in studies of gesture and SLA, and on their relationship to co-occurring speech. A surprising number of published papers provide no or only the most minimal information about the articulatory, spatial, and temporal properties of the gestures under study, and their temporal relationship to speech. Moreover, it is often assumed that gestural function (“gesture type”), semantic content, and coexpressivity between speech and gesture are easily established when they are not; or that gestures are monofunctional when in fact that are deeply multifunctional. We need to be more attentive to all these details. Data sharing and the creation of open multimodal learner corpora is obviously challenging in gesture studies, but that is precisely why it behooves us all to be more explicit than we typically are in the interest of replicability.

5. Conclusions

The study of gestures and SLA can now be said to be a field in its own right. This field considers gestures both as a tool to study acquisition, and as a phenomenon to be studied per se. The double nature of gestures as interactive, addressee-directed phenomena on the one hand, and as internal, speaker-directed ones on the other, make them deeply relevant to issues of L2 acquisition. Theoretically, this new field still needs to integrate concerns from SLA and gesture studies, from language studies and cognitive (neuro)science. Currently, cognitive learning science lends more attention to gestures than SLA studies do, but this state of affairs will change as the body of work grows, as terminologies and methods become more unified, and as the multimodal view of language gains further ground. The challenge for us all is to shift theories and models of language acquisition away from monomodal monolingual perspectives toward multimodal multilingual ones. It is high time.

16 Gesture and Sign Language

1 Signed Languages

Signed languages are natural human languages used by communities of deaf people as their native or primary language. “Sign language” is a broad category requiring clarification in two ways. First, many linguists prefer the term “signed language” rather than “sign language.” The term “sign language” implies that “sign” itself is a language, leading to confusion between the way a language is produced, its modality or medium of expression, and the name of a language. The term “signed” is best used to describe the modality in which a language is produced, parallel to spoken and written. The terms speaking, writing, and signing describe the ways in which a language can be expressed: We speak a language (such as English), we write a language (such as Chinese), and we sign a language (Argentine Sign Language). To put it another way: “Speech” is not the name of a language (students do not take classes in “speech language,” they take classes in Spanish or Japanese). Likewise, “sign” is not the name of a language, although we persist in the pernicious tradition of using the term “sign language” in this way: We say that students are taking “sign language” classes and assume, incorrectly, that this names the language. American Sign Language (ASL) is the name of a signed language. In fact, ASL and other signed languages can be written; linguists and deaf community members have developed orthographies for this purpose Reference Stokoe, Casterline and CronebergStokoe, Casterline, & Croneberg, 1965). There are many signed languages: British Sign Language, Italian Sign Language, Japanese Sign Language, Iranian Sign Language, Taiwanese Sign Language, and so forth. Although their names contain the word “sign,” they are the names of particular languages in the general category of signed languages. Some deaf communities have adopted names for their languages that do not include the word “sign,” such as Auslan (Australian Sign Language), Libras (Língua Brasileira de Sinasi), and others. In fact, the word “sign” and its translation is not even used in some spoken languages to name the local signed language, but rather the equivalent of “gesture” – which makes understanding the relation between sign and gesture even more challenging in those languages, for example, Dutch Nederlandse Gebarentaal, German Deutsche Gebärdensprache, and Russian Russkij zhestovyj jazyk. In summary, many linguists believe it is more accurate to use the term “signed” for the set of all the world’s languages produced in the signed modality, just as we use the term “spoken” for the set of the world’s spoken languages and “written” for those languages with orthographies.

The second way that the term “sign language” – even if replaced with the term “signed language” – must be clarified is that the term often includes signed languages such as American Sign Language, British Sign Language, Chinese Sign Language, and other natural signed languages which are used by large communities; village sign languages which arise when a number of deaf children are born into an insular indigenous community, such as San Juan Quiahije Chatino Sign Language (Reference Mesh and HouMesh & Hou, 2018); newly emerging languages such as Nicaraguan Sign Language (Reference Senghas and CoppolaSenghas & Coppola, 2001), and Al-Sayyid Bedouin Sign Language (Reference Meir, Sandler, Padden, Aronoff, Marschark and SpencerMeir, Sandler, Padden, & Aronoff, 2010); and International Sign (IS), an emerging pidgin that has arisen among signers from different language communities primarily in Europe and is often used as a lingua franca at international conferences (Reference WhynotWhynot, 2016). As one might expect, these different categories of signed languages have different histories, sociolinguistic characteristics, and potentially quite different stages of development, conventionalization, and recognition in education. Signed languages in developed countries with large communities of signers often have become accepted in educational settings (although, as we will see, historically this has not always been the case). In these situations the signed language may be learned by deaf children as a second language in school. Because the school systems in urban communities can draw from a larger population, deaf children from hearing parents often come into contact with deaf children with deaf parents who have learned the local signed language as their first language. However, even in otherwise developed countries such as Japan or China, lesser studied signed languages often are not adopted by educational systems. Village signed languages often occur in naturally isolated settings and smaller communities. As a result, they are sometimes shared with hearing people living in the same community who come into contact with deaf people. Because of the setting, the probability of acquiring more deaf users is relatively low for these signed languages. If deaf children leave these communities to attend school, they often abandon their village signed language and acquire the urban signed language.

A number of myths and misunderstandings have pervaded our understanding of the first class of signed languages, the natural signed languages of large communities of deaf people. One pervasive misunderstanding, held throughout much of history, is that signed languages are merely depictive gestures and not linguistically structured. Signed languages are not simply holistic gestures. There is, however, a complex relationship between signed languages and gesture that scholars are only now beginning to understand (Reference WilcoxWilcox, 2004, Reference Wilcox, Pizzuto, Pietrandrea and Simone2007, Reference Wilcox2009). This relationship will be discussed in Section 3.

Another common misunderstanding is that signed languages are merely representations of spoken languages – that ASL, for example, is a signed representation of spoken English. Signed languages are independent languages with their own lexicons and grammars. Related to this misconception, many people believe that signed languages are invented languages. They are not. Signed languages, like spoken languages, are naturally developing human languages. There are, however, sign systems created to represent spoken/written language, such as Signing Essential English and Signing Exact English.

Following from the belief that signed languages are invented is the assumption that they are languages with a shallow historical depth. The full story of the age of signed languages is quite complex and depends on the language, the question of emergent village signed languages, and the region of the world in which the language is used. Since signed languages were not regarded as true languages for most of recorded history, it is quite difficult to ascertain their age. We know that signed languages are mentioned in Talmudic law, and in the writings of Aristotle, Quintilian, and many others. In his lessons, for example, Quintilian, taught that not only a movement of the hand, but even a nod, may express our meaning, and he noted that such gestures are used by deaf people instead of speech. Deaf people and signed language are also mentioned in ancient Egyptian writings from the 19th Dynasty, ca. 1350–1200 BCE (Reference ErmanErman, 1971).

We do have a few historical accounts of signed language communities, such as that provided by Pierre Desloges in 1779, as reported in Reference Lane, Lane and GrosjeanLane (1980, pp. 123–124). Desloges became deaf at the age of seven from smallpox. As an adult, he wrote a treatise describing how he learned to sign:

Like a Frenchman who sees his language attacked by a German who knows only a few words of French, I felt obliged to defend my own language against the false imputations [that it is not a language]. […] For a long time I was unaware of sign language. I only used scattered signs, isolated, without an orderly sequence and without linkages. I was quite unacquainted with the skill of combining them to sketch clearly defined scenes whereby we can represent our various ideas, communicate them to our deaf companions, and converse with them in an orderly and extended discussion. The first person who taught me this very useful skill was a deaf-mute from birth, of Italian nationality, who knew neither how to read nor write; he was a servant in the home of one of the actors in the “Comédie Italienne.” […] There are deaf-mutes from birth, workers in Paris, who know neither reading nor writing, and who never went to the lessons of the Abbé de l’Epée, but who were so well instructed in religion, solely through the medium of sign, that they were judged worthy of the sacraments of the church. There is no event in Paris, in France, and in the four corners of the world that is not a topic of our conversations. We express ourselves on all topics with as much orderliness, precision, and speed as if we enjoyed the faculties of speech and hearing.

Although very little can be definitely claimed about the history of signed language over the course of centuries, we can certainly say that as long as communities of deaf people have existed, they have used signed languages to communicate with each other.

2 Relation of Sign and Gesture: Historical Background

Our understanding of the relationship between signed languages and gesture has been the subject of debate among scholars of language and philosophers for centuries. One way in which this question has been manifest is in the centuries-long debate about language origins. (See more on this topic in the chapter by Żywiczyński and Zlatev, this volume.) The philosopher Étienne Bonnot de Condillac, for example, suggested that language began as a gesture language or langage d’action. The term “language of action” was even later used to describe signed languages. Jean-Jacques Rousseau, Johann Herder, Wilhelm von Humboldt, and other philosophers of language caricatured and ridiculed this position, arguing instead that language could not have arisen from such natural, animalistic beginnings. Herder, for example, proposed that the fundamental linguistic act was naming; critically, this naming was based on an unemotional sense of curiosity, a desire for pure knowledge. Furthermore, Herder argued that the naming had to have been audible, unaccompanied by any visible movement. The reasoning behind this argument was that “when a man is under the influence of an emotion (such as fear of an enemy) and yet suppresses, from rational grounds, any movement which might reveal it, he is acting from reason, not from passion” (Reference WellsWells, 1987, p. 40). Language thus is essential to reason – “without language man has no reason, and without reason no language” (ibid.). From this perspective, the best evidence for reason is that we ignore our body, our senses, our emotions, and our passions.

Thus, the accepted wisdom was that true language is spoken. Signs were regarded as nothing more than natural gestures evoked by emotion rather than reason. This view came clearly into view during the Milan Conference of 1880. At this time a great debate was taking place between educators who supported the use of signing in the education of deaf children and those who supported speech, the so-called oral method. Supporters of speech, such as Marius Magnat, the director of an oral school in Geneva, maintained that signed languages lacked any features of language and thus were not suited for educating deaf children (Reference LaneLane, 1984, pp. 387–388):

The advantages of articulation training [i.e., speech] […] are that it restores the deaf to society, allows moral and intellectual development, and proves useful in employment. Moreover, it permits communication with the illiterate, facilitates the acquisition and use of ideas, is better for the lungs, has more precision than signs, makes the pupil the equal of his hearing counterpart, allows spontaneous, rapid, sure, and complete expression of thought, and humanizes the user. Manually taught children are defiant and corruptible. This arises from the disadvantages of sign language. It is doubtful that sign can engender thought. It is concrete. It is not truly connected with feeling and thought. […] It lacks precision. […] Sign cannot convey number, gender, person, time, nouns, verbs, adverbs, adjectives, he claims. […] It does not allow [the teacher] to raise the deaf-mute above his sensations. […] Since signs strike the senses materially they cannot elicit reasoning, reflection, generalization, and above all abstraction as powerfully as can speech.

Statements made by Giulio Tarra, the president of the Milan conference, reveal even more starkly the confusion that equated speech with language, and sign with gesture.

Gesture is not the true language of man which suits the dignity of his nature. Gesture, instead of addressing the mind, addresses the imagination and the senses. Moreover, it is not and never will be the language of society […] Thus, for us it is an absolute necessity to prohibit that language and to replace it with living speech, the only instrument of human thought. […] Oral speech is the sole power that can rekindle the light God breathed into man when, giving him a soul in a corporeal body, he gave him also a means of understanding, of conceiving, and of expressing himself. […] While, on the one hand, mimic signs are not sufficient to express the fullness of thought, on the other they enhance and glorify fantasy and all the faculties of the sense of imagination. […] The fantastic language of signs exalts the senses and foments the passions, whereas speech elevates the mind much more naturally, with calm and truth and avoids the danger of exaggerating the sentiment expressed and provoking harmful mental impressions.

(Lane, 1984, pp. 393–394)

The debate between those who supported signed language in the education of the deaf, and those who argued that only speech should be used, provides us with a clear view of how they considered the relationship between sign and gesture. It is no surprise that the relationship was framed in terms of mind–body dualism as espoused by one of the leading philosophers of the time, René Descartes. Summarizing the descriptions of sign and speech from Magnat and Tarra, we see that language is of the mind; it is associated with speech, with the acquisition of ideas and the expression of thought; it elicits reasoning, reflection, abstraction, generalization, and rationality. Speech has precision (perhaps meaning it has grammar); its users exhibit calm, prudence, and truth, such that it humanizes its users. Speech is of the soul, the spirit, because it originated from the breath, the aspiration, of God. Sign, on the other hand, is of the body; it is merely gesture. Signs corrupt deaf children, making them less human, more animalistic. Signs are concrete, and thus they cannot engender thought. Signs lack parts of speech and grammar. Because signs are gesture, they are associated with the corporeal body and with the senses. Signs strike the senses materially. They foment the passions – they are, in both senses of the term, sensual. Signs glorify fantasy and imagination. It is no accident that the root of imagination is image, and the iconic nature of signs, the antithesis of abstraction, is considered to be one of their most damning features. Whereas speech originated as the breath by which God gave humans a soul, sign is of the corruptible body, the flesh, and the material world.

The historical significance for signed language linguistics was tremendous, because it set the worldview for nearly 100 years. Scholars were left with the following assumptions firmly entrenched in our understanding of language, speech, gesture, and sign:

Speech is of the mind; signs are of the body.
Language is equivalent to speech; gesture is not language.
Signs are gesture and therefore are not language.

In one form or another, these assumptions persist today. Much of the early work of sign linguists was motivated by a perceived need to distinguish sign from gesture. Linguists sought to demonstrate, for example, that signed languages have the same design features that spoken languages have (Reference Hockett and WangHockett, 1982). The pioneering work of William C. Stokoe Reference Stokoe(1960) was directed at demonstrating that ASL exhibits duality of patterning – that is, that it has a level equivalent to that of the phonology of spoken languages, with meaningless units that he called cheremes by analogy with phonemes that are combined to form meaningful units, or morphemes. Stokoe initially identified three classes of cheremes: hand shapes, movements, and locations. He later simplified the analysis to two classes – that which acts and its action (Reference StokoeStokoe, 1980).

Other linguists worked to document the complex grammar of signed languages (Reference Klima and BellugiKlima & Bellugi, 1979; Reference SipleSiple, 1978) and the nature of iconicity (Reference Engberg-Pedersen, Michael, Harder, Heltoft and JakobsenEngberg-Pedersen, 1996; Reference FrishbergFrishberg, 1975; Reference Mandel and FriedmanMandel, 1977). Sociolinguists documented the historical relation of ASL to French Sign Language (Reference WoodwardWoodward, 1976b, Reference Woodward and Siple1978) and sociolinguistic characteristics of the deaf community (Reference LucasLucas, 1989; Reference WoodwardWoodward, 1974, Reference Woodward1976a).

3 Sign and Gesture in Acquisition

A substantial body of research has investigated the relation between sign and gesture in first language acquisition. One area of research has examined deaf children who have no exposure to a signed language either at school or in the home. These children often develop a system of idiosyncratic gestures to communicate with parents or siblings (Reference MorfordMorford, 2003). These gestures exhibit many of the same properties seen in signed or spoken languages. Homesigners develop systematic ways of indicating negation and questions. Specific hand shapes and movements become associated with specific meanings. Homesigners use these gestures in a consistent way across settings, rather than creating new gestures for each new setting. Homesigners also use gestures to refer to generic entities and events and not just specific instances. However, homesign also displays important differences when compared to more conventional spoken or signed languages. Homesigners do not appear to ever use complex syntactic structure. Homesign also does not seem to develop phonological structure that is independent of morphological structure (Reference Morford and Hänel‐FaulhaberMorford & Hänel‐Faulhaber, 2011).

Research on homesign appears to contradict strong claims that language cannot be acquired after a critical period. While this may be true for spoken languages, deaf children who use homesign do acquire signed language once they are exposed to signing deaf adults (Reference MorfordMorford, 2003). However, as adults, these signers display deficits when compared to other members of the signed language community. Reference Morford and Hänel‐FaulhaberMorford and Hänel-Faulhaber (2011) propose two possible explanations. One explanation focuses on the differences between homesign and conventional languages, noting that one area of deficit is in acquisition of more native-like phonological structure and complex syntax. The second explanation points out that homesign is not acquired in a shared language community, and so homesigners have reduced exposure to receptive language. As a result, late learners also exhibit receptive processing deficits such as slower sign recognition (Reference Morford and Hänel‐FaulhaberMorford & Hänel-Faulhaber, 2011).

A second area compares the acquisition of gesture in hearing children and sign in deaf children. A broad summary of the results suggests links between early actions, gestures and words, and the importance of multimodal communication and the interplay between gestures and spoken words (Reference Volterra, Capirci, Rinaldi and SparaciVolterra, Capirci, Rinaldi, & Sparaci, 2018). Early work examined the role of gestural performatives and representational gestures (Reference Bates, Camaioni and VolterraBates, Camaioni, & Volterra, 1975). Performatives include ritualized requests, showing off, showing, giving, and pointing. Performatives were typically classified as deictic gestures, the content of which can only be interpreted referring to the extralinguistic context. The second type of gestures, termed representational gestures, are used by children to refer to objects, persons, locations, or events through hand, body, or facial movements (Reference Capirci, Iverson, Montanari and VolterraCapirci, Iverson, Montanari, & Volterra, 2002). Representational gestures differ from deictic gestures by iconically representing attributes or actions of specific referents. The meaning of representational gestures does not change across different contexts.

Early development of these two types of gesture, and of signs, were often attributed to different underlying cognitive systems, one gestural and the other linguistic. A more recent conclusion suggests “not a clear-cut separation, but a continuity between co-speech gestures produced by hearing children and early signs produced by children exposed to a sign language” (Reference Volterra, Capirci, Rinaldi and SparaciVolterra, Capirci, Rinaldi, & Sparaci, 2018, p. 217). These researchers conclude that the traditional dichotomy between gestures as gradient, variable, iconic, and signs as categorical, invariable, arbitrary, should be replaced with a multimodal approach to the study of both spoken and signed languages. This conclusion mirrors recent research questioning any clear-cut distinction between gestural and linguistic systems (Reference TalmyTalmy, 2018; Reference Wilcox and MartínezWilcox & Martínez, 2020; Reference Wilcox and OcchinoWilcox & Occhino, 2016a).

4 The Relation of Sign and Gesture: Current Views

As is so often the case in the history of ideas, the pendulum of science swings between two poles. Space and time were seen as distinct by Newton; Einstein merged them into space-time. Physicists now debate whether time exists at all. Such is also the case with sign and gesture. As we have seen, for centuries signed languages were considered to be gesture and not language. The early work of sign linguists was directed at demonstrating that signed languages are languages and not gesture. In recent years, there has emerged a position which holds that signed languages are integrations of linguistic and non-linguistic or gestural systems. Finally, there are even those who are beginning to question the usefulness of the term “gesture” itself.

One contribution to the paradigm shift that permitted sign and gesture to be reexamined was the publication of Gesture and the Nature of Language (Reference Armstrong, Stokoe and WilcoxArmstrong, Stokoe, & Wilcox, 1995), which made the case that scholars could examine gesture and language as related phenomena. The authors did not simply claim that gesture and signed language might be related. Rather, they put forward the hypothesis that all language is gestural, and that the origins of human language can be traced to visible gestures. The ideas in this were based in part on an understanding of gesture as a functional unit, an equivalence class of coordinated movements that achieve some end (Reference Studdert Kennedy and AllportStuddert Kennedy, 1987). Gesture was seen more broadly as articulatory movements of any part of the body. This concept was derived from the work of the cognitive psychologist Ulrich Reference NeisserNeisser (1967, pp. 156, 161):

To speak is to make finely controlled movements in certain parts of your body, with the result that information about these movements is broadcast to the environment. For this reason the movements of speech are sometimes called articulatory gestures. A person who perceives speech, then, is picking up information about a certain class of real, physical, tangible (as we shall see) events that are occurring in someone’s mouth. […] Since articulatory events are motions of certain parts of the body, speech perception has something in common with perceiving other bodily motions, like those of dancers and athletes. In particular, the perception of facial expressions, nonverbal cues, “body language,” and the like must be continuous with it. There is every reason to believe that speech perception begins just as one aspect of the general perception of other people’s movements.

The ideas presented in Gesture and the Nature of Language were also informed by cognitive linguistics, and in fact the work was instrumental in bringing together a nexus of gesture-language-sign research informed by cognitive linguistics. Cognitive linguistics dramatically changes the perspective of how we view language, and it does so in a way that allows linguists to learn from gesture researchers, research on animal communication, and the study of general perceptual and cognitive abilities, options that were precluded under prior theories.

Although linguists have demonstrated that signed languages are not simply unanalyzable depictive gestures, recent research has begun to explore the relation between signed language and gesture. This work can be classified into three major themes: (1) the historical development by which gestures used in a hearing community are incorporated into a signed language, thus becoming part of the sign linguistic system; (2) the claim that some signs and sign constructions are fusions of linguistic and gestural material, so-called “sign-gesture fusions”; and (3) an analysis of signs and gestures based on cognitive linguistic theory, and in particular, the theory of cognitive grammar (see below).

The first approach is based on grammaticalization theory (Reference BybeeBybee, 2006; Reference Hopper and TraugottHopper & Traugott, 2003). Grammaticalization is the process by which lexical material becomes grammatical material. Working within this framework, linguists have demonstrated that some of the gestures used in the surrounding language community are incorporated into a signed language as lexical signs. These lexical signs then grammaticize, forming grammatical signs. Grammaticalization of gesture and signs has been described for a number of signed languages Reference Pfau, Steinbach, Heine and NarrogPfau & Steinbach, 2011; Reference Shaffer, Jarque, Wilcox, Nogueira and Lopes.Shaffer, Jarque, & Wilcox, 2011; Reference Wilcox, Rossini, Antinoro Pizzuto and BrentariWilcox, Rossini, & Antinoro Pizzuto, 2010; Reference Xavier and WilcoxXavier & Wilcox, 2014).

An example occurs with the gesture meaning “to leave or depart” (Figure 16.1) common in the Mediterranean region (Reference Morris, Collett, Marsh and O’ShaughnessyMorris, Collett, Marsh, & O’Shaughnessy, 1979). This gesture appears to have been incorporated into old French Sign Language (LSF) as the sign PARTIR “to depart.” ASL is genetically related to French Sign Language (LSF), LSF having been a historical source for ASL since the early 1800s (Reference Wilcox and OcchinoWilcox & Occhino, 2016b; Reference Woodward and SipleWoodward, 1978). An old ASL sign derived from the old LSF sign PARTIR appears in films of older signers from 1913 with both lexical and grammatical meanings. The lexical sign DEPART is used in utterances that can be translated from ASL to English as, “At that time, the president of Gallaudet, Edward Miner Gallaudet, departed. A few days prior, he departed/left (to go) to Philadelphia.” We also find the sign with slightly reduced movement used to mark future (Reference Janzen, Shaffer, Meier, Quinto and CormierJanzen & Shaffer, 2002). In the same series of 1913 films, in giving a lay sermon, a deaf person uses the sign when saying (again translating this from ASL to English), “When you understand the words of our Father, you will do that no more.”

Figure 16.1 Depart gesture

(by permission of Advanced Reasoning Forum)

Reference Wilcox, Wilcox, Bybee and FleischmanWilcox and Wilcox (1995) demonstrated that gestural forms often serve as the source for lexical signs, which then grammaticalize to modals. The ASL modal sign CAN, for example, is derived from the ASL lexical form STRONG. Reference LongLong (1918) pointed out that the sign for “strong” is very similar to CAN, noting that the difference lies in the way the hands are moved. For “strength” they are moved somewhat sidewise with a slight circular motion. The source for the lexical sign STRONG is a gesture indicating strength, commonly expressed by moving the fists in an outward or downward motion so as to indicate upper body strength.

An example demonstrating a longer diachronic and cross-linguistic chain occurs with the modal meaning “necessity” (Figure 16.2). The notion of necessity is expressed by the ASL modal form glossed as MUST. The history of this form is more complex, appearing in modern ASL as a downward moving bent index finger. In LSF the form IL FAUT “it is necessary” is similar, except the index finger is straight and the palm orientation is to the side rather than down. In LSF from the nineteenth century, the straight index finger is used but the entire hand points down. Ultimately, the form appears to have had as its gestural source a downward pointing finger. This gesture was used in classical antiquity to indicate “in this place” and “insistence.” Reference DodwellDodwell (2000, p. 36) discusses this gesture from ancient Roman times, which he calls an imperative, “consists of directing the extended index finger towards the ground.” The gesture was ascribed a modal sense by Quintilian, who noted that “when directed towards the ground, this finger insists” (ibid.). Insistence is semantically related to the modal notion of necessity.

Figure 16.2 LSF sign IL FAUT and ASL modal sign MUST

(LSF image with permission from IVT-International Visual Theatre)

Reference WilcoxWilcox (2004, 2009) proposed two routes by which gesture is incorporated into a signed language. The first route, described above, begins when manual gesture enters a signed language as a lexical sign and develops through grammaticalization into a grammatical morpheme. The second route proceeds along a distinctly different path. The source is not the manual gesture itself; rather, it is the way that a manual gesture is produced, the sign’s manner of movement, as well as various facial, mouth, and eye gestures that may accompany a manual gesture or sign. Here, too, we see gesture as a source. For example, Quintilian observed that when the hand is thrown out gently it promises and declares assent; when it is moved more quickly, it is a gesture of exhortation or sometimes of praise.

Upon entering the linguistic system, these manner of movement and facial gestures follow a developmental path from paralinguistic (prosody or intonation) to grammatical marker. As an example of the grammaticalization of manner of movement, in many signed languages the manner in which a modal sign is produced indicates strength of the modal. Reference JarqueJarque (2006) notes that manner of movement is used to indicate differences in modal strength and to mark deontic versus epistemic function in Catalan Sign Language. Modal strength is also marked in Italian Sign Language by weak or strong articulation of the base movement.

Manner of movement also appears as a marker of verb aspect. Reference Pizzuto and VolterraPizzuto (1987) observed that temporal aspect can be expressed in Italian Sign Language via systematic alterations of the verb’s movement pattern, specifying, for instance, the “suddenness” of an action by means of a tense, fast, short movement (e.g. the distinction between to meet and to suddenly/unexpectedly meet someone). Conversely, a verb produced using an elongated, elliptical, large and slow movement specifies that an action is “repeated over and over in time” or “takes place repeatedly in time (e.g. to constantly telephone or to always be on the telephone). Similarly, in ASL, manner of movement marks verb aspect (Reference Klima and BellugiKlima & Bellugi, 1979).

The second route also characterizes the grammaticalization of facial displays. Facial displays play a significant role cross-linguistically in signed languages. In addition to expressing emotion, facial gestures mark a variety of grammatical functions such as interrogatives, topics, adverbials, conditionals, imperatives, and more. These facial displays often begin as gestural expressions. For example, brow furrow is well documented as a display of physical or mental exertion. Reference DarwinDarwin (1872, p. 221) noted that brow furrow marks “the perception of something difficult or disagreeable, either in thought or action.” Brow furrow marks a number of grammatical meanings across several signed languages, including wh-questions, imperatives, and root or deontic modality.

Recently, some sign linguists have proposed that certain signs are not purely linguistic; rather, they claim that these signs and sign constructions are better described as “language-gesture fusions” (Reference Fenlon, Cooperrider, Keane, Brentari and Goldin-MeadowFenlon, Cooperrider, Keane, Brentari, & Goldin-Meadow, 2019; Reference Hodge and JohnstonHodge & Johnston, 2014). One such claim concerns pointing signs. Pointing signs are quite common across signed languages, functioning as deictic and anaphoric pronouns, possessive and reflexive pronouns, demonstratives, locatives, body part signs, and indicating verb agreement. The gesture-language fusion claim is most clearly made in the case of personal pronouns. Reference Meier and Lillo-MartinMeier and Lillo-Martin (2013) observe that the first-person pronoun in ASL is fully specified phonologically: a point to the center of the signer’s chest. However, they claim that the locations to which non-first person signs point cannot be enumerated in a listing of sublexical phonological units: Signers point to an open-ended set of locations. Since location is a phonological prime, having an open-ended phonological inventory of locations is not possible. Thus, they conclude (Reference Meier and Lillo-MartinMeier & Lillo-Martin, 2013, p. 163) that “first-person points, but not non-first points, can be specified entirely in terms of the phonological units that form lexical signs. […] In contrast the locations in space of non-first person points appear to be gestural inasmuch as the direction of pointing is – when the signer is referring to an individual who is present at the conversation – determined by the referent’s physical location in the environment.” This same argument is extended to the marking of verb agreement, in which agreement is marked by location in space.

The gesture-fusion claim requires that linguists identify criteria by which gesture can be reliably and objectively distinguished from language. One proposal that has been offered (Reference SandlerSandler, 2009) is that gestures are holistic and synthetic; they are lacking in hierarchical and combinatoric properties; they are idiosyncratic – different speakers or even the same speaker may use different gestures to represent the same image; and gestures are context-sensitive, their interpretation dependent on the linguistic context. Linguistic structure, on the other hand, is componential, combinatoric, and hierarchically organized. Linguistic signs such as words are highly conventionalized in form, meaning, and distribution. These criteria, however, pose problems. Language is both conventional and unconventional – innovation by definition establishes new, unconventional expressions. So is gesture: Emblems or recurrent gestures are conventional, while idiosyncratic or innovative gestures are, by definition, not conventional. Language is gradient. As Reference BybeeBybee (2010, p. 2) observes, “All types of units proposed by linguists show gradience, in the sense that there is a lot of variation within the domain of the unit (different types of words, morphemes, syllables) and difficulty in setting the boundaries of the unit.”

Another option has been to propose a modality-free definition of gesture (Reference Okrent, Meier, Cormier and Quinto-PozosOkrent, 2002), based on degree of conventionalization (How conventionalized must something be in order to be considered linguistic?), site of conventionalization (What kinds of conventions are linguistic conventions?), and restrictions on combination (What kinds of conditions on the combination of semiotic codes are linguistic conditions?). Answers to these questions are, however, not offered, nor is it clear how they could be objectively answered. For these and other reasons, not all sign linguists accept the language-gesture fusion claim (Reference QuerQuer, 2011; Reference WilburWilbur, 2013). Reference DotterDotter (2018) presented a strong rejection of the claim that gestural components are combined with language elements in essential areas of signed language grammar.

As we have seen, the relation between gesture and sign has a complex history, informed by different perspectives on sign itself and on gesture. Reference MüllerMüller (2018) offers a comprehensive overview of the history of gesture studies and sign linguistics with special attention to examining the relation between gesture and sign. First, Müller reconstructs the history of gesture studies, focusing on the seminal work of Kendon, McNeill, and Goldin-Meadow. She traces Kendon’s view that gesture and sign are one gestural medium of expression, or utterance visible actions, detailing the functional similarities between gesture and sign. Müller points out that McNeill, on the other hand, claims that gesture and sign exhibit sharp discontinuities. Müller attributes this position to McNeill’s decision to restrict the concept of gesture to spontaneously used gestures. A third position described by Müller is that presented by Reference Goldin-Meadow and BrentariGoldin-Meadow and Brentari (2017), in which they strengthen the discontinuity view, framing the distinction such that the essential features distinguishing the two are that sign and speech are categorical, while gesture is imagistic. Müller rejects the assumptions that gestural equals imagistic and that there is a clear-cut boundary between categorical and gestural. Müller concludes by suggesting the need to view gesture and sign dynamically, both in historical relation and in multimodal interaction. Although Müller essentially aligns with Kendon’s continuity position, she rejects Kendon’s proposal of the term “utterance visible actions,” preferring instead to retain the term “gesture,” advocating an understanding of gestures as “deliberate expressive movements.” Ultimately, she suggests that “the question of how gesture and sign relate critically depends on the notion of ‘gesture’ employed” (Reference MüllerMüller, 2018, p. 16). Concerning the historical relation between sign and gesture, Müller concludes that research such as that discussed above suggests a “dynamic, continuous and ongoing process of historical change, where no cataclysmic break is involved, and no sudden rupture transforms gesture into sign from one moment to another” (Reference MüllerMüller, 2018).

Recently, Wilcox and his colleagues (Reference Wilcox and OcchinoWilcox & Occhino, 2016a) have proposed an account of pointing that does not require evoking gesture. Their approach uses a dynamic usage-based model, specifically cognitive grammar (Reference LangackerLangacker, 1987, Reference Langacker, Barlow and Kemmer2000, Reference Langacker2008). Cognitive grammar claims that grammar and lexicon form a continuum of symbolic assemblies composed of phonological structures, semantic structures, and the symbolic links between the two. Phonological, semantic, and symbolic structures are abstracted from usage events: “instances of language use in all their complexity and specificity” (Reference LangackerLangacker, 2008, p. 547). Symbolic assemblies vary along two dimensions: schematicity and complexity. Schematicity pertains to level of detail or precision. Schematic elements are elaborated or instantiated by more specific elements. Symbolic structures combine to form higher-level symbolic structures, or symbolic assemblies. Through repeated combination, symbolic assemblies of high complexity may be formed.

In this analysis, pointing is a complex symbolic assembly, a construction consisting of two component structures: a pointing device and a Place (Figure 16.3). Both of these are symbolic structures: They consist of a semantic pole and a phonological pole. The pointing device serves to direct attention; this is its schematic meaning. The schematic phonological pole of a pointing device is any articulator capable of directing attention. This may be a pointing finger, but eye gaze and even body torso orientation may serve as the phonological pole of a pointing device. The Place (the term appears with initial capital letter to indicate that it names the entire symbolic structure) is the entity at which attention is directed. The schematic semantic pole of a Place structure, which must be specified in an actually occurring usage event, is the referent. The schematic phonological pole of a Place structure is a location in the spatial surroundings. Places appear in a number of constructions, including both deictic and anaphoric expressions (Reference Wilcox and OcchinoWilcox & Occhino, 2016a), as well as reported dialogue and so-called “agreement” constructions (Reference Wilcox, Martínez, Morales, Jucker and HausendorfWilcox, Martínez, & Morales, 2022). Place structures thus unify two distinct types of phonological locations: those internal to language, and those external locations which in other analyses are regarded as gestural. Places also subsume the traditional distinction between deixis and anaphor. Reference TalmyTalmy (2018, p. 1) writes that, “Broadly, an anaphoric referent is an element of the current discourse, whereas a deictic referent is outside the discourse in the spatiotemporal surroundings. This is a distinction made between the lexical and the physical, one that has traditionally led to distinct theoretical treatments of the corresponding referents.” Talmy offers an account in which language engages the same cognitive system for both speech-internal and speech-external referents. Places provide the comparable symbolic resource for languages produced in visual space. For signed languages, a referent at a location in the spatiotemporal surroundings is not outside of the discourse; rather, both deictic and discourse referents occupy phonological locations in the spatiotemporal surroundings.

Figure 16.3 Pointing construction

Reference Ruth-Hirrel and WilcoxRuth-Hirrel and Wilcox (2018) apply the pointing device and Place analysis to speech-gesture constructions. They focus primarily on complex symbolic assemblies consisting of pointing constructions and beats that accompany speech. Beat gestures have been formally characterized as “biphasic movements of the hands” (Reference Biau and Soto-FaracoBiau & Soto-Faraco, 2015). These movements typically involve a “simple flick of the hand or finger up and down, or back and forth” (Reference McNeillMcNeill, 1992, p. 15); however, beats may also be performed using other body parts, such as the head or eyebrows (Reference Krahmer and SwertsKrahmer & Swerts, 2007). Researchers often claim that beats have no semantic content (Reference Alibali, Heath and MyersAlibali, Heath, & Myers, 2001; Reference Biau and Soto-FaracoBiau & Soto-Faraco, 2013; Reference Özçalışkan and Goldin-MeadowÖzçalışkan & Goldin-Meadow, 2009). Researchers acknowledge that beats serve emphatic functions and are closely tied to information structure, comparing beats to an “all-purpose highlighter” superimposed on more objective content (Reference McNeill, Levy, Duncan, Deborah, Heidi and DeborahMcNeill, Levy, & Duncan, 2015).

In the cognitive grammar analysis proposed by Ruth-Hirrel and Wilcox, beats are symbolic structures; thus, they have both phonological and semantic import. Reference LangackerLangacker (2001) expands on the notion of symbolic structures, noting that such structures incorporate multiple channels. The semantic pole consists of several conceptualization channels, including speech management, information structure, and objective content. Speech management includes such functions as holding the floor and turn taking. Information structure includes emphasis, discourse topic, and given versus new information. Objective content is the conceptualization of the situation being described by a linguistic expression. The phonological pole consists of several vocalization channels. The core vocalization channel for speech is segmental content. Other channels include intonation and gesture.

Ruth-Hirrel and Wilcox claim that beat gestures are symbolic structures, but significantly they are phonologically and conceptually dependent structures, requiring autonomous structures for their expression. Semantically, beats are dependent structures in the information structure channel, making reference to some more objective autonomous content, the information that is emphasized or highlighted. Phonologically, beats are expressed as manner of movement, requiring an autonomous gesture carrier for their articulatory expression. This is canonically specified by the movement of a hand, since manner of movement is dependent on movement (there is a further level of dependency, because movement requires some entity such as a hand). It is also possible for the movement of any more substantive and autonomous structure, such as a head, to serve as the phonological carrier for a beat.

Ruth-Hirrel and Wilcox show that simple beat gestures, as well as beat gestures coexpressed with pointing gestures, are used to direct attention to meanings in speech that are associated with salient components of stancetaking acts. Their account reveals both that beats have meaning and the symbolic motivation for the apparent “superimposing” of beats onto pointing gestures and their integration with speech.

5 Sign and Gesture Revisited

One result of the contemporary linguistic analyses of signed languages and gesture has been a blurring, or indeed the loss, of any categorical distinction between sign and gesture. Noting these difficult issues, Reference KendonKendon (2017, p. 30) concluded that “‘gesture’ is so muddied with ambiguity, and theoretical and ideological baggage, that its use in scientific discourse impedes our ability to think clearly about how kinesic resources are used in utterance production and interferes with clarity when comparing signers and speakers.” Kendon has gone so far as to propose abandoning the categories gesture and sign altogether, instead focusing on a comparative semiotics of what he terms visible bodily action as it is used in utterances by speakers and by signers.

The dynamic usage-based approach offers a new way to reframe our understanding of sign and gesture (Reference Occhino and WilcoxOcchino & Wilcox, 2017; Reference Wilcox and OcchinoWilcox & Occhino, 2016a). The world comes to us unlabeled. We perceive not “sign” or “gesture” but perceptible usage events – Kendon’s visible bodily actions or Neisser’s articulatory events. These actions, as perceptual events, are categorized by language learners. In the present case of sign and gesture, we can restrict our focus to deaf language learners. Reference StokoeStokoe (1960, pp. 6–7) adopted this user-centered deaf viewpoint from the very start:

To take a hypothetical example, a shoulder shrug, which for most speakers accompanied a certain vocal utterance, might be a movement so slight as to be outside the awareness of most speakers; but to the deaf person, the shrug is unaccompanied by anything perceptible except a predictable set of circumstances and responses; in short, it has a definite “meaning”.

Having both a perceptible form and a meaning (in cognitive grammar parlance, we would say having a phonological and a semantic pole), the shoulder shrug poses a categorization problem to be solved by the deaf observer: How does this symbolic structure fit into his emerging and dynamic understanding of communicative performance? The units that compose an individual’s linguistic knowledge (i.e. grammar) are related to actual expressions that are perceived in usage events by the process of categorization. In the theory of cognitive grammar, the process works in this way (Reference Langacker, Barlow and KemmerLangacker, 2000). A particular target of categorization activates a variety of established units, the activation set. For any given usage event, taking into consideration all the linguistic and contextual factors, the language user must search the activation set for the member that will categorize the target. Members of the activation set must in a sense compete; the winner becomes the most active member of the set and the active structure which categorizes the target. A number of factors determine which member will become the active structure. One factor is degree of entrenchment of the member, which influences its inherent likelihood of activation and thus of being selected. A second is contextual priming, both phonological and semantic. A third is amount of (phonological or semantic) overlap between the target and a potential categorizing structure. These three factors are primarily linguistic, but we must also include many other factors in order to fully understand the categorization process (Reference Wilcox and OcchinoWilcox & Occhino, 2016a).

In the current case we must include such factors as individual variability (age of exposure to signed language, level of hearing loss); social variability (whether the observer comes from a deaf or hearing family, type of education, accessibility of signed language in the general environment); and cultural variability (society’s attitude toward sign and gesture in general, for example, Navajo culture and Neapolitan culture exhibit very different attitudes toward gesturing; the categories provided by a culture for naming such perceptual events). As Reference Van Hoekvan Hoek (1997) notes, these categorizing judgements determine if the construction is an instantiation of a particular schema or an extension from that schema. With a small amount of conflict, the construction may be judged to be an acceptable innovation, but a significant conflict will cause signers to judge the construction to be anomalous, that is, not a part of their grammar. Thus, the process of categorizing a target structure, be it a shoulder shrug, a facial display, or any other visible bodily action, is none other than the linguistic process of judging whether a perceived structure is well formed with respect to others, that is, it is a part of their dynamically changing grammar.

In all cases, the key to answering the question “Is it a sign or a gesture?” in the context of visibly perceptible usage events lies with the observer, not the observed. It requires that we stop assuming that these events form natural categories – that is, categories that exist in nature, independent of language users. Instead we must reframe the question and adopt an approach which acknowledges that deaf observers categorize perceptual events.

Clifford Reference GeertzGeertz (1973, p. 6) offered an example of how a visible perceptual usage event is categorized:

Consider two boys rapidly contracting the eyelids of their right eyes. In one, this is an involuntary twitch; in the other, a conspiratorial signal to a friend. The two movements are, as movements, identical; from an I-am-a-camera “phenomenalistic” observation of them alone, one could not tell which was twitch and which was wink, or indeed whether both or either was twitch or wink. Yet the difference, however unphotographable, between a twitch and a wink is vast; as anyone unfortunate enough to have had the first taken for the second knows. As [Gilbert] Ryle points out, the winker has not done two things, contracted his eyelids and winked, while the twitcher has done only one, contracted his eyelids. Contracting your eyelids on purpose when there exists a public code in which so doing counts as a conspiratorial signal is winking. That’s all there is to it: a speck of behavior, a fleck of culture, and – voilà! – a gesture.

In the context of understanding sign and gesture, we might rephrase Geertz and say: the same speck of behavior with a fleck of cultural, contextual, and background knowledge and the act of categorization by the deaf observer, and – voilà! – sign (in this case, a grammatical facial display). Like Stokoe’s shrug and Geertz’s wink, the visible bodily actions of usage events are the very stuff from which language is made. The labeling of visible bodily actions as sign or gesture is, as Geertz would say, a matter of determining what counts as what. Language, gesture, and sign are historical-cultural constructs, folk classifications that may or may not be relevant to deaf language users. The relevant question to be examined is the dynamically emerging knowledge – or as a linguist would call it, the grammar – of deaf language users. The key is to not forget the observer. Geertz is helpful even here: We must see things, he tells us (Reference Geertz1974), “from the native’s point of view” – in this case, from the point of view of the individual deaf language user observing and categorizing visible bodily actions. The categorization of these usage events is an individual user’s cognitive activity. The linguist’s task is to discover the user’s categories.

Adopting a user-based cognitive linguistic perspective may produce a paradigm shift in the study of signed language and gesture. As Geertz pointed out, “small facts speak to large issues, winks to epistemology” (Reference GeertzGeertz, 1973, p. 23). For linguists, the small facts are intricately complex usage events, the larger epistemological issue is the construction of a grammar. If linguists are to understand how language is constructed by users from usage events, we must begin by compiling thick descriptions of actual discourse usage events in all of their expressive and conceptual complexity. This demands that linguists incorporate cultural, social, and historical data into our linguistic theories.

Ultimately, the question of what is sign and what is gesture may not be a scientific but an ethnoscientific one. The answer lies not in finding observable, I-am-a-camera photographable differences between the “sign” and “gesture” independent of who is doing the observing and classifying; rather, it is a matter of what counts as sign or gesture from the deaf person’s point of view. This, in turn, depends on received folk classifications that are handed down and change over time, that vary across cultures and across the individuals doing the classification. Whether deaf and hearing people share the same folk classifications of what counts as sign and what counts as gesture – if indeed they even have such categories – is an open question, although research suggests that the answer is likely to be quite complex (Reference Kusters and SahasrabudheKusters & Sahasrabudhe, 2018). From winks and shrugs and facial displays to depictive shapes and movements of the hands; from whether a hand is moved gently or with a sudden, quick movement; from highly innovative signed constructions expressing an actor’s action in chopping down a tree and the tree’s personified emotional reaction to being chopped to highly conventional lexical signs – these are the photographable raw data. As linguists and gesture scholars, our task is to meticulously describe this input to the categorization process and explore how the process plays out, how grammars are dynamically constructed, grow, and change. Echoing Kendon, it appears to this sign linguist that the label gesture, with its historically laden ideological baggage, contributes little to the task at hand.

17 On Grammar–Gesture Relations: Gestures Associated with Negation

1 Introduction

When, how, and why do people gesture? The Cambridge Handbook of Gesture Studies offers many answers to this question from diverse theoretical and methodological perspectives on the relation between language and gesture. The topic introduced in this chapter pulls together several strands of research that have highlighted gesture’s relation to notions and processes that are traditionally seen as “grammatical.” In particular, rich observations on gesture’s link with negation have featured in the work of several key thinkers and texts, and can therefore be said to have played a role in shaping contemporary gesture studies. Rather than emphasizing the spontaneity and idiosyncrasy of co-speech gestures, for instance, studies of gesture’s association with negation have shed light on regularities in gesture form, function, and linguistic organization, and in turn, offered evidence for the multimodality of grammar, the embodiment of cognition, and our bodies’ “potential for language” (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMüller, 2013, p. 202).

As any linguist knows, negation involves lexical and grammatical patterns that determine word order, operate on the semantics of the utterance, and play a role in the conceptualization and pragmatics of speech acts, such as rejections, disagreements, denials, objections, and rejections (Reference HornHorn, 1989). What is also well known in gesture studies is that central to these patterns is an array of gestures that speakers perform in association with their linguistic and pragmatic expression of negation (Reference CalbrisCalbris, 1990, Reference Calbris2011; Reference HarrisonHarrison, 2018; Reference KendonKendon, 2004; Reference Lapaire, Bonnefille and SalbayreLapaire, 2006a). The headshake might immediately come to mind (with its notorious cultural variations; Reference Harrison, Müller, Cienki, Fricke, McNeill and BressemHarrison, 2014a), as well as various gestures of the hands striking outwards from the body or towards the addressee in a holding motion.

Over the last ten to fifteen years, increasing attention has been paid to these gestures associated with negation.Footnote ¹ Their forms and functions characteristically recur among members of a given linguistic and cultural community in clearly identifiable discourse contexts with relatively stable meanings. However, these gestures are somewhat distinct from “emblems” or “quotable gestures” (as discussed in Payrató, this volume). They are often considered to be exemplars of the recurrent gesture category (Reference Harrison and LadewigHarrison & Ladewig, 2021; Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014, this volume). Studies of gestures associated with negation have arguably helped in paving the way for a reconceptualization of the nature of gesture and its relation to linguistic structures, as well as of our understanding of common ground between spoken and signed languages, the multimodality of language, and the embodiment of cognition.

With the wider issue of grammar–gesture relations in the background, the first task of this chapter is to chart the territory of gestures associated with negation: what is known about their forms, organizational properties, and functions related to linguistic negation (explicit and covert), discourse-pragmatics, and interaction, as discernible from their study in a wide variety of linguistic communities (Section 2). Then, the range of discourse domains, interactive contexts, and methodological perspectives in which studies of gestures associated with negation have been conducted will be considered (Section 3). The empirical and theoretical contributions that such studies have made can then be evaluated by reporting the uptake of relevant research findings across different areas of the linguistic, social, and cognitive sciences (Section 4). Throughout these sections, the aim is to not only show what has been discovered, but also to disclose areas that seem ripe for further development and which raise open empirical questions.

2 Gestures Associated with Negation

In their entry for the Stanford Encyclopedia of Philosophy, Reference Horn, Wansing and ZaltaHorn and Wansing (2020) write that “Negation is a sine qua non of every human language.” As a linguistic universal, all languages have grammatical forms and structures that express negation, the subtleties of which have been widely documented and debated in decades of linguistic, pragmatic, and psycholinguistic research (e.g. for an accessible introduction to English negation, see Reference Huddleston and PullumHuddleston & Pullum, 2005, Ch. 8; for Mandarin Chinese negation, see Reference Li and ThompsonLi & Thompson, 1989, Ch. 12). What had not been as closely studied until relatively recently were the forms, meanings, and structures of gestures that can be observed when people express negation in face-to-face spoken communication, and more specifically, the intricate ways in which linguistic and gestural forms and structures relate during the expression of negation. The current section aims to convey what is known about recurrent gestures associated with negation by reporting where in the world they have been observed so far, the typologies in the literature that have been established, and our understanding of the forms, functions, and organizational properties of these gestures.

2.1 Geographical Coverage

Focusing on spoken languages, the following coverage of gestures associated with negation compiles and abstracts from different kinds of research-typologies, descriptive studies of individual gestures, experiments, etc. – conducted along different themes and subjects in relation to particular languages (Table 17.1). This is also visualized with a Google map with pins to represent the location of studies or linguistic communities under study (Figure 17.1). The main criteria for inclusion in this overview was that researchers observed and discussed the relation between gesture and negation in the locations where the studies were conducted; distinctions concerning different forms and form variants of gestures/gesturing were set aside for the present purposes. Distinctions concerning regional uses of language have been made where research permits, such as in different varieties of Spanish. Several papers listed in this review are discussed in more detail in later sections of the chapter.

Table 17.1 Widespread observations of gestures associated with negation, classified by language familyFootnote ²

Indo-European
Romance
Catalan	Reference González-Fuente, Tubau, Espinal and PrietoGonzález-Fuente, Tubau, Espinal, & Prieto (2015); Reference Prieto, Borràs-Comes, Tubau and EspinalPrieto, Borràs-Comes, Tubau & Espinal (2013); Reference Prieto, Espinal, Deprez and Teresa EspinalPrieto & Espinal (2020); Reference Tubau, González-Fuente, Prieto and EspinalTubau, González-Fuente, Prieto, & Espinal (2015);
French	Reference Beaupoil-HourdelBeaupoil-Hourdel (2015); Reference Beaupoil-Hourdel and MorgensternBeaupoil-Hourdel & Morgenstern (2021); Reference Beaupoil-Hourdel, Morgenstern, Boutet, Larrivée and LeeBeaupoil-Hourdel, Morgenstern, & Boutet (2016); Reference Beaupoil-Hourdel, Morgenstern, Boutet, Larrivée and LeeBeaupoil-Hourdel, Blondel, & Boutet (2016); Reference Benazzo and MorgensternBenazzo & Morgenstern (2014); Reference Blondel, Boutet, Beaupoil-Hourdel and MorgensternBlondel, Boutet, Beaupoil-Hourdel, & Morgenstern (2017); Reference CalbrisCalbris (1990, Reference Calbris2003, Reference Calbris2005, Reference Calbris2011, Reference Calbris, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013); Reference Ferre and MettouchiFerre & Mettouchi (2020); Reference GuidettiGuidetti (2000, Reference Guidetti2002, Reference Guidetti2005); Reference Harrison, Larrivée, Larrivée and ChungminHarrison & Larrivee (2016); Morgenstern & Reference Beaupoil-HourdelBeaupoil-Hourdel (2015); Reference Morgenstern, Blondel, Beaupoil-Hourdel, Benazzo, Boutet, Kochan, Limousin, Hickmann, Veneziano and JisaMorgenstern et al. (2018)
Italian	Reference Efron, Fraser, Haber and Müller.Efron (1972); Reference De Joriode Jorio (2000); Reference KendonKendon (2002, Reference Kendon2004, Reference Kendon2017)
Portuguese (Brazilian)	Reference LimaLima (2017)
Spanish (Cuban) Spanish (Iberian) Spanish (Mexican)	Reference Müller and SpeckmannMüller & Speckmann (2002) Reference Prieto, Borràs-Comes, Tubau and EspinalPrieto et al. (2013); Reference Teßendorf, Müller, Cienki, Fricke, McNeill and BressemTeßendorf (2014, Reference Teßendorf, Fernández-Villanueva and Jungbluth2016) Reference Montes and GracielaMontes & Graciela (2003)
Germanic
English	Reference Brown and KamiyaBrown, & Kamiya (2019); Reference HarrisonHarrison (2009a, Reference Harrison, Zlatev, Andrén, Johansson Falck and Lundmark2009b, 2020, Reference Harrison2013, Reference Harrison2014b, Reference Harrison2018); Reference KendonKendon (2004); Reference Lapaire, Bonnefille and SalbayreLapaire (2006a)
German	Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem & Müller (2014a, Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b, Reference Bressem and Müller2017); Reference HotzeHotze (2019); Reference Ladewig and HotzeLadewig & Hotze (2021); Reference MüllerMüller (2017); Reference SchoonjansSchoonjans (2017, Reference Schoonjans2018); Reference StreeckStreeck (2009, Reference Streeck2017)
Swedish	Reference AndrénAndrén (2010, Reference Andrén, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014)
Slavic
Polish	Reference Antas and GembalczykAntas & Gembalczyk (2017); Reference Piontek and Tadeusz-CiesielczykPiontek & Tadeusz-Ciesielczyk (2019)
Russian	Reference González-Fuente, Tubau, Espinal and PrietoGonzález-Fuente, Tubau, Espinal, & Prieto (2015); Reference GrishinaGrishina (2015)
Niger-Congo
Dida	Reference Tano and Houphouet-BoignyTano & Houphouet-Boigny (2018)
IscamthoFootnote ³	Reference BrookesBrookes (2004, Reference Brookes, Seyfeddinipur and Gullberg2014)
Sino-Tibetan
Mandarin Chinese	Reference HarrisonHarrison (2018, Reference Harrison2021); Reference Li, Borràs-Comes and EspinalLi, Borràs-Comes, & Espinal (2019); Reference Li, González-Fuente, Prieto and EspinalLi, González-Fuente, Prieto, & Espinal (2016)
Syuba (Kagate, Nepal)	Reference GawneGawne (2021)
Afro-Asiatic
Hausa	Reference WillWill (2018)
Israeli Hebrew	Reference Inbar and ShorInbar & Shor (2017, Reference Inbar and Shor2019); Reference Shor, Nir and BermanShor (2020)
Austronesian
Indonesian	Reference MarsajaMarsaja (2008); Reference PalfreymanPalfreyman (2019)
Central Solomons
Savosavo (Solomon Islands, Papua New Guinea)	Reference Bressem, Stein and WegenerBressem, Stein, & Wegener (2015, Reference Bressem, Stein and Wegener2017); Reference Bressem and WegenerBressem & Wegener (2021); Reference Wegener and BressemWegener & Bressem (2019)
Japonic
Japanese	Reference Egawa, Aoki and HirataEgawa, Aoki, & Hirata (1985); Reference JungheimJungheim (2004, Reference Jungheim2006, Reference Jungheim, McCafferty and Stam2008, Reference Jungheim2009, Reference Jungheim2011, Reference Jungheim2013)
Quechuan
Ecuadorian Quechua	Reference FloydFloyd (2018)
Barbacoan
Cha’palaa	Reference FloydFloyd (2018)
Isolate
San Juan Quiahije Chatino Sign Language	Reference Mesh and HouMesh & Hou (2018)

Figure 17.1 Geographical coverage of the attested relation between gesture and negation in spoken languages

Figure 17.1 is no doubt incomplete, being restricted to the studies published in the languages that I can read (or could confidently cross-reference) and those that I could locate. Several gestures described in A world guide to gestures by Reference MorrisMorris (1994) are associated with negation and either located geographically or described as having “widespread” locality. They include the Head Shake (Widespread; p. 144), Palm Thrust (Greece; p. 191), Palms Front (Worldwide; p. 195), Palms Wipe (Widespread; p. 198), and others. Similarly, Reference DarwinDarwin’s (1872) The expression of emotions in man and animals is a rich resource for culturally specific observations of various head and hand gestures associated with negation that Darwin noted on his voyages or in correspondence with peers (see Reference CooperriderCooperrider, 2019). However, work by Morris and Darwin could not be easily pinned down to the above map.

Disclaimers aside, Table 17.1 and Figure 17.1 combined reveal both a number of research hotspots and data deserts, which may be useful in revealing gaps in research for future studies. Observations of gestures associated with negation in locations not observed here will add incrementally to established findings and, moreover, hopefully extend or challenge the picture of these gestures that is emerging. It is this picture to which we can now turn.

3.2 Forms, Organizational Properties, and Functions of Gestures Associated with Negation

This section’s first port of call is the landmark work on gestures with pragmatic functions in Italian and English by Kendon and his studies on the Open Hand Prone gesture family (Reference KendonKendon, 1995, Reference Kendon2002, Reference Kendon2004, Reference Kendon2017), while research into the semiotics of French gesture by Calbris then introduces an alternative perspective on similar gesture forms (Reference CalbrisCalbris, 1990, Reference Calbris2003, Reference Calbris2005, Reference Calbris2011, Reference Calbris, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013; Calbris & Copple, this volume). These landmark observations provide the departure point for various lines of research that have subsequently added to our understanding of gestures associated with negation.

3.2.1 Context-of-Use, Kinesic and Semantic Core, Underlying Action

In a highly influential study by Reference KendonKendon (2004), associations between gesture and the expression of negation were observed as part of characterizing the use of recurrent gestural forms in the Open Hand Prone family. In this family of forms, “the forearm is always in a prone position so that the palm of the hand faces either toward the ground [‘ZP’ gestures] or away from the speaker [‘VP’ gestures], depending upon how the elbow is bent” (p. 248). Analyzing examples of these gestures in a video corpus of Italian speakers (supplemented with several examples of English speakers), Reference KendonKendon (2004) observed that gestures in the Open Hand Prone family occurred in discursive contexts “where something is being denied, negated, interrupted or stopped, whether explicitly or by implication” (p. 248), as well as “in contexts where a speaker gives an extreme positive evaluation of something” (p. 249). For each example, Kendon scrutinized the specific form of the gestures and their timing in relation to the verbal utterance, and analyzed their potential semantic and pragmatic contributions to the discursive context-of-use (on this methodology; Reference KendonKendon, 2004, p. 226; Reference Müller, Posner and MüllerMüller, 2004).

Reference KendonKendon’s (2004) findings led him to argue that the formational core of gestures in the Open Hand Prone family expresses a “core semantic theme” of “halting, interrupting, or indicating the interruption of a line of action” (p. 281). This theme, he proposed, was derived from the manipulatory action of the hands that motivates the core form of the gesture (see work by Calbris, Streeck, and Müller below). Gestures with the palm oriented downwards and “swept” laterally (ZP gestures) “perhaps derive from the action of cutting something through, knocking something away or sweeping away irregularities on a surface, as in rubbing out any marks or traces of something,” whereas for gestures with the palm raised vertically and oriented toward the addressee (VP gestures), “the actor engages in a schematic act of stopping something or holding something back” (Reference KendonKendon, 2004, p. 263). In contexts that seemed to be expressing meanings that are positive, Kendon argued that the performance of these gestures with their core semantic themes may be making explicit the underlying expression of an implied negative. Observations of headshakes in these contexts were also included (Reference KendonKendon, 2002). This research was foundational in flagging up gestures with the Open Hand Prone formation as candidates for studying relations between gestures and negation.

3.2.2 Analogical Links between Gestures and Negation

In a study that has been similarly influential to our understanding of gestures associated with negation, French semiotician Calbris arrived at the association between gesture and negation from a different perspective to that of Kendon. In the context of characterizing the symbolic import of certain physical components of gestures in French, Calbris discovered the salience of a number of such components to the expression of concepts related to negation (Reference CalbrisCalbris, 1990, Reference Calbris2003, Reference Calbris2005, Reference Calbris2011, Reference Calbris, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013).

During Calbris’ career-long study of symbolic relations between gestures and notions, negation played a central role in developing and illustrating several key constructs, namely “gesture variants,” “kinesic ensembles,” “polysemous gestures,” and “polysigns” (Reference CalbrisCalbris, 2011, p. 24). While “gesture variants” refer to the finding that one notion may be expressed by several different gestures, which may be performed simultaneously resulting in a “kinesic ensemble” (or “cumulative variant”), the construct of “polysemous gestures” refers to gesture forms that may convey a range of different meanings, sometimes expressed simultaneously resulting in a “polysign.”

Defining negation “as an act of the mind that consists in refusing a relation, a proposition, an existence, and as the process of refusal” (Reference CalbrisCalbris, 2005 , p. 2; my translation), Calbris has examined the different gestures that French speakers use to express negation and developed an explicit typology of them (Reference CalbrisCalbris, 1990, Reference Calbris2005, Reference Calbris2011, Reference Calbris, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013). This typology is presented succinctly in Reference CalbrisCalbris (2005), which contains nine “variantes gestuelles de la négation,” namely: three head gestures (backwards toss, lateral sweep, and shake), three gestures with the vertical palm shaped flat (either raised, swept laterally, or oscillated), two with the extended index finger (raised, oscillated), and one involving the level hand moved horizontally.

Focusing on Calbris’ treatment of the variants involving the vertical palm and the level hand will illustrate further constructs while distinguishing her approach from Kendon’s (on this distinction, see Reference CalbrisCalbris, 2011, pp. 282–284). For Calbris, the Palms Forward (Kendon’s “VP”) and Level Hand (Kendon’s “ZP”) are not only gesture variants of negation, but also examples of “polysemous gestures,” negation being only one of their meanings, because the gestures can express a variety of notions on different occasions of use. Thus, what Calbris calls the Palms Forward gesture illustrates a semantic derivation from a singular analogical link to a physical action of the outwards-turned hand: self-protection, which depending on the context, “may express the notions of ‘opposition,’ ‘prudence,’ ‘refusal of responsibility,’ ‘stopping,’ ‘requesting someone to wait,’ ‘agreement,’ ‘refusal-negation,’ ‘objection,’ ‘restriction,’ or ‘perfection’” (Reference Calbris, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfCalbris, 2013 , p. 666). While this is a similar idea to the notion of a core semantic theme being derived from the underlying action motivating the gesture form (Reference KendonKendon, 2004), the gesture that Calbris calls the Level Hand illustrates a different relation, not of a core semantic theme, but one of plural motivation. Depending on which component of the Level Hand is salient or profiled (its movement trajectory or profiling of shape configurations) determines the analogical link established with a physical correlate, which in turn is the basis for the gesture’s meaning and array of subsequent semantic derivations, including: superlative, perfection, determinism, certainty, negation, refusal, cutting, and equality (Figure 17.2) (see also Calbris & Copple, this volume, Section 2.2.2).

Figure 17.2 Plural motivation of the “Level Hand” gesture

(Calbris, 2011, p. 183, reproduced with permission from John Benjamins Publishing Company)

Spelling out this network of plural motivations for the Level Hand leads Calbris to a different analysis than positing a core semantic theme derived from a single schematic action as per Reference KendonKendon (2004). This different position can be exemplified for the case of the gesture’s occurrence in contexts where positive assertions are being made. Whereas Kendon posits the expression of an implied negative in such contexts, which is made explicit through the performance of the gesture (enacting a “cutting,” “knocking,” “sweeping,” “rubbing”), Calbris posits a different analogical link based on singling out a physical component of the gesture’s form. As per Figure 17.2, the Level Hand’s expression of a superlative or perfection derives from the meanings of “quantity” and “totality,” which are motivated by the horizontal movement of the gesture. This is not an enactment or representation of a manual action but conveys the concept of “everywhere.” Similarly, the expression of negation by the Level Hand gesture is not related to the gesture’s origin in an action of knocking or sweeping away (as per Kendon), but to the meaning of “stop-refusal” represented as an obstacle, conveyed by the resistance of the palm down. “In short, the interpretation of a co-speech gesture supposes not only an appreciation of the contextual situation but also a physical understanding of the gesture and of the underlying symbolic system” (Reference CalbrisCalbris, 2011, p. 284, emphasis in original).

It seems that some gestures associated with negation may have a single, identifiable origin in a manual action, such as the Vertical Palm’s connection to “schematic stopping.” This may be the case for a number of other gestures that have been shown to express negation. The “brushing aside” gesture observed in Spain (Reference Teßendorf, Müller, Cienki, Fricke, McNeill and BressemTeßendorf, 2014, Reference Teßendorf, Fernández-Villanueva and Jungbluth2016) and the “dusting off palms” gesture observed in Nigeria (Reference WillWill, 2018) seem to be candidates. Variations in the performance of other gestures associated with negation, such as the Level Hand (ZP), may not be variations on an underlying action motif, but variations on analogical links that are based on fundamentally different signifiers (i.e. movement through the visual field, finger tips that draw lines, palms that resist/cover, and the edge that cuts).

Several aspects of the work discussed so far have been the subject of further studies of gestures associated with negation, including the relation between gestures and physical actions, the connection to negation in the linguistic utterance, and the occurrences of these gestures with the expression of explicit and implicit negation.

3.2.3 Relations to Aspects of Physical Action

In their typology of recurrent gestures identified in a corpus of German speakers, Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem and Müller (2014a, Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b) identified a family of recurrent gestures clearly associated with the linguistics and pragmatics of negation. These gestures were crucially shaped by a shared formational characteristic – movement of the hand away from the body, which, according to Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem and Müller (2014b), revealed a common origin in an underlying “action scheme” motivating meanings and functions associated with negation. Specifying this scheme in a subsequent paper, Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem and Müller (2014b) stated that “Gestures may reproduce perceptually salient aspects of instrumental actions and extract distinctive elements of the action by comparing, selecting, and recombining physically pertinent elements” (p. 1600).

The action scheme underpinning all gestures included in the Away family, Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem and Müller (2014b) have argued, is “the effect that actions involving the clearing of the body space have in common: Something that was present has been moved away – or something wanting to intrude has been or is being kept away from intrusion” (p. 1596). Following the convention of referring to a recurrent gestural form based on the assumed underlying action, the recurrent gestures in the Away family were named the “brushing away gesture,” the “throwing away gesture,” the “holding away gesture,” and the “sweeping away gesture.”

Bressem and Müller articulated this relation between a physical action scheme and gestures based on analyses of spoken language use among adults and have subsequently argued it could be the basis for a multimodal construction in German (Reference MüllerBressem & Müller, 2017 ; see Section 4.2 below). Building on this line of work, Reference GawneGawne (2021) has observed a gesture among some speakers of Syuba in Nepal with a rotated “away trajectory” (thus similar to ‘brushing away’) that is “used only with grammatically negative forms” (p. 4) and “used to indicate the absence of someone or something” (p. 9). This relation between aspects of actions and gestures has been the focus of other research into gestures associated with negation, such as investigating children’s early language development and the connections between grammar and gesture.Footnote ⁴

3.2.4 Developmental Pathways into Multimodal Negation

While the research discussed so far has been based on video recordings of adult speakers, the multimodality of negation has caught the attention of French language acquisition researchers for its potential as a rich case study for investigating multimodal language development. For this, researchers have adopted a usage-based, video corpus methodology to explore children’s acquisition of negation with longitudinal data from a range of linguistic and interactive contexts (e.g. Reference Beaupoil-HourdelBeaupoil-Hourdel, 2015; Reference Beaupoil-Hourdel, Morgenstern, Boutet, Larrivée and LeeBeaupoil-Hourdel et al., 2016; Reference Morgenstern, Blondel, Beaupoil-Hourdel, Benazzo, Boutet, Kochan, Limousin, Hickmann, Veneziano and JisaMorgenstern et al., 2018).

Central to this body of research is the Paris Corpus (Reference Morgenstern and ParisseMorgenstern & Parisse, 2012). Hour-long video recordings were made at monthly intervals of five monolingual French children from the ages of only several months up to several years, filmed with high-quality video and audio equipment in naturally occurring settings for child–parent interaction, then transcribed and annotated by research teams using the software CHAT (Reference Morgenstern and ParisseMorgenstern & Parisse, 2007). Articulating the notion of a “multimodal pathway,” five stages of development in these children’s expression of negation have been identified, with careful consideration of the caregiver input that was also transcribed in the data (Reference Beaupoil-Hourdel, Morgenstern, Boutet, Larrivée and LeeBeaupoil-Hourdel, Morgenstern, & Boutet, 2016). With qualitative and quantitative analyses, the researchers have characterized this pathway as a gradual transition between “non-symbolic actions” (of rejection, avoidance), “symbolic actions” (gestures), and multimodal linguistic constructions similar in complexity to those of adult speakers (Reference Morgenstern, Blondel, Beaupoil-Hourdel, Benazzo, Boutet, Kochan, Limousin, Hickmann, Veneziano and JisaMorgenstern et al., 2018).Footnote ⁵ By making these distinctions, the findings are also shown to demonstrate a gradual specification of the functional roles of gestural and linguistic components being combined in the children’s expression (Reference Beaupoil-Hourdel, Morgenstern, Boutet, Larrivée and LeeBeaupoil-Hourdel et al., 2016). The researchers have proposed to view the acquisition of multimodal negation as evidence for the children’s development of linguistic and cognitive skills (Reference Beaupoil-HourdelBeaupoil-Hourdel, 2015).

Though focusing more on patterns of multimodal negation than on individual gestures, these studies have highlighted the salience of “PalmUp-shrug” and “indexWave” gestures in the acquisition of negation (Reference Beaupoil-Hourdel and DebrasBeaupoil-Hourdel & Debras, 2017; Reference Beaupoil-Hourdel and MorgensternBeaupoil-Hourdel & Morgenstern, 2021; Reference Blondel, Boutet, Beaupoil-Hourdel and MorgensternBlondel et al., 2017; Reference Morgenstern, Beaupoil-Hourdel, Blondel, Boutet, Ortega, Tyler, Park and UnoMorgenstern, Beaupoil-Hourdel, Blondel, & Boutet, 2016). Building on previous descriptions of similar forms and their association with the expression of notions related to negation (e.g. Reference CalbrisCalbris, 1990; Reference KendonKendon, 2004; Reference StreeckStreeck, 2009), studies in this line of research have advanced our understanding by proposing kinesiological analyses of these gestures, central to which is an intrinsic frame of reference. In describing gesture relative not to an observer’s frame of reference but to its own articulatory physiology, gestures are viewed as the combination of a muscular impulse constrained by the biomechanics of the human skeleton (cf. Reference BoutetBoutet, 2008, Reference Boutet2010, Reference Boutet2015, Reference Boutet2018).

3.2.5 Kinesiological Perspectives on Negation

In his “formal analysis of gestural negation,” Reference BoutetBoutet (2015) observes that previous descriptions of gestures associated with negation have adopted an egocentric frame of reference. They have restricted the description of such gestures to three-dimensional space (x, y, z) and overlooked an essential characteristic of the gestural negation system related to the physiology of human bodies. The bodily articulation of a given gesture from finger to shoulder can involve seventeen reference points (“degrees of freedom”). Taking the example of gestures described by Kendon and others as “Vertical Palm” and “Horizontal Palm,” by adopting the perspective of the gestural articulator, the negative meanings would not derive from the action motif said to be motivating the form of the gesture (i.e. of the hand sweeping aside for the ZP or schematically stopping for the VP), but from the physiological configuration that remains constant or “invariant” across all manifestations of this gesture: the pronation of the palm (Figure 17.3).

Figure 17.3 Invariant feature in different orientations of the palm: a. pronation/palm down, b. pronation/palm forward, c. pronation/palm sideways

(Boutet, 2015, p. 118; article published under a Creative Commons Attribution 4.0 License)

Explaining this invariance (Figure 17.3), Boutet points out that from an egocentric reference point, “the forearm and the arm are in a different position for the three gestures,” as has been an important distinction in previous work adopting an egocentric frame of reference; however, Boutet highlights that from an allocentric perspective, “the hand does not change position relative to the forearm, being in a position of pronation across the three cases” (Reference BoutetBoutet, 2015, p. 118). In this approach, the invariance of the position of the hand (relative to articulatory segments of the forearm and arm – not to the observer) is where the different gestures derive their semantic theme from, which allows the analyst to arrive at the relation between embodiment and negation without the intermediary of an underlying action. Boutet’s analysis here is similar to Streeck’s, who also posits the invariant in these motions as their conceptual core, attributing form variations such as orientation of the gesture and position of the hand (supine-prone) to the speaker’s embodied adaptations to the interactional context in which the gestures are made (Reference StreeckStreeck, 2017, Ch. 5). From Boutet’s (kinesiological, intrinsic) perspective, the role of the difference between the palm oriented either away from the speaker’s body or facing down in distinguishing “different gestures” may have been exaggerated in the previous literature (see further discussion in Reference Beaupoil-Hourdel and MorgensternBeaupoil-Hourdel & Morgenstern, 2021, and in Boutet & Cienki, this volume).

Additional explanations for the form and organization of gestures associated with negation have been proposed based on studies that attend to the grammatical constructs of negation.

3.2.6 Kinesic Organization in the Grammar–Gesture Nexus

Building on sensory-kinesic approaches to negation (Reference Lapaire, Bonnefille and SalbayreLapaire, 2006a; see Section 4.2) with a video corpus of spoken English and form-based methods of gesture analysis (Reference Müller, Ladewig, Borkent, Dancygier and HinnellMüller, Bressem, & Ladewig, 2013), in Reference HarrisonHarrison (2018) I reported on the collection and analysis of a corpus of English spoken utterances that all involved linguistically what Reference HornHorn (1989) describes as “the traditional criteria for negativity – the presence of a negative particle, its appearance in a specified syntactic location, and so forth” (p. 34).Footnote ⁶ A second criterion for inclusion in the corpus was that the utterance involved the performance of a gesture from the Open Hand Prone gesture family (which I assumed to be potentially related to the expression of negation; Reference CalbrisCalbris, 1990; Reference KendonKendon, 2004). Based on grammatical and gestural microanalysis of over eighty examples, a number of distinctions in the form–function relations of such gestures, as well as a principle governing their temporal organization with speech, were proposed.

First of all, I distinguished between three form variations of Palm Down gestures associated with different kinds of negation or negative speech acts and, by analyzing several examples, argued that their performance with these linguistic utterances supported the view that each gesture reproduced variations on a manual action. As illustrated in Figure 17.4, the associations observed were between clausal negation and “sweeping away” for the Palm Down Across, exclusions and “clearing aside” for the 2-Palms Down Mid, and rejections and “cutting through” for the 2-Palms Down Across (Figure 17.4).

Figure 17.4 Three Horizontal Palm gestures based on different underlying actions: PDAcross (“sweeping away”), 2PDmid (‘clearing aside’), and 2PDAcross (‘cutting through’)

(Harrison, 2018, reproduced with permission of Cambridge University Press through PLSclear)

Close study of these gestures at the level of utterance revealed how speakers may organize their different phases of the gestural action (i.e. preparation, stroke, and holds) in relation to the grammatical structures in the accompanying speech, which can be viewed as creating “sync points” for gestures associated with negation (Reference HarrisonHarrison, 2018, Ch. 3).Footnote ⁷ Specifically, this research has shown the way in which the manual gestures were organized in relation to the node and scope of negation (Reference HarrisonHarrison, 2010) and to negative polarity items and negative focus (Reference HarrisonHarrison, 2013), as well as how manual, head, and linguistic elements were orchestrated in kinesic ensembles (Reference HarrisonHarrison, 2014b). Extending notions of lexical and conceptual affiliation between speech and gestures (Reference McNeill, Renals and BengioMcNeill, 2006), these examples of negation, expressed and organized multimodally, were offered as supporting evidence for a grammatical affiliation between speech and gesture (Reference Lapaire, Bonnefille and SalbayreLapaire, 2006a). The organization of gestures in the Open Hand Prone family in relation to the node and scope of negation has offered a basis for cross-linguistic comparison. One study found a similar pattern in French, whose syntax for negation on the sentence level overlaps to some extent with that of English (Reference Harrison, Larrivée, Larrivée and ChungminHarrison & Larrivee, 2016). In a study of negative utterances from the Subaya language in Nepal (Reference GawneGawne, 2021), however, the holding of gesture over scope of negation was not found to occur. Subaya is “a verb final language,” which may explain the difference in gesture organization because there is “less content within the scope of the negation that follows the verb” (Reference GawneGawne, 2021, p. 18).

Gesture studies of other key grammatical phenomena relating to negation will be discussed in further sections below, including implicit/covert negation, single versus double negation, and quantification, negation and scopal ambiguity. These studies develop explanations for aspects of gestures associated with negation.

4 Explanations for the Occurrence of Gestures Associated with Negation

What is driving the relations between gestures and negation? What is the conceptual and functional role of gestures associated with negation in spoken language production and perception? A growing body of published studies converge on answers to these questions. Linguistic, cognitive-semantic, functional, embodied, cultural, and psycholinguistic explanations can be identified.

4.1 Gestures as Semantic Operators

Several researchers have observed that gestures associated with negation may also be performed in contexts-of-use where the spoken utterances exhibit no visible or ‘surface manifestation’ of linguistic negation. These observations have been the basis for some researchers to propose explanations for why gestures occur.

Reference KendonKendon’s (2002) analyses of the headshake in a corpus of examples from speakers of Italian and English revealed several such contexts. His examples included utterances structured with certain adverbs (e.g. “only”), evidentiality markers (e.g. “obviously”), statements without exceptions, superlatives, and intensified expressions (e.g. declaring that something was “marvelous” or “wonderful”; pp. 172–173). Considered alongside examples of the headshake in contexts of explicit verbal negation, Reference KendonKendon (2002) concluded “it seems that we can interpret [the headshake] as an operator that does semantic work similar in many ways to the work done by the various verbal particles of negation” (p. 180). How the headshake was placed in relation to speech further revealed it to be used as “an expression in its own right,” operating somewhat freely in relation to different parts of the verbal utterance “according to the rhetorical needs of the moment” (p. 180). Taking a more cognitive perspective on the expression of negation, other researchers have related these operations to the manifestation of a cognitive domain.

4.2 Manifestation of a Cognitive Domain

Based on a typology of gestures originally identified as co-occurring with grammatical negation in a twenty-hour corpus of TV interviews conducted in Israeli Hebrew (Reference Inbar and ShorInbar & Shor, 2017), Reference Inbar and ShorInbar and Shor (2019) present a subsequent study in which they report a “typology of verbal utterances that do not contain markers of grammatical negation and that may be accompanied by gestures associated with grammatical negation in spoken Israeli Hebrew” (p. 87). They identify six “patterns,” which, building on previous work, they call the “headshake,” “sweeping away’, “holding away,” “hands up gesture,” “finger wagging,” and “shoulder shrug.” Finding that “all the contexts revealed are connected to negation on some cognitive level” (Reference Inbar and ShorInbar & Shor, 2019, p. 93), they argue that when these gestures occur with utterances that show no surface form of negation, the gestures are a manifestation of an underlying cognitive domain of negativity. According to Reference Inbar and ShorInbar and Shor (2019), such utterances include those with “words with [a] negative meaning component” (p. 88); they are found in “contexts of intensification” (p. 88), with “indefinite modifiers and hedging expressions” (p. 90), “discourse particles that imply negation or restriction” (p. 90), and “conversational implicatures arising from the verbal utterance” (p. 92).

By analyzing the semantic import of the gestures in these contexts, Reference Inbar and ShorInbar and Shor (2019) argue that the “gestures indicate a higher abstract notion, namely ‘negativity’ rather than negation” (p. 94). They account for the occurrence of these gestures with utterances containing no surface expression of negation as the gestural manifestation of the speaker’s underlying cognitive domain. Gestures seem to offer a diagnostic of the cognitive domains at play which may underpin or have been “bleached” from the explicit linguistic expressions.

4.3 Functional Load Sharing and Encoding Strategies

The prevalence of Open Hand Prone gestures with utterances showing no verbal negation has been the basis for alternative perspectives, such as “functional load sharing” proposed by Reference Wegener and BressemWegener and Bressem (2019). The data were observations of the “sweeping away” and “holding away” gestures examined in a six-hour corpus of narratives, procedural texts, and interviews among fifteen speakers (mainly male) of SavoSavo, a non-Austronesian language spoken on Savo Island in the Solomon Islands. In this study, the search domain for the sweeping away and holding away gestures was identified as utterances containing lexemes of explicit and implicit negation. Reference Wegener and BressemWegener and Bressem (2019) found “rather few instances of gestures associated with negation co-occurring with explicit verbal negation,” and instead, “the ‘sweeping away’ gesture, is used mostly with implicit (lexical or pragmatic) negation/negativity and only rarely accompanies explicit verbal negation” (para. 1). The researchers propose a functional explanation for these findings which can be called the “sharing the load” view: “When explicit verbal negation is used, it bears the main functional load. When explicit verbal negation is absent, negation/negativity is emphasized and made visible through gestures” (Figure 17.5).

Figure 17.5 Functional explanation: “sharing the load”

(Wegener & Bressem, 2019, with permission from the authors)

Wegener and Bressem’s diagram illustrates “the interplay between verbal and gestural negation,” and they argue that the link between gesture forms attested to occur with negation (such as Vertical and Horizontal manifestations of the Open Hand Palm family) may not be as tight or as frequent as suggested in the literature, and on that basis they advise “don’t look for negation gestures where verbal negation is used” (ibid., Results). Similar to Reference Inbar and ShorInbar and Shor (2019), the assumption here that the “sweeping away” and “holding away” gestures are expressions or manifestations of negation is central to this explanation, though Wegener and Bressem propose to view the relation between the gesture form and negation as a more explicit one of encoding. While this view does not entertain Reference CalbrisCalbris’ (2011) notion of polysemous gestures, the functional perspective finds some support in a rich chain of psycholinguistic experiments.

4.4 Psycholinguistic Perspectives

Functional explanations such as load sharing and encoding strategies are consistent with a line of psycholinguistic research that has examined the interaction between negation and gesture using an experimental methodology. Explicitly building on naturalistic observations that negation receives multimodal expression in spoken language usage, researchers have designed experiments to investigate the role of gesture and prosody in the perception, interpretation, and comprehension of negative utterances (Reference Brown and KamiyaBrown & Kamiya, 2019; Reference Ferre and MettouchiFerre & Mettouchi, 2020; Reference Li, González-Fuente, Prieto and EspinalLi et al., 2016; Reference Prieto, Borràs-Comes, Tubau and EspinalPrieto et al., 2013; Reference Prieto, Espinal, Deprez and Teresa EspinalPrieto and Espinal, 2020; Reference Tubau, González-Fuente, Prieto and EspinalTubau et al., 2015). Results of these studies offer evidence that gestures associated with negation (coupled with certain prosodic patterns) may function as “cues” that guide the addressee’s interpretation of negative meaning, which might be ambiguous if only the verbal sentence were to be taken into account.

One chain of these studies has been conducted by Prieto, Borràs-Comes, Tubau, and Espinal. In their Reference Prieto, Borràs-Comes, Tubau and Espinal2013 article, they focus on the negative particles ningú (Catalan) and nadie (Spanish), which in response to a question that includes a sentential negative (such as “Who did not eat dessert?”) can be interpreted as meaning either “nobody” (single negation) or “everybody” (double negation). In a first step of the study, utterances reflecting both these meanings were elicited on camera from native speakers. Representative patterns of prosody and gesture that co-occurred with either single negation or double negation (identical in Catalan and Spanish) were taken as the stimuli for a subsequent perception study. These gestural patterns included the headshake and two-handed horizontal palm for single negation (“nobody”) and a shrug with headshake or nod for double negation (“everybody”). Various software was then used to create versions of each type of negation integrated with either congruent or incongruent gestural/prosodic patterns, these being presented to naïve participants in auditory-only (AO), visual-only (VO), and audiovisual (AV) conditions. The results established that “prosodic and non-verbal cues (i.e., gestural patterns) crucially affect the interpretation of isolated n-words” (p. 147).

Subsequent studies have built on this finding, and applying similar experimental paradigms, have supported the crucial role of prosodic and gestural patterns in the interpretation of answers to negative yes/no-questions in Catalan (Reference Tubau, González-Fuente, Prieto and EspinalTubau et al., 2015) and of rejections to negative assertions/questions in Mandarin Chinese (Reference Li, González-Fuente, Prieto and EspinalLi et al., 2016). Reference Brown and KamiyaBrown and Kamiya (2019) focused on sentences in English that are ambiguous in scope. Scopal ambiguities are known to arise in the context of sentences that include both a quantifier (e.g. many, most, all) and a negative particle (e.g. not, -n’t), which can be “semantically ambiguous sentences with multiple interpretations” (ibid., p. 4). The researchers found that gestures play a facilitative role in the interpretation of scopal ambiguities notoriously associated with negation. Reference Brown and KamiyaBrown and Kamiya (2019) specify that “speakers may manipulate the features of gestural form, placement, and length potentially to help listeners resolve the ambiguities arising from scopal interactions between quantification and negation” (p. 27).

Finally, a line of quasi-experimental research has developed in studies focusing on the production and perception of refusal gestures in Japan, aiming to understand the implications for learners of Japanese as a second or foreign language (Reference JungheimJungheim, 2004, Reference Jungheim, McCafferty and Stam2008, Reference Jungheim2013). Jungheim’s starting point was a specific speech act – refusal – and its nonverbal and culturally specific dimensions, which include what he calls the Hand Fan gesture (following Reference MorrisMorris, 1994). Unlike the Vertical Palm gestures we have described above as being oriented towards the addressee or object of negation, the Hand Fan “is performed high in the central gesture space near the face with the palm facing to the left or right depending on which hand is used,” creating a fan-like motion in front of the face (Reference JungheimJungheim, 2004, p. 135). Native speakers of Japanese perform Hand Fans with refusals, but Reference JungheimJungheim (2004) showed that learners of Japanese as a second language mainly used Vertical Palm gestures instead, also bowing more than Japanese native speakers. Learners obviously struggle with the complexity of the “refusal-routines,” but they could still accurately interpret refusal gestures performed by Japanese speakers (Reference Jungheim, McCafferty and StamJungheim, 2008). Having reviewed various perspectives on gestures associated with negation, we can now turn to work that has considered the theoretical implications and practical applications of this research area.

5 Theoretical and Applied Contributions

Studies of gestures associated with negation are helping to conceptualize the relationships between gesture and language. This can be seen in the uptake of the findings concerning gestures associated with negation in discussions of the multimodal nature of grammar (Section 5.1), the embodiment of cognition (Section 5.2), and the relation between gesture and signs (Section 5.3).

5.1 Multimodality of Grammar

Cognitive grammarian Lapaire has long ruminated: “What is the relationship between gesture and grammar?” (Reference LapaireLapaire, 2011, p. 88). When gesture studies focus on the spontaneity of gestural expression, without attending to recurrent gestures, Reference McNeillMcNeill’s (2005) conclusion that gestures are “certainly not part of ‘grammar’” may seem warranted (p. 21). However, this position has become problematic for researchers investigating aspects of gesture that appear closely connected to grammar. Such studies are found in areas of applied cognitive grammar (Reference Lapaire, Lewandowska-Tomaszczyk, Turewicz and Hanway PouloskyLapaire, 2002 , Reference Lapaire2005, Reference Lapaire, Bonnefille and Salbayre2006a, Reference Lapaire2006b, Reference Lapaire2016), cognitive linguistic gesture studies (Reference Cienki, Badio and KoseckiCienki, 2012, Reference Cienki2015, Reference Cienki2017), multimodal grammar (Reference FrickeFricke, 2012, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014), and multimodal construction grammar (Reference SchoonjansSchoonjans, 2017, Reference Schoonjans2018; Reference Steen, Turner, Borkent, Dancygier and HinnellSteen & Turner, 2013; Reference Zima and BergsZima & Bergs, 2017). Integrated with these different developments – to greater and lesser degrees – have been the gestures associated with negation.

Lapaire originally posed his question in considering the challenges that a perspective combining Merleau-Ponty’s phenomenology and cognitive linguistics were presenting to traditional grammatical analysis. Salient among these challenges was the overwhelming evidence found by cognitive linguists for the pervasive role of bodies in all aspects of language structure, including the conceptual organization and expression of grammar (Reference JohnsonJohnson, 1987; Reference LakoffLakoff, 1987; Reference Lakoff and JohnsonLakoff & Johnson, 1980; Reference SweetserSweetser, 1990). Treating grammatical notions, processes, and structures not as “mental phenomena” but as “revealers” of embodied meaning, and drawing on his own gesture studies, Lapaire has proposed cognitive-etymological, body motion-based, and manual-haptic models of core grammatical phenomena, including epistemic modality (Reference LapaireLapaire, 2006b, Reference Lapaire2013), temporal experience/relations (Reference LapaireLapaire, 2016), and – most relevant to the current chapter – negation (Reference Lapaire, Bonnefille and SalbayreLapaire, 2006a). Lapaire’s analyses highlight how grammar and gesture share in schematicity, imagery, meaning, and conventionality, leading him to view at least some gestures as “co-grammatical” (Reference LapaireLapaire, 2013), that is, as an embodied dimension of grammar.

Another theorization of the relation between grammar and gesture through the lens of multimodality has led Reference FrickeFricke (2012) to propose a “multimodal grammar” (Reference FrickeFricke, 2012, Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014). One of Fricke’s main claims is that multimodality is not only a feature of individual utterances or constructions but also a property of linguistic systems and thus of grammar in general. The multimodality of grammar, for example, means that structures and processes typically identified as “grammatical” may be more general organization principles that also determine the form and function of other modes that participate in multimodal expressions. In addition to grammar’s multimodality, studies of gestures associated with negation shed light on the embodiment of cognition.

5.2 Embodiment of Cognition

Gestures are rooted cognitively not only in conceptual structures from inside the brain, but also in “human actions and movement experiences more generally (that may culturally vary) in connection with aspects of their practical uses” (Reference MüllerMüller, 2017, p. 291, original emphasis). The sensory-kinesthetic experience of a particular movement pattern that characterizes a number of recurrent gestures associated with negation offers a case in point. Several researchers have observed that gestures related to negation may exhibit a movement away from the body (Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem & Müller, 2014a, Reference Bressem and Müller2017; Reference CalbrisCalbris, 2011; Reference HarrisonHarrison, 2014b, Reference Harrison2018; Reference Lapaire, Bonnefille and SalbayreLapaire, 2006a). In addition to “encoding” a conceptual schema or metaphor of “negation as distance” (Reference ChiltonChilton, 2014), the movement of recurrent gestures away from the body cotimed with the verbal expression of negation and repeated across contexts constitutes for the speaker a real-time, embodied experience of negation as distance (Reference MüllerBressem & Müller, 2017; Reference HarrisonHarrison, 2014b, Reference Harrison2018; Reference MüllerMüller, 2017). The movement of the hand away from the body results in a proprioceptive experience of something no longer being present, and it has been argued that such dynamic, real-time experience has become a basis for linguistic meanings and functions of gestures related to negation.

This feature of gestures provides evidence that cognition is not only centrally processed in the brain, but also influenced through certain interactive behavior and sensory-motor actions. As Reference Müller, Ladewig, Borkent, Dancygier and HinnellMüller and Ladewig (2013) specified, such findings advocate a view of cognition in which “gestures in and of themselves are embodied and dynamic conceptualizations” (p. 298; see also Reference StreeckStreeck, 2009, Reference Streeck2017).

5.3 Relations between Gesture and Sign

On the form configurations involved in the configuration of signs for negation, one landmark typology of the forms of signs that express negation is Reference ZeshanZeshan (2004). Similarities of forms can be found between examples in this typology and in typologies of gestures associated with negation found in spoken languages (Reference BoutetBoutet, 2015; Reference HarrisonHarrison, 2018; Reference Lapaire, Bonnefille and SalbayreLapaire, 2006a; Reference Mesh and HouMesh & Hou, 2018). For instance, linguistic signs in various sign languages related to the expression of negation involve a Vertical Palm with lateral oscillation movement (Reference ZeshanZeshan, 2004). Searching for “no” on the website www.spreadthesign.com provides video clips of signers from China, Ukraine, India, and Estonia performing a sign very similar to the action of wiping away and the Vertical Palm oscillate gesture. Several examples of a sign for negation are based on the Vertical Palm oscillate form in Chinese Sign Language (Reference Yang and FischerYang & Fisher, 2002) and Indonesian Sign Language (Reference PalfreymanPalfreyman, 2019). The “away” movement mentioned earlier is also prominent. In American Sign Language, Reference BembridgeBembridge (2016) explains how “The predicates KNOW, WANT, LIKE, HAVE, and GOOD […] are customarily negated through a reverse in the orientation of hand or hands […] a twisting outward or downward movement” (p. 4; see also Reference LiskovaLiskova, 2012).

Against a background of research bringing gesture and sign into a comparative perspective (Reference HarrisonHarrison, 2018, Ch. 7; Reference KendonKendon, 2004, Reference Kendon2008; Reference MüllerMüller, 2018), several studies have explicitly compared features of negation across signed and spoken languages by considering gestures. For example, Reference SchoonjansSchoonjans (2017) found similarities in the form and organization of “downtoning” stance markers in German multimodal speech and German Signed Language (DGS). Reference Mesh and HouMesh and Hou (2018) identified five “negative conventional gestures” with clausal, emphatic, and imperative negation (termed WAG, TWIST, PALM-UP, PALM-DOWN, DEAD) used by both speakers and signers of a municipality in Oaxaca, Mexico – something which facilitated communication between hearing and deaf people within the community. In Reference HarrisonHarrison (2018), I studied uses of the Vertical Palm form by a teacher of French Sign Language (LSF) and identified interactive functions shared with speakers, though also grammatical functions not observed in spoken language data. In language acquisition research, a developmental perspective on gestures and their relations to signed language have been proposed (Reference Blondel, Boutet, Beaupoil-Hourdel and MorgensternBlondel et al., 2017; Reference Morgenstern, Beaupoil-Hourdel, Blondel, Boutet, Ortega, Tyler, Park and UnoMorgenstern et al., 2016; Reference Morgenstern, Blondel, Beaupoil-Hourdel, Benazzo, Boutet, Kochan, Limousin, Hickmann, Veneziano and JisaMorgenstern et al., 2018).

While gesture and sign are often positioned at opposite extremities of a gesture–sign continuum (Reference McNeillMcNeill, 1992, Reference McNeill2005), comparative studies of spoken language and signed language negation bring them closer together. Whether we are dealing with sign-like gestures, gesture-like signs, both, or something else will require more research and discussion in the future.

6 Conclusion

This chapter has offered an overview of gestures associated with negation from empirical and theoretical perspectives, developing and challenging a number of themes that have become widely acknowledged in research into gestures associated with negation specifically, and into the study of recurrent gestures and grammar/gesture relations more widely. Given the unquestionably linguistic nature of negation and its centrality to all languages, the area of research discussed in this chapter further attests that gesture is an essential part of human communication – fundamentally intertwined with language on every level. Theories of language and grammar must ultimately be able to account for these relations between gesture and linguistic concepts.

Footnotes

13 The Role of Gesture in Debates on the Origins of Language

* This research was supported by a grant from Narodowa Agencja Wymiany Akademickiej Nr PPN/BEK/2018/1/00289 awarded to Przemysław Źywiczyn´ski under the Bekker Programme.

¹ In some cases the term “modality” is used to refer to what we mean by semiotic system, for example, in the analysis of “multimodal metaphors” (Reference ForcevilleForceville, 2017). For some scholars, speech and gestures are regarded as “communicative modalities,” and language itself is often considered “multimodal” (Reference Vigliocco, Perniss and VinsonVigliocco, Perniss, & Vinson, 2014). The tradition of social semiotics (Reference KressKress, 2009) considers the combination of “modes” such as speech, text, picture, color, music, typography, design and so on under the notion of “multimodality,” which “leads to an abundance of modes that are difficult to compare” (Reference GreenGreen, 2014, p. 10). Finally, “modality” is a term that refers to the different senses: vision, hearing, touch, and smell (and possibly others like proprioception), and perception is known to be multimodal.

² However, there is a minority position that highlights the manual character of ape gesture (Reference Pollick and de WaalPollick & de Waal, 2007; Reference de Waal, Pollick, Gibson and Tallermande Waal & Pollick, 2011).

³ In the current paleoanthropological nomenclature, hominins include Homo sapiens and its extinct ancestral or related species that emerged after the separation of our evolutionary line from panins, represented today by the common chimp and the bonobo (Reference Żywiczyński and WacewiczŹywiczyn´ski & Wacewicz, 2019).

⁴ Both the terms “sign languages” and “signed languages” are used in the field, but we prefer the latter, as it (a) more clearly shows the similarity to spoken languages, (b) does not confuse this type of languages with specific ones such as American Sign Language (ASL) or Swedish Sign Language (SSL), and (c) distinguishes the term from the semiotic notion of sign, according to which both signed and spoken languages are fundamentally based on signs.

⁵ Interestingly, all major proponents of monosemiotic gestural hypotheses underscore the role of iconic gestures, although none of them provides a semiotically developed notion of it. For example, when Reference ArbibArbib (2012) explains the conventionalization of pantomime as loss of iconicity, he describes it in very general terms. Such an implicit approach to iconicity is also typical of the other researchers reviewed in this section.

⁶ The selection and subsequent discussion is in part based on Reference Żywiczyński and WacewiczŹywiczyn´ski and Wacewicz (2019, pp. 206–207).

⁷ A saltational leap (from Latin saltus, meaning “a jump”) is a “catastrophic,” sudden change which involves a significant transformation in one or a few generations (cf. Reference Żywiczyński and WacewiczŹywiczyn´ski & Wacewicz, 2019).

⁸ The view formulated within geology, which holds that the same processes that operate now operated in the evolutionary past (Reference LyellLyell, 1833).

⁹ Even the term “co-speech gesture” (rather than “co-gesture speech”) implies what the dominant element of the communicative process is (cf. Reference Zlatev, Wacewicz, Żywiczyński and van de WeijerZlatev et al., 2017).

¹⁰ While commonly adopted, the use of the term “arbitrary” in relation to the linguistic sign, deriving from Reference SaussureSaussure (1916/1960) is highly ambiguous and arguably misleading, as it is conventionality (rather than “arbitrariness”) that combines with iconicity and indexicality in the signs of language, as in other sign systems, but in different proportions (see Reference ZlatevZlatev, 2014).

¹¹ The stone manufacture associated with Homo erectus, spanning over ca 1.7–0.1 million years (Reference de la Torrede la Torre, 2016).

14 Gesture and First Language Development: The Multimodal Child

¹ Broadening the types of populations and cultures we do our research on would be extremely valuable to integrate cultural and linguistic diversity in our approaches to multimodal language development. Indeed, most such studies analyze upper-middle-class children mainly in Europe and North America, but some do consider other cultures, especially in Central and South America (Reference HavilandHaviland, 1998; Reference Zukow-Goldring, Dent-Read and Zukow-GoldringZukow-Golding, 1997).

² The longitudinal data have been collected in the project supported by the French National Agency (ANR-08-COM-021, scientific coordinator, Aliyah Morgenstern). Children were video-recorded at home in their natural environment for one hour a month. Video extracts are accessible for readers online; see URLs provided with the examples. Note that informed consent was received from the parents of all the children and of all adults shown here – both to be video-recorded and to allow publication of their images.

³ For CLAN see the CHILDES project (https://childes.talkbank.org); for ELAN see the language archive (https://archive.mpi.nl/tla/elan); for PHON see the phonbank project (https://phonbank.talkbank.org).

⁴ The data in this chapter were collected in a research project supported by the French National Agency (Project ANR-08-COM-O21 on children’s language practices).

⁵ In order to facilitate the reading of the examples, I have presented transcriptions in tables with actions and gestures on the left side and speech on the right side. Lines correspond to multimodal turns. When turns overlap, I have given them the same number and added a letter.

⁶ Some extra examples are given online in order to illustrate this issue. In example (a), “ainsi font font font,” Madeleine’s mother is singing and gesturing to her ten-month-old child . In example (b), video b “the itsy bitsy spider,” Romy is singing along with her mother in English although she is a little girl who is otherwise monolingual in Hebrew.

https://www.ortolang.fr/workspaces/cup-morgenstern?section=content&root=head&path=%2F

https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%20b-itsy%20bitsy%20spider.mp4

⁷ The use of XXX in the transcription means that the transcribers could not decipher the meaning of the verbal production.

⁸ Ellie, the British little girl in our dataset, was filmed until she was four years old. The four French children form the Paris corpus (Reference Morgenstern and ParisseMorgenstern & Parisse, 2012) were filmed until they were seven years old. We chose this final example in our French data in order to illustrate a child’s multimodal and narrative skills at seven years old.

15 Gesture and Second/Foreign Language Acquisition

¹ In the following, the term SLA will be used to cover the learning of any language after the first language(s) (the L1), and also to cover both “second language acquisition” (learned in the country where the new language is spoken) and “foreign language acquisition” (learned in the learner’s home country, often in classrooms). The term will also refer to multilingual speakers regardless of acquisition history and proficiency level when the development of a new language is under scrutiny.

16 Gesture and Sign Language

17 On Grammar–Gesture Relations: Gestures Associated with Negation

I wish to thank Alan Cienki for his valuable comments and detailed editorial work on previous drafts. This work was also supported by Lin Qiuhan and Zeng Xinyi with funding from City University of Hong Kong (SRG-Fd 7005363).

¹ This chapter opts for the term “gestures associated with negation” rather than “negation gestures” or “gestures of negation,” as these latter terms risk implying that the relation between certain gesture forms and meanings is exclusive. The “same” gestures associated with negation have been observed in contexts and with meanings that may seem unrelated to the expression of negation (as per work by Calbris, discussed in later sections of this chapter). The reference to gestures associated with negation included in the “handy” primer on “Gesture for linguists” by Reference Abner, Cooperrider and Goldin-MeadowAbner, Cooperrider, and Goldin-Meadow (2015) is one indication of the importance of this topic within gesture studies.

² I thank Olivier LeGuen and Alan Cienki for suggestions that improved this classification.

³ Reference BrookesBrookes (2004) considers “a male youth argot” (p. 188) called Iscamtho, which “utilizes as its grammatical base the most common local language(s) spoken in the townships, most often Zulu as the majority language, but also South Sotho and Tswana” (p. 188 Footnote n6).

⁴ The derivation of gesture’s form and function from action has been contested by McNeill (Reference McNeillMcNeill, 2012, pp. 116–119), who sees “action-actions” and gesture-actions as cognitively distinct on the basis of an evolved “thought-language-hand link in the brain” (p. 117). (For discussion of both positions, see Reference Harrison and LadewigHarrison & Ladewig, 2021.)

⁵ Experimental studies by Reference Austin, Theakston, Lieven and TomaselloAustin, Theakston, Lieven, & Tomasello (2014) have demonstrated that children “understand linguistic means of expressing denial before they understand gestural means” (p. 2061), though they observe that “denials are widely regarded to be distinct from other subfunctions of negation because of their complexity” (p. 2061).

⁶ In English, Reference Huddleston and PullumHuddleston and Pullum (2005) specify that negation is “marked by individual words (such as no, not, never) or by affixes within a word (such as -n’t, un-, non-)” (p. 149; original emphasis).

⁷ The “nodes” of negation are the explicitly negative forms such as particles like not, no, never, and none, which can project a semantic influence over other elements in the utterance that lie in their “scope.” Reference Huddleston and PullumHuddleston and Pullum (2005) define scope as “the part of the sentence that the negative applies to semantically” (p. 150), while for Reference Downing and LockeDowning and Locke (2006) “all that follows the negative form to the end of the clause will be non-assertive and within the scope of negation” (p. 25). The scope of negation generates the occurrence of Negative Polarity Items (like any, only, and even; Reference LawlerLawler, 2005) and may “focus” negation onto other parts of speech, including nouns, adverbs, and adjectives.

References

Ahlner, F., & Zlatev, J. (2010). Cross-modal iconicity: A cognitive semiotic approach to sound symbolism. Sign System Studies, 38(1/4), 298–348.CrossRef Google Scholar

Andrén, M. (2010). Children’s gestures between 18 and 30 months. Lund, Sweden: Media Tryck.Google Scholar

Arbib, M. A. (2005). From monkey-like action to human language: An evolutionary framework for neurolinguistics. Behavioral and Brain Sciences, 28, 105–167. https://doi.org/10.1017/S0140525X05000038 CrossRef Google Scholar PubMed

Arbib, M. A. (2006). Aphasia, apraxia and the evolution of the language-ready brain. Aphasiology, 20(9), 1125–1155. https://doi.org/10.1080/02687030600741683 CrossRef Google Scholar

Arbib, M. A. (2012). How the brain got language. Oxford, UK: Oxford University Press.CrossRef Google Scholar

Arbib, M. A. (2016). Towards a computational comparative neuroprimatology: Framing the language-ready brain. Physics of Life Reviews, 16, 1–54. https://doi.org/10.1016/j.plrev.2015.09.003 CrossRef Google Scholar PubMed

Armstrong, D. F., Stokoe, W. C., & Wilcox, S. E. (1995). Gesture and the nature of language. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Armstrong, D. F., & Wilcox, S. (2007). The gestural origin of language. Oxford, UK: Oxford University Press.CrossRef Google Scholar

Bard, K. A., Bakeman, R., Boysen, S. T., & Leavens, D. A. (2014). Emotional engagements predict and enhance social cognition in young chimpanzees. Developmental Science, 17(5), 682–696. https://doi.org/10.1111/desc.12145 CrossRef Google Scholar PubMed

Bard, K. A., Maguire-Herring, V., Tomonaga, M., & Matsuzawa, T. (2019). The gesture “Touch”: Does meaning-making develop in chimpanzees’ use of a very flexible gesture?. Animal Cognition, 22(4), 535–550. https://doi.org/10.1007/s10071-017-1136-0 CrossRef Google Scholar PubMed

Berwick, R. C., & Chomsky, N. (2015). Why only us: Language and evolution. Cambridge, MA: MIT Press.Google Scholar

Blasi, D., Wichmann, S., Hammarstörm, H., Stadler, P., & Christiansen, M. (2016). Sound–meaning association biases evidenced across thousands of languages. Proceedings of the National Academy of Sciences, 113(39), 10818–10823. https://doi.org/10.1073/pnas.1605782113 CrossRef Google Scholar PubMed

Brosnan, S. F., & De Waal, F. B. (2002). A proximate perspective on reciprocal altruism. Human Nature, 13(1), 129–152.CrossRef Google Scholar PubMed

Brown, J. E. (2012). The evolution of symbolic communication: An embodied perspective. (Unpublished doctoral dissertation). University of Edinburgh.Google Scholar

Brown, S., Mittermaier, E., Kher, T., & Arnold, P. (2019). How pantomime works: Implications for theories of language origin. Frontiers in Communication, 4, 9. https://doi.org/10.3389/fcomm.2019.00009 CrossRef Google Scholar

Burling, R. (2005). The talking ape: How language evolved. Oxford, UK: Oxford University Press.CrossRef Google Scholar

Byrne, R. W., Cartmill, E., Genty, E., Graham, K. E., Hobaiter, C., & Tanner, J. (2017). Great ape gestures: Intentional communication with a rich set of innate signals. Animal Cognition, 20(4), 755–769. https://doi.org/10.1007/s10071-017-1096-4 CrossRef Google Scholar PubMed

Carstairs-McCarthy, A. (1996). Review of Armstrong, Stokoe & Wilcox, Gesture and the nature of language. Lingua, 99(2–3), 135–138. https://doi.org/10.1016/0024-3841(96)81480-X Google Scholar

Collins, C. (2014). Paleopoetics: The evolution of the preliterate imagination. New York, NY: Columbia University Press.Google Scholar

Corballis, M. C. (2002). From hand to mouth: The origins of language. Princeton, NJ: Princeton University Press.CrossRef Google Scholar

Corballis, M. C. (2003). From mouth to hand: Gesture, speech, and the evolution of right-handedness. Behavioral and Brain Sciences, 26(2), 199–208. https://doi.org/10.1017/S0140525X03000062 CrossRef Google Scholar PubMed

Corballis, M. C. (2012). How language evolved from manual gestures. Gesture, 12(2), 200–226. https://doi.org/10.1075/gest.12.2.04cor CrossRef Google Scholar

Corballis, M. C. (2013). Gestural theory of the origins of language. New Perspectives on the Origins of Language, 144, 171–184.CrossRef Google Scholar

Corballis, M. C. (2019). Language, memory, and mental time travel: An evolutionary perspective. Frontiers in Human Neuroscience, 13, 217. https://doi.org/10.3389/fnhum.2019.00217 CrossRef Google Scholar PubMed

Crockford, C., Wittig, R. M., Mundry, R., & Zuberbühler, K. (2012). Wild chimpanzees inform ignorant group members of danger. Current Biology, 22(2), 142–146. https://doi.org/10.1016/j.cub.2011.11.053 CrossRef Google Scholar PubMed

Crockford, C., Wittig, R. M., & Zuberbühler, K. (2017). Vocalizing in chimpanzees is influenced by social-cognitive processes. Science Advances, 3(11), e1701742. https://doi.org/10.1126/sciadv.1701742 CrossRef Google Scholar PubMed

Deacon, T. (1997). The symbolic species: The co-evolution of language and the brain. New York, NY: W. W. Norton and Company.Google Scholar

Demir‐Lira, Ö. E., Asaridou, S. S., Raja Beharelle, A., Holt, A. E., Goldin‐Meadow, S., & Small, S. L. (2018). Functional neuroanatomy of gesture–speech integration in children varies with individual differences in gesture processing. Developmental Science, 21(5), e12648. https://doi.org/10.1111/desc.12648 CrossRef Google Scholar PubMed

de la Torre, I. (2016). The origins of the Acheulean: Past and present perspectives on a major transition in human evolution. Philosophical Transactions of the Royal Society B, 371(1698), https://doi.org/10.1098/rstb.2015.0245 CrossRef Google Scholar

de Waal, F. B. M., & Pollick, A. S. (2011). Gesture as the most flexible modality of primate communication. In Gibson, K. R. & Tallerman, M. (Eds.), The Oxford handbook of language evolution (pp. 82–89). Oxford, UK: Oxford University Press.Google Scholar

Dittman, A. T. (1972). The body movement-speech rhythm relationship as a cue to speech encoding. In Siegman, A. W. & Pope, B. (Eds.), Studies in dyadic communication (pp. 135–155). New York, NY: Pergamon.CrossRef Google Scholar

Donald, M. (1991). Origins of the modern mind: Three stages in the evolution of culture and cognition. Cambridge, MA: Harvard University Press.Google Scholar

Donald, M. (1998). Mimesis and the executive suite: Missing links in language evolution. In Hurford, J. R., Studdert-Kennedy, M., & Knight, C. (Eds.), Approaches to the evolution of language: Social and cognitive bases (pp. 44–67). Cambridge, UK: Cambridge University Press.Google Scholar

Donald, M. (2001). A mind so rare: The evolution of human consciousness. New York, NY: Norton.Google Scholar

Donald, M. (2012). The mimetic origins of language. In Tallerman, M. & Gibson, K. R. (Eds.), The Oxford handbook of language evolution (pp. 180–184). Oxford, UK: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199541119.013.0017 Google Scholar

Donald, M. (2013). Mimesis theory re-examined, twenty years after the fact. In Hatfield, G. & Pittman, H. (Eds.), Evolution of mind, brain and culture (pp. 169–192). Philadelphia, PA: University of Pennsylvania.CrossRef Google Scholar

Efron, D. (1941). Gesture and environment. New York, NY: King’s Crown Press.Google Scholar

Ekman, P., & Friesen, W. V. (1969). Nonverbal leakage and clues to deception. Psychiatry, 32 (1), 88–106. https://doi.org/10.1080/00332747.1969.11023575 CrossRef Google Scholar PubMed

Emmorey, K. (2002). Language, cognition, and brain: Insights from sign language research. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar

Falk, D. (2009). Finding our tongues: Mothers, infants, and the origins of language. New York, NY: Basic Books.Google Scholar

Fay, N., Arbib, M., & Garrod, S. (2013). How to bootstrap a human communication system. Cognitive Science, 37(7), 1356–1367. https://doi.org/10.1111/cogs.12048 CrossRef Google Scholar PubMed

Fay, N., Lister, C. J., Ellison, T. M., & Goldin-Meadow, S. (2014). Creating a communication system from scratch: Gesture beats vocalization hands down. Frontiers in Psychology, 5, 354. https://doi.org/10.3389/fpsyg.2014.00354 CrossRef Google Scholar PubMed

Fitch, W. T. (2010). The evolution of language. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar PubMed

Forceville, C. (2017). Visual and multimodal metaphor in advertising. Styles of Communication, 9(2), 26–41.Google Scholar

Freedman, N. (1972). The analysis of movement behavior during the clinical interview. In Seigman, A. & Pope, B. (Eds.), Studies in dyadic communication (pp. 153–175). New York, NY: Pergamon.CrossRef Google Scholar

Fröhlich, M., Sievers, C., Townsend, S. W., Gruber, T., & van Schaik, C. P. (2019). Multimodal communication and language origins: Integrating gestures and vocalizations. Biological Reviews, 94(5), 1809–1829. https://doi.org/10.1111/brv.12535 CrossRef Google Scholar PubMed

Furness, W. H. (1916). Observations on the mentality of chimpanzees and orangutans. Proceedings of the American Philosophical Society, 55 (3), 281–290. www.jstor.org/stable/984118 Google Scholar

Galantucci, B. (2009). Experimental semiotics: A new approach for studying communication as a form of joint action. Topics in Cognitive Science, 1(2), 393–410. https://doi.org/10.1111/j.1756-8765.2009.01027.x CrossRef Google Scholar PubMed

Gardner, R. A., & Gardner, B. T. (1969). Teaching sign language to a chimpanzee. Science, 165(3894), 664–672. https://doi.org/10.1126/science.165.3894.664 CrossRef Google Scholar PubMed

Gardner, R. A., & Gardner, B. T. (1971). Two-way communication with an infant chimpanzee. In Schrier, A. & Stollnitz, F. (Eds.), Behavior of nonhuman primates (pp. 117–184). New York, NY: Academic Press. https://doi.org/10.1016/B978-0-12-629104-9.50010-8 Google Scholar

Gärdenfors, P. (2017). Demonstration and pantomime in the evolution of teaching. Frontiers in Psychology, 8, 415. https://doi.org/10.3389/fpsyg.2017.00415 CrossRef Google Scholar PubMed

Gärdenfors, P. (2018). Pantomime as a foundation for ritual and language. Studia Liturgica, 48(1–2), 41–55. https://doi.org/10.1177/00393207180481-204 CrossRef Google Scholar

Gärdenfors, P., & Högberg, A. (2017). The archaeology of teaching and the evolution of Homo docens. Current Anthropology, 58(2), 188–208. https://doi.org/10.1086/691178 CrossRef Google Scholar

Gentilucci, M., & Corballis, M. C. (2006). From manual gesture to speech: A gradual transition. Neuroscience & Biobehavioral Reviews, 30(7), 949–960. https://doi.org/10.1016/j.neubiorev.2006.02.004 CrossRef Google Scholar PubMed

Goldin-Meadow, S. (2008). Gesture, speech, and language. In Smith, A., Smith, K., & Ferrer-i-Cancho, R. (Eds.), Proceedings of the 7th International Conference on the Evolution of Language (pp. 427–428). London, UK: World Scientific.Google Scholar

Goldschmidt, R. (1982). The material basis of evolution. New Haven, CT: Yale University Press.Google Scholar

Green, J. (2014). Drawn from the ground: Sound, sign and inscription in Central Australian sand stories. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Hayes, K. J., & Hayes, C. (1952). Imitation in a home-raised chimpanzee. Journal of Comparative and Physiological Psychology, 45 (5), 450–459. https://doi.org/10.1037/h0053609 CrossRef Google Scholar

Hewes, G. W., Andrew, R. J., Carini, L., Hackeny, C., Gardner, R. A., Kortland, A, … & Wescott, R. W. (1973). Primate communication and the gestural origins of language. Current Anthropology, 14(1/2), 5–24. https://doi.org/10.1086/201401 CrossRef Google Scholar

Hewes, G. W. (1977a). A model for language evolution. Sign Language Studies, 15, 97–168.CrossRef Google Scholar

Hewes, G. W. (1977b). Language origin theories. In Rumbaugh, D. (Ed.), Language learning by a chimpanzee: The Lana Project (pp. 3–53). New York, NY: Academic Press.CrossRef Google Scholar

Hewes, G. W. (1996). A history of the study of language origins and the gestural primacy hypothesis. In Lock, A. & Peters, C. R. (Eds.), Handbook of human symbolic evolution (pp. 263–269). Oxford, UK: Oxford University Press.Google Scholar

Higginbotham, D. R., Isaak, M. I., & Domingue, J. N. (2008). The exaptation of manual dexterity for articulate speech: An electromyogram investigation. Experimental Brain Research, 186(4), 603–609. https://doi.org/10.1007/s00221-007-1265-9 CrossRef Google Scholar PubMed

Hobaiter, C., & Byrne, R. W. (2014). The meanings of chimpanzee gestures. Current Biology, 24(14), 1596–1600. https://doi.org/10.1016/j.cub.2014.05.066 CrossRef Google Scholar PubMed

Hrdy, S. B. (2009). Mothers and others: The evolutionary origins of mutual understanding. Cambridge, MA: Harvard University Press.Google Scholar

Hurford, J. R. (2007). The origins of meaning; Language in the light of evolution. Oxford, UK: Oxford University Press.Google Scholar

Imai, M., & Kita, S. (2014). The sound symbolism bootstrapping hypothesis for language acquisition and language evolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1651). https://doi.org/10.1098/rstb.2013.0298 CrossRef Google Scholar PubMed

Kellogg, W. N., & Kellogg, L. A. (1933). The ape and the child: A comparative study of the environmental influence upon early behavior. New York, NY: Hafner.CrossRef Google Scholar

Kendon, A. (1985). Some uses of gesture. In Tannen, D. & Saville Troike, M. (Eds.), Perspectives on silence (pp. 215–234). Norwood, NJ: Ablex.Google Scholar

Kendon, A. (1992). Some recent work from Italy on quotable gestures (Emblems). Journal of Linguistic Anthropology, 2(1), 92–108. www.jstor.org/stable/43102154 CrossRef Google Scholar

Kendon, A. (1995). Gestures as illocutionary and discourse structure markers in Southern Italian conversation. Journal of Pragmatics, 23(3), 247–279. https://doi.org/10.1016/0378-2166(94)00037-F CrossRef Google Scholar

Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Kendon, A. (2008). Signs for language origins? The Public Journal of Semiotics, 2(2), 2–29.CrossRef Google Scholar

Kendon, A. (2009). Language’s matrix. Gesture, 9(3), 355–372. https://doi.org/10.1075/gest.9.3.05ken CrossRef Google Scholar

Kendon, A. (2011). Some modern considerations for thinking about language evolution: A discussion of the evolution of language by Tecumseh Fitch. The Public Journal of Semiotics, 3(1), 79–108.CrossRef Google Scholar

Kendon, A. (2014). The ‘poly-modalic’ nature of utterances and its relevance for inquiring into language origins. In Dor, D., Knight, C., & Lewis, J. (Eds.), The social origins of language (pp. 67–76). Oxford, UK: Oxford University Press.Google Scholar

Kendon, A. (2017). Reflections on the gesture-first hypothesis of language origins. Psychonomic Bulletin & Review, 24(1), 163–170. https://doi.org/10.3758/s13423-016-1117-3 CrossRef Google Scholar PubMed

Kita, S. (2000). How representational gestures help speaking. In McNeill, D. (Ed.), Language and gesture (pp. 162–185). Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Kita, S., & Özyürek, A. (2003). What does cross-linguistic variation in semantic coordination of speech and gesture reveal?: Evidence for an interface representation of spatial thinking and speaking. Journal of Memory and Language, 47 (1), 16–32. https://doi.org/10.1016/S0749-596X(02)00505-3 CrossRef Google Scholar

Klima, E. A., & Bellugi, U. (1979). The signs of language. Cambridge, MA: Harvard University Press.Google Scholar

Knecht, S., Dräger, B., Deppe, M., Bobe, L., Lohmann, H., Flöel, … Henningsen, H. (2000). Handedness and hemispheric language dominance in healthy humans. Brain, 123(12), 2512–2518. https://doi.org/10.1093/brain/123.12.2512 CrossRef Google Scholar PubMed

Knight, C. (2000). Play as precursors of phonology and syntax. In Knight, C., Studdert-Kennedy, M., & Hurford (Eds.), J., The evolutionary emergence of language (pp. 99–119). Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Kress, G. (2009). Multimodality: A social semiotic approach to contemporary communication. London, UK: Routledge.CrossRef Google Scholar

Laudanna, A., & Volterra, V. (1991). Order of words, signs, and gestures: A first comparison. Applied Psycholinguistics, 12(2), 135–150. https://doi.org/10.1017/S0142716400009115 CrossRef Google Scholar

Leavens, D. A., Hopkins, W. D., & Thomas, R. K. (2004). Referential communication by chimpanzees (Pan troglodytes). Journal of Comparative Psychology, 118(1), 48–57. https://psycnet.apa.org/doi/10.1037/0735-7036.118.1.48 CrossRef Google Scholar PubMed

Lepic, R., Börstell, C., Belsitzman, G., & Sandler, W. (2016). Taking meaning in hand. Sign Language & Linguistics, 19(1), 37–81. https://doi.org/10.1075/sll.19.1.02lep CrossRef Google Scholar

Levinson, S. C. (2006). On the human “interaction engine”. In Enfield, N. J. & Levinson, S. C. (Eds.), Roots of human sociality: Culture, cognition and interaction (pp. 36–69). Oxford, UK: Berg.Google Scholar

Levinson, S., & Holler, J. (2014). The origin of human multi-modal communication, Philosophical Transactions of The Royal Society B. Biological Sciences, 369, 20130302. https://doi.org/10.1098/rstb.2013.0302 CrossRef Google Scholar PubMed

Levy, E. (2011). A new study of the co-emergence of speech and gestures: Towards an embodied account of early narrative development. Poster presented at the 2011 Language Fest, University of Connecticut, Storrs, CT.Google Scholar

Lewis, J. (2014). BaYaka pygmy multi-modal and mimetic communication traditions. In Dor, D., Knight, C., & Lewis, J. (Eds.), The social origins of language (pp. 77–91). Oxford, UK: Oxford University Press.CrossRef Google Scholar

Liszkowski, U., Carpenter, M., Henning, A., Striano, T., & Tomasello, M. (2004). Twelve-month-olds point to share attention and interest, Developmental Science, 7 (3), 297–307. https://doi.org/10.1111/j.1467-7687.2004.00349.x CrossRef Google Scholar PubMed

Lockwood, G., & Dingemanse, M. (2015). Iconicity in the lab: A review of behavioral, developmental, and neuroimaging research into sound-symbolism. Frontiers in Psychology, 6, 1246. https://doi.org/10.3389/fpsyg.2015.01246 Google Scholar PubMed

Lyell, C. (1833). Principles of geology. London, UK: John Murray.Google Scholar

MacNeilage, P. F. (2008). The origin of speech. Oxford, UK: Oxford University Press.Google Scholar

McNeill, D. (1992). What gestures reveal about thought. Chicago, IL: University of Chicago Press.Google Scholar

McNeill, D. (2005). Gesture and thought. Chicago, IL: University of Chicago Press.CrossRef Google Scholar

McNeill, D. (2012). How language began: Gesture and speech in human evolution. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Mead, G. H. (1974). Mind, self, and society from the standpoint of a social behaviorist. Morris, C. W. (Ed.), Chicago, IL: University of Chicago Press. (Original work published 1934)Google Scholar

Meir, I., Sandler, W., Padden, C., & Aronoff, M. (2010a). Emerging sign languages. In Marschark, M. & Spencer, P. (Eds.), Oxford handbook of deaf studies, language, and education (Vol. 2, pp. 267–280). Oxford, UK: Oxford University Press.Google Scholar

Meir, I., Aronoff, M., Sandler, W., & Padden, C. (2010b). Sign languages and compounding. In Scalise, S. & Vogel, I. (Eds.), Compounding (pp. 573–595). Amsterdam, the Netherlands: John Benjamins.Google Scholar

Müller, C. (2014). Gestural modes of representation as techniques of depiction. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D., & Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1687–1702). Berlin, Germany: De Gruyter Mouton. https://doi.org/10.1515/9783110302028.1687 Google Scholar

Müller, C., Cienki, A., Fricke, E., Ladewig, S., McNeill, D., Teßendorf, S. & Bressem, J. (Eds.) (2013–2014). Body - language - communication: An international handbook on multimodality in human interaction (Vols. 1 & 2). Berlin, Germany: De Gruyter Mouton. https://doi.org/10.1515/9783110302028 Google Scholar

Orzechowski, S., Wacewicz, S., & Żywiczyński, P. (2014). Orofacial gestures in language evolution. The auditory feedback hypothesis. In Cartmill, E. S., Roberts, S., Lyn, H., & Cornish, H. (Eds.), Proceedings of the 10th International Conference (EVOLANG 10) (pp. 221–227). Singapore: World Scientific.Google Scholar

Perlman, M. (2017). Debunking two myths against vocal origins of language. Interaction Studies 18(3), 376–401. https://doi.org/10.1075/is.18.3.05per CrossRef Google Scholar

Perlman, M., & Cain, A. A. (2014). Iconicity in vocalization, comparisons with gesture, and implications for theories on the evolution of language. Gesture, 14(3), 320–350. https://doi.org/10.1075/gest.14.3.03per CrossRef Google Scholar

Pika, S. (2008a). What is the nature of the gestural communication of great apes? In Zlatev, J., Racine, T., Sinha, C., & Itkonen, E. (Eds.), The shared mind: Perspectives on intersubjectivity (pp. 165–186). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Pika, S. (2008b). Gestures of apes and pre-linguistic human children: Similar or different?. First Language, 28(2), 116–140. https://doi.org/10.1177/0142723707080966 CrossRef Google Scholar

Poggi, I., & Zomparelli, M. (1987). Lessico e grammatica nei gesti e nelle parole [Lexis and grammar of gestures and speech]. In Poggi, I. (Ed.), Le parole nella testa. Per una educazione lingusitica cognitivista [Words in the head. For a cognitive-linguistic education] (pp. 291–327). Bologna, Italy: Il Mulino.Google Scholar

Pollick, A. S., & de Waal, F. B. M. (2007). Ape gestures and language evolution. Proceedings of the National Academy of Sciences of the USA, 104(19), 8184–8189. https://doi.org/10.1073/pnas.0702624104 CrossRef Google Scholar PubMed

Premack, D. (1970). The Education of Sarah: A chimpanzee learns the language. Psychology Today, 4 (4), 55–58.Google Scholar

Premack, D., & Premack, A. J. (1974). Teaching visual language to apes and language-deficient persons. In Schiefelbusch, R. L. & Lloyd, L. L. (Eds.), Language perspectives: Acquisition, retardation and intervention (pp. 347–375). Baltimore, MD: University Park Press.Google Scholar

Rizzolatti, G., & Arbib, M. A. (1998). Language within our grasp. Trends in Neurosciences, 21(5), 188–194. https://doi.org/10.1016/s0166-2236(98)01260-0 CrossRef Google Scholar PubMed

Rousseau, J.-J. (2008). Discourse on the origin of inequality. New York, NY: Cosimo Classics. (Original work published 1755)Google Scholar

Sandler, W., Meir, I., Padden, C., & Aronoff, M. (2005) The emergence of grammar: Systematic study in a new language. Proceedings of the National Academy of Sciences, 102(7), 2661–2665. https://doi.org/10.1073/pnas.0405448102 CrossRef Google Scholar

Sandler, W. (2013). Vive la énéralee: Sign language and spoken language in language evolution. Language and Cognition, 5(2–3), 189–203. https://doi.org/10.1515/langcog-2013-0013 CrossRef Google Scholar

Saussure, F. de (1960). Cours de linguistique énérale/Course in general linguistics. Paris, France/London, UK: Payot/Duckworth. (Original work published 1916)Google Scholar

Scherer, K. R., Johnstone, T., & Klasmeyer, G. (2003). Vocal expression of emotion. In Davidson, R. J., Scherer, K. R., & Goldsmith, H. H. (Eds.), Series in affective science. Handbook of affective sciences (pp. 433–456). Oxford, UK: Oxford University Press.Google Scholar

Senghas, A., & Coppola, M. (2001). Children creating language: How Nicaraguan Sign Language acquired a spatial grammar. Psychological Science, 12 (4), 323–328. https://doi.org/10.1111/1467-9280.00359 CrossRef Google Scholar PubMed

Senghas, A., Kita, S., & Özyürek, A. (2004). Children creating core properties of language: Evidence from an emerging sign language in Nicaragua. Science, 305(5691), 1779–1782. https://doi.org/10.1126/science.1100199 CrossRef Google Scholar PubMed

Slocombe, K. (2011). Have we underestimated great ape vocal capacities? Oxford handbooks online. https://doi.org/10.1093/oxfordhb/9780199541119.013.0007 CrossRef Google Scholar

Sonesson, G. (1997). The ecological foundations of iconicity. In Rauch, I., Carr, G., & Gerald, F. (Eds.), Semiotics around the world: Synthesis in diversity (pp. 739–742). Berlin, Germany: Mouton de Gruyter.Google Scholar

Sonesson, G. (2007). From the meaning of embodiment to the embodiment of meaning: A study in phenomenological semiotics. In Ziemke, T., Zlatev, J., & Frank, R. (Eds.), Body, language and mind. Vol. 1: Embodiment (pp. 85–128). Berlin, Germany: Mouton de Gruyter.Google Scholar

Stamp, R., & Sandler, W. (2016). The grammar of the body and the emergence of complexity in sign languages. In Roberts, S. G., Cuskley, C., McCrohon, L., Barceló-Coblijn, L., Feher, O., & Verhoef, T. (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). https://doi.org/10.17617/2.2248195 CrossRef Google Scholar

Stampoulidis, G., Bolognesi, M., & Zlatev, J. (2019). A cognitive semiotic exploration of metaphors in Greek street art. Cognitive Semiotics, 12(1), 1–20. https://doi.org/10.1515/cogsem-2019-2008 CrossRef Google Scholar

Stokoe, W. C. (1960). Sign language structure: An outline of the visual communication systems of the American deaf studies, Linguistics Occasional Papers, 8, 1–78.Google Scholar

Stokoe, W. C. (1991). Semantic phonology. Sign Language Studies, 71, 99–106. https://doi.org/10.1353/sls.1991.0032 CrossRef Google Scholar

Stokoe, W. C., Casterline, D. C., & Croneberg, C. G. (1965). A dictionary of American Sign Language on linguistic principles. Silver Spring, MD: Linstok.Google Scholar

Suddendorf, T., & Corballis, M. (1997) Mental time travel and the evolution of the human mind. Genetic, Social, and General Psychology Monographs, 123(2),133–167.Google Scholar PubMed

Tanner, J., & Perlman, M. (2017). Moving beyond “meaning”: Gorillas combine gestures into sequences for creative display. Language & Communication, 54, 56–72. https://doi.org/10.1016/j.langcom.2016.10.006 CrossRef Google Scholar

Tomasello, M. (2000). Primate cognition: Introduction to the issue. Cognitive Science, 24(3), 351–361. https://doi.org/10.1207/s15516709cog2403_1 CrossRef Google Scholar

Tomasello, M. (2008). Origins of human communication. Cambridge, MA: MIT Press.CrossRef Google Scholar

Tomasello, M. (2009). Why we cooperate. Cambridge, MA: MIT Press.CrossRef Google Scholar

Tomasello, M., George, B. L., Kruger, A. C., Jeffrey, M., & Evans, A. (1985). The development of gestural communication in young chimpanzees. Journal of Human Evolution, 14(2), 175–186. https://doi.org/10.1016/S0047-2484(85)80005-1 CrossRef Google Scholar

Vigliocco, G., Perniss, P., & Vinson, D. (2014). Language as a multimodal phenomenon: Implications for language learning, processing and evolution. Philosophical Transactions of the Royal Society B, 369(1651), https://doi.org/10.1098/rstb.2013.0292 CrossRef Google Scholar PubMed

Vingerhoets, G., Alderweireldt, A. S., Vandemaele, P., Cai, Q., Van der Haegen, L., Brysbaert, M., & Achten, E. (2013). Praxis and language are linked: Evidence from co-lateralization in individuals with atypical language dominance. Cortex, 49(1), 172–183. https://doi.org/10.1016/j.cortex.2011.11.003 CrossRef Google Scholar PubMed

Wacewicz, S., & Żywiczyński, P. (2008). Broadcast transmission, signal secrecy and gestural primacy hypothesis. In Smith, A., Smith, K., & Ferrer-i-Cancho, R. (Eds.), The Evolution of Language. Proceedings of the 7th International Conference (EVOLANG 7) (pp. 354–361). Singapore: World Scientific. https://doi.org/10.1142/9789812776129_0045 Google Scholar

Waxer, P. H. (1977). Nonverbal cues for anxiety: An examination of emotional leakage. Journal of Abnormal Psychology, 86(3), 306–314. https://doi.org/10.1037/0021-843X.86.3.306 CrossRef Google Scholar PubMed

Whiteside, S. P., Dyson, L., Cowell, P. E., & Varley, R. A. (2015). The relationship between apraxia of speech and oral apraxia: association or dissociation? Archives of Clinical Neuropsychology, 30(7), 670–682. https://doi.org/10.1093/arclin/acv051 CrossRef Google Scholar PubMed

Zlatev, J. (2008). From proto-mimesis to language: Evidence from primatology and social neuroscience. Journal of Physiology – Paris, 102(1–3), 137–152. https://doi.org/10.1016/j.jphysparis.2008.03.016 CrossRef Google Scholar PubMed

Zlatev, J. (2014). Human uniqueness, bodily mimesis and the evolution of language. Humana. Mente Journal of Philosophical Studies, 7(27), 197–219.Google Scholar

Zlatev, J. (2015a). Cognitive semiotics. In Trifonas, P. (Ed.), International handbook of semiotics (pp. 1043–1067). Dordrecht, the Netherlands: Springer.CrossRef Google Scholar

Zlatev, J. (2015b). The emergence of gestures. In MacWhinney, B. & O’Grady, W. (Eds.), The handbook of language emergence (pp. 458–477). New York, NY: Wiley.CrossRef Google Scholar

Zlatev, J. (2016). Preconditions in human embodiment for the evolution of symbolic communication. In Etzelmüller, G. & Tewes, C. (Eds.), Embodiment in evolution and culture (pp. 151–174). Tübingen, Germany: Mohr Siebeck.Google Scholar

Zlatev, J. (2019). Mimesis theory, learning and polysemiotic communication. In Peters, M. (Ed.), Encyclopedia of educational philosophy and theory (pp. 1–6). Dordrecht, the Netherlands: Springer. https://doi.org/10.1007/978-981-287-532-7_672-1 Google Scholar

Zlatev, J., Persson, T., & Gärdenfors, P. (2005). Triadic bodily mimesis is the difference. Behavioral and Brain Sciences, 28(5), 720–721. https://doi.org/10.1017/S0140525X05530127 CrossRef Google Scholar

Zlatev, J., & Andrén, M. (2009). Stages and transitions in children’s semiotic development. In Zlatev, J., Andrén, M., Johansson-Falck, M., & Lundmark, C. (Eds.), Studies in language and cognition (pp. 380–401). Newcastle, UK: Cambridge Scholars Press.Google Scholar

Zlatev, J., Devylder, S., Defina, R., Moskaluk, K., & Andersen, L. B. (2023). Analyzing polysemiosis: Language, gesture, and depiction in two cultural practices with sand drawing. Semiotica, 2023(253), 81–116. https://doi.org/10.1515/sem-2022-0102 CrossRef Google Scholar

Zlatev, J., Madsen, E., Lenninger, S., Persson, T., Sayehli, S., Sonesson, G., & van de Weijer, J. (2013). Understanding communicative intentions and semiotic vehicles by children and chimpanzees. Cognitive Development, 28(3), 312–329. https://doi.org/10.1016/j.cogdev.2013.05.001 CrossRef Google Scholar

Zlatev, J., Wacewicz, S., Żywiczyński, P., & van de Weijer, J. (2017). Multimodal-first or pantomime-first communicating events through pantomime with and without vocalization. Interaction Studies, 18(3), 465–488. https://doi.org/10.1075/is.18.3.08zla CrossRef Google Scholar

Zlatev, J., Żywiczyński, P., & Wacewicz, S. (2020). Pantomime as the original human-specific semiotic system. Journal of Language Evolution, 5(2), 156–174. https://doi.org/10.1093/jole/lzaa006 CrossRef Google Scholar

Żywiczyński, P. (2018). Language origins: From mythology to science. Berlin, Germany: Peter Lang.CrossRef Google Scholar

Żywiczyński, P. (2020). How research on language evolution contributes to linguistics. Poznań Linguistic Meeting Yearbook, 5(1), 1–34. https://doi.org/10.2478/yplm-2020-0001 Google Scholar

Żywiczyński, P., & Wacewicz, S. (2019). The evolution of language: Towards gestural hypotheses. Berlin, Germany: Peter Lang.CrossRef Google Scholar

Żywiczyński, P., Wacewicz, S., & Orzechowski, S. (2017). Adaptors and the turn-taking mechanism. Interaction Studies, 18(2), 276–298. https://doi.org/10.1075/is.18.2.07zyw CrossRef Google Scholar

Żywiczyński, P., Wacewicz, S., & Sibierska, M. (2018). Defining pantomime for language evolution research. Topoi, 37(2), 307–318. https://doi.org/10.1007/s11245-016-9425-9 CrossRef Google Scholar

References

Abramov, O., Kopp, S., Nemeth, A., Kern, F., Mertens, U., & Rohlfing, K. (2018). Towards a computational model of child gesture-speech production. How information is spread across modalities in pre-school children. Proceedings of KogWis 2018, Darmstadt, Germany. https://pub.uni-bielefeld.de/record/2932018 Google Scholar

Airenti, G. (2015). Theory of mind: A new perspective on the puzzle of belief ascription. Frontiers in Psychology, 6, 1184. https://doi.org/10.3389/fpsyg.2015.01184 CrossRef Google Scholar PubMed

Alamillo, A. R., Colletta, J., & Guidetti, M. (2013). Gesture and language in narratives and explanations: The effects of age and communicative activity on late multimodal discourse development. Journal of Child Language, 40(3), 511–538. https://doi.org/10.1017/S0305000912000062 CrossRef Google Scholar PubMed

Alibali, M. W., Kita, S., & Young, A. J. (2000). Gesture and the process of speech production: We think, therefore we gesture. Language and Cognitive Processes, 15, 593–613. https://doi.org/10.1080/016909600750040571 CrossRef Google Scholar

Andrén, M. (2010). Children’s gestures from 18 to 30 months. (Unpublished doctoral dissertation). Lund University, Lund, Sweden.Google Scholar

Arbib, M. A. (2012). How the brain got language: The mirror system hypothesis. Oxford, UK: Oxford University Press.CrossRef Google Scholar

Aronsson, K., & Morgenstern, A. (2021). “Bravo!” Co-constructing praise in French family life. Journal of Pragmatics, 173, 1–14. https://doi.org/10.1016/j.pragma.2020.12.002 CrossRef Google Scholar

Augustine, Saint. (1996). Confessions. Cambridge, MA: Harvard University Press.Google Scholar

Bahrick, L. E., & Pickens., J. N. (1994). Amodal relations: The basis for intermodal perception and learning in infancy. In Lewkowicz, D. J. & Lickliter, R. (Eds.), The development of intersensory perception: Comparative perspectives (pp. 205–233). Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar

Baldwin, D. A. (1995). Understanding the link between joint attention and language. In Moore, C. & Dunham, P. J. (Eds.), Joint attention: Its origins and role in development (pp. 131–158). Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar

Bates, E. (1976). Language and context: The acquisition of pragmatics. Cambridge, MA: Academic Press.Google Scholar

Bates, E., Benigni, L., Bretherton, I., Camaioni, L., & Volterra, V. (1979). Cognition and communication from nine to thirteen months: Correlational findings. In Bates, E. (Ed.), The emergence of symbols: Cognition and communication in infancy (pp. 69–140). Cambridge, MA: Academic Press.CrossRef Google Scholar

Bates, E. Bretherton, I., & Snyder, L. (1988). From first words to grammar: Individual differences and dissociable mechanisms. Cambridge, UK: Cambridge University Press.Google Scholar

Beaupoil-Hourdel, P., Morgenstern, A., & Boutet, D. (2015). A child’s multimodal negations from 1 to 4: The interplay between modalities. In Larrivée, P. & Lee, C. (Eds.), Negation and polarity: Experimental perspectives (pp. 95–123). New-York, NY: Springer International Publishing.Google Scholar

Booth, A. E., McGregor, K. K., & Rohlfing, K. L. (2008). Socio-pragmatics and attention: Contributions to gesturally guided word learning in toddlers. Language Learning and Development, 4, 179–202. https://doi.org/10.1080/15475440802143091 CrossRef Google Scholar

Bressem, J., & Müller, C. (2014). A repertoire of German recurrent gestures with pragmatic functions. In Müller, C., Cienki, A., Fricke, E., Ladewig, S., McNeill, D., & Bressem., J. (Eds.). Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1575–1591). Berlin, Germany: De Gruyter Mouton.Google Scholar

Brooks, R., & Meltzoff, A. (2005). The development of gaze following and its relation to language. Developmental Science, 8(6), 535–543. https://doi.org/10.1111/j.1467-7687.2005.00445.x CrossRef Google Scholar PubMed

Bruner, J. S. (1975). The ontogenesis of speech acts. Journal of Child Language, 2 (1), 1–19.CrossRef Google Scholar

Bruner, J. S. (1978). The role of dialogue in language acquisition. In Sinclair, A., Jarvelle, R. J., & Levelt, W. J. M. (Eds.), The child’s concept of language (pp. 241–256). New York, NY: Springer-Verlag.Google Scholar

Bruner, J. S. (1983). Child’s talk: Learning to use language, 1st Ed. New York, NY: W. W. Norton & Company.Google Scholar

Butcher, C., & Goldin-Meadow, S. (2000). Gesture and the transition from one- to two-word speech: When hand and mouth come together. In McNeill, D. (Ed.), Language and gesture (pp. 235–257). Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Capirci, O., Contaldo, A., Caselli, M. C., & Volterra, V. (2005). From action to language through gesture: A longitudinal perspective. Gesture, 5(1–2), 155–177. https://doi.org/10.1075/gest.5.1.12cap Google Scholar

Capirci, O., Iverson, J., Pizzuto, E., & Volterra, V. (1996). Gestures and words during the transition to two-word speech. Journal of Child Language, 23, 645–673. https://doi.org/10.1017/S0305000900008989 CrossRef Google Scholar

Capone, N., & McGregor, K. K. (2004) Gesture development: A review for clinical and research practices. Journal of Speech and Hearing Research, 47, 173–186. https://doi.org/10.1044/1092-4388(2004/015)CrossRef Google Scholar PubMed

Cienki, A. (2012). Usage events of spoken language and the symbolic units we (may) abstract from them. In Badio, J. and Kosecki, K. (Eds.), Cognitive processes in language (pp. 149–158). Bern, Switzerland: Peter Lang.Google Scholar

Cienki, A. (2017). Ten lectures on spoken language and gesture from the perspective of cognitive linguistics : Issues of dynamicity and multimodality. Leiden, the Netherlands: Brill.CrossRef Google Scholar

Clark, E. V., & Estigarribia, B. (2011). Using speech and gesture to introduce new objects to young children. Gesture 11(1), 1–23. https://doi.org/10.1075/gest.11.1.01cla CrossRef Google Scholar

Cohen, M. (1924). Sur les langages successifs de l’enfant [On the successive languages of children]. In Linguistique de Paris, Société de (eds.), Mélanges linguistiques offerts à M. J. Vendryès par ses amis et ses élèves [Linguistic mixes offered to M. J. Vendryès by his friends and students] (pp. 109–127). Paris, France: E. Champion.Google Scholar

Colletta, J.-M. (2004). Le développement de la parole chez l’enfant [Children’s speech development]. Liège, Belgium: Mardaga.Google Scholar

Colletta, J.-M. (2009). Comparative analysis of children’s narratives at different ages: A multimodal approach. Gesture, 9(1), 61–97. https://doi.org/10.1075/gest.9.1.03col CrossRef Google Scholar

Colletta, J.-M., & Pellenq, C. (2009). Multimodal explanations in French children aged from 3 to 11 years. In Nippold, N. & Scott, C. (Eds.), Expository discourse in children, adolescents, and adults. Development and disorders (pp. 63–97). London, UK: Taylor & Francis.Google Scholar

Congdon, E. L., Novack, M., & Goldin-Meadow, S. (2018). Gesture in experimental studies: How videotape technology can advance psychological theory. Organizational Research Methods, 21(2), 489–499. https://doi.org/10.1177/1094428116654548 CrossRef Google Scholar

Cuxac, C. (2000). La langue des signes française. Les voies de l’iconicité [French Sign Language. The pathways of iconicity]. Paris, France: Ophrys.Google Scholar

Darwin, C. (1872). The expression of the emotions in man and animals. London, UK: John Murray.CrossRef Google Scholar

Darwin, C. (1877). A biographical sketch of an infant. Mind, 2(7), 285–294. https://doi.org/10.1093/mind/os-2.7.285 CrossRef Google Scholar

Debras, C. (2017). The shrug: Forms and meanings of a compound enactment. Gesture, 16(1), 1–34. https://doi.org/10.1075/gest.16.1.01deb CrossRef Google Scholar

De Laguna, G. (1927). Speech: Its function and development. New Haven, CT: Yale University Press.Google Scholar

Dodane, C., Boutet, D., Didirkova, I., Ouni, S., & Morgenstern, A. (2019). An integrative platform to capture the orchestration of gesture and speech. GeSpIn 2019 – Gesture and Speech in Interaction, Sept. 2019, Paderborn, Germany.Google Scholar

Efron, D. (1972). Gesture and environment. New York, NY: King’s Crown Press. (Original work published 1941)Google Scholar

Emmorey, K. (2014). Iconicity as structure mapping. Philosophical Transactions of the Royal Society, 369 (1651), 20130301. https://doi.org/10.1098/rstb.2013.0301 CrossRef Google Scholar PubMed

Esteve-Gibert, N., & Prieto, P. (2013). Prosodic structure shapes the temporal realization of intonation and manual gesture movements. Journal of Speech, Language and Hearing Research, 56(3), 850–864. https://doi.org/10.1044/1092-4388(2012/12-0049)CrossRef Google Scholar PubMed

Franco, F., & Butterworth, G. (1996). Pointing and social awareness: Declaring and requesting in the second year. Journal of Child Language, 23(2), 307–336. https://doi.org/10.1017/S0305000900008813 CrossRef Google Scholar PubMed

Goffman, E. (1963). Stigma: Notes on the management of spoiled identity. Hoboken, NJ: Prentice-Hall.Google Scholar

Goldin-Meadow, S. (1999). The role of gesture in communication and thinking. Trends in Cognitive Sciences, 3(11), 419–429. https://doi.org/10.1016/S1364-6613(99)01397-2 CrossRef Google Scholar PubMed

Goldin-Meadow, S., & Morford, M. (1990). Gesture in early child language. In Volterra, V. and Erting, C. J. (Eds.), From gesture to language in hearing and deaf children (pp. 249–262). New York, NY: Springer-Verlag.CrossRef Google Scholar

Goldin-Meadow, S., Mylander, C., & Frank, A. (2007). How children make language out of gesture: Morphological structure in gesture systems developed by American and Chinese deaf children. Cognitive Psychology, 55(2), 87–135. https://doi.org/10.1016/j.cogpsych.2006.08.001 CrossRef Google Scholar PubMed

Goodrich, W., & Hudson Kam, C. (2009). Co-speech gesture as input in verb learning. Developmental Science, 12(1), 81–87. https://doi.org/10.1111/j.1467-7687.2008.00735.x CrossRef Google Scholar PubMed

Goodwin, C. (1986). Gestures as a resource for the organization of mutual orientation. Semiotica, 62(1-2), 29–49. https://doi.org/10.1515/semi.1986.62.1-2.29 CrossRef Google Scholar

Goodwin, C. (2013). The co-operative, transformative organization of human action and knowledge, Journal of Pragmatics, 46(1), 8–23. https://doi.org/10.1016/j.pragma.2012.09.003 CrossRef Google Scholar

Goodwyn, S. W., & Acredolo, L. P. (1993). Symbolic gesture versus word: Is there a modality advantage for onset of symbol use? Child Development, 64 (3), 688–701. https://doi.org/10.2307/1131211 CrossRef Google Scholar

Grossard, C., Chaby, L., Hun, S., Pellerin, H., Bourgeois, J., Dapogny, A., … & Cohen, D. (2018). Children facial expression production: Influence of age, gender, emotion subtype, elicitation condition and culture. Frontiers in Psychology, 9, 446.CrossRef Google Scholar PubMed

Guidetti, M. (2005). Yes or no? How young French children combine gestures and speech to agree and refuse. Journal of Child Language, 32 (4), 911–924. https://doi.org/10.1017/S0305000905007038 CrossRef Google Scholar PubMed

Haddington, P., Keisanen, T., Mondada, L., & Nevile, M. (Eds.) (2014). Multiactivity in social interaction: Beyond multitasking. Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Haviland, J. B. (1998). Early pointing gestures in Zincantán. Journal of Linguistic Anthropology, 8(2), 162–196. https://doi.org/10.1525/jlin.1998.8.2.162 CrossRef Google Scholar

Ingram, D. (1989). First language acquisition: Method, description and explanation. Cambridge, UK: Cambridge University Press.Google Scholar

Iverson, J. M, Capirci, O., & Caselli, M. C. (1994). From communication to language in two modalities. Cognitive Development, 9, 23–43. https://doi.org/10.1016/0885-2014(94)90018-3 CrossRef Google Scholar

Iverson, J. M., Capirci, O., Longobardi, E., & Caselli, M. C. (1999). Gesturing in mother-child interactions. Cognitive Development, 14(1), 57–75. https://doi.org/10.1016/S0885-2014(99)80018-5 CrossRef Google Scholar

Iverson, J., Capirci, O., Volterra, V., Goldin-Meadow, S. (2008). Learning to talk in a gesture-rich world: Early communication in Italian vs. American children. First Language, 28(2), 164–181. https://doi.org/10.1177/0142723707087736 CrossRef Google Scholar

Iverson, J. M., & Fagan, M. K. (2004). Infant vocal-motor coordination: Precursor to the gesture-speech system? Child Development, 75(4), 1053–1066. https://doi.org/10.1111/j.1467-8624.2004.00725.x CrossRef Google Scholar

Iverson, J., & Goldin Meadow, S. (1998). Why people gesture as they speak. Nature, 396(6708), 228. https://doi.org/10.1038/24300 CrossRef Google Scholar PubMed

Iverson, J. M., & Goldin-Meadow, S. (2005). Gesture paves the way for language development. Psychological Science, 16(5), 367–371. https://doi.org/10.1111/j.0956-7976.2005.01542.x CrossRef Google Scholar PubMed

Kaplan, F., Oudeyer, P-Y. , & Bergen, B. (2008). Computational models in the debate over language learnability. Infant and Child Development, 17(1), 55–80. https://doi.org/10.1002/icd.544 CrossRef Google Scholar

Kelly, B. F. (2011). A new look at redundancy in children’s gesture and word combinations. In Arnon, I. & Clark, E. V. (Eds.), Experience, variation, and generalization: Learning a first language (pp. 73–90). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Kress, G. (2010). Multimodality: A social semiotic approach to contemporary communication. London, UK: Routledge.Google Scholar

Ladewig, S. H. (2014). Recurrent gestures. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D., & Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1558–1574). Berlin, Germany: De Gruyter Mouton.Google Scholar

Le Guen, O. (2011). Modes of pointing to existing spaces and the use of frames of reference. Gesture, 11(3), 271–307. https://doi.org/10.1075/gest.11.3.02leg CrossRef Google Scholar

Levelt, W. J. M. (1980). On-line processing constraints on the properties of signed and spoken language. In Bellugi, U. & Studdert-Kennedy, M. (Eds.) Signed and spoken language: Biological constraints on linguistic form (pp. 141–160). Weinheim, Germany: Verlag Chemie GmbH.Google Scholar

Levinson, S. (1983). Pragmatics. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Levinson, S. (2006). On the human “interaction engine”. In Enfield, N. J., & Levinson, S. C. (Eds.), Roots of human sociality: Culture, cognition and interaction (pp. 39–69). Oxford, UK: Berg.Google Scholar

Levinson, S. C., & Holler, J. (2014). The origin of human multimodal communication. Philosophical Transactions of the Royal Society, 369(1651), 20130302. https://doi.org/10.1098/rstb.2013.0302 CrossRef Google Scholar

Linell, P. (2005). The written language bias in linguistics. London, UK: Routledge.Google Scholar

Liszkowski, U., Carpenter, M., Henning, A., Striano, T., & Tomasello, M. (2004). Twelve-month-olds point to share attention and interest. Developmental Science, 7, 297–307. https://doi.org/10.1111/j.1467-7687.2004.00349.x CrossRef Google Scholar PubMed

MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk, 3rd Ed., Vol. 2. The database. Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar

McGregor, K. K., Rohlfing, K. J., Bean, A., & Marschner, E. (2009). Gesture as a support for word learning: The case of under. Journal of Child Language, 36(4), 807–828. https://doi.org/10.1017/S0305000908009173 CrossRef Google Scholar PubMed

McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago, IL: University of Chicago Press.Google Scholar

Marcos, H. (1998). De la communication prélinguistique au langage: Formes et fonctions [From prelinguistic communication to language: Forms and functions]. Paris, France: L’Harmattan.Google Scholar

Masataka, N. (2003). The onset of language. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Mondada, L. (2019). Contemporary issues in conversation analysis: Embodiment and materiality, multimodality and multisensoriality in social interaction. Journal of Pragmatics, 145, 47–62. https://doi.org/10.1016/j.pragma.2019.01.016 CrossRef Google Scholar

Morford, M., & Goldin-Meadow, S. (1992), Comprehension and production of gesture in combination with speech in one-word speakers. Journal of Child Language, 19(3), 559–580. https://doi.org/10.1017/S0305000900011569 CrossRef Google Scholar PubMed

Morgenstern, A. (2009). L’enfant dans la langue [The child in language]. Paris, France: Presses de la Sorbonne Nouvelle.Google Scholar

Morgenstern, A. (2014). Shared attention, gaze and pointing gestures in hearing and deaf children. In Inbal, A., Estigarribia, B., Tice, M., & Kurumada, C. (Eds.), Language in interaction. Studies in honor of Eve V. Clark [TiLAR 12] (pp. 139–156). Amsterdam, the Netherlands: John Benjamins.Google Scholar

Morgenstern, A. (2019). Le développement multimodal du langage de l’enfant : des premiers bourgeons aux constructions multimodales [Children’s multimodal development: From first buds to multimodal constructions]. In Mazur-Palandre, A. & Colon, I (Eds.), Multimodalité du langage dans les interactions et l’acquisition [Multimodality of language in interactions and in acquisition] (pp. 27–52). Grenoble, France: Presses Universitaires de Grenoble.Google Scholar

Morgenstern, A. (2022). Early pointing gestures. In Morgenstern, A. & Goldin-Meadow, S. (Eds.), Gesture in language: Development across the lifespan (pp. 47–92). Berlin, Germany: Walter De Gruyter.CrossRef Google Scholar

Morgenstern, A., Blondel, M., Beaupoil-Hourdel, P., Benazzo, S., Boutet, D., Kochan, A., & Limousin, F. (2018). The blossoming of negation in gesture, sign and oral productions. In Hickman, M., Veneziano, E., & Jisa, H. (Eds.), Sources of variation in first language acquisition: Languages, contexts, and learners (pp. 339–364). TILAR. Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Morgenstern, A., & Parisse, C. (2007). Codage et interprétation du langage spontané d’enfants de 1 à 3 ans [Coding and interpretation of 1 to 3-year-old children’s spontaneous language]. Corpus, 6, Interprétation, contextes, codage, 55–78. https://doi.org/10.4000/corpus.922 Google Scholar

Morgenstern, A., & Parisse, C. (2012). The Paris Corpus. French Language Studies, 22(1), 7–12. https://doi.org/10.1017/S095926951100055X CrossRef Google Scholar

Ochs, E. (2012). Experiencing language. Anthropological Theory, 12(2), 142–160. https://doi.org/10.1177/1463499612454088 CrossRef Google Scholar

Odom, R. D., & Lemond, C. M. (1972). Developmental differences in the perception and production of facial expressions. Child Development, 43(2), 359–369. https://doi.org/10.2307/1127541 CrossRef Google Scholar PubMed

Ozçalışkan, S., Gentner, D., & Goldin-Meadow, S. (2014). Do iconic gestures pave the way for children’s early verbs? Applied Psycholinguistics, 35(6), 1143–1162. https://doi.org/10.1017/S0142716412000720 CrossRef Google Scholar PubMed

Özçalışkan, S., & Goldin-Meadow, S. (2005), Gesture is at the cutting edge of early language development, Cognition, 96(3), B101–B113. https://doi.org/10.1016/j.cognition.2005.01.001 CrossRef Google Scholar PubMed

Özyürek, A. (2014). Hearing and seeing meaning in speech and gesture: Insights from brain and behaviour. Philosophical Transactions of the Royal Society, B 369(1651), 20130296. https://doi.org/10.1098/rstb.2013.0296 CrossRef Google Scholar PubMed

Parisse, C., & Morgenstern, A. (2010). A multi-software integration platform and support for multimedia transcripts of language. LREC 2010, Proceedings of the Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, 106–110.Google Scholar

Pfandler, E, Lakatos, G, & Miklósi, Á. (2013). Eighteen-month-old human infants show intensive development in comprehension of different types of pointing gestures. Animal Cognition, 16(5), 711–719. https://doi.org/10.1007/s10071-013-0606-2 CrossRef Google Scholar PubMed

Rader, N., & Zukow-Goldring, P. (2010). How the hands control attention during early word learning. Gesture 10(2/3), 202–221. https://doi.org/10.1075/bct.39.05rad CrossRef Google Scholar

Romanes, G. J. (1889) (translation into French 1891). L’évolution mentale chez l’homme. Origine des facultés humaines. Paris, France: Alcan.Google Scholar

Rowe, M. L., & Goldin-Meadow, S. (2009). Differences in early gesture explain SES disparities in child vocabulary size at school entry. Science, 323(5916), 951–953. https://doi.org/10.1126/science.1167025 CrossRef Google Scholar PubMed

Rowe, M. L., Raudenbush, S. W., & Goldin-Meadow, S. (2012). The pace of vocabulary growth helps predict later vocabulary skill. Child Development, 83(2), 508–525. https://doi.org/10.1111/j.1467-8624.2011.01710.x CrossRef Google Scholar PubMed

Sacks, H. (1984). Notes on methodology. In Atkinson, J. M. & Heritage, J. (Eds.), Structures of social action (pp. 21–37). Cambridge, UK: Cambridge University Press.Google Scholar

Sacks, H. (1992). Lectures on conversation. Oxford, UK: Basil Blackwell.Google Scholar

Scaife, M., & Bruner, J. S. (1975). The capacity for joint visual attention in the infant. Nature, 253(5489), 265–266.CrossRef Google Scholar PubMed

Stefanini, S., Bello, A., Caselli, M. C., Iverson, J. M., & Volterra, V. (2009). Co-speech gestures in a naming task: Developmental data. Language and Cognitive Processes, 24(2), 168–189.CrossRef Google Scholar

Streeck, J. (2008). Depicting by gestures. Gesture, 8(3), 285–301. https://doi.org/10.1075/gest.8.3.02str CrossRef Google Scholar

Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge MA: Harvard University Press.Google Scholar

Tomasello, M. (2008). The origins of communication. Boston, MA: MIT Press.CrossRef Google Scholar

Volterra, V. (1981). Gestures, signs and words at two years old: When does communication become language? Sign Language Studies, 33(Fall), 351–362. https://doi.org/10.1353/sls.1981.0006 CrossRef Google Scholar

Werner, H., & Kaplan, B. (1963). Symbol formation. Hoboken, NJ: Wiley.Google Scholar

Wu, Z., & Gros-Louis, J. (2014). Infants’ prelinguistic communicative acts and maternal responses: Relations to linguistic development. First Language, 34(1), 72–90. https://doi.org/10.1177/0142723714521925 CrossRef Google Scholar

Wundt, W. (1973). The language of gestures. Berlin, Germany: Mouton. (Original work published 1921)CrossRef Google Scholar

Zlatev, J., Persson, T., & Gärdenfors, P. (2005). Bodily mimesis as the “missing link” in human cognitive evolution, LUCS 121. Lund, Sweden: Lund University Cognitive Studies.Google Scholar

Zukow-Goldring, P. (1997). A social-ecological realist approach to the emergence of the lexicon: educating attention to amodal invariants in gesture and speech. In Dent-Read, C. & Zukow-Goldring, P. (Eds.), Evolving explanations of development: Ecological approaches to organism-environment systems (pp. 199–251). Washington, DC: American Psychological Association.CrossRef Google Scholar

References

Adams, T. W. (1998). Gesture in foreigner talk. (Unpublished doctoral dissertation). University of Pennsylvania, Philadelphia.Google Scholar

Alibali, M. W., Yeo, A., Hostetter, A. B., & Kita, S. (2017). Representational gestures help speakers package information for speaking. In Breckinridge Church, R., Alibali, M. W., & Kelly, S. D. (Eds.), Why gesture?: How the hands function in speaking, thinking and communicating (pp. 15–37). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Allen, L. Q. (1995). The effect of emblematic gestures on the development and access of mental representations of French expressions. Modern Language Journal, 79(4), 521–529. https://doi.org/10.1111/j.1540-4781.1995.tb05454.x CrossRef Google Scholar

Allen, L. Q. (2000). Nonverbal accommodations in foreign language teacher talk. Applied Language Learning, 11, 155–176.Google Scholar

Andrä, C., Mathias, B., Schwager, A., Macedonia, M., & von Kriegstein, K. (2020). Learning foreign language vocabulary with gestures and pictures enhances vocabulary memory for several months post-learning in eight-year-old school children. Educational Psychology Review, 32, 815–850. https://doi.org/10.1007/s10648-020-09527-z CrossRef Google Scholar

Azar, Z., Backus, A., & Özyürek, A. (2019). General- and language-specific factors influence reference tracking in speech and gesture in discourse. Discourse Processes, 56(7), 553–574. https://doi.org/10.1080/0163853X.2018.1519368 CrossRef Google Scholar

Aziz, J. R., & Nicoladis, E. (2019). “My French is rusty”: Proficiency and bilingual gesture use in a majority English community. Bilingualism: Language and Cognition, 22, 826–835. https://doi.org/10.1017/S1366728918000639 CrossRef Google Scholar

Baills, F., Suárez-González, N., González-Fuente, S., & Prieto, P. (2019). Observing and producing pitch gestures facilitates the learning of Mandarin Chinese tones and words. Studies in Second Language Acquisition, 41(1), 33–58. https://doi.org/10.1017/S0272263118000074 CrossRef Google Scholar

Bardovi-Harlig, K. (2000). Tense and aspect in second language acquisition; Form, meaning, and use. Oxford, UK: Blackwell.Google Scholar

Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645. https://doi.org/10.1146/annurev.psych.59.103006.093639 CrossRef Google Scholar PubMed

Belhiah, H. (2013). Using the hand to choreograph instruction: On the functional role of gesture in definition talk. The Modern Language Journal, 97(2), 417–434. https://doi.org/10.1111/j.1540-4781.2013.12012.x CrossRef Google Scholar

Brentari, D., Nadolske, M. A., & Wolford, G. (2012). Can experience with co-speech gesture influence the prosody of a sign language? Sign language prosodic cues in bimodal bilinguals. Bilingualism: Language and Cognition, 15, 402–412. https://doi.org/10.1017/S1366728911000587 CrossRef Google Scholar

Brown, A. (2015). Universal development and L1–L2 convergence in bilingual con-strual of manner in speech and gesture in Mandarin, Japanese, and English. The Modern Language Journal, 99(S1), 66–82. https://doi.org/10.1111/j.1540-4781.2015.12179.x CrossRef Google Scholar

Brown, A., & Gullberg, M. (2008). Bidirectional crosslinguistic influence in L1-L2 encoding of Manner in speech and gesture: A study of Japanese speakers of English. Studies in Second Language Acquisition, 30(2), 225–251. https://doi.org/10.1017/S0272263108080327 CrossRef Google Scholar

Casey, S., Emmorey, K., & Larrabee, H. (2012). The effects of learning American Sign Language on co-speech gesture. Bilingualism: Language and Cognition, 15(4), 677–686. https://doi.org/10.1017/S1366728911000575 CrossRef Google Scholar PubMed

Cekaite, A. (2009). Soliciting teacher attention in an L2 classroom: Affect displays, classroom artefacts, and embodied action. Applied Linguistics, 30(1), 26–48. https://doi.org/10.1093/Applin/Amm057 CrossRef Google Scholar

Choi, S., & Lantolf, J. P. (2008). Representation and embodiment of meaning in L2 communication. Motion events in the speech and gesture of advanced L2 Korean and L2 English speakers. Studies in Second Language Acquisition, 30(2), 191–224. https://doi.org/10.1017/S0272263108080315 CrossRef Google Scholar

Cienki, A., & Iriskhanova, O. K. (Eds.). (2018). Aspectuality across languages: Event construal in speech and gesture. Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Clark, H. H. (1996). Using language. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Clark, J., & Trofimovich, P. (2016). L2 vocabulary teaching with student- and teacher-generated gestures: A classroom perspective. TESL Canada Journal, 34, 1–24. https://doi.org/10.18806/tesl.v34i1.1253 CrossRef Google Scholar

Cook, S., & Fenn, K. M. (2017). The function of gesture in learning and memory. In Breckinridge Church, R., Alibali, M. W., & Kelly, S. D. (Eds.), Why gesture?: How the hands function in speaking, thinking and communicating (pp. 129–153). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Cook, V. (Ed.) (2003). Effects of the second language on the first. Clevedon, UK: Multilingual Matters.CrossRef Google Scholar

Davies, A. (2003). The native speaker: Myth and reality. Clevedon: Multilingual Matters.Google Scholar

Debreslioska, S., & Gullberg, M. (2020). What’s new? Gestures accompany inferable rather than brand-new referents in discourse. Frontiers in Psychology, 11, 1935. https://doi.org/10.3389/fpsyg.2020.01935 CrossRef Google Scholar PubMed

Denisova, V. A., Cienki, A., & Iriskhanova, O. K. (2018). Boundary expression in verbs and gesture: Differences between L1 and L2 speakers. Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue” (2018) (Komp’juternaja Lingvistika i Intellektual’nye Tehnologii), 163–171.Google Scholar

Doughty, C. J. (2003). Instructed SLA: Constraints, compensation, and enhancement. In Doughty, C. J. & Long, M. H. (Eds.), The handbook of second language acquisition (pp. 256–310). Oxford, UK: Blackwell.CrossRef Google Scholar

Drijvers, L., & Özyürek, A. (2020). Non-native listeners benefit less from gestures and visible speech than native listeners during degraded speech comprehension. Language & Speech, 63(2), 209–220. https://doi.org/10.1177/0023830919831311 CrossRef Google Scholar PubMed

Duncan, S. D. (2002). Gesture, verb aspect, and the nature of iconic imagery in natural discourse. Gesture, 2(2), 183–206. https://doi.org/10.1075/gest.2.2.04dun CrossRef Google Scholar

Eskildsen, S. W., & Wagner, J. (2013). Recurring and shared gestures in the L2 classroom: Resources for teaching and learning. European Journal of Applied Linguistics, 1(1), 139–161. https://doi.org/10.1515/eujal-2013-0007 CrossRef Google Scholar

Eskildsen, S. W., & Wagner, J. (2015). Embodied L2 construction learning. Language Learning, 65(2), 268–297. https://doi.org/10.1111/lang.12106 CrossRef Google Scholar

Ferguson, C. A. (1971). Absence of copula and the notion of simplicity: A study of normal speech, baby talk, foreigner talk and pidgins. In Hymes, D. (Ed.), Pidginization and creolization of languages (pp. 141–150). Cambridge, UK: Cambridge University Press.Google Scholar

Flege, J. E. (2002). Interactions between the native and second-language phonetic systems. In Burmeister, P., Piske, T. T., & Rohde, A. (Eds.), An integrated view of language development (pp. 1–26). Trier, Germany: Wissenschaftlicher Verlag Trier.Google Scholar

Foraker, S. (2011). Gesture and discourse: How we use our hands to introduce versus refer back. In Stam, G. A. & Ishino, M. (Eds.), Integrating gestures: The interdisciplinary nature of gesture (pp. 279–292). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Fornel, M. de (1991). De la pertinence du geste dans les séquences de réparation et d’interruption [On the relevance of gestures in repair and interruption sequences]. In Conein, B., Fornel, M. de, & Quéré, L. (Eds.), Les formes de la conversation [The forms of conversation] (pp. 119–153). Paris, France: Résaux.Google Scholar

Frederiksen, A. T., & Mayberry, R. I. (2019). Reference tracking in early stages of different modality L2 acquisition: Limited overexplicitness in novice ASL signers’ referring expressions. Second Language Research, 35, 253–283. https://doi.org/10.1017/S0142716418000656 CrossRef Google Scholar PubMed

García-Gámez, A. B., & Macizo, P. (2019). Learning nouns and verbs in a foreign language: The role of gestures. Applied Psycholinguistics, 40, 473–507. https://doi.org/10.1017/S0142716418000656 CrossRef Google Scholar

Gass, S. M., & Mackey, A. (Eds.). (2012). The Routledge handbook of second language acquisition. Abingdon, UK: Routledge.Google Scholar

Ghaemi, F., & Rafi, F. (2018). The impact of visual aids on the retention of English word stress patterns. International Journal of Applied Linguistics & English Literature, 7(2), 225–231. https://doi.org/10.7575/aiac.ijalel.v.7n.2p.225 CrossRef Google Scholar

Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action. Psychonomic Bulletin & Review, 9, 558–565. https://doi.org/10.3758/BF03196313 CrossRef Google Scholar PubMed

Gluhareva, D., & Prieto, P. (2017). Training with rhythmic beat gestures benefits L2 pronunciation in discourse-demanding situations. Language Teaching Research, 21, 609–631. https://doi.org/10.1177/1362168816651463.CrossRef Google Scholar

Goldin-Meadow, S. (2003). The resilience of language: What gesture creation in deaf children can tell us about how all children learn language. Hove, UK: Psychology Press.Google Scholar

Graziano, M., & Gullberg, M. (2018). When speech stops, gesture stops: Evidence from developmental and crosslinguistic comparisons. Frontiers in Psychology, 9(879). https://doi.org/10.3389/fpsyg.2018.00879 CrossRef Google Scholar PubMed

Gregersen, T. S., Olivares-Cuhat, G., & Storm, J. (2009). An examination of L1 and L2 gesture use: What role does proficiency play? The Modern Language Journal, 93(2), 195–208. https://doi.org/10.1111/j.1540-4781.2009.00856.x CrossRef Google Scholar

Gu, Y., Mol, L., Hoetjes, M., & Swerts, M. (2017). Conceptual and lexical effects on gestures: The case of vertical spatial metaphors for time in Chinese. Language, Cognition and Neuroscience, 32, 1048–1063. https://doi.org/10.1080/23273798.2017.1283425 CrossRef Google Scholar

Gu, Y., Zheng, Y., & Swerts, M. (2019). Having a different pointing of view about the future: The effect of signs on co-speech gestures about time in Mandarin–CSL bimodal bilinguals. Bilingualism: Language and Cognition, 22, 836–847. https://doi.org/10.1017/S1366728918000652 CrossRef Google Scholar

Gullberg, M. (1998). Gesture as a communication strategy in second language discourse. A study of learners of French and Swedish. Lund, Sweden: Lund University Press.Google Scholar

Gullberg, M. (2003). Gestures, referents, and anaphoric linkage in learner varieties. In Dimroth, C. & Starren, M. (Eds.), Information structure and the dynamics of language acquisition (pp. 311–328). Amsterdam, the Netherlands: Benjamins.CrossRef Google Scholar

Gullberg, M. (2006a). Handling discourse: Gestures, reference tracking, and communication strategies in early L2. Language Learning, 56, 155–196. https://doi.org/10.1111/j.0023-8333.2006.00344.x CrossRef Google Scholar

Gullberg, M. (2006b). Some reasons for studying gesture and second language acquisition (Hommage à Adam Kendon). International Review of Applied Linguistics, 44(2), 103–124. https://doi.org/10.1515/IRAL.2006.004 CrossRef Google Scholar

Gullberg, M. (2008). Gestures and second language acquisition. In Robinson, P. & Ellis, N. C. (Eds.), Handbook of cognitive linguistics and second language acquisition (pp. 276–305). London, UK: Routledge.Google Scholar

Gullberg, M. (2009). Reconstructing verb meaning in a second language: How English speakers of L2 Dutch talk and gesture about placement. Annual Review of Cognitive Linguistics, 7(1), 222–244. https://doi.org/10.1075/arcl.7.09gul CrossRef Google Scholar

Gullberg, M. (2010). Methodological reflections on gesture analysis in SLA and bilingualism research. Second Language Research, 26(1), 75–102. https://doi.org/10.1177/0267658309337639 CrossRef Google Scholar

Gullberg, M. (2011). Multilingual multimodality: Communicative difficulties and their solutions in second language use. In Streeck, J., Goodwin, C., & LeBaron, C. (Eds.), Embodied interaction: Language and body in the material world (pp. 137–151). Cambridge, UK: Cambridge University Press.Google Scholar

Gullberg, M. (2012a). Bilingualism and gesture. In Bhatia, T. K. & Ritchie, W. C. (Eds.), The handbook of bilingualism and multilingualism (pp. 417–437). Malden, MA: Wiley-Blackwell.CrossRef Google Scholar

Gullberg, M. (2012b). Gesture analysis in second language acquisition. In Chapelle, C. (Ed.), Encyclopedia of applied linguistics. Oxford, UK: Wiley-Blackwell. https://doi.org/10.1002/9781405198431.wbeal0455.Google Scholar

Gullberg, M. (2014). Gestures and second language acquisition. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D., & Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1868–1875). Berlin, Germany: Mouton de Gruyter.Google Scholar

Gullberg, M., & McCafferty, S. G. (2008). Introduction to gesture and SLA: Toward an integrated approach. Studies in Second Language Acquisition, 30, 133–146. https://doi.org/10.1017/S0272263108080285 CrossRef Google Scholar

Gullberg, M., Roberts, L., & Dimroth, C. (2012). What word-level knowledge can adult learners acquire after minimal exposure to a new language? International Review of Applied Linguistics, 50(4), 239–276. https://doi.org/10.1515/iral-2012-0010 CrossRef Google Scholar

Harrison, S., Adolphs, S., Gillon Dowens, M., Du, P., & Littlemore, J. (2018). All hands on deck. Negotiation over gesture forms in collaborative discourse. Lingua, 207, 1–22. https://doi.org/10.1016/j.lingua.2018.02.002 CrossRef Google Scholar

Hauser, E. (2014). Solution strokes. Gestural component of speaking trouble solution. Gesture, 14(3), 297–319. https://doi.org/10.1075/gest.14.3.02hau CrossRef Google Scholar

Hendriks, H. (2003). Using nouns for reference maintenance: A seeming contradiction in L2 discourse. In Ramat, A. G. (Ed.), Typology and second language acquisition (pp. 291–326). Berlin, Germany: Mouton.Google Scholar

Hilliard, A. (2020). The effects of teaching methods for raising ESL students’ awareness of gesture. Language Awareness, 29, 1–20. https://doi.org/10.1080/09658416.2019.1703996 CrossRef Google Scholar

Hirata, Y., & Kelly, S. D. (2010). Effects of lips and hands on auditory learning of second-language speech sounds. Journal of Speech Language and Hearing Research, 53, 298–310. https://doi.org/10.1044/1092-4388(2009/08-0243)CrossRef Google Scholar PubMed

Hirata, Y., Kelly, S. D., Huang, J., & Manansala, M. (2014). Effects of hand gestures on auditory learning of second-language vowel length contrasts. Journal of Speech, Language, and Hearing Research, 57, 2090–2101. https://doi.org/10.1044/2014_JSLHR-S-14-0049 CrossRef Google Scholar PubMed

Hooijschuur, L., Hilton, N., & Loerts, H. (2017). Gesture use and its role for nativeness judgements. Dutch Journal of Applied Linguistics, 6(1), 21–40. https://doi.org/10.1075/dujal.6.1.02hoo CrossRef Google Scholar

Housen, A., Kuiken, F., & Vedder, I. (Eds.). (2012). Dimensions of L2 performance and proficiency: Complexity, accuracy, and fluency in SLA. Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Huang, X., Kim, N., & Christianson, K. (2019). Gesture and vocabulary learning in a second language. Language Learning, 69(1), 177–197. https://doi.org/10.1111/lang.12326 CrossRef Google Scholar

Hulstijn, J. H. (2005). Theoretical and empirical issues in the study of implicit and explicit second-language learning. Studies in Second Language Acquisition, 27, 129–140. https://doi.org/10.1017/S0272263105050084 CrossRef Google Scholar

Iizuka, T., Nakatsukasa, K., & Braver, A. (2020). The efficacy of gesture on second language pronunciation: An exploratory study of handclapping as a classroom instructional tool. Language Learning, 70(4), 1054–1090. https://doi.org/10.1111/lang.12415.CrossRef Google Scholar

Ijaz, I. H. (1986). Linguistic and cognitive determinants of lexical acquisition in a second language. Language Learning, 36(4), 401–451. https://doi.org/10.1111/j.1467-1770.1986.tb01034.x CrossRef Google Scholar

Irujo, S. (1993). Steering clear: Avoidance in the production of idioms. International Review of Applied Linguistics, 31, 205–219. https://doi.org/10.1515/iral.1993.31.3.205 CrossRef Google Scholar

Isaeva, E., & Fernández-Villanueva, M. (2016). Gestures and lexical access problems in German as second language. In Fernández-Villanueva, M. & Jungbluth, K. (Eds.), Beyond language boundaries: Multimodal use in multilingual contexts (pp. 93–113). Berlin, Germany: Mouton.CrossRef Google Scholar

Iverson, J. M., Capirci, O., Volterra, V., & Goldin-Meadow, S. (2008). Learning to talk in a gesture-rich world: Early communication in Italian vs. American children. First Language, 28(2), 164–181. https://doi.org/10.1177/0142723707087736 CrossRef Google Scholar

Iwasaki, N., & Yoshioka, K. (2020). Thinking-for-Speaking to describe motion events. English-Japanese bilinguals’ L1 English and L2 Japanese speech and gesture. In Pappalardo, G. & Heinrich, P. (Eds.), European approaches to Japanese language and linguistics (Vol. 13, pp. 71–98). Venice, Italy: Edizioni Ca’Foscari.Google Scholar

Janke, V., & Marshall, C. R. (2017). Using the hands to represent objects in space: Gesture as a substrate for signed language acquisition. Frontiers in Psychology, 8, 2007. https://doi.org/10.3389/fpsyg.2017.02007 CrossRef Google Scholar

Jarvis, S., & Pavlenko, A. (2008). Crosslinguistic influence in language and cognition. New York, NY: Routledge.CrossRef Google Scholar

Jenkins, S., & Parra, I. (2003). Multiple layers of meaning in an oral proficiency test: The complementary roles of nonverbal, paralinguistic, and verbal behaviors in assessment decisions. Modern Language Journal, 87(1), 90–107. https://doi.org/10.1111/1540-4781.00180 CrossRef Google Scholar

Jungheim, N. O. (2006). Learner and native speaker perspectives on a culturally-specific Japanese refusal gesture. International Review of Applied Linguistics in Language Teaching, 44(2), 125–142. https://doi.org/10.1515/IRAL.2006.005 CrossRef Google Scholar

Kasper, G., & Kellerman, E. (Eds.) (1997). Communication strategies: Psycholinguistic and sociolinguistic perspectives. London, UK: Longman.Google Scholar

Kellerman, S. (1992). “I see what you mean”: The role of kinesic behaviour in listening and implications for foreign and second language learning. Applied Linguistics, 13(3), 239–257. https://doi.org/10.1093/applin/13.3.239 CrossRef Google Scholar

Kelly, S. D. (2017). Exploring the boundaries of gesture-speech integration during language comprehension. In Breckinridge Church, R., Alibali, M. W., & Kelly, S. D. (Eds.), Why gesture?: How the hands function in speaking, thinking and communicating (pp. 243–265). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Kelly, S. D., Hirata, Y., Manansala, M., & Huang, J. (2014). Exploring the role of hand gestures in learning novel phoneme contrasts and vocabulary in a second language. Frontiers in Psychology, 5, 673. https://doi.org/10.3389/fpsyg.2014.00673 CrossRef Google Scholar

Kelly, S. D., & Lee, A. (2012). When actions speak too much louder than words. Gesture disrupts word learning when phonetic demands are high. Language and Cognitive Processes, 27, 793–807. https://doi.org/10.1080/01690965.2011.581125 CrossRef Google Scholar

Kelly, S. D., McDevitt, T., & Esch, M. (2009). Brief training with co-speech gesture lends a hand to word learning in a foreign language. Language and Cognitive Processes, 24, 313–334. https://doi.org/10.1080/01690960802365567 CrossRef Google Scholar

Kendon, A. (1972). Some relationships between body motion and speech: An analysis of an example. In Siegman, A. W. & Pope, B. (Eds.), Studies in dyadic communication (pp. 177–210). New York, NY: Pergamon.CrossRef Google Scholar

Kendon, A. (2004). Gesture. Visible action as utterance. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Kida, T. (2005). Appropriation du geste par les étrangers: Le cas d’étudiants japonais apprenant le français [The acquisition of gestures by foreigners: The case of Japanese students learning French]. (Unpublished doctoral dissertation). Université de Provence (Aix-Marseille I), Aix en Provence.Google Scholar

Kim, S., & Cho, S. (2017). How a tutor uses gesture for scaffolding: A case study on L2 tutee’s writing. Discourse Processes, 54, 105–123. https://doi.org/10.1080/0163853X.2015.1100909 CrossRef Google Scholar

Kimura, D., & Kazik, N. (2017). Learning in-progress: On the role of gesture in microgenetic development of L2 grammar. Gesture, 16(1), 127–151. https://doi.org/10.1075/gest.16.1.05kim CrossRef Google Scholar

Kita, S. (2009). Cross-cultural variation of speech-accompanying gesture: A review. Language and Cognitive Processes, 24, 145–167. https://doi.org/10.1080/01690960802586188 CrossRef Google Scholar

Kita, S., & Essegbey, J. (2001). Pointing left in Ghana: How a taboo on the use of the left hand influences gestural practice. Gesture, 1(1), 73–95. https://doi.org/10.1075/gest.1.1.06kit CrossRef Google Scholar

Krönke, K.-M., Mueller, K., Friederici, A. D., & Obrig, H. (2013). Learning by doing? The effect of gestures on implicit retrieval of newly acquired words. Cortex, 49(9), 2553–2568. https://doi.org/10.1016/j.cortex.2012.11.016 CrossRef Google Scholar PubMed

Kushch, O., Igualada, A., & Prieto, P. (2018). Prominence in speech and gesture favour second language novel word learning. Language, Cognition and Neuroscience, 33, 992–1004. https://doi.org/10.1080/23273798.2018.1435894 CrossRef Google Scholar

Ladewig, S. H. (2014). The cyclic gesture. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D., & Teßendorf, S. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1605–1618). Berlin, Germany: Mouton de Gruyter.Google Scholar

Lazaraton, A. (2004). Gesture and speech in the vocabulary explanations of one ESL teacher: A microanalytic inquiry. Language Learning, 54(1), 79–117. https://doi.org/10.1111/j.1467-9922.2004.00249.x CrossRef Google Scholar

Lee, J. (2008). Gesture and private speech in second language acquisition. Studies in Second Language Acquisition, 30, 169–190. https://doi.org/10.1017/S0272263108080303 CrossRef Google Scholar

Levy, E. T., & McNeill, D. (1992). Speech, gesture, and discourse. Discourse Processes, 15, 277–301. https://doi.org/10.1080/01638539209544813 CrossRef Google Scholar

Lewis, T. N. (2012). The effect of context on the L2 Thinking for Speaking development of path gestures. L2 Journal, 4(2), 247–268. https://doi.org/10.5070/L24211612 CrossRef Google Scholar

Li, P., Baills, F., & Prieto, P. (2020). Observing and producing durational hand gestures facilitates the pronunciation of novel vowel-length contrasts. Studies in Second Language Acquisition, 42(5), 1015–1039. https://doi.org/10.1017/S0272263120000054 CrossRef Google Scholar

Lin, Y.-L. (2020). A helping hand for thinking and speaking: Effects of gesturing and task planning on second language narrative discourse. System, 91, 102243. https://doi.org/10.1016/j.system.2020.102243 CrossRef Google Scholar

McCafferty, S. G. (1998). Nonverbal expression and L2 private speech. Applied Linguistics, 19(1), 73–96. https://doi.org/10.1093/applin/19.1.73 CrossRef Google Scholar

McCafferty, S. G., & Rosborough, A. (2014). Gesture as a private form of communication during lessons in an ESL-designated elementary classroom: A sociocultural perspective. TESOL Journal, 5(2), 225–246. https://doi.org/10.1002/tesj.104 CrossRef Google Scholar

McNeill, D. (1992). Hand and mind. What gestures reveal about thought. Chicago, IL: University of Chicago Press.Google Scholar

McNeill, D. (2014). Gesture-speech unity: Phylogenesis, ontogenesis, and microgenesis. Language, Interaction and Acquisition, 5(2), 137–184. https://doi.org/10.1075/lia.5.2.01mcn CrossRef Google Scholar

Macedonia, M. (2019). Embodied learning: Why at school the mind needs the body. Frontiers in Psychology, 10, 2098. https://doi.org/10.3389/fpsyg.2019.02098 CrossRef Google Scholar PubMed

Macedonia, M., & Klimesch, W. (2014). Long-term effects of gestures on memory for foreign language words trained in the classroom. Mind, Brain, and Education, 8, 74–88. https://doi.org/10.1111/mbe.12047 CrossRef Google Scholar

Macedonia, M., & Knösche, T. R. (2011). Body in mind: How gestures empower foreign language learning. Mind, Brain, and Education, 5, 196–211. https://doi.org/10.1111/j.1751-228X.2011.01129.x CrossRef Google Scholar

Macedonia, M., Müller, K., & Friederici, A. D. (2011). The impact of iconic gestures on foreign language word learning and its neural substrate. Human Brain Mapping, 32, 982–998. https://doi.org/10.1002/hbm.21084 CrossRef Google Scholar PubMed

Macedonia, M., Repetto, C., Ischebeck, A., & Mueller, K. (2019). Depth of encoding through observed gestures in foreign language word learning. Frontiers in Psychology, 10, 33. https://doi.org/10.3389/fpsyg.2019.00033 CrossRef Google Scholar PubMed

Matsumoto, Y., & Dobs, A. M. (2017). Pedagogical gestures as interactional resources for teaching and learning tense and aspect in the ESL grammar classroom. Language Learning, 67(1), 7–42. https://doi.org/10.1111/lang.12181 CrossRef Google Scholar

Melinger, A., & Kita, S. (2007). Conceptualisation load triggers gesture production. Language and Cognitive Processes, 22, 473–500. https://doi.org/10.1080/01690960600696916 CrossRef Google Scholar

Molinsky, A. L., Krabbenhoft, M. A., Ambady, N., & Choi, Y. S. (2005). Cracking the nonverbal code. Journal of Cross-Cultural Psychology, 36(3), 380–395. https://doi.org/10.1177/0022022104273658 CrossRef Google Scholar

Morett, L. M. (2014). When hands speak louder than words: The role of gesture in the communication, encoding, and recall of words in a novel second language. The Modern Language Journal, 98(3), 834–853. https://doi.org/10.1111/modl.12125 Google Scholar

Morett, L. M. (2018). In hand and in mind: Effects of gesture production and viewing on second language word learning. Applied Psycholinguistics, 39(2), 355–381. https://doi.org/10.1017/S0142716417000388 CrossRef Google Scholar

Morett, L. M., & Chang, L.-Y. (2015). Emphasising sound and meaning: Pitch gestures enhance Mandarin lexical tone acquisition. Language, Cognition and Neuroscience, 30, 347–353. https://doi.org/10.1080/23273798.2014.923105 CrossRef Google Scholar

Mori, J., & Hayashi, M. (2006). The achievement of intersubjectivity through embodied completions: A study of interactions between first and second language speakers. Applied Linguistics, 27(2), 195–219. https://doi.org/10.1093/applin/aml014 CrossRef Google Scholar

Morris, D., Collett, P., Marsh, P., & O’Shaughnessy, M. (1979). Gestures, their origins and distribution. London, UK: Cape.Google Scholar

Nagpal, J., Nicoladis, E., & Marentette, P. (2011). Predicting individual differences in L2 speakers’ gestures. International Journal of Bilingualism, 15(2), 205–214. https://doi.org/10.1177/1367006910381195 CrossRef Google Scholar

Nakatsukasa, K. (2016). Efficacy of recasts and gestures on the acquisition of locative prepositions. Studies in Second Language Acquisition, 38, 771–799. https://doi.org/10.1017/S0272263115000467 CrossRef Google Scholar

Nakatsukasa, K. (2021). Gesture-enhanced recasts have limited effects: A case of the regular past tense. Language Teaching Research, 25(4), 587–612. https://doi.org/10.1177/1362168819870283 CrossRef Google Scholar

Nakatsukasa, K., & Loewen, S. (2017). Non-verbal feedback. In Nassaji, H. & Kartchava, E. (Eds.), Corrective feedback in second language teaching and learning: Research, theory, applications, implications (pp. 158–173). New York, NY: Routledge.CrossRef Google Scholar

Nardotto Peltier, I., & McCafferty, S. G. (2010). Gesture and identity in the teaching and learning of Italian. Mind, Culture, and Activity, 17, 331–349. https://doi.org/10.1080/10749030903362699 CrossRef Google Scholar

Nicoladis, E. (2007). The effect of bilingualism on the use of manual gestures. Applied Psycholinguistics, 28, 441–454. https://doi.org/10.1017/S0142716407070245 CrossRef Google Scholar

Nicoladis, E., Pika, S., Yin, H. U. I., & Marentette, P. (2007). Gesture use in story recall by Chinese-English bilinguals. Applied Psycholinguistics, 28(3), 721–735. https://doi.org/10.1017/S0142716407070385 CrossRef Google Scholar

Odlin, T. (2003). Cross-linguistic influence. In Doughty, C. J. & Long, M. H. (Eds.), The handbook of second language acquisition (pp. 436–486). Oxford, UK: Blackwell.CrossRef Google Scholar

Olsher, D. (2004). Talk and gesture: The embodied completion of sequential actions in spoken interaction. In Gardner, R. & Wagner, J. (Eds.), Second language conversations (pp. 221–245). London, UK: Continuum.Google Scholar

Olsher, D. (2008). Gesturally-enhanced repeats in the repair turn: Communication strategy or cognitive language-learning tool? In McCafferty, S. G. & Stam, G. (Eds.), Gesture. Second language acquisition and classroom research (pp. 109–130). New York, NY: Routledge.Google Scholar

Ortega, G., & Morgan, G. (2015a). Input processing at first exposure to a sign language. Second Language Research, 31(4), 443–463. https://doi.org/10.1177/0267658315576822 CrossRef Google Scholar

Ortega, G., & Morgan, G. (2015b). Phonological development in hearing learners of a sign language: The influence of phonological parameters, sign complexity, and iconicity. Language Learning, 65(3), 660–688. https://doi.org/10.1111/lang.12123 CrossRef Google Scholar

Özçalışkan, Ş. (2016). Do gestures follow speech in bilinguals’ description of motion? Bilingualism: Language and Cognition, 19, 644–653. https://doi.org/10.1017/S1366728915000796 CrossRef Google Scholar

Özyürek, A. (2002). Speech-language relationship across languages and in second language learners: Implications for spatial thinking and speaking. In Skarabela, B. (Ed.), BUCLD Proceedings (Vol. 26, pp. 500–509). Somerville, MA: Cascadilla Press.Google Scholar

Paradis, M. (2009). Declarative and procedural determinants of second languages. Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Perdue, C. (2000). Introduction. Organizing principles of learner varieties. Studies in Second Language Acquisition, 22, 299–305. https://doi.org/10.1017/S0272263100003016 CrossRef Google Scholar

Pettenati, P., Sekine, K., Congestrì, E., & Volterra, V. (2012). A comparative study on representational gestures in Italian and Japanese children. Journal of Nonverbal Behavior, 36, 149–164. https://doi.org/10.1007/s10919-011-0127-0 CrossRef Google Scholar

Pika, S., Nicoladis, E., & Marentette, P. (2006). A cross-cultural study on the use of gestures: Evidence for cross-linguistic transfer? Bilingualism: Language and Cognition, 9, 319–327. https://doi.org/10.1017/S136672890600266 CrossRef Google Scholar

Porter, A. (2016). A helping hand with language learning: teaching French vocabulary with gesture. The Language Learning Journal, 44, 236–256. https://doi.org/10.1017/S1366728906002665CrossRef Google Scholar

Quinlisk, C. C. (2008). Nonverbal communication and second language classrooms: A review. In McCafferty, S. G. & Stam, G. (Eds.), Gesture. Second language acquisition and classroom research (pp. 25–40). New York, NY: Routledge.Google Scholar

Rauscher, F. H., Krauss, R. M., & Chen, Y. (1996). Gesture, speech and lexical access: The role of lexical movements in speech production. Psychological Science, 7(4), 226–231. https://doi.org/10.1111/j.1467-9280.1996.tb00364.x CrossRef Google Scholar

Sato, R. (2020). Gestures in EFL classroom: Their relations with complexity, accuracy, and fluency in EFL teachers’ L2 utterances. System, 89, Article 102215. https://doi.org/https://doi.org/10.1016/j.system.2020.102215 CrossRef Google Scholar

Scheflen, A. E. (1972). Body language and the social order: Communication as behavioral control. Englewood Cliffs, NJ: Prentice Hall.Google Scholar

Schmidt, R. (2001). Attention. In Robinson, P. (Ed.), Cognition and second language instruction (pp. 3–30). Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Sekine, K., Stam, G., Yoshioka, K., Tellier, M., & Capirci, O. (2015). Cross-linguistic views of gesture usage. Vigo – International Journal of Applied Linguistics, 12, 91–105. URL: http://hdl.handle.net/11858/00-001M-0000-002C-099E-4 Google Scholar

Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics, 10, 209–231. https://doi.org/10.1515/iral.1972.10.1-4.209 CrossRef Google Scholar

Sime, D. (2006). What do learners make of teachers’ gestures in the language classroom? International Review of Applied Linguistics, 44, 209–228. https://doi.org/10.1515/IRAL.2006.009 CrossRef Google Scholar

Sinclair, J. M., & Brazil, D. (1982). Teacher talk. Oxford, UK: Oxford University Press.Google Scholar

Slobin, D. I. (1996). From “thought and language” to “thinking for speaking”. In Gumperz, J. J. & Levinson, S. C. (Eds.), Rethinking linguistic relativity (pp. 70–96). Cambridge, UK: Cambridge University Press.Google Scholar

Smotrova, T. (2017). Making pronunciation visible: Gesture in teaching pronunciation. TESOL Quarterly, 51(1), 59–89. https://doi.org/10.1002/tesq.276 CrossRef Google Scholar

Smotrova, T., & Lantolf, J. P. (2013). The function of gesture in lexically focused L2 instructional conversations. The Modern Language Journal, 97(2), 397–416. https://doi.org/10.1111/j.1540-4781.2013.12008.x CrossRef Google Scholar

So, W. C. (2010). Cross-cultural transfer in gesture frequency in Chinese-English bilinguals. Language and Cognitive Processes, 25, 1335–1353. https://doi.org/10.1080/01690961003694268 CrossRef Google Scholar

So, W. C., Kita, S., & Goldin-Meadow, S. (2013). When do speakers use gestures to specify who does what to whom? The role of language proficiency and type of gestures in narratives. Journal of Psycholinguistic Research, 42, 581–594. https://doi.org/10.1007/s10936-012-9230-6 CrossRef Google Scholar PubMed

So, W. C., Lim, J.-Y., & Tan, S.-H. (2014). Sensitivity to information status in discourse: Gesture precedes speech in unbalanced bilinguals. Applied Psycholinguistics, 35, 71–95. https://doi.org/10.1017/S0142716412000355 CrossRef Google Scholar

So, W. C., Sim Chen-Hui, C., & Low Wei-Shan, J. (2012). Mnemonic effect of iconic gesture and beat gesture in adults and children: Is meaning in gesture important for memory recall? Language and Cognitive Processes, 27, 665–681. https://doi.org/10.1080/01690965.2011.573220 CrossRef Google Scholar

Stam, G. (2006). Thinking for speaking about motion: L1 and L2 speech and gesture. International Review of Applied Linguistics, 44(2), 143–169. https://doi.org/10.1515/IRAL.2006.006 CrossRef Google Scholar

Stam, G. (2012). Second language acquisition and gesture. In Chapelle, C. (Ed.), Encyclopedia of Applied Linguistics. Oxford, UK: Wiley-Blackwell. https://doi.org/10.1002/9781405198431.wbeal1049 Google Scholar

Stam, G. (2015). Changes in thinking for speaking: A longitudinal case study. The Modern Language Journal, 99(S1), 83–99. https://doi.org/10.1111/j.1540-4781.2015.12180.x CrossRef Google Scholar

Stam, G., & Buescher, K. (2018). Gesture research. In Phakiti, A., De Costa, P., Plonsky, L., & Starfield, S. (Eds.), Palgrave handbook of applied linguistics research methodology (pp. 793–809). London, UK: Palgrave Macmillan.CrossRef Google Scholar

Stam, G., & Tellier, M. (2017). The sound of silence: The functions of gestures in pauses in native and non-native interaction. In Breckinridge Church, R., Alibali, M. W., & Kelly, S. D. (Eds.), Why gesture?: How the hands function in speaking, thinking and communicating (pp. 353–377). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Sueyoshi, A., & Hardison, D. M. (2005). The role of gestures and facial cues in second language listening comprehension. Language Learning, 55(4), 661–699. https://doi.org/10.1111/j.0023-8333.2005.00320.x CrossRef Google Scholar

Tabensky, A. (2008). Expository discourse in a second language classroom: How learners use gesture. In McCafferty, S. G. & Stam, G. (Eds.), Gesture. Second language acquisition and classroom research (pp. 298–320). New York, NY: Routledge.Google Scholar

Talmy, L. (1991). Paths to realization: A typology of event conflation. In Sutton, L. A., Johnson, C., & Shields, R. (Eds.), Proceedings of the Berkeley Linguistics Society (Vol. 17, pp. 480–519). Berkeley, CA: Berkeley Linguistics Society.Google Scholar

Talmy, L. (2000). Toward a cognitive semantics. Cambridge, MA: MIT Press.Google Scholar

Tarone, E., & Bigelow, M. (2005). Impact of literacy on oral language processing: Implications for SLA research. Annual Review of Applied Linguistics, 25, 77–97. https://doi.org/10.1017/S0267190505000048 CrossRef Google Scholar

Tellier, M. (2006). L’impact du geste pédagogique sur l’enseignement/apprentissage des langues étrangères: Etude sur des enfants de 5 ans [The impact of pedagogical gestures on the teaching/learning of foreign languages: A study of five-year-old children]. (Unpublished doctoral dissertation.) Université Paris VII – Denis Diderot, Paris.Google Scholar

Tellier, M. (2008). The effect of gestures on second language memorisation by young children. Gesture, 8(2), 219–235. https://doi.org/10.1075/gest.8.2.06tel CrossRef Google Scholar

Tellier, M. (2014). Donner du corps à son cours [To give body to your course]. In Tellier, M. & Cadet, L. (Eds.), Le corps et la voix de l’enseignant: théorie et pratique [The teacher’s body and voice: theory and practice]. (pp. 101–114). Paris, France: Éditions Maison des Langues.Google Scholar

Tellier, M., & Stam, G. (2012). Stratégies verbales et gestuelles dans l’explication lexicale d’un verbe d’action [Verbal and gestural strategies in the lexical explanations of action verbs]. In Rivière, V. (Ed.), Spécificités et diversité des interactions didactiques [Specificity and diversity in didactic interactions]. (pp. 357–374). Paris, France: Riveneuve éditions.Google Scholar

Tian, L., & McCafferty, S. G. (2020). Chinese international students’ multicultural identity and second language development: Gesture awareness and use. Language Awareness, 30(2), 114–133. https://doi.org/10.1080/09658416.2020.1767118 CrossRef Google Scholar

Ullman, M. T. (2001). The neural basis of lexicon and grammar in first and second language: The declarative/procedural model. Bilingualism: Language and Cognition, 4, 105–122. https://doi.org/10.1017/S1366728901000220 CrossRef Google Scholar

Van Compernolle, R. A., & Smotrova, T. (2014). Corrective feedback, gesture, and mediation in classroom language learning. Language and Sociocultural Theory, 1(1), 25–47. https://doi.org/10.1558/71056194384 CrossRef Google Scholar

Van Compernolle, R. A., & Williams, L. (2011). Thinking with your hands: speech–gesture activity during an L2 awareness-raising task. Language Awareness, 20, 203–219. https://doi.org/10.1080/09658416.2011.559244 CrossRef Google Scholar

Van Hell, J. G., & Dijkstra, T. (2002). Foreign language knowledge can influence native language performance in exclusively native contexts. Psychonomic Bulletin & Review, 9, 780–789. https://doi.org/10.3758/BF03196335 CrossRef Google Scholar PubMed

Volterra, V., Beronesi, S., & Massoni, P. (1990). How does gestural communication become language? In Volterra, V. & Erting, C. J. (Eds.), From gesture to language in hearing and deaf children (pp. 205–216). Berlin, Germany: Springer.CrossRef Google Scholar

Weisberg, J., Casey, S., Sevcikova Sehyr, Z., & Emmorey, K. (2020). Second language acquisition of American Sign Language influences co-speech gesture production. Bilingualism: Language and Cognition, 23(3), 473–482. https://doi.org/10.1017/S1366728919000208 CrossRef Google Scholar PubMed

Williams, J. (1988). Zero anaphora in second language acquisition. Studies in Second Language Acquisition, 10, 339–370. https://doi.org/10.1017/S0272263100007488 CrossRef Google Scholar

Wolfgang, A., & Wolofsky, Z. (1991). The ability of new Canadians to decode gestures generated by Canadians of Anglo-Celtic backgrounds. International Journal of Intercultural Relations, 15(1), 47–64. https://doi.org/10.1016/0147-1767(91)90073-P CrossRef Google Scholar

Yoshioka, K. (2005). Linguistic and gestural introduction and tracking of referents in L1 and L2 discourse. (Unpublished doctoral dissertation). Rijksuniversiteit Groningen, Groningen, the Netherlands.Google Scholar

Yoshioka, K. (2008). Gesture and information structure in first and second language. Gesture, 8(2), 236–255. https://doi.org/10.1075/gest.8.2.07yos CrossRef Google Scholar

Yoshioka, K., & Kellerman, E. (2006). Gestural introduction of Ground reference in L2 narrative discourse. International Review of Applied Linguistics, 44, 171–193. https://doi.org/10.1515/IRAL.2006.007 CrossRef Google Scholar

Yuan, C., González-Fuente, S., Baills, F., & Prieto, P. (2019). Observing pitch gestures favors the learning of Spanish intonation by Mandarin speakers. Studies in Second Language Acquisition, 41, 5–32. https://doi.org/10.1017/S0272263117000316 CrossRef Google Scholar

Zhang, Y., Baills, F., & Prieto, P. (2018). Hand-clapping to the rhythm of newly learned words improves L2 pronunciation: Evidence from training Chinese adolescents with French words. Language Teaching Research, 24(5), 666–689. https://doi.org/10.1177/1362168818806531 CrossRef Google Scholar

Zheng, A., Hirata, Y., & Kelly, S. D. (2018). Exploring the effects of imitating hand gestures and head nods on L1 and L2 Mandarin tone production. Journal of Speech, Language, and Hearing Research, 61(9), 2179–2195. https://doi.org/10.1044/2018_JSLHR-S-17-0481 CrossRef Google Scholar PubMed

Zvaigzne, M., Oshima-Takane, Y., & Hirakawa, M. (2019). How does language proficiency affect children’s iconic gesture use? Applied Psycholinguistics, 40, 555–583. https://doi.org/10.1017/S014271641800070X CrossRef Google Scholar

References

Alibali, M. W., Heath, D. C., & Myers, H. J. (2001). Effects of visibility between speaker and listener on gesture production: Some gestures are meant to be seen. Journal of Memory and Language, 44(2), 169–188. https://doi.org/10.1006/jmla.2 000.2752CrossRef Google Scholar

Armstrong, D. F., Stokoe, W. C., & Wilcox, S. (1995). Gesture and the nature of language. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Bates, E., Camaioni, L., & Volterra, V. (1975). The acquisition of performatives prior to speech. Merrill-Palmer Quarterly of Behavior and Development, 21(3), 205–226.Google Scholar

Biau, E., & Soto-Faraco, S. (2013). Beat gestures modulate auditory integration in speech perception. Brain and Language, 124(2), 143–152. https://doi.org/10.1016/j.bandl.2012.10.008 CrossRef Google Scholar PubMed

Biau, E., & Soto-Faraco, S. (2015). Synchronization by the hand: The sight of gestures modulates low-frequency activity in brain responses to continuous speech. Frontiers in Human Neuroscience, 9, 527. https://doi.org/10.3389/fnhum.2015.00527 CrossRef Google Scholar PubMed

Bybee, J. (2006). From usage to grammar: The mind’s response to repetition. Language, 82(4), 711–733.CrossRef Google Scholar

Bybee, J. (2010). Language, usage and cognition. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Capirci, O., Iverson, J. M., Montanari, S., & Volterra, V. (2002). Gestural, signed and spoken modalities in early language development: The role of linguistic input. Bilingualism: Language and Cognition, 5(1), 25–37. https://doi.org/10.1017/S1366728902000123 CrossRef Google Scholar

Clark, A. (2012). How to write American Sign Language. Burnsville, MN: ASLWrite.Google Scholar

Darwin, C. (1872). The expression of the emotions in man and animals. London, UK: J. Murray.CrossRef Google Scholar

Dodwell, C. R. (2000). Anglo-Saxon gestures and the Roman stage. Cambridge, UK: Cambridge University Press.Google Scholar

Dotter, F. (2018). Most characteristic elements of sign language texts are intricate mixtures of linguistic and non-linguistic parts, aren’t they? Colloquium: New Philologies, 3(1), 1–62.Google Scholar

Engberg-Pedersen, E. (1996). Iconicity and arbitrariness. In Michael, F., Harder, P., Heltoft, L., & Jakobsen, L. F. (Eds.), Content, expression and structure: Studies in Danish functional grammar (pp. 453–468). Amsterdam, the Netherlands: John Benjamins Publishing Company.CrossRef Google Scholar

Erman, A. (1971). The literature of the ancient Egyptians: Poems, narratives, and manuals of instruction from the third and second millennia B.C. New York, NY: Benjamin Blom.Google Scholar

Fenlon, J., Cooperrider, K., Keane, J., Brentari, D., & Goldin-Meadow, S. (2019). Comparing sign language and gesture: Insights from pointing. Glossa: A Journal of General Linguistics, 4(1), 1–26. https://doi.org/10.5334/gjgl.499 CrossRef Google Scholar

Frishberg, N. (1975). Arbitrariness and iconicity: Historical change in American Sign Language. Language, 51(3), 696–719.CrossRef Google Scholar

Geertz, C. (1973). The interpretation of cultures. New York, NY: Basic Books.Google Scholar

Geertz, C. (1974). “From the native’s point of view”: On the nature of anthropological understanding. Bulletin of the American Academy of Arts and Sciences, 28(1), 26045.CrossRef Google Scholar

Goldin-Meadow, S., & Brentari, D. (2017). Gesture, sign and language: The coming of age of sign language and gesture studies. Behavioral and Brain Sciences, 40, e46. https://doi.org/10.1017/S0140525X15001247 CrossRef Google Scholar PubMed

Hockett, C. (1982). The origin of speech. In Wang, W. S.-Y. (Ed.), Human communication: Language and its psychobiological bases (pp. 5–12). San Francisco, CA: W. H. Freeman and Company.Google Scholar

Hodge, G., & Johnston, T. (2014). Points, depictions, gestures and enactment: Partly lexical and non-lexical signs as core elements of single clause-like units in Auslan (Australian Sign Language). Australian Journal of Linguistics, 34(2), 262–291. https://doi.org/10.1080/07268602.2014.887408 CrossRef Google Scholar

Hopper, P. J., & Traugott, E. C. (2003). Grammaticalization. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Janzen, T., & Shaffer, B. (2002). Gesture as the substrate in the process of ASL grammaticization. In Meier, R., Quinto, D., & Cormier, K. (Eds.), Modality and structure in signed and spoken languages (pp. 199–223). Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Jarque, M. J. (2006). The expression of possibility in Catalan Sign Language: The sign PODER. 5th International Conference of the Spanish Cognitive Linguistics Association (AELCO/ SCOLA). Murcia, October 19–21, 2006.Google Scholar

Kendon, A. (2017). Languages as semiotically heterogenous systems. Behavioral and Brain Sciences, 40, e59. https://doi.org/10.1017/S0140525X15002940 CrossRef Google Scholar PubMed

Klima, E., & Bellugi, U. (1979). The signs of language. Cambridge, MA: Harvard University Press.Google Scholar

Krahmer, E., & Swerts, M. (2007). The effects of visual beats on prosodic prominence: Acoustic analyses, auditory perception and visual perception. Journal of Memory and Language, 57(3), 396–414. https://doi.org/10.1016/j.jml.2007.06.005 CrossRef Google Scholar

Kusters, A., & Sahasrabudhe, S. (2018). Language ideologies on the difference between gesture and sign. Language & Communication, 60, 44–63. https://doi.org/10.1016/j.langcom.2018.01.008 CrossRef Google Scholar

Lane, H. (1980). A chronology of the oppression of sign language in France and the United States. In Lane, H. & Grosjean, F. (Eds.), Recent perspectives on American Sign Language (pp. 119–161). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar

Lane, H. (1984). When the mind hears: A history of the deaf. New York, NY: Random House.Google Scholar

Langacker, R. W. (1987). Foundations of cognitive grammar: Volume I, Theoretical foundations. Stanford, CA: Stanford University Press.Google Scholar

Langacker, R. W. (2000). A dynamic usage-based model. In Barlow, M. & Kemmer, S. (Eds.), Usage-based models of language (pp. 1–63). Stanford, CA: CSLI Publications Center for the Study of Language and Information.Google Scholar

Langacker, R. W. (2001). Discourse in cognitive grammar. Cognitive Linguistics, 12 (2), 143–188.CrossRef Google Scholar

Langacker, R. W. (2008). Cognitive grammar: A basic introduction. Oxford, UK: Oxford University Press.CrossRef Google Scholar

Long, J. S. (1918). The sign language: A manual of signs. Iowa City, IA: Athens Press.Google Scholar

Lucas, C. (1989). The sociolinguistics of the Deaf community. San Diego, CA: Academic Press.Google Scholar

Mandel, M. (1977). Iconic devices in American Sign Language. In Friedman, L. A. (Ed.), On the other hand: New perspectives on American Sign Language (pp. 57–108). New York, NY: Academic Press.Google Scholar

Martínez, R., & Wilcox, S. (2019). Pointing and placing: Nominal grounding in Argentine Sign Language. Cognitive Linguistics, 30(1), 85–121. https://doi.org/10.1515/cog-2018-0010 CrossRef Google Scholar

McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago, IL: University of Chicago Press.Google Scholar

McNeill, D., Levy, E. T., & Duncan, S. D. (2015). Gesture in discourse. In Deborah, T., Heidi, E. H., & Deborah, S. (Eds.), Handbook of discourse analysis (pp. 262–290). Oxford, UK: Blackwell.CrossRef Google Scholar

Meier, R. P., & Lillo-Martin, D. (2013). The points of language. Humana. Mente Journal of Philosophical Studies, 24, 151–176.Google Scholar PubMed

Meir, I., Sandler, W., Padden, C., & Aronoff, M. (2010). Emerging sign languages. In Marschark, M. & Spencer, P. E. (Eds.), Oxford handbook of deaf studies, language, and education (pp. 267–280). Oxford, UK: Oxford University Press.Google Scholar

Mesh, K., & Hou, L. (2018). Negation in Chatino Sign Language. Gesture, 17(3), 330–374. https://doi.org/10.1075/gest.18017.mes CrossRef Google Scholar

Morford, J. P. (2003). Grammatical development in adolescent first-language learners. Linguistics: An Interdisciplinary Journal of the Language Sciences, 41(4), 681–721.CrossRef Google Scholar

Morford, J. P., & Hänel‐Faulhaber, B. (2011). Homesigners as late learners: Connecting the dots from delayed acquisition in childhood to sign language processing in adulthood. Language and Linguistics Compass, 5(8), 525–537. https://doi.org/10.1111/j.1749-818X.2011.00296.x CrossRef Google Scholar

Morris, D., Collett, P., Marsh, P., & O’Shaughnessy, M. (1979). Gestures: Their origin and distribution. New York, NY: Stein and Day.Google Scholar

Müller, C. (2018). Gesture and sign: Cataclysmic break or dynamic relations? Frontiers in Psychology, 9, 1651. https://doi.org/10.3389/fpsyg.2018.01651 CrossRef Google Scholar PubMed

Neisser, U. (1967). Cognitive psychology. New York, NY: Appleton-Century-Crofts.Google Scholar

Occhino, C., & Wilcox, S. (2017). Gesture or sign? A categorization problem. Behavioral and Brain Sciences, 40, e66. https://doi.org/10.1017/S0140525X15003015 CrossRef Google Scholar PubMed

Okrent, A. (2002). A modality-free notion of gesture and how it can help us with the morpheme vs. gesture question in sign language linguistics (or at least give us some criteria to work with). In Meier, R., Cormier, K., & Quinto-Pozos, D. (Eds.), Modality and structure in signed and spoken languages (pp. 175–198). Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Özçalışkan, S., & Goldin-Meadow, S. (2009). When gesture-speech combinations do and do not index linguistic change. Language and Cognitive Processes, 24(2), 190–217. https://doi.org/10.1080/01690960801956911 CrossRef Google Scholar

Pfau, R., & Steinbach, M. (2011). Grammaticalization in sign languages. In Heine, B. & Narrog, H. (Eds.), The Oxford handbook of grammaticalization (pp. 683–695). Oxford, UK: Oxford University Press.Google Scholar

Pizzuto, E. (1987). Aspetti morfosintattici. In Volterra, V. (Ed.), La Lingua Italiana dei Segni – La comunicazione visivo-gestuale dei sordi (Italian Sign Language – the visual-gestural communication of the deaf) (pp. 179–209). Bologna, Italy: Il Mulino.Google Scholar

Quer, J. (2011). When agreeing to disagree is not enough: Further arguments for the linguistic status of sign language agreement. Theoretical Linguistics, 37(3/4), 189–196. https://doi.org/10.1515/THLI.2011.014 CrossRef Google Scholar

Ruth-Hirrel, L., & Wilcox, S. (2018). Speech-gesture constructions in cognitive grammar: The case of beats and points. Cognitive Linguistics, 29(3), 453–493. https://doi.org/10.1515/cog-2017-0116 CrossRef Google Scholar

Sandler, W. (2009). Symbiotic symbolization by hand and mouth in sign language. Semiotica: Journal of the International Association for Semiotic Studies/Revue de l’Association Internationale de Sémiotique, 2009, 174, 241–275. https://doi.org/10.1515/semi.2009.035_supp-2 CrossRef Google Scholar PubMed

Senghas, A., & Coppola, M. (2001). Children creating language: How Nicaraguan Sign Language acquired a spatial grammar. Psychological Science, 12(4), 323–328.CrossRef Google Scholar PubMed

Shaffer, B., Jarque, M. J., & Wilcox, S. (2011). The expression of modality: Conversational data from two signed languages. In Nogueira, M. T. & Lopes., M. F. V. (Eds.), Modo e modalidade: gramática, discurso e interação (Mode and modality: grammar, discourse and interaction) (pp. 11–39). Fortaleza, Brazil: Edições UFC.Google Scholar

Siple, P. (Ed.). (1978). Understanding language through sign language research. New York, NY: Academic Press.Google Scholar

Stokoe, W. C. (1960). Sign language structure (Studies in Linguistics, Occasional Papers 8). Buffalo, NY: Department of Anthropology and Linguistics, University of Buffalo.Google Scholar

Stokoe, W. C. (1980). Sign language structure. Annual Review of Anthropology, 9, 365–470.CrossRef Google Scholar

Stokoe, W. C., Casterline, D., & Croneberg, C. (1965). A dictionary of American Sign Language on linguistic principles. Washington, DC: Gallaudet College Press.Google Scholar

Studdert Kennedy, M. (1987). The phoneme as a perceptuomotor structure. In Allport, D. (Ed.), Language perception and production: relationships between listening, speaking, reading, and writing (pp. 67–84). London, UK: Academic Press.Google Scholar

Talmy, L. (2018). The targeting system of language. Cambridge, MA: MIT Press.CrossRef Google Scholar

Talmy, L. (2020). Targeting in language: Unifying deixis and anaphora. Frontiers in Psychology, 11, 2016.CrossRef Google Scholar PubMed

Van Hoek, K. (1997). Anaphora and conceptual structure. Chicago, IL: University of Chicago Press.Google Scholar

Volterra, V., Capirci, O., Rinaldi, P., & Sparaci, L. (2018). From action to spoken and signed language through gesture: Some basic developmental issues for a discussion on the evolution of the human language-ready brain. Interaction Studies, 19(1–2), 216–238. https://doi.org/10.1075/is.17027.vol CrossRef Google Scholar

Wells, G. A. (1987). The origin of language: Aspects of the discussion from Condillac to Wundt. La Salle, IL: Open Court.Google Scholar

Whynot, L. A. (2016). Understanding International Sign: A sociolinguistic study. Washington, DC: Gallaudet University Press.Google Scholar

Wilbur, R. B. (2013). The point of agreement: Changing how we think about sign language, gesture, and agreement. Sign Language and Linguistics, 16(2), 221–258. https://doi.org/10.1075/sll.16.2.05wil CrossRef Google Scholar

Wilcox, S. (2004). Gesture and language: Cross-linguistic and historical data from signed languages. Gesture, 4(1), 43–73.CrossRef Google Scholar

Wilcox, S. (2007). Routes from gesture to language. In Pizzuto, E., Pietrandrea, P., & Simone, R. (Eds.), Verbal and signed languages: Comparing structures, constructs and methodologies (pp. 107–131). Berlin, Germany: Mouton de Gruyter.Google Scholar

Wilcox, S. (2009). Symbol and symptom: Routes from gesture to signed language. Annual Review of Cognitive Linguistics, 7(1), 89–110. https://doi.org/10.1075/arcl.7.04wil CrossRef Google Scholar

Wilcox, S., & Martínez, R. (2020). The conceptualization of space: Places in signed language discourse. Frontiers in Psychology: Language Sciences, 11, Article 1406. https://doi.org/10.3389/fpsyg.2020.01406 CrossRef Google Scholar PubMed

Wilcox, S., Martínez, R., & Morales, D. (2022). The conceptualization of space in signed languages: Placing the signer in narratives. In Jucker, A. & Hausendorf, H. (Eds.), Pragmatices of space (pp. 63–94). Berlin: Mouton de Gruyter. https://doi.org/10.1515/9783110693713-003 CrossRef Google Scholar

Wilcox, S., & Occhino, C. (2016a). Constructing signs: Place as a symbolic structure in signed languages. Cognitive Linguistics, 27(3), 371–404. https://doi.org/10.1515/cog-2016-0003 CrossRef Google Scholar

Wilcox, S., & Occhino, C. (2016b). Historical change in signed languages. Oxford handbooks online. https://doi.org/10.1093/oxfordhb/9780199935345 CrossRef Google Scholar

Wilcox, S., Rossini, P., & Antinoro Pizzuto, E. (2010). Grammaticalization in sign languages. In Brentari, D. (Ed.), Sign languages (pp. 332–354). Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Wilcox, S., & Wilcox, P. P. (1995). The gestural expression of modality in American Sign Language. In Bybee, J. & Fleischman, S. (Eds.), Modality in grammar and discourse (pp. 135–162). Amsterdam, the Netherlands: John Benjamins Publishing Company.CrossRef Google Scholar

Woodward, J. (1974). Implicational variation in American Sign Language: Negative incorporation. Sign Language Studies, 5, 20–30.CrossRef Google Scholar

Woodward, J. (1976a). Black Southern Signing. Language in Society, 5(2), 211–218.CrossRef Google Scholar

Woodward, J. (1976b). Signs of change: Historical variation in American Sign Language. Sign Language Studies 10, 81–94.CrossRef Google Scholar

Woodward, J. (1978). Historical bases of American Sign Language. In Siple, P. (Ed.), Understanding language through sign language research (pp. 333–348). New York, NY: Academic Press.Google Scholar

Xavier, A. N., & Wilcox, S. (2014). Necessity and possibility modals in Brazilian Sign Language (Libras). Linguistic Typology, 18, 449–488. https://doi.org/10.1515/lingty-2014-0019 CrossRef Google Scholar

References

Abner, N., Cooperrider, K., & Goldin-Meadow, S. (2015). Gesture for linguists: A handy primer. Language and Linguistics Compass, 9(11), 437–451. https://doi.org/10.1111/lnc3.12168 CrossRef Google Scholar PubMed

Andrén, M. (2010). Children’s gestures from 18 to 30 months. (Unpublished doctoral dissertation). Lund University, Lund, Sweden.Google Scholar

Andrén, M. (2014). Gestures in Northern Europe: Children’s gestures in Sweden. In Müller, C., Cienki, A., Fricke, E., Ladewig, S., McNeill, D. & Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1282–1289). Berlin, Germany: De Gruyter Mouton.Google Scholar

Antas, J., & Gembalczyk, S. (2017). The bodily expression of negation in Polish. Journal of Multimodal Communication Studies, 4(1–2), 16–22. https://ruj.uj.edu.pl/xmlui/handle/item/55971 Google Scholar

Austin, K., Theakston, A., Lieven, E., & Tomasello, M. (2014). Young children’s understanding of denial. Developmental Psychology, 50(8), 2061–2070.CrossRef Google Scholar PubMed

Beaupoil-Hourdel, P. (2015). Acquisition et expression multimodale de la négation. Étude d’un corpus vidéo et longitudinal de dyades mère-enfant Francophone et Anglophone [Acquisition and multimodal expression of negation. A longitudinal video corpus study of French- and English-speaking mother-child dyads]. (Unpublished doctoral dissertation). Université Sorbonne Paris Cité, Paris, France.Google Scholar

Beaupoil-Hourdel, P., & Debras, C. (2017). Developing communicative postures: The emergence of shrugging in child communication. Language, Interaction and Acquisition, 8(1), 89–116.CrossRef Google Scholar

Beaupoil-Hourdel, P., Morgenstern, A., & Boutet, D. (2016). A child’s multimodal negations from 1 to 4: The interplay between modalities. In Larrivée, P. & Lee, C. (Eds.), Negation and polarity: Experimental perspectives. Vol. 1, Language, cognition, and mind (pp. 95–123). Cham, Switzerland: Springer International Publishing.CrossRef Google Scholar

Beaupoil-Hourdel, P., & Morgenstern, A. (2021). How French and British children learn to shrug: A cross-linguistic developmental comparison of a recurrent gesture. Gesture, 20(2), 180–218.CrossRef Google Scholar

Bembridge, G. (2016). Negation in American Sign Language: The view from the interface. Toronto Working Papers in Linguistics (TWPL), 36, 1–20.Google Scholar

Benazzo, S., & Morgenstern, A. (2014). A bilingual child’s multimodal path into negation. Gesture, 14(2), 171–202. https://doi.org/10.1075/gest.14.2.03ben CrossRef Google Scholar

Blondel, M., Boutet, D., Beaupoil-Hourdel, P., & Morgenstern, A. (2017). La négation chez les enfants signeurs et non signeurs: Des patrons gestuels communs [Negation among signing and non-signing children: Common gesture patterns]. Language, Interaction and Acquisition, 8(1), 141–171.CrossRef Google Scholar

Boutet, D. (2008). Une morphologie de la gestualité: Structuration articulaire [A morphology of gesture: Articulatory structuring]. Cahiers de Linguistique Analogique, 5, 81–115.Google Scholar

Boutet, D. (2010). Structuration physiologique de la gestuelle: Modèle et tests [Physiological structuring of gestures: Model and tests]. Lidil. Revue de linguistique et de didactique des langues, 42, 77–96.Google Scholar

Boutet, D. (2015). Conditions formelles d’une analyse de la négation gestuelle [Formal conditions for an analysis of gestural negation]. Vestnik of Moscow State Linguistic University, 717, 116–129. http://www.vestnik-mslu.ru/vestnik.asp?vest_lang=Eng&vest_type=gum Google Scholar

Boutet, D. (2018). La création de soi par soi dans la recherche-création: Comment la réflexivité augmente la conscience et l’expérience de soi [The creation of oneself by oneself in the creation of research: How reflexivity increases self-awareness and experience]. Approches inductives: travail intellectuel et construction des connaissances [Inductive approaches: Intellectual work and knowledge construction], 5(1), 289–310.CrossRef Google Scholar

Bressem, J., & Müller, C. (2014a). A repertoire of recurrent gestures of German. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D., & Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1575–1591). Berlin, Germany: De Gruyter Mouton.Google Scholar

Bressem, J., & Müller, C. (2014b). The family of Away gestures: Negation, refusal, and negative assessment. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D., & Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1592–1604). Berlin, Germany: De Gruyter Mouton.Google Scholar

Bressem, J., & Müller, C. (2017). The “Negative-Assessment-Construction” – A multimodal pattern based on a recurrent gesture? Linguistics Vanguard, 3(s1). https://doi.org/10.1515/lingvan-2016-0053 CrossRef Google Scholar

Bressem, J., Stein, N., & Wegener, C. (2015). Structuring and highlighting speech – Discursive functions of holding away gestures in Savosavo. Paper presented at the GESPIN 4, Nantes.Google Scholar

Bressem, J., Stein, N., & Wegener, C. (2017). Multimodal language use in Savosavo. Refusing, excluding and negating with speech and gesture. Pragmatics, 27(2), 173–206. https://doi.org/10.1075/prag.27.2.01bre Google Scholar

Bressem, J., & Wegener, C. (2021). Handling talk – A cross-linguistic perspective on discursive functions of gestures in German and Savosavo. Gesture 20(2), 219–253. https://doi.org/10.1075/gest.19041.bre CrossRef Google Scholar

Brookes, H. (2004). A repertoire of South African quotable gestures. Journal of Linguistic Anthropology, 14(2), 186–224.CrossRef Google Scholar

Brookes, H. (2014). Gesture in the communicative ecology of a South African township. In Seyfeddinipur, M. & Gullberg, M. (Eds.), From gesture in conversation to visible action as utterance: Essays in honor of Adam Kendon (pp. 59–73). Amsterdam, the Netherlands: John Benjamins.Google Scholar

Brown, A., & Kamiya, M. (2019). Gesture in contexts of scopal ambiguity: Negation and quantification in English. Applied Psycholinguistics, 40(5), 1141–1172. https://doi.org/10.1017/S014271641900016X CrossRef Google Scholar

Calbris, G. (1990). The semiotics of French gesture. Bloomington, IN: Indiana University Press.Google Scholar

Calbris, G. (2003). From cutting an object to a clear-cut analysis: Gesture as the representation of a preconceptual schema linking concrete actions to abstract notions. Gesture, 3(1), 19–46.CrossRef Google Scholar

Calbris, G. (2005, June). La négation. Son symbolisme physique [Negation: Its physical symbolism]. In 2ème Conférence ISGS, Lyon (pp. 15–18). http://gesture-lyon2005.ens-lyon.fr/IMG/pdf/CalbrisFinal.pdf Google Scholar

Calbris, G. (2011). Elements of meaning in gesture. Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Calbris, G. (2013). Elements of meaning in gesture: The analogical links. In Müller, C., Cienki, A., Fricke, E., Ladewig, S., McNeill, D. & Teßendorf, S. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 1, pp. 658–674). Berlin, Germany: De Gruyter Mouton.Google Scholar

Chilton, P. (2014). Language, space and mind: The conceptual geometry of linguistic meaning. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Cienki, A. J. (2012). Usage events of spoken language and the symbolic units we (may) abstract from them. In Badio, J. & Kosecki, K. (Eds.), Cognitive processes in language (pp. 149–158). Bern, Switzerland: Peter Lang.Google Scholar

Cienki, A. (2015). Spoken language usage events. Language and Cognition, 7(4), 499–514. https://doi.org/10.1017/langcog.2015.20 CrossRef Google Scholar

Cienki, A. (2017). Ten lectures on spoken language and gesture from the perspective of cognitive linguistics: Issues of dynamicity and multimodality. Leiden, the Netherlands: Brill.CrossRef Google Scholar

Cooperrider, K. (2019). Universals and diversity in gesture: Research past, present, and future. Gesture, 18(2/3), 210–239. https://doi.org/10.1075/gest.19011.coo CrossRef Google Scholar

Darwin, C. (1872). The expression of emotions in man and animals. London, UK: John Murray.CrossRef Google Scholar

De Jorio, A. (2000). Gesture in Naples and Gesture in Classical Antiquity: A translation of la mimica degli antichi investigata nel gestire Napoletano [Gestural expression of the ancients in the light of Neapolitan gesturing]. Bloomington, IN: Indiana University Press.Google Scholar

Downing, A., & Locke, P. (2006). English grammar: A university course. New York, NY: Routledge.CrossRef Google Scholar

Efron, R. (1972). The measurement of perceptual durations. In Fraser, J. T., Haber, F. C., & Müller., G. H. (Eds.), The study of time (pp. 207–218). Berlin, Germany: Springer.CrossRef Google Scholar

Egawa, K., Aoki, T., & Hirata, Y. (1985). Kigo no jiten [A dictionary of signs]. Tokyo, Japan: Sanseido.Google Scholar

Ferre, G., & Mettouchi, A. (2020). A cross-linguistic study of open-palm hand gestures and their prosodic correlates. Proceedings of the 10th International Conference on Speech Prosody 2020, 285–289. https://doi.org/10.21437/SpeechProsody.2020-58.Google Scholar

Fricke, E. (2012). Grammatik multimodal: Wie Wörter und Gesten zusammenwirken [Multimodal grammar: How words and gestures interact]. Berlin, Germany: Walter de Gruyter.CrossRef Google Scholar

Fricke, E. (2013). Towards a unified grammar of gesture and speech: A multimodal approach. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D., & Teßendorf, S. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 1, pp. 733–754). Berlin, Germany: De Gruyter Mouton.Google Scholar

Fricke, E. (2014). Syntactic complexity in co-speech gestures: Constituency and recursion. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D. & Bressem, J. (Eds.), Body - language - communication. An international handbook on multimodality in human interaction (Vol. 2, pp. 1650–1661). Berlin, Germany: De Gruyter Mouton.Google Scholar

Floyd, S. (2018). Spoken and visual negation in two languages of Ecuador. Paper presented at the Eighth Conference of the International Society for Gesture Studies (ISGS8), Cape Town, South Africa.Google Scholar

Gawne, L. (2021). “Away” gestures associated with negative expressions in narrative discourse in Syuba (Kagate, Nepal) speakers. Semiotica, 2021(239), 37–59. https://doi.org/10.1515/sem-2017-0163 CrossRef Google Scholar

González-Fuente, S., Tubau, S., Espinal, M., & Prieto, P. (2015). Is there a universal answering strategy for rejecting negative propositions? Typological evidence on the use of prosody and gesture. Frontiers in Psychology, 6, 899. https://doi.org/10.3389/fpsyg.2015.00899 CrossRef Google Scholar

Grishina, E. (2015). O russkom zhestikulyatsionnom otritsanii [On Russian gestures of negation]. Proceedings of the Institute of the Russian Language, 6, 556–604.Google Scholar

Guidetti, M. (2000). Pragmatic study of agreement and refusal messages in young French children. Journal of Pragmatics, 32(5), 569–582. https://doi.org/10.1016/S0378-2166(99)00061-2 CrossRef Google Scholar

Guidetti, M. (2002). The emergence of pragmatics: Forms and functions of conventional gestures in young French children. First Language, 22(3), 265–285. https://doi.org/10.1177/014272370202206603 CrossRef Google Scholar

Guidetti, M. (2005). Yes or no? How young French children combine gestures and speech to agree and refuse. Journal of Child Language, 32(4), 911–924. https://doi.org/10.1017/S0305000905007038 CrossRef Google Scholar PubMed

Harrison, S. (2009a). Grammar, gesture, and cognition: The case of negation in English. (Unpublished doctoral dissertation). Université Michel de Montaigne, Bordeaux, France.Google Scholar

Harrison, S. (2009b). The expression of negation through grammar and gesture. In Zlatev, J., Andrén, M., Johansson Falck, M. M., & Lundmark, C. (Eds.), Studies in language and cognition (pp. 421–435). Cambridge, UK: Cambridge Scholars Publishing.Google Scholar

Harrison, S. (2010). Evidence for node and scope of negation in coverbal gesture. Gesture, 10(1), 29–51. https://doi.org/10.1075/gest.10.1.03har CrossRef Google Scholar

Harrison, S. (2013, June). The temporal organisation of negation gestures in relation to speech. Proceedings of the Tilburg Gesture Research Meeting, Tilburg. https://tiger.uvt.nl/pdf/papers/harrison.pdf Google Scholar

Harrison, S. (2014a). Head shakes: Variation in form, function, and cultural distribution of a head movement related to “no”. In Müller, C., Cienki, A., Fricke, E., McNeill, D., & Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1496–1501). Berlin, Germany: De Gruyter Mouton.Google Scholar

Harrison, S. (2014b). The organisation of kinesic ensembles associated with negation. Gesture, 14(2), 117–41. https://doi.org/10.1075/gest.14.2.01har CrossRef Google Scholar

Harrison, S. (2018). The impulse to gesture: Where language, minds, and bodies intersect. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Harrison, S. (2021). The feel of a recurrent gesture: Embedding the Vertical Palm within a gift-giving episode in China (aka the “seesaw battle”). Gesture, 20(2), 254–284. https://doi.org/10.1075/gest.21003.har CrossRef Google Scholar

Harrison, S., & Ladewig, S. (2021). Recurrent gestures across bodies, languages, and cultural practices. Gesture, 20(2), 153–179. https://doi.org/10.1075/gest.21014.har CrossRef Google Scholar

Harrison, S., & Larrivée, P. (2016). Morphosyntactic correlates of gestures: A gesture associated with negation in French and its organisation with speech. In Larrivée, P. & Chungmin, L. (Eds.), Negation and negative polarity. Experimental and cognitive perspectives (pp. 75–94). Dordrecht, the Netherlands: Springer.Google Scholar

Hotze, L. (2019), Multimodale Kommunikation in den Vorschuljahren – Zur Verschränkung von Sprache und Gestik in der kindlichen Entwicklung [Multimodal communication in the preschool years – On the interweaving of language and gestures in child development]. (Unpublished doctoral dissertation). Europa-Universität Viadrina, Frankfurt/Oder, Germany.Google Scholar

Horn, L. R. (1989). A natural history of negation. Chicago, IL: University of Chicago Press.Google Scholar

Horn, L. R., & Wansing, H. (2020). Negation. In Zalta, E. N. (Ed.), The Stanford encyclopedia of philosophy archive (Spring 2020 Edition). https://plato.stanford.edu/archives/spr2020/entries/negation/Google Scholar

Huddleston, R. D., & Pullum, G. K. (2005). A student’s introduction to English grammar. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Inbar, A., & Shor, L. (2017). Negation in spoken Israeli Hebrew: The interplay between grammar and gestures. Presented at the 44^th Annual Meeting of the Israeli Association of Applied Linguistics, Arugot, Israel.Google Scholar

Inbar, A., & Shor, L. (2019). Covert negation in Israeli Hebrew: Evidence from co-speech gestures. Journal of Pragmatics, 143, 85–95. https://doi.org/10.1016/j.pragma.2019.02.011 CrossRef Google Scholar

Johnson, M. (1987). The body in the mind: The bodily basis of meaning, imagination and reason. Chicago, IL: University of Chicago Press.CrossRef Google Scholar

Jungheim, N. O. (2004). Hand in hand: A comparison of gestures accompanying Japanese native speaker and JSL learner refusals. The Journal of Applied Learning & Teaching, 26(2), 127–146. https://doi.org/10.37546/JALTJJ26.2-1 Google Scholar

Jungheim, N. O. (2006). Learner and native speaker perspectives on a culturally-specific Japanese refusal gesture. International Review of Applied Linguistics in Language Teaching, 44(2), 125–143. https://doi.org/10.1515/IRAL.2006.005 CrossRef Google Scholar

Jungheim, N. O. (2008). Language learner and native speaker perceptions of Japanese refusal gestures portrayed in video. In McCafferty, S. G. & Stam, G. (Eds.), Gesture: Second language acquisition and classroom research (pp. 169–194). New York, NY: Routledge.Google Scholar

Jungheim, N. O. (2009). Japanese refusals and obligatory contexts for gestures. Eibungaku [English literature], 95, 1–18.Google Scholar

Jungheim, N. O. (2011). Web-based evaluation of EFL learners’ pragmatic competence utilizing video stimuli. Waseda Review of Education, 25(1), 147–167.Google Scholar

Jungheim, N. O. (2013). The interaction of language and nonverbal behavior influencing the perception of Japanese refusals. Departmental Bulletin Paper (Waseda University). http://hdl.handle.net/2065/51466 Google Scholar

Kendon, A. (1995). Gestures as illocutionary and discourse structure markers in Southern Italian conversation. Journal of Pragmatics, 23 (3), 247–279. https://doi.org/10.1016/0378-2166(94)00037-F CrossRef Google Scholar

Kendon, A. (2002). Some uses of the head shake. Gesture, 2(2), 147–182. https://doi.org/10.1075/gest.2.2.03ken CrossRef Google Scholar

Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Kendon, A. (2008). Some reflections on the relationship between “gesture” and “sign”. Gesture, 8(3), 348–366. https://doi.org/10.1075/gest.8.3.05ken CrossRef Google Scholar

Kendon, A. (2017). Pragmatic functions of gestures. Some observations on the history of their study and their nature. Gesture, 16(2), 157–175.CrossRef Google Scholar

Ladewig, S. H. (2014). Recurrent gestures. In Müller, C., Cienki, A., Fricke, E., Ladewig, S., McNeill, D. & Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1558–1574). Berlin, Germany: De Gruyter Mouton.Google Scholar

Ladewig, S., & Hotze, L. (2021). The Slapping movement as an embodied practice of dislike among children. Inter-affectivity in interactions among children. Gesture 20(2), 285–312. https://doi.org/10.1075/gest.21013.lad CrossRef Google Scholar

Lakoff, G. (1987). Women, fire, and dangerous things. What categories reveal about the mind. Chicago, IL: University of Chicago Press.CrossRef Google Scholar

Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago, IL: University of Chicago Press.Google Scholar

Lapaire, J.-R. (2002). Imaginative grammar. In Lewandowska-Tomaszczyk, B., Turewicz, K., & Hanway Poulosky, L. (Eds.), Cognitive linguistics today (pp. 623–642). Bern, Switzerland: Peter Lang.Google Scholar

Lapaire, J.-R. (2005). La grammaire anglaise en mouvement [English grammar in motion]. Paris, France: Hachette.Google Scholar

Lapaire, J.-R. (2006a). Negation, reification and manipulation in a cognitive grammar of substance. In Bonnefille, S. & Salbayre, S. (Eds.), La négation (pp. 333–349). Tours, France: Press Universitaires François-Rabelais. https://books.openedition.org/pufr/4853 CrossRef Google Scholar

Lapaire, J.-R. (2006b). From sensory to propositional modality: Towards a phenomenology of epistemic modal meanings. Corela: Cognition, Représentation, Langage, 4(1). Retrieved from: http://journals.openedition.org/corela/441 Google Scholar

Lapaire, J.-R. (2011). Grammar, gesture and cognition: Insights from multimodal utterances and applications for gesture analysis. Visnyk of the Lviv University. Series Philology, 52, 88–103.Google Scholar

Lapaire, J.-R. (2013). Gestualité cogrammaticale: De l’action corporelle spontanée aux postures de travail métagestuel guidé. Maybe et le balancement épistémique en anglais [Cogrammatical gestures: From spontaneous bodily action to guided metagestural work postures. Maybe and the epistemic swing in English]. Langages, 192, 57–72. https://doi.org/10.3917/lang.192.0057 CrossRef Google Scholar

Lapaire, J.-R. (2016). From ontological metaphor to semiotic make-believe: Giving shape and substance to fictive objects of conception with the “globe gesture”. Santa Cruz do Sul, 41(70), 29–44.Google Scholar

Lawler, J. (2005). Negation and NPIs. Retrieved August 18, 2020 from www.personal.umich.edu/~jlawler/NPIs.pdf Google Scholar

Li, C. N., & Thompson, S. A. (1989). Mandarin Chinese: A functional reference grammar. Berkeley, CA: University of California Press.Google Scholar

Li, F., Borràs-Comes, J., & Espinal, M. T. (2019). Mismatches in the interpretation of fragment negative expressions in Mandarin Chinese. Journal of Pragmatics, 152, 28–45. https://doi.org/10.1016/j.pragma.2019.07.017 CrossRef Google Scholar

Li, F., González-Fuente, S., Prieto, P., & Espinal, M. T. (2016). Is Mandarin Chinese a truth-based language? Rejecting responses to negative assertions and questions. Frontiers in Psychology, 7, 1967. https://doi.org/10.3389/fpsyg.2016.01967 CrossRef Google Scholar PubMed

Lima, C. V. D. (2017). A multimodalidade na conversa face a face em episódios de desacordo [Multimodality in face-to-face conversation in episodes of disagreement]. (Unpublished doctoral dissertation). Universidade de São Paulo, Brazil.Google Scholar

Liskova, E. (2012). Negation of KNOW, WANT, LIKE, HAVE, and GOOD in American Sign Language. (Unpublished Master’s thesis). University of Texas at Austin.Google Scholar

McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago, IL: University of Chicago Press.Google Scholar

McNeill, D. (2005). Gesture and thought. Chicago, IL: University of Chicago Press.CrossRef Google Scholar

McNeill, D. (2006) Gesture, gaze, and ground. In Renals, S. & Bengio, S. (Eds.), Machine learning for multimodal interaction (pp. 1–14). Berlin, Germany: Springer.Google Scholar

McNeill, D. (2012). How language began: Gesture and speech in human evolution. Cambridge, UK: Cambridge University PressCrossRef Google Scholar

Marsaja, G. I. (2008). Desa kolok: A deaf village and its sign language in Bali, Indonesia. Nijmegen, the Netherlands: Ishara Press.Google Scholar

Mesh, K., & Hou, L. (2018). Negation in San Juan Quiahije Chatino Sign Language: The integration and adaptation of conventional gestures. Gesture, 17(3), 330–374. https://doi.org/10.1075/gest.18017.mes CrossRef Google Scholar

Montes, M., & Graciela, R. (2003). “Haciendo a un lado”: gestos de desconfirmación en el habla mexicano [“Putting aside”: Gestures of disconfirmation in Mexican speech.]. IZTAPALAPA, 53, 248–267.Google Scholar

Morgenstern, A., & Beaupoil-Hourdel, P. (2015, July). Children’s multimodal grammar under construction: The example of negation. Paper presented at ICLC201, University of Leicester, United Kingdom.Google Scholar

Morgenstern, A., Beaupoil-Hourdel, P., Blondel, M., & Boutet, D. (2016). A multimodal approach to the development of negation in signed and spoken languages: Four case studies. In Ortega, L., Tyler, A. E., Park, H. I., & Uno, M. (Eds.). The usage-based study of language learning and multilingualism (pp. 15–36). Washington, DC: Georgetown University Press.Google Scholar

Morgenstern, A., Blondel, M., Beaupoil-Hourdel, P., Benazzo, S., Boutet, D., Kochan, A., & Limousin, F. (2018). The blossoming of negation in gesture sign and oral productions. In Hickmann, M., Veneziano, E., & Jisa, H. (Eds.), Sources of variation in first language acquisition: Languages, contexts, and learners (pp. 339–364). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Morgenstern, A., & Parisse, C. (2007). Codage et interprétation du langage spontané d’enfants de 1 à 3 ans [Coding and interpretation of the spontaneous language of children aged 1 to 3 years]. Corpus, 6, 55–78.CrossRef Google Scholar

Morgenstern, A., & Parisse, C. (2012). The Paris Corpus. Journal of French Language Studies, 22(Special issue 1), 7–12.CrossRef Google Scholar

Morris, D. (1994). Bodytalk: A world guide to gestures. London, UK: Jonathon Cape.Google Scholar

Müller, C. (2004). Forms and uses of the Palm Up Open Hand. A case of a gesture family? In Posner, R. & Müller, C. (Eds.), The semantics and pragmatics of everyday gestures (pp. 234–256). Berlin, Germany: Weidler Buchverlag.Google Scholar

Müller, C. (2013). Gestures as a medium of expression: The linguistic potential of gestures. In Müller, C., Cienki, A., Fricke, E., Ladewig, S. H., McNeill, D., & Teßendorf, S. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 1, pp. 202–217). Berlin, Germany: De Gruyter Mouton.Google Scholar

Müller, C. (2017). How recurrent gestures mean: Conventionalised contexts-of-use and embodied motivation. Gesture, 16(2), 276–303. https://doi.org/10.1075/gest.16.2.05mul CrossRef Google Scholar

Müller, C. (2018). Gesture and sign: Cataclysmic break or dynamic relations? Frontiers in Psychology, 9, 1651. https://doi.org/10.3389/fpsyg.2018.01651 CrossRef Google Scholar PubMed

Müller, C., & Ladewig, S. H. (2013). Metaphors for sensorimotor experiences: Gestures as embodied and dynamic conceptualizations of balance in dance lessons. In Borkent, M., Dancygier, B., & Hinnell, J. (Eds.), Language and the creative mind (pp. 295–324). Stanford, CA: CSLI Publications.Google Scholar

Müller, C., & Speckmann, G. (2002). Gestos con una valoración negativa en la conversación cubana [Gestures with a negative evaluation in Cuban conversation]. DeSignis, 3, 91–103.Google Scholar

Palfreyman, N. (2019). Variation in Indonesian Sign Language: A typological and sociolinguistic analysis. Berlin, Germany: De Gruyter Mouton.CrossRef Google Scholar

Piontek, D., & Tadeusz-Ciesielczyk, M. (2019). Nonverbal components of the populist style of political communication: A study on televised presidential debates in Poland. Central European Journal of Communication, 12(2), 150–168. https://doi.org/10.19195/1899-5101.12.2(23).3 CrossRef Google Scholar

Prieto, P., Borràs-Comes, J., Tubau, S., & Espinal, M. T. (2013) Prosody and gesture constrain the interpretation of double negation. Lingua, 131, 136–150. https://doi.org/10.1016/j.lingua.2013.02.008 CrossRef Google Scholar

Prieto, P., & Espinal, M. T. (2020). Prosody, gesture, and negation. In Deprez, V. & Teresa Espinal, M. (Eds.), The Oxford handbook of negation (pp. 667–693). Oxford, UK: Oxford University Press.Google Scholar

Schoonjans, S. (2017). Nonmanual downtoning in German co-speech gesture and in German Sign Language. Yearbook of the German Cognitive Linguistics Association, 5(1), 85–100.CrossRef Google Scholar

Schoonjans, S. (2018). Modalpartikeln als multimodale Konstruktionen: Eine korpusbasierte Kookkurrenzanalyse von Modalpartikeln und Gestik im Deutschen [Modal particles as multimodal constructions: A corpus-based co-occurrence analysis of modal particles and gestures in German]. Berlin, Germany: De Gruyter.CrossRef Google Scholar

Shor, L. (2020). Negation in Israeli Hebrew. In Nir, B. & Berman, R. (Eds.), Usage based studies in Modern Hebrew (pp. 583–621). Amsterdam, the Netherlands: John Benjamins.CrossRef Google Scholar

Steen, F., & Turner, M. B. (2013). Multimodal construction grammar. In Borkent, M., Dancygier, B., & Hinnell, J. (Eds.), Language and the creative mind (pp. 255–274). Stanford, CA: CSLI Publications.Google Scholar

Streeck, J. (2009). Gesturecraft: The manu-facture of meaning. Amsterdam, the Netherlands: John Benjamins Publishing.CrossRef Google Scholar

Streeck, J. (2017). Self-making man: A day of action, life, and language. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Sweetser, E. (1990). From etymology to pragmatics: Metaphorical and cultural aspects of semantic structure. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Tano, J., & Houphouet-Boigny, F. (2018). Deaf parents and their children’s gestures and signs of negation: The case of rural and urban deaf families in Côte d’Ivoire. Paper presented at the Eighth Conference of the International Society for Gesture Studies (ISGS8), Cape Town, South Africa.Google Scholar

Teßendorf, S. (2014). Pragmatic and metaphoric – combining functional with cognitive approaches in the analysis of the “brushing aside gesture”. In Müller, C., Cienki, A., Fricke, E., McNeill, D., and Bressem, J. (Eds.), Body - language - communication: An international handbook on multimodality in human interaction (Vol. 2, pp. 1540–1558). Berlin, Germany: De Gruyter Mouton.Google Scholar

Teßendorf, S. (2016). Actions as sources of gestures. In Fernández-Villanueva, M. & Jungbluth, K. (Eds.), Beyond language boundaries: Multimodal use in multilingual contexts (pp. 34–54). Berlin, Germany: De Gruyter.CrossRef Google Scholar

Tubau, S., González-Fuente, S., Prieto, P., & Espinal, M. T. (2015). Prosody and gesture in the interpretation of yes-answers to negative yes/no-questions. The Linguistic Review, 32 (1), 115–142. https://doi.org/10.1515/tlr-2014-0016 CrossRef Google Scholar

Wegener, C., & Bressem, J. (2019). Sharing the load: The interplay of verbal and gestural negation in Savosavo. Poster presented at LingCologne2019, University of Cologne, Germany. http://www.janabressem.de/wp-content/uploads/2016/10/Wegener_Bressem_LingCologne2019.pdf Google Scholar

Will, I. (2018). From cleaning to totality: The semantic core of the “dusting off palms” gesture among the Hausa of Northern Nigeria. Studies in African Languages and Cultures, 52, 87–111.CrossRef Google Scholar

Yang, J. H., & Fischer, S. D. (2002). Expressing negation in Chinese Sign Language. Sign Language & Linguistics, 5(2), 167–202. https://doi.org/10.1075/sll.5.2.05yan CrossRef Google Scholar

Zeshan, U. (2004). Hand, head, and face: Negative constructions in sign languages. Linguistic Typology, 8 (1), 1–58. https://doi.org/10.1515/lity.2004.003 CrossRef Google Scholar

Zima, E., & Bergs, A. (2017). Multimodality and construction grammar. Linguistics Vanguard, 3(s1). https://doi.org/10.1515/lingvan-2016-1006 CrossRef Google Scholar

Figure 14.1 Are you finished + palm down lateral movement (turn 4)

Figure 14.2 Ellie’s resonance with her mother’s recurrent gesture (turn 7)

Figure 14.3 Ellie’s preparation for avoidance

Figure 14.4 Avoidance and refusal

Figure 14.5 Request to get out of the chair

Figure 14.6 a, b Smacking lips

Figure 14.7 a, b Smiling, lifting shoulders, and opening arms

Figure 14.8 “I think it was this big”

Figure 14.9 a, b Size readjustment: “this big?”

Figure 14.10 “the Mum is this big”

Figure 14.11 Narrative space

Figure 14.12 Discourse space

(reported speech inside narration)

Figure 14.13 a, b, c [régler + recurrent gesture]

Figure 16.1 Depart gesture

(by permission of Advanced Reasoning Forum)

Figure 16.2 LSF sign IL FAUT and ASL modal sign MUST

(LSF image with permission from IVT-International Visual Theatre)

Figure 16.3 Pointing construction

Figure 17.1 Geographical coverage of the attested relation between gesture and negation in spoken languages

Figure 17.2 Plural motivation of the “Level Hand” gesture

(Calbris, 2011, p. 183, reproduced with permission from John Benjamins Publishing Company)

Figure 17.3 Invariant feature in different orientations of the palm: a. pronation/palm down, b. pronation/palm forward, c. pronation/palm sideways

(Boutet, 2015, p. 118; article published under a Creative Commons Attribution 4.0 License)

Figure 17.4 Three Horizontal Palm gestures based on different underlying actions: PDAcross (“sweeping away”), 2PDmid (‘clearing aside’), and 2PDAcross (‘cutting through’)

(Harrison, 2018, reproduced with permission of Cambridge University Press through PLSclear)

Figure 17.5 Functional explanation: “sharing the load”

(Wegener & Bressem, 2019, with permission from the authors)

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the HTML of this chapter is currently unknown and may be updated in the future.

Book contents

Part III - Gestures and Language

Summary

Information

1 Introduction

2 Different Ways of Understanding Gesture

3 Monosemiotic Gestural Theories

3.1 General Features

3.2 Hewes’ Gestural Primacy Hypothesis

3.3 Stokoe and Research in Signed Languages

3.4 Corballis’ Manual Protolanguage

3.5 Arbib’s Mirror Neuron Hypothesis

3.6 Tomasello’s Pointing and Pantomime

3.7 Evaluation

4 Equipollent Polysemiotic Theories

4.1 General Features

4.2 McNeill’s Growth Point

4.3 Kendon’s Languaging

4.4 Evaluation

5 Pantomimic Polysemiotic Theories

5.1 General Features

5.2 Gärdenfors’ Pedagogy

5.3 Zlatev’s Mimesis Hierarchy

5.4 Similar Accounts

5.5 Evaluation

6 Conclusions

1 Introduction

2 Historical Background and Evolution of Theoretical Approaches and Methods

3 Gesture and Scaffolding in the Adult Input

Table 14.1 Example (1). Video (1). Ellie, 10 months.Footnote 5 https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%201-Ellie-0–10-finished%20-%20gesture.mp4 (Mother is taking care of her child (CHI) Ellie; Grandmother is filming.)

4 Children’s Gestures before They Produce Spoken Language

Table 14.2 Example (2). Video (2). Ellie one year and two months. Between action and gesture: headshakes. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%202-Ellie-1–02%20anticipation-grounded%20in%20routines.mp4

5 Children’s Gestures as They Enter Verbal Language

Table 14.3 Example 3. Video 3. Ellie, one year and six months. Refusal. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%203-Ellie-1–02-Tangerine%20or%20ball.mp4 Ellie is in her highchair. Her bowl of food with chicken and fish is almost finished.

Table 14.4 Example 4. Video 4. Ellie, one year and 11 months. Echoing speech with gesture. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%204-Ellie-1–11-stir.mp4

Table 14.5 Example 5. Video 5. Ellie, one year and nine months. Pointing + adjective. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%205-Ellie-1–09-pointing%20and%20word.mp4

6 Children’s Gestures When Verbal Language Is Mastered and Dominant

Table 14.6 Example 6. Video 6. Ellie, four years and two months. Big. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%206-Ellie-4_02-this%20big.mp4 Ellie is co-narrating for her grandmother, with her mother’s help, her visit to a zoo.

Table 14.7 Example 7. Video 7. Madeleine, six years and 11 months. https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%207-Mad-6_11-mince.mp4

7 Conclusion

1 Introduction

2 Gestures in Acquisition

2.1 The Influence of Other Languages in SLA, Cross-Linguistic Influence

2.2 General Learner Phenomena

3 Gestures in L2 Interaction and Instruction

3.1 Teachers’ and Learners’ Gestural Practices in and outside of Language Classrooms

3.2 The Effects of Seeing and Producing Gestures on SLA

4 A Research Agenda for Gestures and SLA?

5. Conclusions

1 Signed Languages

2 Relation of Sign and Gesture: Historical Background

3 Sign and Gesture in Acquisition

4 The Relation of Sign and Gesture: Current Views

5 Sign and Gesture Revisited

1 Introduction

2 Gestures Associated with Negation

2.1 Geographical Coverage

Table 17.1 Widespread observations of gestures associated with negation, classified by language familyFootnote 2

3.2 Forms, Organizational Properties, and Functions of Gestures Associated with Negation

3.2.1 Context-of-Use, Kinesic and Semantic Core, Underlying Action

3.2.2 Analogical Links between Gestures and Negation

3.2.3 Relations to Aspects of Physical Action

3.2.4 Developmental Pathways into Multimodal Negation

3.2.5 Kinesiological Perspectives on Negation

3.2.6 Kinesic Organization in the Grammar–Gesture Nexus

4 Explanations for the Occurrence of Gestures Associated with Negation

4.1 Gestures as Semantic Operators

4.2 Manifestation of a Cognitive Domain

4.3 Functional Load Sharing and Encoding Strategies

4.4 Psycholinguistic Perspectives

5 Theoretical and Applied Contributions

5.1 Multimodality of Grammar

5.2 Embodiment of Cognition

5.3 Relations between Gesture and Sign

6 Conclusion

Footnotes

13 The Role of Gesture in Debates on the Origins of Language

14 Gesture and First Language Development: The Multimodal Child

15 Gesture and Second/Foreign Language Acquisition

16 Gesture and Sign Language

Table 14.1 Example (1). Video (1). Ellie, 10 months.Footnote ⁵ https://repository.ortolang.fr/api/content/cup-morgenstern/head/video%201-Ellie-0–10-finished%20-%20gesture.mp4 (Mother is taking care of her child (CHI) Ellie; Grandmother is filming.)

Table 17.1 Widespread observations of gestures associated with negation, classified by language familyFootnote ²