Sign advantage: Both children and adults ’ spatial expressions in sign are more informative than those in speech and gestures combined

Expressing Left-Right relations is challenging for speaking-children. Yet, this challenge was absent for signing-children, possibly due to iconicity in the visual-spatial modality of expression. We investigate whether there is also a modality advantage when speaking-children ’ s co-speech gestures are considered. Eight-year-old child and adult hearing mono-lingual Turkish speakers and deaf signers of Turkish-Sign-Language described pictures of objects in various spatial relations. Descriptions were coded for informativeness in speech, sign, and speech-gesture combinations for encoding Left-Right relations. The use of co-speech gestures increased the informativeness of speakers ’ spatial expressions compared to speech-only. This pattern was more prominent for children than adults. However, signing-adults and children were more informative than child and adult speakers even when co-speech gestures were considered. Thus, both speaking-and signing-children benefit from iconic expressions in visual modality. Finally, in each modality, children were lessinformativethanadults,pointingtothechallengeofthisspatialdomainindevelopment.


Introduction
Children, from early on, see and interact with the objects surrounding them (e.g., a fork next to a plate).They also need to communicate about these objects and the spatial relations between them to function and navigate successfully in the world.To do so, children need to learn how to map the linguistic expressions in their specific languages to spatial relations.Previous work has shown that children learning different spoken languages show considerable variability in learning to encode spatial relations (e.g., Bowerman, 1996aBowerman, , 1996b;;Johnston & Slobin, 1979).However, it is not known whether the development of spatial language use can be modulated by visually motivated form-meaning mappings (i.e., iconicity; Perniss, Thompson, & Vigliocco, 2010) as in the case of sign languages and/or co-speech gestures.Speakers and signers can use iconicity to map the relative relations of objects in real space onto sign and gesture space in an analogue manner (e.g., Emmorey, 2002;Perniss, 2007).In this study, we aim to investigate whether such iconic affordances of visual expressions provide an advantage for children compared to the use of arbitrary expressions in speech.To do so, we focus on encoding Left-Right relations, which have been found to be challenging for children learning spoken languages.
In this paper, by taking a multimodal approach to the development of spatial language, we investigate whether iconic expressions provide linguistic and expressive tools for children and adults to convey more spatial information than speech alone.To do so, we study child and adult signers of Turkish Sign language (Türk İşaret Dili, TİD) and child and adult speakers of Turkish.In the sections that follow, we first describe what is known about the linguistic expressions of locative relations specifically focusing on Left-Right in speech, sign, and co-speech gestures.Next, we review the literature on the development of such expressions in different modalities.Based on this literature, we derive a set of predictions on whether visual modality of expression modulates the development of spatial language use in childhood and whether these patterns carry into adulthood.

Linguistic encoding of locative relations
The linguistic encoding of locative spatial relations requires the mention of Figure and Ground objects as well as the spatial relation between them.In a spatial configuration, the Figure refers to the smaller and foregrounded object, which is located with respect to a backgrounded, and usually bigger object, known as the Ground (Talmy, 1985).Figure 1 depicts various locative spatial relations between the pen (Figure ) and the paper (Ground).Descriptions of locative spatial relations can vary in requiring an external perspective, which may be viewer-or environment-centered (Levinson, 1996(Levinson, , 2003;;Majid, 2002;Pederson, Danziger, Wilkins, Levinson, Kita, & Senft, 1998; see also Li & Gleitman, 2002).In this study, we are interested in the viewer-centered spatial relations that are especially likely to manifest in cases where Ground objects do not have intrinsic features, and thus require speakers to consider a viewpoint in using spatial terms.For instance, in Figure 1a, the spatial relation between the objects is independent of the viewpoint of the observer.However, in some spatial relations, such as Left-Right (Figure 1b) or Front-Behind (Figure 1c), the spatial relation between the objects depends on the viewpoint of the observer (see Martin & Sera, 2006 for a discussion; see also Landau, 2017;Levinson, 2003).For Front-Behind, informational cues such as visibility (in the case of Front) and occlusion (in the case of Behind) provide information for the asymmetrical relationship that helps distinguish the two spatial relations from each other (Grigoroglou et al., 2019).The case of Left-Right, however, does not contain any informational cues to distinguish them from each other and remains to be two categorically distinct symmetrical spatial layouts.The current study focuses on encoding of Left-Right relations in which language users need to be explicit in their descriptions to be informative.
Linguistic encoding of space in speech, sign and co-speech gestures Speech In encoding locative spatial relations, speech transforms visual and three-dimensional experiences into categorical linguistic forms that have an arbitrary relationship to their meaning.For instance, in order to describe the spatial relation between the pen and the paper in Figure 1b, English speakers might rely on prepositional phrases with Left or Rightdepending on their viewpoint.Alternatively, in order to describe the spatial relation between the objects in the same picture, English speakers may use general spatial terms such as Next to.However, the latter description might be underinformative in certain contextsfor example, when distinguishing between two categorical layouts, such as Left versus Right, because it fails to specify the exact spatial relation between the objects compared to expressions using Left-Right spatial terms.
In this study, following Sümer (2015), we focus on descriptions in Turkish.For describing the picture in Figure 1b in an informative way (i.e., to distinguish Left from Right), Turkish speakers use Sol 'Left' or Sağ 'Right'.Alternatively, Turkish speakers can use a general relational term Yan 'Side'.This general relational term in Turkish (unlike Next to in English) can be used to refer to any side of an object, including its Front and Back.Thus, when Yan 'Side' is used, it is rather under-informative and cannot distinguish one viewpoint-dependent relation from another.Therefore, in Turkish, Left-Right relations are most informatively described when specific spatial terms are used.It should be noted that Turkish speakers typically describe viewpoint-dependent relations from their viewpoint (Sümer, 2015).More information regarding the descriptions in Turkish is provided in the coding section.

Sign
In encoding locative spatial relations, sign languages incorporate linguistic forms that bear iconic links to their meanings.The most frequent iconic form for describing spatial relations, including Left-Right, is through the use of morphologically complex classifier constructions, as shown in Figure 2d (Emmorey, 2002;Janke & Marshall, 2017;Perniss, Zwitserlood, & Özyürek, 2015a;Supalla, 1982;Zwitserlood, 2012).In these constructions, the location of the hands encodes the location of the objects with respect to each other, while the handshape encodes objects' shape information (Emmorey, 2002;Perniss et al., 2015a;Supalla, 1982;Zwitserlood, 2012).To illustrate, while describing the spatial relation between the cup and the toothbrush, signers first introduce the lexical signs for the cup (Figure 2a) and the toothbrush (Figure 2c), and later they choose classifier handshapes to indicate the size and shape of these two objects (e.g., Figure 2d).More specifically, signers choose a round handshape to represent the round nature of the cup and an elongated handshape (i.e., index finger) to represent the shape of the toothbrush.Later, they position their hands in the signing space in a way analogue to the spatial relations in the picture.Thus, the representation of spatial relations between objects on the signing space maps onto the exact spatial relation between the objects in real space from a specific viewpoint (mainly signer/viewer viewpoint).For instance, if the toothbrush was located on the right of the cup, then, the signer would have positioned her handshape with an index finger for the toothbrush to the right side of the classifier handshape used for locating the cup to her left from her viewpoint.This then allows a diagrammatically iconic expression considering the relative locations of objects (Perniss, 2007).In addition to classifier constructions, signers can use other linguistic formsalbeit less frequentlyto express the spatial relation between objects.These include relational lexemes (Arık, 2003;Sümer, 2015), tracing the shape of the objects and locating them on the signing space (Perniss et al., 2015a), pointing to indicate the object's location in the signing space (Karadöller, Sümer, & Özyürek, 2021), and lexical verb placements (Newport, 1988) (See coding section and Figures 9, 10, for more details).Even though the handshapes in these forms are not iconic themselves, similar to classifier constructions, all of these forms give iconic information about the relative spatial locations of the objects with respect to each other from signers' viewpoint in a diagrammatically iconic way.In this sense, they are almost always informative in conveying object locations and differ from the under-informative expressions in spoken languages (e.g., Yan 'Side' in Turkish or Next to in English), which fail to distinguish between the two symmetrical layouts.

Co-speech gestures
Visual modes of expressions allowing iconic and analogue encodings are not specific to sign languages.These types of expressions can be found in spoken languages in the form of co-speech gestures (Kendon, 2004;Kita & Özyürek, 2003;McNeill, 1992McNeill, , 2005;;Özyürek, 2018).Co-speech gestures can be used to indicate locations of objects in gesture space in an analogue manner due to their iconic affordances.Therefore, spoken expressions accompanied by gestures might convey more spatial information than speech alone.For instance, when describing locations in space, speakers sometimes encode space in an ambiguous way (e.g.,Here or There) in speech while also using gestures to indicate relative locations of entities in space (McNeill, 2005;Peeters & Özyürek, 2016).Figure 3 exemplifies the use of a directional pointing gesture used with speech.In this example, although speech fails to give information regarding the exact spatial relation between the objects, the directional pointing gesture to the right gestural space indicates that the fork is on the right side.In this sense, even a pointing gesture can map the location of an object in real space to gesture space in a diagrammatically iconic way in relation to speaker's body.In such descriptions, gestures might serve as a helpful tool during communication by disambiguating information conveyed in speech (McNeill, 1992; see more examples in the coding section) and thus contribute to linguistic encoding of the spatial relation.Figure 3.An example from a Turkish speaker using a pointing gesture towards the right while mentioning "Side" in speech.
Notes.The underlined word denotes the speech that gesture temporally overlaps with.The description is informative only when both speech and gesture are considered.
Learning to encode of Left-Right is considered to be a two-step process.First, children develop a conceptual understanding of their own Left-Right and map relevant spatial terms to refer to their own body (Howard & Templeton, 1966).As a next step, they map these spatial terms on other people's left and right hands/legs (Howard & Templeton, 1966;Piaget, 1972).Encoding Left-Right relations between objects appears even later (e.g., Sümer et al., 2014).Even though speaking-children use Left-Right spatial terms to encode spatial relations between objects around ages 8-10, they still use them less frequently compared to adults and often provide incorrect or missing information in their speech alone descriptions (see Abarbanell &Li, 2021 andSümer et al., 2014 for the use of alternative spatial terms, such as Front, for describing Left-Right).This has been attributed to the symmetrical nature of Left-Right, which makes it hard to distinguish Left and Right from each other.

Sign
Recent research on sign language acquisition raises the possibility that the abovementioned development of learning to encode Left-Right in speech might not be a reflection of a challenge in conceptual development.Rather, it might be due to the difficulty of mapping arbitrary and categorical terms onto Left-Right relations.If this is the case, iconic affordances of sign languages can facilitate children's encoding of Left-Right relations.Empirical support for this claim comes from a study conducted by Sümer et al. (2014) showing that TİD signing-children can produce expressions of Left-Right relations in adult-like ways earlier than Turkish-speaking-children when only speech is considered.Importantly, this advantage has not been found for other spatial relations, such as In-On-Under (Sümer & Özyürek, 2020).The advantage found for signingchildren in encoding Left-Right cannot be explained by morphological complexity, lexical diversity, or other typological differences between Turkish and TİD as these were similar across expressions used for Left-Right and In-On-Under.Instead, this advantage seems to be best explained by the iconic affordances of sign languages that allow iconic mappings of the spatial relations onto the signing space (Emmorey, 2002) that possibly ease the encoding of cognitively challenging spatial relations.This possibility has been supported by the early use of classifier constructions as well as relational lexemes that directly map relations onto the right or left side of the body (Karadöller et al., 2021;Sümer, 2015;Sümer et al., 2014 for TİD;Manhardt, Özyürek, Sümer, Mulder, Karadöller, & Brouwer, 2020 for Sign Language of the Netherlands).See Figure 9 in the coding section for a body-anchored encoding of Left in TİD.

Co-speech gestures
Similar to sign language encodings, iconic affordances of co-speech gestures also convey visually motivated expressions of space along with speech (see Özyürek, 2018 for a review).A few studies have found gestures to be an important indicator for the development of spatial communication (e.g., Sauter, Uttal, Alman, Goldin-Meadow, & Levine, 2012;Sekine, 2009).In one of these studies, Sekine (2009) investigated route descriptions of children (e.g., from school to home) in three age groups (4, 5, and 6 years).The results showed a correlation between spatial information used in speech (use of Left-Right terms and mention of landmarks in the route) and the spontaneous use of spatial gestures.Another study investigating descriptions of the spatial layout of hidden objects in a room found that 8-year-olds rarely encoded the spatial location of objects in speech but often used gestures to convey the locations of objects when prompted to use their hands (Sauter et al., 2012).Based on this previous research, it is plausible to argue that children's gestures might convey information about spatial relations in cases where speech is underinformative.This, however, has not been investigated for expression of Left-Right relations, in which speaking-children are known to show delayed acquisition in their speech.

Present study
As the above research shows, the visual modality of expression (i.e., sign and gesture) seems to be privileged in providing more spatial information compared to speech possibly due to the affordances of iconic form-meaning mappings (Goldin-Meadow & Beilock, 2010;Sommerville, Woodward, & Needham, 2005).However, the role of visual modality as a modulating factor in spatial language development has not been fully examined.Until now, researchers have typically studied sign or speech independently.Moreover, limited work studying both sign and speech has compared sign to speech alone.However, comparing sign to speech and speech-gesture combinations could more realistically approximate the development of spatial language use as it would capture all semiotic tools available to spoken languages, including both arbitrary/categorical (i.e., in auditoryvocal speech) and iconic/analogue (i.e., in visual-spatial co-speech gestures) expressions (Goldin-Meadow & Brentari, 2017;Özyürek & Woll, 2019).Hence, here we investigate, for the first time, how deaf child and adult signers acquiring sign language from birth and hearing child and adult speakers express Left-Right relations in sign, speech, and speechgesture combinations to provide informative expressions.
We defined informativeness in terms of whether a participant's description distinguishes symmetrical Left-Right relations (see Grigoroglou & Papafragou, 2019a for a similar approach in the domain of events).In the present study, participants engaged in a communicative task in which they saw displays with 4 pictures presenting different spatial configurations of the same two objects (see Karadöller, Sümer, Ünal, & Özyürek, 2022;Manhardt et al., 2020;Manhardt, Brouwer, & Özyürek, 2021 for a similar procedure).Within one display, the only distinguishing feature of the pictures was the spatial configuration between the objects (see Figure 4 for examples of displays).One of the pictures in the display was the "target picture" to be described to a confederate addressee who had to find it on her tablet among the same four pictures displayed in a different way.A detailed description of the stimuli material, procedure, and coding is provided in the methods section.
We chose Turkish and TİD because there is a strong tendency for Turkish speakers, especially children, to use under-informative descriptions (e.g., Yan 'Side') while describing Left-Right relations between objects, even at the age of 8 (Sümer, 2015).We focused on age 8 to build on this previous work in Turkish and TİD.Moreover, although not directly studied for the domain of space, Turkish has been found to be a high gesture culture in general (Azar, Özyürek, & Backus, 2020).Due to these features of Turkish and based on previous work showing that gestures can be used as a tool to convey spatial information by children (Sauter et al., 2012;Sekine, 2009), we investigate whether signing-children still have an advantage in describing Left-Right relations in informative ways compared to speaking-children even when their multimodal expressions are taken into account.

Predictions
We grouped our predictions into two clusters.First, we compared sign to speech (i.e., Unimodal Descriptions).Then, we compared sign to speech by also taking into account gestures (i.e., Multimodal Descriptions).In each section, we also compared the development of spatial expressions of children to adults.
In unimodal descriptions, we expected an overall effect of modality, such that signers would produce informative descriptions in sign more frequently than speakers would do so in speech.This would be due to the affordances of visual modality that allow iconic/ analogue expressions (Goldin-Meadow & Brentari, 2017;Özyürek & Woll, 2019;Taub, 2001;Taub & Galvan, 2001).Regarding developmental differences between children and adults in the two groups, there are two possibilities.One possibility is that speakingchildren would produce informative expressions less frequently than adults, but signingchildren would produce informative descriptions equally frequently as signing-adults.This would be in line with previously reported developmental patterns for speakingchildren (Clark, 1973) and signing-children who have been found to produce adult-like expressions of Left-Right relations starting from 4 years of age (Sümer, 2015).Alternatively, signing-children, similar to speaking-children, might produce informative descriptions less frequently than adults despite the advantage of the visual modality.This latter possibility would indicate a universal challenge in conceptual development of the spatial domain, specifically for Left-Right regardless of the modality of expression (Clark, 1973).Turning to multimodal descriptions, we first predicted that iconic affordances of gestures might facilitate expressing spatial relations in informative ways.In line with this, we expected speaking-children to use co-speech gestures that complement their underinformative speech more frequently than adults who would be mostly already informative in their speech (Alibali & Goldin-Meadow, 1993;Church & Goldin-Meadow, 1986;Perry, Church, & Goldin-Meadow, 1992;Sauter et al., 2012).
Next, when comparing speech-gesture combinations to sign, if co-speech gestures help with the informativeness of speakers' expressions, we expected modality differences between speakers and signers and developmental differences between speaking-children and speaking-adults to disappear (Goldin-Meadow & Brentari, 2017;Özyürek & Woll, 2019).However, it is still possible for signers to produce informative descriptions more frequently than speakers.Such a finding could be due to the fact that iconic forms in sign are conventional linguistic tools (Brentari, 2010;Emmorey, 2002;Klima & Bellugi, 1979), unlike co-speech gestures that are learned and used flexibly as a composite system together with speech (Kendon, 2004;McNeill, 1992McNeill, , 2005

Method
The methods reported in this study have been approved by the Ethics Review Board of the Radboud University Nijmegen, and Survey and Research Commission of the Republic of Turkey Ministry of National Education.
Signing participants consisted of TİD signing-children (N = 21; 12 Female; Mean Age = 8;5; SD Age = 1.29;Age Range = 6;8 -11) and adults (N = 26; 21 Female; Mean Age = 29;10; SD Age = 8.34; Age Range = 18;2 -48;7).Data from an additional 6 signingchildren and 4 signing-adults were excluded from the study due to failure to follow the instructions (n = 7), problems with the testing equipment (n = 1), or disruption during the testing sessions (n = 2).All signing participants were profoundly and congenitally deaf and acquired TİD from birth from their deaf signing parents.They did not receive speech therapy and were exposed to written Turkish when they started the school for the deaf. 1e determined the sample size based on convenience.Working with special populations poses certain challenges in reaching participants.Here, we report data from https://doi.org/10.1017/S0305000922000642Published online by Cambridge University Press signers who had been exposed to sign language from birth by their signing deaf parents.This group represents 10% of the deaf population in the world (Mitchell & Karchmer, 2004) and in Turkey (İlkbaşaran, 2015).Hence, the number of participants in each group reported in this study (speaking-children, signing-children, speaking-adults, and signingadults) was determined based on the total number of deaf children attending the deaf schools in İstanbul that we could collect data from.We collected data from all students who matched our criteria (e.g., age, absence of comorbid health issues).Finally, to our knowledge, the current sample incorporates the largest number of deaf signers who have been exposed to sign language from birth by their parents in comparison to previous studies conducted in the field.We could not balance, however, the gender diversity within the adults' group as we were limited to the number of deaf adults living in İstanbul.We collected data from almost all deaf adults falling under our criteria and willing to participate.Participation was voluntary and at the end of the study all children received a gender-neutral color pencil kit and adult participants received monetary compensation for their participation.

Materials
Stimuli consisted of 84 displays.Each display had 4 pictures presented in a 2 x 2 grid.Individual pictures in each display showed the same two objects in various spatial configurations.Ground objects (e.g., a jar) were always in the center of the pictures and they did not have "intrinsic" sides determined by their shape (e.g., a picture frame has an intrinsic front, but a jar does not).Figure objects (e.g., a pencil) changed their location in relation to the Ground objects.In each display, the target picture to be described was indicated by an arrow.Experimental displays (n = 28) consisted of Left-Right spatial configurations between objects (e.g., the pencil is to the left of the cup).In half of the experimental displays, only the target picture contained Left or Right spatial configuration between objects and all non-target pictures contained spatial configurations other than Left-Right (i.e., Non-contrast displays).In the remaining half of the experimental displays, one non-target picture contained the contrastive spatial configuration (i.e., contrast picture; if the target picture contained Left spatial configuration, contrast picture contained Right spatial configuration or vice versa) and remaining pictures contained spatial configurations other than Left-Right (i.e., Contrast displays).See Figure 4 for example displays.The rationale for having Contrast displays in addition to Non-contrast displays was to increase the need for informativeness in describing the spatial relation between the objects in the target pictures.With this manipulation, we aimed to test if participants use more informative descriptions for Contrast than Non-Contrast displays to distinguish the target picture in a more distinctive way among the other pictures in the display (see Manhardt et al., 2020Manhardt et al., , 2021 for a similar procedure).
In addition to the experimental displays, we included 56 filler displays to avoid attention to the Left-Right spatial configurations.Filler displays consisted of target pictures in Front (n = 14), Behind (n = 14), In (n = 14), and On (n = 14) spatial relations between objects.
All visual displays were piloted to ensure that both children and adults could identify and name the objects in the display.Within all 84 displays, Figure objects (e.g., pen) were presented only once.Ground objects (e.g., cup) were presented 4 times but always with other Figure objects (e.g., cup-pencil, cup-egg, cup-fork, cup-chocolate).The same Ground objects were never presented twice in a row.Moreover, the same relation between the objects as a target picture was not presented more than twice in a row to avoid biases to one type of spatial relation.There were two sets of displays with the same Ground objects but with different Figure objects.All other configurations were similar across the two sets.The order of the displays and locations of the pictures in each display were randomized across each participant.

Procedure
The description and familiarization tasks were originally designed as part of an eyetracking experimenthowever, for the purpose of this paper, we only reported the description data.

Description task
Participants were presented with the description task after the familiarization task was completed (see details below).Trials started with a fixation cross (2000ms), followed by a display of 4 pictures (1000ms).Next, an arrow appeared (for 500ms) to indicate the target item and disappeared and 4 pictures remained on the screen (2000ms) until visual white noise appeared.Participants were instructed to describe the target picture to an addressee sitting across the table immediately after the appearance of the visual white noise.This was done to prevent children from pointing towards the screen to show the pictures or objects in a picture while describing.Participants were instructed that the addressee would choose the target picture on her tablet based on the participant's description.They were also aware that the addressee had the same 4 pictures but in a different arrangement in the display and without the arrow.The addressee was a confederate and pretended to choose a picture on her tablet based on the participant's description.Participants moved to the next trial by pressing the ENTER key on the keyboard.Having an addressee, albeit as a confederate, was especially important considering previous reports on children's tendencies to be under-informative in the presence of an inattentive listener or in the absence of a listener (Bahtiyar & Küntay, 2009;Girbau, 2001;Grigoroglou & Papafragou, 2019b).See figure 5 for the timeline of a trial in the description task.
At the beginning of the description task, participants engaged in practice trials (n = 3) and these trials were repeated if necessary.During practice trials, when participants failed to understand the task instructions, the experimenter repeated them.Both during the practice trials and throughout the experiment, the addressee did not give feedback on whether or not the description was correct in order to avoid biasing the responses in the upcoming trials but pretended to have found the right picture.When there was missing spatial information in participant's description, the addressee only asked the location of the Figure object.In such cases, speaking participants were asked for the location of the figure object (e.g., Kalem nerede?'Where is the pencil?') in Turkish, and signing participants were asked for the location of the Figure object using the lexical sign of WHERE and the lexical sign of the Figure object found in the target picture in TİD.In order to provide consistent feedback, no other instructions were given to the participants.The addressee asked for such a clarification only once.Even if participants provided a description with missing spatial information in the second round, the addressee did not ask for further clarification and pretended to choose a picture in her tablet.Moreover, we did not provide explicit instructions for speaking participants to gesture or not.Thus, all of the gestures were spontaneously produced.Hearing adult speakers of Turkish were present as an addressee and as an experimenter for speaking participants, and deaf adult signers of TİD were present as an addressee and as an experimenter for signing participants.

Familiarization task
The familiarization task was introduced before the description task.This task aimed to introduce the general complexity of the displays with the 2 x 2 grid with two objects in various spatial configurations to each other.Participants were randomly presented with one of the two sets of displays that they did not receive during the description task.

Corsi Block Tapping Task
Participants received the computerized version of the Corsi Block Tapping Task in forward order.We administered this task to ensure that speaking and signing participants have similar spatial working memory spans (Corsi, 1972).This was especially important as previous studies showed mixed evidence for the visuospatial abilities across signers and speakers (see Emmorey, 2002;Marshall, Jones, Denmark, Mason, Atkinson, Botting, & Morgan, 2015).
All the experimentation (Familiarization Task, Description Task, and Corsi Block Tapping Task) was administered via Dell laptop with software Presentation NBS 16.4 (Neurobehavioral Systems, Albany, CA).Instructions of the tasks were given orally to speaking participants, or in sign to signing participants, in order to avoid misunderstandings in written instructions by signers.We applied the same procedure to speakers to ensure identical experimental strategies.The description task was video-recorded from the front and side-top angles to allow for speech, sign, and gesture coding.

Annotation and coding
All descriptions produced in the description task were annotated for Target Pictures.Descriptions with speech, gesture, and sign were coded using ELAN (Version 4.9.3), a free annotation tool (http://tla.mpi.nl/tools/tla-tools/elan/) for multimedia resources developed by the Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, The Netherlands (Wittenburg, Brugman, Russel, Klassmann, & Sloetjes, 2006).
We coded descriptions in speech, gesture, and sign.Next, we formed informativeness categories first based on the information conveyed in speech, later by considering co-speech gestures along with speech, and in sign.We operationalized informativeness as whether participants' descriptions provide a uniquely referring expression to distinguish the spatial relation between the objects in the target picture from other referent pictures in the display.In almost all descriptions, participants used their own perspective in encoding the spatial relations between objects.

Speech
Speech data were annotated and coded by the first author who is a hearing native speaker of Turkish.We did not have a reliability coding for speech as speech coding involved the presence of spatial terms (e.g., Left) that were unambiguously heard and identified by a hearing native speaker of Turkish.We grouped participants' descriptions into two categories based on whether or not the linguistic form used to encode the spatial relation in the description was informative in uniquely identifying the target picture when only speech is considered.
(1) Informative in Speech: This category consisted of descriptions that included Sol 'Left'-Sağ 'Right' spatial terms.The specific spatial terms used in these descriptions provided uniquely referring expressions that distinguished the target picture (Figure 6a).(2) Under-informative in Speech: This category consisted of all the remaining descriptions as they failed to provide enough information to uniquely identify the target picture.These descriptions included the following sub-categories: (2a) Descriptions with a general relational term (Yan 'Side'; Figure 6b) that failed to provide uniquely referring information that distinguished the actual spatial relation (e.g., which object is on the left side and which object is on the right side).(2b) Specific spatial terms other than Left-Right (e.g., Ön 'Front'; Figure 6c).These descriptions were especially frequent in children's descriptions as they tended to encode Left-Right with other spatial relations especially with Front (Abarbanell & Li, 2021;Sümer et al., 2014).We did not want to render these descriptions as incorrect since we could not be sure whether children used these spatial terms due to having a difficulty in mapping Left-Right spatial terms onto Left-Right relations.A few cases also included spatial terms other than Front to describe target pictures in Left-Right spatial relations.Based on our definition of informativeness, all of these descriptions that encoded Left-Right with other spatial terms were not informative enough for the addressee to pick up the correct picture from her tablet and thus was considered underinformative.(2c) Descriptions with missing spatial relation where participants only labeled the objects but not the spatial relation between them (Figure 6d).

Speech and gesture
We further coded spontaneous co-speech gestures as identified by strokes (see Kita, van der Hulst, & van Gijn, 1998) produced by participants that conveyed information regarding the location of the two objects or the spatial relation between the objects.We did not take into account other types of gestures such as beat gestures.We did this coding per description and regardless of the type of speech used in the description.In order to ensure reliability, 25% of the gesture data (5 Children and 5 Adults) were coded by another hearing native speaker of Turkish.There was substantial agreement between the coders for the type of spatial gestures used to localize Figure (88% Agreement, kappa = 0. 77) and Ground (92% Agreement, kappa = 0.79) objects.All disagreements were discussed to reach a 100% agreement.
For each description, we coded gestures separately for   (1) Informative in speech: This category only involved Left-Right spatial terms as described above.Some of the descriptions with Left-Right spatial terms also included accompanying spatial gestures.However, for these descriptions, spatial gestures did not add to the informativeness of the description and were considered redundant.That is, speech was already informative even without considering the gestures.Thus, they did not form a new category (see Figure 7).( 2) Informative in speech-plus-gesture: This category consisted of descriptions that include general spatial term Yan 'Side' in speech together with spatial gestures.In these descriptions, spatial information missing from descriptions with Yan 'Side' in speech was conveyed via spatial gestures (see Figures 3 and 8 for examples).Thus, these descriptions were informative only when the spatial gestures were considered.
(3) Under-informative even when gestures are considered: This category consisted of descriptions with specific spatial terms other than Left-Right (e.g., Front; Figure 6c) as well as descriptions with missing spatial relation where participants only label the objects but not the spatial relation between them (Figure 6d).These descriptions were still under-informative even when gestures were considered together with speech.That is, gestures did not contribute to the informativeness of the description above speech.

Sign
Sign data were annotated by a hearing L2 signer of TİD.The data were coded by another hearing L2 signer of TİD.Annotations and coding were checked by a trained native deaf signer of TİD.We did not have a reliability coding for sign as we only included the linguistic forms that were unambiguously approved by this signer in the final dataset.We coded descriptions for the presence of spatial relation between the objects and the type of linguistic form used to localize the Figure object in relation to the Ground object.Signers used 5 different Linguistic forms.These forms were classifier constructions (Figure 2d), which are one of the most common forms to localize the Figure object in relation to the Ground object in sign languages in general (Emmorey, 2002) and also in TİD (Arık, 2003;Karadöller et al., 2021;Sümer, 2015;Sümer, et al., 2014).They allow signers to encode information about the entities through the handshape classifications of objects (e.g., Emmorey, 2002;Janke & Marshall, 2017;Manhardt et al., 2020;Perniss et al., 2015a;Zwitserlood, 2012).Alternatively, signers also used other forms such as relational lexemes, which are the lexical signs for spatial terms used in sign languages (Arık, 2003;Manhardt et al., 2020;Figure 9); tracing the shape of the Figure object on the signing space (Karadöller et al., 2021;Figure 10); pointing to the location of the Figure object on the signing space (Karadöller et al., 2021); placing a lexical verb to locate an object on the signing space (see Karadöller, 2021).
We grouped participants' descriptions into two categories in terms of whether or not the description was uniquely informative (i.e., which object is where relative to the other based on diagrammatical iconicity) in identifying the target picture depending on the linguistic form that is used to encode the spatial relation in sign.
(1) Informative in Sign: This category included all of the linguistic forms mentioned above as they were describing the exact spatial relation between the objects and distinguishing the target picture uniquely from the other pictures in the display.RH: CUP RULER LEFT LH: ----RULER ----'There is a cup.There is a ruler.The ruler is to the left'  (2) Under-informative in Sign: This category included descriptions with incorrect (e.g., describing that the pen is in front of the paper, despite the target picture showing that the pen is to the left of the paper) and missing (e.g., only labeling the

Results
Data presented in this section were analyzed using generalized mixed-effects logistic regression modeling ( glmer) with random intercepts for Subjects and Items.2This mixedeffects approach allowed us to take into account the random variability due to having different participants and different items.All models were fit with the lme4 package (version 1.1.17;Bates, Mächler, Bolker, & Walker, 2014) in R (version 3.6.3:R Core Team, 2018) with the optimizer bobyqa (Powell, 2009).We did not include random slopes in any of the models because all of our models were testing between-subjects effects that cannot be added as random slopes.

Multimodal descriptions
First of all, we tested if children's spatial gestures help convey more information that is missing in Under-Informative speech than adults.To test this, we compared the frequency of descriptions that were Informative in speech-plus-gesture across children and adults (see light green bars in Figure 12).We used a glmer model to test the fixed effect of Age Group (Children, Adults) on whether the descriptions were Informative in speech-plus-gesture (1) or not (0) at the item level.The fixed effect of Age Group was analyzed with centered contrasts (-0.5, 0.5).The model revealed a fixed effect of Age Group (β = 2.94, SE = 0.70, p < 0.001): children (Mean = 0.45; SD = 0.46) produced descriptions that were Informative in speech-plus-gesture more frequently than adults (Mean = 0.08; SD = 0.26).Finally, we investigated whether frequency of informative descriptions changes across the modalities and age groups (see light and dark green bars compared to yellow bars in Figure 12).We used a glmer model to test the fixed effects of Modality (Informative in speech and Informative in speech-plus-gesture combined versus Informative in sign) and Age Group (Children versus Adults), and an interaction between them on binary values for the presence of informative description (Present = 1, Absent = 0) at the item level.The fixed effects of Modality and Age Group were analyzed with centered contrasts (-0.5, 0.5).The model revealed a fixed effect of Modality (β = -1.30,SE = 0.37, p < 0.001): signers (Mean = 0.94; SD = 0.27) produced informative descriptions (in sign) more frequently than speakers (Mean = 0.85; SD = 0.36) (in speech and speech-plus-gesture combined).The model also revealed a fixed effect of Age Group (β = 2.25, SE = 0.37, p < 0.001): adults (Mean = 0.97; SD = 0.18) produced informative descriptions more frequently than children (Mean = 0.82; SD = 0.38) There was no interaction between Modality and Age Group (β = -1.12,SE = 0.74, p = 0.127).

Discussion
In this study, we investigated whether the modality of expression influences the informativeness of Left-Right expressions of children and adults.To our knowledge, this is the first study to investigate the adult-like uses of Left-Right expressions with a multimodal perspective by considering spatial co-speech gestures and comparing descriptions in sign not only to speech but also to speech-gesture combinations.Overall, results showed that gestures help increase the informativeness of spatial descriptions compared to information in speech for speaking-children.However, sign expressions were still more informative even when gestures were considered.Finally, across both modalities children were less informative than adults.
Sign has an advantage over speech both for children and adults Unimodal comparisons across modalities revealed that signers produced informative descriptions more frequently than speakers regardless of the age group.This can be attributed to the facilitating effect of iconicity of sign language expressions providing more information compared to expressions in speech.Nevertheless, expressions in sign being more informative than those in speech should not be taken to indicate that signers have a more developed conception of Left-Right relations between objects than speakers do.Rather, it implies that iconic affordances of sign language expressions allow for more direct encoding and thus increase informativeness of the expression (see Sümer, 2015;Slonimska, Özyürek, & Capirci, 2020;Taub, 2001 for adults).These findings provide important contributions for the development of spatial language use in speech and sign.That is, spatial language development may depend on several linguistic factors varying across languages (see Johnston & Slobin, 1979).Despite these linguistic differences between TİD and Turkish, signing-and speaking-children encode In-On-Under at similar ages (Sümer & Özyürek, 2020).By contrast, they differ in encoding Left-Right.This difference points to a complex interplay between modality of expression, cognitive and linguistic development of space.
Gesture enhances informativeness of spoken expressions and more for children than for adults When we considered multimodal descriptions of speakers, we found that both children and adults used spatial gestures that disambiguated the descriptions with Side.This trend was more prominent for children than adults.This is reminiscent of previous findings from other domains showing that gestures can help clarify the meaning of Side.For instance, Cook, Duffy, and Fenn (2013) showed that using gestures to teach math operations when referring to two sides of an equation helps children solve math equations more accurately.It seems that the use of gestures is an important tool for clarifying underinformative speech (see also Kelly, 2001).Moreover, we extend this finding to situations where participants were not explicitly instructed to gesture (Sauter et al., 2012), since all of the gestures elicited in our study were spontaneous.Here, the frequent use of spontaneous gestures by Turkish speakers could be a reflection of Turkish being a high gesture culture (Azar et al., 2020), hence raises possibilities for further investigations in other high (e.g., Italian) and low (e.g., Dutch) gesture cultures.
Moreover, we showed that children could communicate Left-Right spatial relations between two objects informatively through co-speech gestures before communicating them informatively in speech.This corroborates previous literature on gestures preceding speech, which was already established for several other domains (Alibali & Goldin-Meadow, 1993;Church & Goldin-Meadow, 1986;Perry et al., 1992;Sauter et al., 2012).It seems that by age 8, children have some conceptual understanding of Left-Right spatial relations, yet they fail to map arbitrary/categorical linguistic forms in speech onto these conceptual representations.In these instances, gestures could act as a medium for representing already established spatial concepts that fail to surface in speech.These instances were very rare in adult language as adults could already map Left-Right spatial terms to these concepts.Together, our findings highlight the importance of considering children's multimodal encodings in assessing their pragmatic (i.e., informativeness; Grigoroglou et al., 2019) and cognitive development (Hermer-Vazquez, Moffet, & Munkholm, 2001).
Signed descriptions are more informative even when gestures are considered both for children and adults When spatial gestures were considered together with speech and compared to expressions in sign, signers continued to be more informative than speakers.This can be attributed to having iconic expressions as obligatory and conventional linguistic forms for sign languages (Brentari, 2010;Emmorey, 2002;Klima & Bellugi, 1979).Conversely, co-speech gestures are used flexibly and only as a composite system together with speech (Kendon, 2004;McNeill, 1992McNeill, , 2005;;see Perniss et al., 2015b for a discussion).Differences in the way co-speech gestures and signs are used during acquisition might have been the underlying factor of sign advantage.
Visual modality conveys more information than speech alone Findings obtained for enhanced informativeness in both sign and gesture compared to speech add further evidence for the importance of body's interaction with the world in shaping language and cognition (embodied cognition; Chu & Kita, 2008;Goldin-Meadow, 2016).It seems that children map their bodies' interaction with the spatial relations between objects more easily onto iconic gestures/signs than speech (see Sümer, 2015 for a discussion).Both signers and speakers described the spatial relation between objects dominantly from their perspective.In those descriptions, participants might be mentally aligning the location of the objects that they saw on the computer screen with the left and right sides of their body.This alignment might have eased encoding the object locations on the sign/gesture space by placing hands rather than by mapping them by abstract spatial terms in speech.Moreover, signers used specific linguistic forms that are body-anchored (i.e., relational lexemes)although few in frequency compared to classifiers.These signs have been found to facilitate learning to encode spatial relations earlier in sign than in speech and especially for Left-Right (Sümer, 2015;Sümer et al., 2014).Thus, having body-anchored lexical signs already in sign language lexicon might have allowed signers to encode more information with respect to Left-Right spatial relations between objects using their own body as a reference.Overall, visual modality allows more direct mapping of visual/bodily experience onto linguistic labels, which, therefore, results in more informative descriptions of Left-Right relations between objects.
Left-Right remains to be a challenging spatial domain for children even when visual modality is considered Contrary to our initial expectation, developmental differences for the informativeness of Left-Right expressions did not disappear even when we considered visual modality of expression in sign or co-speech gestures.Children were less informative than adults for both unimodal and multimodal descriptions.This suggests an intricate interplay between language development, cognitive development, and the visual modality of expression.On the one hand, spatial gestures contribute to the informativeness of the descriptions for children more than adults.On the other hand, this contribution is not sufficient for speaking-children to reach adult levels of informativeness.Together, these results speak for the general claim that Left-Right is challenging for children regardless of the modality of communication (Abarbanell & Li, 2021;Clark, 1973;Rigal, 1994Rigal, , 1996;;Sümer, 2015;Sümer et al., 2014).
It is possible to attribute the differences between children and adults to the development of pragmatic knowledge required to provide informative descriptions.One way to investigate whether children and adults differ due to differences in pragmatic knowledge (as opposed to ease of encoding with iconic expressions) is to observe differences in contrast versus non-contrast trials.However, participants did not change the way they describe when there is a contrast or a non-contrast displays.Another way would be to have a non-confederate addressee and investigate various description/selection instances.For example, investigating the possible changes in the description patterns of the participants upon incorrect picture selection by addressee could reveal important insights for the development of pragmatic knowledge of children when compared to adults.This can be investigated in future research.

Future directions
First of all, the current study focused on manual articulators in sign and co-speech gestures in order to maintain similarity to previous research on the development of encoding space.In addition to the manual articulators, head/torso movements and eyegaze direction may provide important contributions to our understanding of the role of multimodal communication.We call for further research to establish systematical and conventional ways of integrating those aspects in investigating the development of multimodal communication.
Secondly, the current study did not directly assess participants' actual knowledge of Left-Right terms which could be used to strengten our claims for modulating effect of visual modality on the acquisition of Left-Right language.
Finally, even though our findings in spatial language use suggest an advantage when using sign compared to using speech or speech-gesture combinations, this advantage seems to be limited to tasks where language is explicitly used within the task.We did not, for instance, find differences between visual-spatial working memory across signers and speakers.Similarly, we did not find differences in participants' spatial memory accuracy when we asked them to remember the picture that had been described in different modalities (Karadöller et al., 2021, see also Karadöller, 2021 for a discussion).Together, these findings call for investigations in other indices of cognition, such as visual attention during a planning of a description where some research demonstrated variations based on modality (see Manhardt et al., 2021).

Conclusion
In summary, visual modality of expression, in sign or gesture, can modulate the development of spatial language use in signers and speakers.However, the facilitating effect of sign in conveying informative spatial descriptions was stronger than that of co-speech gestures.Having obligatory and conventional iconic expressions as linguistic forms in sign languages, unlike co-speech gestures that are used flexibly as composite utterances with speech, might have facilitated the development of informativeness in Left-Right descriptions of signers.Finally, both signing-and speaking-children were less informative than adults even with the advantage of visual modality allowing iconic descriptions.This corroborates earlier claims pointing to the challenge of this spatial domain in conceptual and linguistic development (Clark, 1973;Johnston, 1985Johnston, , 1988)).Results of the present study call for investigations in other languages, bilinguals, and cultures (e.g., low-gesture cultures) and on different aspects of language development to unravel how cognitive and linguistic (e.g., modality of expression) factors interact and determine the outcomes for developmental milestones.

Figure 2 .
Figure 2. Informative description from a TİD signer by using a classifier construction in encoding the spatial relation between the cup and the toothbrush.

Figure 5 .
Figure 5. Timeline of a trial in the description task.

Figure 6 .
Figure 6.Examples from Turkish speakers describing the spatial relation between the pencil and the cup using (a) Left-Right spatial terms, (b) general relational term Side, (c) spatial terms other than Left-Right, (d) missing encoding of spatial relation between the objects.
Figure and Ground objects.These gestures included either directional pointing gestures indicating the location of the Figure or Ground Object in an analogue way (Figure 3 and 7) or iconic hand placement gestures indicating the location of the Figure and/or Ground object on the gesture space (Figure 8).Both of the spatial gesture types, like linguistic structures found in sign to represent space, give spatial information about the Left-Right relations between objects from the viewpoint of the speaker, and help identify the target picture uniquely from other referents in the display.As a next step, we considered these spatial gestures on top of what has been conveyed in speech and redefined the informativeness categories for speakers creating multimodally informative categories:

Figure 7 .
Figure7.Informative in Speech description from a Turkish speaker using a specific spatial term (Left) together with a directional pointing gesture to the left.Note.Underlined words denote the speech that the gesture overlapped with.The description is informative even when only speech is considered.

Figure 8 .
Figure8.Informative in speech-plus-gesture description from a Turkish speaker using a general spatial term (Side) together with iconic hand placement gestures.Notes.Participant introduced gestures sequentially.Gesture indicating the basket (RH) was performed when the participant mentioned the basket in her speech.Gesture indicating the newspaper (LH) was performed when the participant mentioned the newspaper.Both gestures remained in the gesture space until the end of the sentence.The description is informative only when information in both speech and gestures combined is considered.

Figure 11 .
Figure 11.Under-informative description in sign with missing spatial relation between objects.

Figure 9 .
Figure 9. Informative in sign description from a TİD signer by using a relational lexeme for Left in encoding the spatial relation between the cup and the ruler.

Figure 10 .
Figure 10.Informative in sign description from a TİD signer by tracing the shape of the Figure object on the signing space in encoding the spatial relation between the cup and the ruler.
Figure and Ground object but not the spatial relation of the Figure object in relation to the Ground object, Figure 11) spatial relation.
used a glmer model to test the fixed effects of Modality (Informative in speech versus Informative in sign) and Age Group (Children versus Adults), and an interaction between them on binary values for the presence of informative descriptions (Present = 1, Absent = InformaƟve in Speech InformaƟve in Speech-plus-gesture InformaƟve in Sign

Figure 12 .
Figure 12.Proportion of Informative descriptions across Age Groups and Modality.