No effects of modality in development of locative expressions of space in signing and speaking children

Abstract Linguistic expressions of locative spatial relations in sign languages are mostly visually motivated representations of space involving mapping of entities and spatial relations between them onto the hands and the signing space. These are also morphologically complex forms. It is debated whether modality-specific aspects of spatial expressions modulate spatial language development differently in signing compared to speaking children. In a picture description task, we compared the use of locative expressions for containment, support, and occlusion relations by deaf children acquiring Turkish Sign Language and hearing children acquiring Turkish (age 3;5–9;11). Unlike previous reports suggesting a boosting effect of iconicity, and/or a hindering effect of morphological complexity of the locative forms in sign languages, our results show similar developmental patterns for signing and speaking children's acquisition of these forms. Our results suggest the primacy of cognitive development guiding the acquisition of locative expressions by speaking and signing children.


Introduction
Humans can perceive as well as talk about various aspects of the spatial world around them. Learning to talk about spatial relations is a pivotal milestone of language development as children early on need to communicate about the relations between the entities in their physical surroundings with their caregivers (e.g., to ask for a cookie on the table; a toy in the basket, etc.). To do so, children need to figure out the appropriate spatial linguistic forms available in their languages. However, which aspects of the language map onto what type of spatial relations and in what ways (e.g., lexical, grammatical) vary not only across spoken languages but also between sign and spoken languages (Emmorey, 2002;Talmy, 2003;Levinson & Wilkins, 2006). This paper investigates how children acquiring a sign (Turkish Sign Language; TİD) versus a spoken language (Turkish) tune into the language-specific encodings of their respective languages and whether this is modulated by the modality of the language being acquired (visual/spatial vs. auditory/vocal).
Spatial relations in speech are expressed by arbitrary and categorical form-meaning mappings such as spatial prepositions like in, on for English or morphological case markers (-de/da 'at'), which can be attached to object nouns (e.g., masa-da 'at the table'), or to spatial nouns (üst-'top') together with a genitive case marker, which then functions as a postposition (e.g., üst-ün-de 'on'). However, spatial expressions in sign languages differ from spoken languages since they have modality-specific aspects. They employ iconic mappings of spatial relations between entities in the signing space within systematic linguistic constraints, including complex morphology (Emmorey, 2002;Perniss, 2007;Perniss, Zwitserlood, & Özyürek, 2015). Speakers can also express visually motivated representations in spatial communication through iconic co-speech gestures (e.g., McNeill, 1992;Kita & Özyürek, 2003;Goldin-Meadow, 2004). However, we here focus on the influence of the main language modality, i.e., sign versus speech, on encoding spatial relations by children and adults. Thus, the use of co-speech gestures is beyond the scope of the current study.
The modality-specific differences between sign and spoken languages in the domain of spatial expressions raise interesting questions about the developmental trajectory of spatial language in general and the role of modality of expression in language development more specifically. While one can assume iconicity in mapping spaceto-space can boost spatial language development for signing children, it has been claimed that the morphological complexity of the language forms to encode spatial relations in sign languages (Newport, 1981;Supalla, 1982;Perniss, Pfau, & Steinbach, 2007;Zwitserlood, 2012) might actually hinder the acquisition of spatial language for signing children. The requirement to map spatial relations between two entities in signing space using simultaneous articulators as well as the complex linguistic constraints in such constructions have been claimed to be the main source of challenge for signing children, and their mastery may take until eleven to twelve years of age (Kantor, 1980;Supalla, 1982;Schick, 1990;Engberg-Pedersen, 2003;Slobin et al., 2003;de Beuzeville, 2006;Tang, Sze, & Lam, 2007). Thus, these findings might suggest a possible delay in the expression of space by signing children compared to speaking children. Previous research has argued that Turkish-speaking children produce locative expressions in their languages as young as two years of age. For example, in an earlier study by Johnston and Slobin (1979), Turkish children between ages 2;0 and 4;4 were reported to describe various spatial scenes, similar to those used in our study. However, this study lacks adult data. Thus we do not know for sure if these children used linguistic forms in adult-like ways in their descriptions. Furthermore, they did not report about the type of locative constructions produced (i.e., adposition versus general locative case marker). In another study, Aksu-Koç (1994) reports the development of linguistic forms by examining the narrations of Frog Story by Turkish speaking children (ages 3;5-9;0) and adults. However, due to the use of a picture story and the nature of the task, the elicited narrations mostly include motion event descriptions, thus focusing mainly on the expression of motion verbs, which may or may not be accompanied by locative modifications.
In this study, we address the question of whether the modality of the language being acquired (sign versus spoken) modulates the course of learning to express locative spatial relations by comparing signing and speaking children's and adults' productions of the same pictures featuring locative relations between two objects. Specifically, we focus on spatial relations such as containment, support/contact, and occlusion (also named as 'topological' relations; Levinson, 1996). The extensive cross-linguistic research on the acquisition of these spatial relations for spoken languages has shown that they are acquired earlier than other types of spatial relations, such as those requiring, for example, perspective taking (H. Clark, 1973;Johnston & Slobin, 1979). Locative relations are claimed to be cognitively most salient and basic for children (Casasola, Cohen, & Chiarello, 2003;Hespos & Baillargeon, 2008;Park & Casasola, 2015). Despite the substantial amount of knowledge on the acquisition of these spatial relations by speaking children, no research to date has compared the development of language forms for these relations specifically between signing and speaking children.
Most previous research has compared sign and spoken languages in learning to encode motion events, which are more complex than encoding locative spatial relations. Locative relations consist of Figure, Ground, and the relation between them, while motion events consist of more components such as Figure, Ground, Path, Manner, and Motion (Talmy, 1985). Thus, even though previous research has found development of motion event descriptions to be delayed in signing children compared to speaking children, when it comes to locative relations, one might either see the facilitative effect of iconicity in their acquisition or there might not be any differences between signing and speaking children in this domain due to the fact that locative relations are simpler than motion events.
Here we report for the first time a comparison of the development of locative spatial relations in deaf children acquiring TİD (natively from their deaf parents) and hearing children acquiring Turkish. A cross-modality comparison with less-studied sign and spoken languages is relevant for and can contribute to our understanding of which aspects of spatial language acquisition are universal and thus driven by general learning principles, and which ones are more of a result of modality-specific tuning.
Before moving into the specifics of the study, we will briefly review the variation in the linguistic encoding possibilities of locative relations in spoken and sign languages, while also providing more specific information for Turkish and TİD. Later, we discuss previous theoretical and empirical evidence for the constraining role of cognitive versus language-specific factors such as modality in the development of spatial language.
Encoding locative spatial relations in spoken languages Linguistic expressions of locative spatial relations require distinguishing between Figure objectsi.e., objects that are smaller compared to the other object(s) or that are in the focus of attentionand Ground objectsi.e., objects that are larger in comparison to the other object(s), or that are back-grounded in a spatial scene (Talmy, 1985). Linguistic devices for encoding a spatial relationship between Figure and Ground exhibit a great deal of variation across spoken languages, ranging from the use of elements from small inventories of closed-class forms (e.g., adpositions, case markers) to elements of large inventories of open-class forms (e.g., verbs, positionals) (Levinson & Wilkins, 2006). Many spoken languages use adpositions, rather than verbs, to encode the spatial relation between entities. Adpositions include prepositions (e.g., in, on, under in English) or morphological case markers, which mostly occur as postpositions attached to object nouns or spatial nouns (e.g., -de/da 'at' in Turkish). For example, to encode the spatial relation as depicted in Figure 1, English speakers can use the English preposition on, which indicates that the Figure (cup) is in a support/contact type of spatial relation with the Ground (table) (see example (1a)). In contrast, in Turkish a locative case marker (-de/ da 'at') can be attached to the noun expressing the Ground (as shown in (1b)) or to a spatial noun specifying a spatial aspect of the Ground noun (e.g., üst 'top'; alt 'under'), which conveys in a more semantically specific way the nature of the spatial relation between the Figure and the Ground (see (1c)). The spatial noun is further linked to the Ground noun with genitive and possessive case markers, which makes it morphologically more complex than the general locative case marker. Thus, Turkish is typologically different from many extensively studied European languages, especially in terms of the linguistic encoding of locative expressions, due to its morphological complexity (Johnston & Slobin, 1979;Aksu-Koç & Slobin, 1985). The developmental trajectory of learning different forms for locative expressions in Turkish has never been studied before, taking the full morphological structure as well as the optionality of expressions into account, and not in comparison to the acquisition of a sign language.
(1a) The cup is on the table.
(1b) Masa-da bardak var. Encoding locative spatial relations in sign languages In spite of fundamental similarities found in several domains of linguistic structure such as phonology, morphology, and syntax (Klima & Bellugi, 1979;Sandler & Lillo-Martin, 2006), modality differences give rise to a very different structuring of spatial expressions between sign and spoken languages. Sign languages use space to encode space in an iconic way, unlike spoken languages that employ arbitrary forms that label different types of spatial relations (Emmorey, 2002;Talmy, 2003;Emmorey, McColluogh, Mehta, Ponto, & Grabowski, 2013). The canonical structure of the locative constructions in sign languages requires signers to follow a relatively fixed order in which they introduce Ground first, which is followed by Figure, and these are mostly introduced by full noun phrases (e.g., lexical signs). The use of locative forms comes after the introduction of Ground and Figure (Boyes-Braem, Fournier, Rickli, Corazza, Franchi, & Volterra, 1990, for French Swiss Sign Language;Perniss, 2007, for German Sign Language [DGS]; Sümer, Zwitserlood, Perniss, & Özyürek, 2013, for TİD; but see also Boyes-Braem et al., 1990, for  In order to describe the spatial relation between Figure and Ground, signers mainly use linguistic forms that are analogue of the real spatial configuration. In these forms, called 'classifier constructions', the position and the movement of the hand(s) in the signing space communicate information about the location of the referent(s), and the classifiers are expressed by handshapes that classify entities by representing their salient characteristics, predominantly size and shape features (Supalla, 1982;Emmorey, 2002;Perniss, 2007;Özyürek, Zwitserlood, & Perniss, 2010;Zwitserlood, 2012;Perniss et al., 2015).
An example of a classifier construction used to describe a locative spatial relation is given in (2a) below for DGS (Perniss et al., 2015), where the signer expresses the spatial relation between the Figure (cup) and the Ground (table). The classifier on the signer's right hand represents the round shape of the cup; the classifier on her left hand represents the flat surface of the table. The signer localizes these classifiers in the signing space in a way that reflects the locations of the entities as she has seen them. Although not shown in the example below, before localizing Ground and Figure in the signing space with their classifiers, she introduces them by their lexical signs. Also note that the classifier constructions mostly require coordination of two hands simultaneously, as well as requiring the knowledge that certain handshapes should refer to referents in localizing them in space. These aspects, (handshapes and use of relative locations in space), albeit analogous to the spatial configuration to be encoded, can be considered as the complex morphological aspects of these linguistic expressions.

(2a)
'The cup is on the table.' Besides classifier constructions, another linguistic device for encoding spatial relations, called 'relational lexeme', has also been observed in some sign languages (Emmorey, 2002, for American Sign Language [ASL]; Perniss, 2007, for DGS;Özyürek et al., 2010;Arık, 2013, for TİD). In contrast to the use of classifier constructions, these are fixed forms with a specific meaning that categorically indicate the type of spatial relation, unlike the highly iconic forms, as described above, that use classifiers to map spatial relations to signing space. Handshapes in these forms do not denote the salient features of the entities involved. This makes relational lexemes morphologically less complex and also less iconic than classifier constructions. An example of a relational lexeme NEXT TO from DGS is given in (2b) below, in which a DGS signer moves his right hand sideways (Perniss et al., 2015).
(2b) 'next to' Note that, even though relational lexemes in sign languages do not include specific handshapes for entities, they are still more iconic about the spatial relations they depict than linguistic forms for spatial encoding in spoken languages. As in the example above, the DGS sign NEXT TO moves right from the side of the left hand, which represents the location of (possibly) the Ground object in the sign space. Relational lexemes are reported to be used by signers, however, more infrequently compared to the use of classifier constructions in many sign languages.
Factors shaping development of spatial language by signing and speaking children Cognitive development Within the domain of spatial language acquisition, two main guiding principles have been put forward in the literature. The first principle proposes that non-verbal conceptual development about space plays a major determining role in acquiring spatial relations. The evidence for this claim comes from studies conducted in a variety of spoken languages (Grimm, 1975, for German;Dromi, 1979, for Hebrew;Johnston & Slobin, 1979, for Italian, Serbo-Croatian, Turkish;Tomasello, 1987, for English;Vorster, 1984, for Afrikaans;E. Clark, 2004) showing that children's acquisition of locative expressions follows a similar sequence in which containment, contact/support, and occlusion (e.g., as expressed by in, on, and under, respectively, in English) appear earlier than other types of spatial relations such as left, right, front, and behind (H. Clark, 1973;Tomasello, 1987;Johnston, 1988;Bowerman & Choi, 2001;Loewenstein & Gentner, 2005). The finding that locative terms are apparently learned in a predictable order has been used to argue that general cognitive development is a strong driver that determines the similar process of acquisition of these terms in different languages.
It should be noted, however, that these studies have focused on the acquisition of the spatial terms (e.g., üst 'top', on) themselves, without taking the whole complexity of the morphological/syntactic structures that they occur in into account. It is not clear whether and how children can relate spatial terms to Figure and/or Ground in development. The few studies that have focused on the discourse of children have shown that children have difficulties in expressing spatial information in a larger discourse context, and cannot anchor the location of Figure with respect to Ground in adult-like ways until ten to twelve years of age (Ehrich, 1982;Lloyd, 1991;Hickmann, 2003).

Language specificity
Spatial situations can obviously be constructed in different ways in different languages, and these differences may have consequences for the acquisition of spatial encoding (Brown, 1994;Levinson, 1996). This brings us to the other hypothesis that claims the role of language-specific factors, rather than or in tandem with principles of cognitive development, for the acquisition of spatial language. In a series of studies, Bowerman (1996aBowerman ( , 1996b and Bowerman and Choi (2001) have argued that diversity in spatial semantic structuring imposes language-specific acquisition strategies, and that children's spatial semantic categories are language-specific from an early age onwards. For example, in their research, they found that English-and Korean-acquiring children tuned into specific patterns of their language as early as 14-16 months, despite the fact that these two languages highlight different aspects of spatial relations: locative relations of tight-containment (e.g., cork inserted into bottle) and loose-containment (e.g., ball in box) are grammaticalized in English with one linguistic form, i.e., in (thus ignoring the tight versus loose distinction), while Korean has separate forms, which are verbs, for these events, namely kkita and nehta, respectively. Evidence for early tuning to language-specific encodings was also found for children acquiring Tzotzil (Bowerman, de León, & Choi, 1995), Tzeltal and Hindi (Narasimhan & Brown, 2009), English, Danish, and Chinese (Sinha, Thorseng, Hayashi, & Plunkett, 1994). Based on these findings, language-specific factors have also been argued to modulate spatial language development in addition to the constraints of cognitive development.

Modality
Apart from general cognitive and language-specific encoding principles, previous research has questioned whether modality-specific aspects of sign language also modulate the development of spatial language. Findings of the earlier production studies conducted with children acquiring a sign language might be interpreted as posing a hindering effect of modality on the acquisition of spatial relations by signing children compared to speaking children despite the iconic correspondences between form and meaning in sign languages. These studies have shown that full mastery in learning locative devices that are mainly used to encode spatial relations, i.e., classifier constructions, takes beyond even seven to twelve years of age (Kantor, 1980;Supalla, 1982;Schick, 1990;Engberg-Pedersen, 2003;Slobin et al., 2003;de Beuzeville, 2006;Tang et al., 2007; but see Morgan, Herman, Barriere, & Woll, 2008, which reports the spatial language development in one British Sign Languageacquiring Deaf child). This delay has been mainly attributed to the challenges of representing Figure and Ground simultaneously (i.e., motoric difficulty in coordinating both hands) and choosing appropriate classifier handshapes in these forms (Newport & Supalla, 1980;Supalla, 1982;Newport & Meier, 1985;Engberg-Pedersen, 2003;Slobin et al., 2003;Tang et al., 2007).
In some studies, however, the effect of iconicity on the acquisition of classifier constructions has also been reconsidered. Despite the morphological complexity of the classifier constructions, some studies found that deaf children, as young as two years of age, were able to use classifiers to indicate the movement of their referents in signing spacealbeit with generic handshapes (Slobin et al., 2003). Although the production of the classifiers relative to all sign productions was low, these observations suggest a more prominent role of iconicity in the acquisition of classifier constructions than assumed in earlier studies. It is possible that the iconic form-meaning mappings in these forms might be encouraging deaf children to use these forms productively, but their mastery (i.e., using adult-like handshapes, simultaneity) may still take until ten to thirteen years of age.
It is important to note that morphological complexity has been discussed only in the domain of classifier constructions, and less is known with respect to the other, less complex and less iconic, linguistic forms such as relational lexemes, and their effects on the acquisition of spatial language. Additionally, most of the studies with signing children draw their conclusions primarily from indirect comparisons with English, where the linguistic forms of spatial expression are morphologically less complex than the classifier predicates in sign languages (Newport & Supalla, 1980;Supalla, 1982;Newport, 1988). The development of sign and spoken spatial expressions might be comparable when comparisons are made with the development of spatial expressions in spoken languages with more morphological complexity (e.g., Turkish).
Also, direct comparisons of development in spoken and sign language using the same tasks are necessary to see clearly whether and how modality might influence the language development process. Finally, previous studies examined the signing children's simultaneous expressions mainly in narrations or in spontaneous conversations (Kantor, 1980;Supalla, 1982;Schick, 1990;Engberg-Pedersen, 2003;Slobin et al., 2003;de Beuzeville, 2006;Tang et al., 2007;Morgan et al., 2008). There is thus no information on whether signing children have difficulty in expressing Figure and Ground simultaneously, together with the locative spatial relation in simpler descriptions.

The present study
The aim of this study is to compare the developmental trajectories in encoding locative spatial relationsnamely containment, support/contact, occlusionin a sign (i.e., TİD) and a spoken (i.e., Turkish) language.
TİD and Turkish were chosen for the following reasons: Turkish, as a morphologically rich language, differs from many Indo-European languages in the way in which locative spatial relations are expressed. In Indo-European languages, in relation to which the acquisition of sign languages has been mostly compared, speakers mainly use prepositions to encode spatial relations. However, as explained earlier, Turkish speakers use morphologically more complex forms, as in sign languages. Turkish uses general postpositional locative case marker (-de/da 'at'), which can be attached to the noun expressing the Ground (Masa + da 'table + LOC'). This linguistic form can also be added to a spatial noun (üst+ün + de 'top + POSS + LOC'), producing a morphologically more complex locative form. Thus, a domain-general as well as a semantically specific encoding are two options children need to learn at different stages of their spatial language development in Turkish. Similarly, in TİD, to encode spatial relations, children need to learn to use either morphologically complex iconic forms such as classifier constructions, or less iconic, and less morphologically complex forms, namely relational lexemes. These characteristics of locative forms in Turkish and TİD enabled an examination of factors such as morphological complexity (both Turkish and TİD) and iconicity (only TİD) in spatial language development. It is also important to note that historically TİD is not in a contact relation (genetically or geographically) with more widely studied (Western) sign languages, and also differs from Turkish (Zeshan, 2003). Thus, studying the acquisition of TİD and comparing it with Turkish extends our knowledge about the acquisition of sign languages both in general and in relation to typologically different spoken languages.
In this study, we asked children between ages 3;5 and 9;10 for TİD and 3;8 and 9;11 for Turkish to describe pictures that show two entities in either a containment (e.g., ball in box), or support/contact (e.g., pen on paper), or occlusion (e.g., pillow under bed) type of spatial relation to a deaf or hearing addressee depending on the language condition. We compared both groups of children to adults in their respective languages to examine possible differences and/or similarities in their developmental patterns. These age-groups were chosen because we wanted to make sure that describing spatial relations from pictures would be possible in a communicative task.
Although previous studies with signing children report the emergence of classifier handshapes at around two years of age, they mainly focused on the production of the classifier constructions without considering the whole linguistic structures that they are embedded in (e.g., in relation to Figure and Ground). Therefore, it is not clear whether these children were able to encode spatial relations by also mentioning Figure and/or Ground objects in their productions. Similarly, an extensive body of research on the acquisition of spatial terms in spoken languages (e.g., up, down) has also neglected the larger context that these terms appeared in. Previous research on Turkish has also not taken into account the acquisition of the optional forms in the language (i.e., general versus specific locative terms for which further development is possible). Thus, after some pilot studies with signing and speaking children, we determined that age 3;5 was the earliest suitable age for both signing and speaking children when they could express both Figure and Ground for us to be able to see if they could express the relations between the two.
For the comparison between Turkish and TİD, we asked first how often relations between Figure and Ground were encoded across age-groups within each language. More specifically, we investigated if signing children used classifier constructions versus relational lexemes as well as the simultaneous expressions as frequently as the signing adults. We did not however analyse further if the classifier handshapes were exactly like adults' as long as they were acceptable classifiers for TİD (Slobin et al., 2003). Finally, for speaking children, we asked if they used the general locative case marker versus a spatial noun as frequently as adults.

Predictions
Regarding the acquisition of locative spatial relations by signing and speaking children, we entertained three possible outcomes. One prediction is that children acquiring a sign language would become adult-like in their preference of locative forms earlier than children acquiring a spoken language, since the visual-spatial modality affords iconic form-meaning mappings in sign languages (facilitating effect). Recent research has shown that at the lexical level iconic signs are acquired earlier than non-iconic ones (Thompson, Vinson, Woll, & Vigliocco, 2012;Caselli & Pyers, 2017;Sümer, Grabitz, & Küntay, 2017). One might expect this effect to continue for locative spatial expressions as well. Thus, one might also expect signing children to use more iconic forms such as classifier constructions than relational lexemes compared to adults, for example.
The linguistic forms that encode the location and motion of the entities, most prominently classifier constructions, are morphologically complex language forms and require the simultaneous expression of referents using the two hands (Newport, 1981;Perniss et al., 2007;Zwitserlood, 2012), which might be hard to acquire for signing children (Supalla, 1982). Therefore, an alternative possibility is that signing children's preferences for the locative forms will become adult-like (in terms of frequency of use) later than speaking children's (hindering effect), as also found for motion event expressions.
Finally, both signing and speaking children might achieve adult-like forms at similar ages in terms of the options they have in their respective languages (neutral effect). Earlier studies have shown that the first signs of signing children and the first words of speaking children center around similar semantic concepts such as people, animal, and food (Anderson & Reilley, 2002), which then suggests that the acquisition of meanings is driven by more general cognitive processes and follows similar developmental paths (Rinaldi, Caselli, Renzo, Gulli, & Volterra, 2014). This might also be the case for spatial language development. This would then provide further evidence for general cognitive development to be a stronger driving force for the development of spatial language than the specific effects of the input, such as modality (Johnston, 1988;E. Clark, 2004).

Method
Participants Ten Turkish adult speakers and 10 deaf adult signers of TİD in addition to 20 younger and 20 older signing and speaking children, 10 in each age group, were recruited (see Table 1 for their demographic information). All deaf participants in this study have two deaf parents, and were thus exposed to TİD from birth.
In forming these age-groups, the age at which children start primary school in Turkey was taken as the decision criterion. During data collection, the starting age for primary school was seven years in Turkey, so children younger than seven were grouped as preschool-age children and the rest as school-age children for the current study. As will be explained further below, most of the deaf children in the second group attend a school for the deaf, where they are exposed to TİD more than the deaf children in the first group. Thus, by being a rich source of language exposure for deaf children, schooling is likely to have effects on their language development. By taking the starting age for school in our study as the criterion for forming our groups of children, we could take the effect of schooling experience on language development into account.
Seven deaf children in the older age-group attended a primary school for the deaf and three were in mainstream schools for the hearing. As for the younger age-group, three of them were full-time (five days a week) and four were part-time (two days a week) attenders at a preschool education program for the deaf. The rest stayed at home. It is important to note that TİD was not systematically taught at the schools for the deaf, and thus was not part of the curriculum at the time of data collection. Furthermore, due to the lack of access, deaf children learned very little Turkish at school. For the hearing children, all of them in the older age-group received formal education. Five of the younger hearing children attended a preschool education program five days a week while the rest did not.

Materials
All participants were asked to describe the same set of pictures in which a Figure object is situated in relation to a Ground object (e.g., ball in cup; pen on paper; pillow under bed). There were 7 picture sets for containment, 8 picture sets for support/contact, and 6 picture sets for occlusion, resulting in a total of 21 picture sets. In each set, in addition to the picture showing the target spatial relation indicated with a red frame, there were: Note that even though we had more sets to begin with we could not include them in our final analyses as children had difficulty in naming the objects. This resulted in a slightly unequal distribution of pictures displaying different types of spatial relations. None of these pictures shows people acting upon objects, but all present objects in a static situation. (see 'Appendix' for all picture sets).

Procedure
Signers/speakers were asked to sit opposite the addressee, who was a deaf or hearing confederate depending on the language condition. There was a laptop located on a table between them, and the table top was below the waist of the participants so that their hands could easily be seen. They were asked to describe the target picture with the red frame to the addressee who had the same picture set (but without any red frames) in a booklet. The task of the addressee was to find the picture described by the signer/speaker.

Data coding
All descriptions were coded for (a) the encoding of a spatial relation between  relational encodings when both Figure and Ground were mentioned and the spatial relation between them was expressed. In this way, we tried to be sure that lack of relational encoding is an indicator of difficulty in using a locative form, rather than the lack of necessary lexical knowledge to name the items in question. Thus, the cases where participants did not mention Figure and/or Ground or the spatial relation between them were not considered to be a relational encoding, and were excluded from the analysis (see Table 2). For each description, we took the signers'/speakers' first description in our coding and analysis since second and/or repeated descriptions upon being asked by the interlocutor would introduce uncontrolled variability. All TİD descriptions were annotated sign by sign and coded using ELAN (2012; Wittenburg, Brugman, Russel, Klassmann, & Sloetjes, 2006) by a hearing advanced signer of the language, and checked by a deaf native TİD signer, as well. The Turkish data were coded for each picture description by a Turkish-speaking research assistant.

TİD Coding
We first categorized the linguistic forms in sign language expressions with relational encoding into two. The first category is for classifier constructions that convey shape  and size information about Figure and Ground objects (e.g., (3a), third still, by an adult TİD signer). These constructions are considered morphologically complex forms due to having to use two hands simultaneously as well as regarding the use of handshapes to classify the objects. At the same time, they are highly iconic in terms of the entities expressed and the relative locations in space. The second category is for relational lexemes, which are morphologically simpler forms than classifier constructions as they do not convey shape and size information about the entities. They also carry less visual resemblance (i.e., are less iconic) to the real spatial configuration, compared to classifier constructions (e.g., relational lexemes for three spatial types of relations shown in (3b), (3c), and (3d) (third stills)). For some descriptions, signers used both a classifier construction and a relational lexeme. In these cases, we counted each linguistic form and included it in the analysis. In addition to these two locative forms, signers sometimes used other strategies for relational encoding (e.g., pointing towards the object locations), but they were few in number (consisting of less than 5% of the all picture descriptions), and were thus excluded from the analysis. Please note that we were mainly interested in whether children in both languages were similar to their adults in how frequently they preferred these forms in their spatial descriptions. Therefore, the cases where signing children's classifiers (i.e., handshapes) varied from those of signing adults (e.g., use of a generic rather than a specific handshape) were still included in the analysis as long as their classifier handshapes were acceptable in TİD, following the list by Kubuş (2008) and having asked for judgments by a native signer of TİD.  'There is a bed. There is a plane. The plane is under the bed.' We also coded relational encodings of TİD signers in terms of the simultaneity used in their classifier constructions. This has been suggested as one source of difficulty (i.e., morphological or articulatory) for signing children in learning to express spatial relations (Newport & Supalla, 1980;Supalla, 1982;Engberg-Pedersen, 2003;Slobin et al., 2003;Tang et al., 2007;Morgan et al., 2008), but has not been systematically analysed with direct comparisons to adults (except Engberg-Pedersen, 2003;Tang et al., 2007).
When they used classifier constructions, TİD signers represented Figure and Ground simultaneously or not (i.e., non-simultaneously). In the former, after introducing Ground and Figure, signers localized both of them by using two hands simultaneously with classifier constructions (4a). In the current study, this group includes simultaneous constructions in which two locative forms (one for the Figure and one for the Ground) were produced in the signing space at the same time (Simultaneous-Simultaneous), or signers localized the Ground first (e.g., in a classifier predicate or by a lexical sign) and, while holding the Ground in signing space, they localized the Figure in relation to it (Simultaneous-Consecutive) (see Perniss, 2007, for the distinction between these two types of simultaneity). For the current study, we collapsed these two forms into one category, since they are subtypes of simultaneity rather than constituting major categories. In the non-simultaneous type, the mention of Ground was followed by its localization in an analogue form in the signing space, but it was not held there. Then, Figure was mentioned and localized in another classifier, with respect to the previously indicated location of Ground (4b). Please note that both forms are considered to be conventional forms and frequently emerged in our data.

CL(long) LOC
'There is paper. The paper is here. There is a pen. The pen is on the paper.'

Turkish coding
Data was categorized into two morphologically different ways of encoding spatial language: (i) a morphologically less complex general locative case marker attached to the Ground noun (e.g., Masa-da bardak var 'There is a cup at the table'); and (ii) a morphologically more complex form where the locative case marker attached to the spatial noun (e.g., Masa-(n)ın üst-ün-de bardak var 'There is a cup on the table'). Note that (-ın) and (-ün) are additional morphological forms, genitive and possessive markers, which define the relations between a spatial noun (üst) and the Ground object noun (masa). This presents us with a convenient case to see whether the degree of morphological complexity in spatial language forms modulates the acquisition of spatial language. In addition, two options that differ in degree of morphological complexity can be considered as similar to TİD data where classifier constructions as well as relational lexemes are the two options children need to learn to use.

Results
To investigate the effects of language modality on learning to encode spatial relations, namely, containment, contact/support, and occlusion, in TİD and Turkish, we performed a mixed-effects logistic regression model (Jaeger, 2008). To this end, we used the lme4 package (Bates, Mächler, Bolker, & Walker, 2015) in R (version 1.1.21; R Core team, 2019). This approach allowed us to take into account the random variability that is due to having different participants and items. For the different analyses reported below, a stepwise variable selection procedure was conducted for each model, and non-significant predictors were removed to obtain the most parsimonious model. In order to compare models, likelihood ratio tests were performed that compared the goodness of fit using the Anova function in the base package (R Core Team, 2019). In this way, the final model was selected by checking whether the p-value from the likelihood ratio test was significant. Thus, the reported models in our study are the most parsimonious model (smallest AIC with Anova comparisons) and includes the maximum random effect structure that converged. Initially, we added random slopes for age and language (where possible) by item and for spatial type by subject; however, we did not observe improvement in the models as a result of Anova comparisons.

Relational encodings
For the first analysis, we focused on the amount of relational encodings where participants mentioned both Figure and Ground. As seen in Table 3 We conducted a mixed-effects logistic regression model with the mention of a locative relation in picture descriptions as the binary dependent variable (0 = no, 1 = yes) by adding the step-wise fixed effect factors Age (Adult, School age, Preschool age), Language (TİD, Turkish), and Type of Spatial Relation (Containment, Support, Occlusion), to the baseline model (including subjects and items as random intercepts). Age and Type of Spatial Relations were coded with numeric contrasts.
The model revealed a main effect of Age, Language, and Type of Spatial Relation (Table 4). These findings showed that, in both languages, adults encoded a spatial relation more frequently than preschool-age children, but not more than school-age children. Preschoolage children encoded a spatial relation less frequently than school-age children.
Additionally, compared to TİD signers (M = .91), Turkish speakers (M = .97) encoded a spatial relation more frequently. Finally, containment (M = .91) and support (M = .95) type of spatial relations elicited similar amounts of spatial encoding while occlusion (M = .97) elicited more than both types.
Within these relational encodings, we further investigated how frequently different types of locative forms were used (i.e., classifier constructions versus relational lexemes for TİD, and locative case marker vs. spatial noun for Turkish). Since it is hard to equate these linguistic devices in both languages, we avoided conducting direct comparisons of age-groups across languages. Rather, comparisons of children to their adults were done within each language so that we could compare the general developmental patterns in encoding spatial relations.
Linguistic forms used to encode locative spatial relations in TİD To address whether TİD acquiring children have become adult-like in how frequently they used classifier constructions and relational lexemes for encoding locative spatial relations in their language, we carried out a mixed-effects logistic regression model, in which Age (Adult, School age, Preschool age) and Type of Spatial Relation (Containment, Support, Occlusion) were entered as fixed factors, as well as subjects and items as random intercepts. Age and Type of Spatial Relations were coded with numeric contrasts. Type of Spatial Relation turned out to predict the use of both language forms: participants used classifier constructions more for containment (M = .81, SE = .03) and support (M = .89, SE = .03) than for occlusion (M = .58, SE = .05).
There was a reverse pattern for the use of relational lexemes which signers mostly used for occlusion (M = .50, SE = .05) than containment (M = .23, SE = .04) and support (M = .15, SE = .03) ( Table 5). Age did not improve the goodness of fit of the model, and there were no significant interactions, showing that TİD-acquiring children were able to use both of these language forms as frequently as adult signers of TİD (Figure 3).

Different types of simultaneity in expressing Figure and Ground (TİD only)
Within the relational encodings with classifier constructions in TİD, we also analysed whether signing children achieved adult patterns in their simultaneity preferences in a mixed-effects logistic regression model, where we entered Age (Adult, School age, Preschool age) and Type of Spatial Relation (Containment, Support, Occlusion) as fixed effects, and subjects as the random intercept on binary values for mention of simultaneity in classifier constructions (0 = no, 1 = yes). Age and Type of Spatial Relations were coded with numeric contrasts. Our model showed no main effect for Age and Type of Spatial Relation. No interactions between fixed factors were observed (Table 6; Figure 4).

Linguistic devices used to encode locative spatial relations in Turkish
In a similar type of analysis to the one used for TİD, we tested the effects of Age (Adult, School age, Preschool age) and Type of Spatial Relation (Containment, Support, Occlusion) on using a spatial noun coded as binary values of mentioning the spatial relation (0 = no, 1 = yes) by Turkish speakers. Subjects were entered as the random intercepts in the model. Also, Age and Type of Spatial Relations were coded with numeric contrasts. Our model showed no main effect for Age and Type of Spatial Relation. No interactions between fixed factors were observed (Table 7). Thus, similar to TİD-acquiring children, Turkish-acquiring children in both age-groups were similar to adults, and used a spatial noun, rather than the locative case marker, to express the spatial relation between the entities ( Figure 5).

Discussion
This study investigated the possible effects of language modality (i.e., visual/spatial versus auditory/vocal) on the acquisition of containment, support/contact, and occlusion in a sign (TİD) and a spoken language (Turkish). We asked to what extent signing and speaking children are similar in learning to use the locative in their languages. This is the first study where encoding spatial relations is compared between children and adults within a sign and a spoken language, both of which offer morphologically complex language forms in this domain. In general, three hypotheses were entertained, namely, a facilitating effect, a hindering effect, or a neutral effect of the visual/spatial modality. TİD differs Table 5. Details for the model predicting the use of classifier constructions and relational lexemes to encode the locative spatial relation between Figure and Ground in TİD. For the fixed effects, estimates (β), standard errors (SE), z-values, and p-values are given. For the random effects, variance (var) and standard deviations (SD) are reported. radically from Turkish in using iconic forms to different degrees in expressing these spatial relations. Contrary to a number of reports from earlier studies showing a protracted developmental trajectory in learning to produce classifiers constructions, i.e., the most preferred linguistic form to encode spatial relations in sign languages, signing children in our study were not found to be late in their mastery of these forms compared to speaking children. Our findings suggest a neutral effect of language modality on the acquisition of locative spatial expressions. Signing and speaking children were comparable in terms of the frequency of encoding a relation, and in being adult-like in the different types of linguistic strategies they used. Signing children also used morphologically complex simultaneous expressions in adult-like ways.  These findings corroborate earlier studies that suggest the primacy of cognitive development in learning spatial language (H. Clark, 1973;Johnston & Slobin, 1979;Tomasello, 1987;Johnston, 1988;E. Clark, 2004;Loewenstein & Gentner, 2005). In this study, we show that the linguistic form preferences to describe spatial relations by signing and speaking children align with their adults as early as age 3;5. Despite the nature of spatial representation in sign languages, which allow iconic mapping of space onto signing space, deaf children proceed in a manner very similar to their hearing peers in learning to encode locative spatial relations, and are not delayed by the morphological complexity of the locative forms, namely, classifier constructions. This suggests a stronger role for spatial cognitive development for the acquisition of the linguistic representation of spatial relations in sign and spoken languages, thus underscores the relation between cognition and linguistic representations. Table 7. Details for the model predicting the use of spatial nouns to encode the locative spatial relation between Figure and Ground in Turkish. For the fixed effects, estimates (β), standard errors (SE), z-values, and p-values are given. For the random effects, variance (var) and standard deviations (SD) are reported.  The locative spatial relations as investigated here are the ones known to be acquired earliest (Johnston & Slobin, 1979), and are the most salient and basic for children from very early on (Casasola et al., 2003;Hespos & Baillargeon, 2008;Park & Casasola, 2015). It is possible that they are less susceptible to the effects of iconicity or morphological complexity than other spatial relation types. Especially the acquisition of containment and support has been considered to have primacy and centrality in representing objects and their spatial relations (e.g., H. Clark, 1973;Bowerman, 1996aBowerman, , 1996bCasasola, 2005Casasola, , 2008Johannes, Wilson, & Landau, 2016). As shown by these studies, containment and support appear at early stages of language development in speaking children, mainly at around two years of age with some variability depending on linguistic factors. Research with speaking children suggests that language development links linguistic forms to universal, pre-existing representations of spatial meanings (e.g., Hespos & Piccin, 2009), thus underlying the importance of linguistic experience. Johnston and Slobin (1979) exemplifies this process by a 'waiting room' metaphor, in which the entry door is opened with the underlying spatial concept as key, and the exit door with the appropriate linguistic form. Previous studies have suggested that classifier constructions are notoriously challenging to be acquired, and signing children need to spend more time 'in this room' compared to their speaking peersmaybe until around nine to ten years of age. However, our findings provided clear evidence that this was not the case. Signing children in our study still encoded spatial relations in adult-like ways at similar times with their speaking peers. This might be due to the primacy of these spatial concepts, which pushes the employment of these forms, despite their complexity, by signing children.
The neutral effect of language modality as suggested in this study is for the acquisition of a group of locative spatial relations, whose comprehension and production do not require taking a certain viewpoint. Our findings should not be generalized over all types of spatial relations. For example, in a previous work, TİD-acquiring children were found to become adult-like earlier in their preferences than Turkish-speaking children in encoding locative spatial relations that require a viewpoint (e.g., left, right in English) (Sümer, Perniss, Zwitserlood, & Özyürek, 2014;Sümer, 2015). It is possible that, for the acquisition of viewpoint-dependent spatial relations in TİD, language modality might be playing a stronger role: TİD relational lexemes that encode these types of spatial relations are body-anchored, in which signers touch their left arm while, for example, describing the location of Figure to the left of a Ground, or to their right arm for the location of a Figure to the right of a Ground. Thus, expressing viewpoint-dependent spatial relations might be facilitated through directly mapping these forms onto the coordinates of the body. Thus, signing children might be benefiting from iconicity more while encoding viewpoint-dependent spatial relations compared to encoding containment, support/ contact, and occlusion types. The findings of the current study compared with the previous findings on encoding of viewpoint-dependent relations also show that visual resemblance between language form and meaning in sign languages might have different effects for different aspects of language acquisition.
Simultaneous expression of Figure and Ground in classifier constructions has been suggested to be an area of challenge for signing children, especially due to the articulatory difficulties of the simultaneous combination of two classifiers (Supalla, 1982;Slobin et al., 2003). However, in this study, signing children acquiring TİD were able to express Figure and Ground simultaneously in their picture descriptions with classifier constructions as frequently as signing adults. The difference between the findings of the present study and those of previous studies might be related to the fact that previous studies analysed the simultaneous expression of Figure and Ground in motion event narrations (Supalla, 1982;Engberg-Pedersen, 2003;Slobin et al., 2003;Tang et al., 2007), unlike the present study which focused on short descriptions of locative spatial relations. Another source of difference between the findings of the current study and those of previous ones might be coming from how (non)-simultaneity was defined. In previous studies, non-simultaneity may have included those cases where the signing children did not mention Figure or Ground and/or the cases where they did mention them, but did not produce a classifier predicate (Supalla, 1982;Engberg-Pedersen, 2003;Slobin et al., 2003;Tang et al., 2007). The present study meticulously differentiates these cases by doing both a 'Figure-Ground omission' analysis and a simultaneity analysis with both Figure and Ground expressed. The findings of the latter analysis do not suggest a hindering effect of modality for signing children, which might then suggest that the previously reported challenge in the simultaneity of classifier constructions might be due to the omissions of the references and also the event type (i.e., motion events).
Another relevant finding of the current study is that adult and child signers in general used relational lexemes to describe spatial scenes to a higher degree than has been suggested earlier (Emmorey, 2002;Özyürek et al., 2010;Arık, 2013). However, in the current study, both adult and child signers used relational lexemes to provide the spatial information between Figure and Ground to a considerable extent (M = .26, SE = .03 for adults, M = .34, SE = .04 for school-age children, and M = .22, SE = .04 for preschool age children), but still less frequently than classifier constructions (M = .77, SE = .02 versus M = .28, SE = .02). One reason for the higher use of relational lexemes by TİD signers in the present study in comparison with the previous studies could have been caused by the existence of contrastive spatial configurations shown at the same time as the target spatial configuration. In this way, the signers in the present study might have wanted to emphasize such a contrastive nature of the task by using relational lexemes. This was not the case with the previous studies where the signers were presented with only one picture to describe. This might also be the reason why Turkish-acquiring children used a spatial noun, which is morphologically more complex than the general locative case marker, to a similar extent as did adult speakers of Turkish.
Another reason for the relatively high use of relational lexemes found in our data might be due to the occlusion type of spatial relations used in the current study. The Özyürek et al. (2010) study investigated only support and 'next to' types of spatial relations. Similarly, Arık and Wilbur (2008) and Arık (2009) included containment and support in addition to viewpoint-dependent spatial relations such as 'left', 'right', 'front', and 'behind' in their study. The relatively higher use of relational lexemes in our study might thus be due to the inclusion of expressions with occlusion type of relations that elicited relational lexemes more than other types (i.e., containment and support). In our study, signers used relational lexemes more frequently for occlusion (M = .50) than for containment (M = .23) and support (M = .15). On the other hand, they reserved classifier constructions more for containment (M = .81) and support (M = .89), and less for occlusion (M = .58). Interestingly, signing children in both age-groups were sensitive to this distinction. In addition, the occlusion type received a higher amount of relational encodings than the other two types of spatial relations in both languages, which might be related to the relatively higher use of relational lexemes. One speculation about the frequent occurrence of the relational lexeme for occlusion might be that the concept of occlusion might have hindered signers from using a classifier predicate for the location of the Figuresalthough the Figures were always visible in the stimulus pictures. It is also possible that the Figures might have seemed less visible to the signers because there was an occlusion type of spatial relation, and they might have preferred a relational lexeme as a device, which does not provide more specific (e.g., shape) information about the Figure object.
We would also like to point out some limitations of our study, such as the small number of subjects in each group. Deaf native signers comprise 10% of any deaf population (Mitchell & Karchmer, 2004) and are thus hard to reach in the deaf population. This is the main reason for the limited data collected from signers. To make our signing sample comparable to the speaking sample, we limited our signing groups to those with deaf parents (i.e., native signers), unlike many studies which gathered data from a heterogeneous group of signers containing deaf children with deaf parents and deaf children with hearing parents (so called 'non-native signers'). Further research with more subjects and from different sign and spoken languages is needed to generalize our claim on modality effects. Furthermore, in ideal circumstances, it would have been much more preferable to collect longitudinal instead of cross-sectional data and compare the same children at different time intervals to be able to claim findings over language development (Unsworth & Blom, 2010).

Conclusion
This study provides a first account of the developmental patterns in the encoding of locative spatial relations in children acquiring TİD. It is also the first study comparing the developmental patterns in a sign and a spoken language considering both child and adult patterns. It went beyond previous research by focusing on a subset of spatial relations (i.e., locative), rather than generalizing from a mixed set of spatial events (i.e., location and motion). In doing so, it provided extensive qualitative and quantitative analyses of linguistic devices (e.g., classifier constructions, relational lexemes, spatial nouns, etc.) available in both languages. Moreover, the present study took varying levels of the morphological complexity of linguistic devices in TİD and Turkish, and different levels of iconicity for TİD into account. The results show many similarities in the developmental patterns of signing and speaking children's acquisition of static locative expressions in spite of the differences in modality and structure of Turkish and TİD. It is noteworthy that, despite the existence of morphologically less complex forms in their language, children, regardless of the language modality being acquired, did not prefer them in their spatial descriptions, but, similar to adults, used morphologically complex ones in their spatial descriptions.
As a result of these findings, we conclude that the modality of the language being acquired does not seem to have a clearly hindering or facilitating effect, especially through iconicity, on learning to encode locative spatial relations, since similar developmental trends were found for age-matched signing and speaking children. The results of this study also question the generalization of the claim that spatial language poses difficulties for signing children, at least for the acquisition of locative spatial relations. Therefore, morphological complexity and modality-specific aspects inherent in sign languages (i.e., iconicity, simultaneity) do not modulate the development of locative expressions differently for signing children as compared to speaking children. These findings corroborate earlier studies that suggest the primacy of cognitive development in the acquisition of locative expressions for containment, support, and occlusion as they are considered to be salient for children, and have been found to show similar developmental patterns across different spoken languages (H. Clark, 1973;Johnston & Slobin, 1979;Tomasello, 1987;Johnston, 1988;Bowerman & Choi, 2001;Loewenstein & Gentner, 2005). In this sense, it may not be surprising that language modality does not modulate this aspect of spatial language development, unlike what has been reported for the expressions of other types of spatial events (e.g., motion).