The emergence of subjects in Lebanese two-year-olds

Published online by Cambridge University Press:  24 October 2022

UMR 1253, Imagery and Brain (iBrain), Université de Tours, Inserm, Tours, France.
American University of Beirut, Lebanon.
Private clinic, Lebanon.
Laurice TULLER
UMR 1253, Imagery and Brain (iBrain), Université de Tours, Inserm, Tours, France.
*Corresponding author: UMR 1253, Imagery and Brain (iBrain), Université de Tours, Inserm, Tours, France. E-mail:
In Lebanese Arabic, lexical subjects may occur before or after verbs, but only before non-verbal predicates. Analysis of spontaneous language samples from 19 two-year-old children shows that postverbal (VS) and preverbal (SV) subjects emerge simultaneously. The youngest children displayed no VS-SV difference in frequency. A slight preference for SV is observed in older children. No preference for SV subjects was found in the speech of the mothers of the younger or older children. Lexical subjects systematically appeared before non-verbal predicates. We interpret these results as evidence for early knowledge of syntactic movement, consistent with Wexler’s (1998) Very Early Parameter Setting.

Brief Research Report
We report here on a study of the emergence of utterances with subject-predicate relations in the early stages of multiword utterances in Arabic. In Lebanese Arabic (LA), lexical subjects, including demonstratives and strong pronouns, may be preverbal (SV) or postverbal (VS), a property which has typically been analyzed as entailing a difference in syntactic movement.Footnote 1

It has been argued that syntactic movement is costly and that young children acquire structures requiring no movement before those that require movement (van Kampen, Reference Van Kampen1997). In addition, when acquiring a construction which allows an option with fewer movements compared to an option with more movement, young children show a preference for the most economical one, avoiding the options with more movements (Hamann, Reference Hamann2006; Hulk & Zuckerman, Reference Hulk and Zuckerman2000; Zuckerman, Reference Zuckerman2001). This observation was formalized in work by Jakubowicz as the Derivational Complexity Hypothesis (DCH). Jakubowicz (Reference Jakubowicz2011) showed that less complex derivations (those that involved fewer movement operations) were acquired earlier than more complex ones (those involving more movement operations), in both Typically Developing children and children with Developmental Language Disorder; and that children with typical or atypical development are sensitive to movement. When given alternative ways to express the same content, they produce more frequently the less complex alternative (Jakubowicz, Reference Jakubowicz2011; Prévost, Tuller, Galloux, & Barthez, Reference Prévost, Tuller, Galloux and Barthez2017).

Wexler (Reference Wexler1998) argued for the Very Early Parameter-Setting (VEPS) hypothesis, which can seem at odds with the idea that movement is costly. This hypothesis suggests that basic syntactic parameters are set correctly as soon as the child enters the two-word stage. Wexler includes, among the basic syntactic parameters, word order parameters such as Verb-Object versus Object-Verb orders in Swedish and in German, and Verb-to-Inflection (V-to-T) movement in French versus English. In line with the predictions of VEPS, Friedmann and Lavi (Reference Friedmann and Lavi2006) found that Hebrew-speaking children as young as 2;3 years were capable of syntactic movement, as they correctly repeated sentences entailing movement to a non-argument position (Spec-CP). Interestingly, children were capable of repeating sentences entailing movement to a non-argument position only if they were able to repeat sentences entailing movement to an argument position (e.g., object-to-subject movement). So, perhaps not all types of movement are acquired early; some types might be more costly than others, resulting in a hierarchy in the acquisition of movement types (Friedmann & Reznick, Reference Friedmann and Reznick2021).

We follow Ouhalla (Reference Ouhalla, Cheng and Corver2013) in assuming that the availability of SV and VS word orders in Arabic, illustrated in (1), is linked to a parametrization of the Extended Projection Principle (EPP).Footnote 2

In Arabic, the subject (and the object) is/(are) generated inside the Verb Phrase (VP) and the verb raises to Tense (T) (see, e.g., Aoun, Benmamoun, & Choueiri, Reference Aoun, Benmamoun and Choueiri2010; Ouhalla, Reference Ouhalla, Cheng and Corver2013). The EPP feature generated in T, which is optional in Arabic, entails movement of the subject to the Specifier of TP, resulting in SVO order as shown in (2).

In sentences with VSO order, verb raising to T leaves the subject in situ, inside VP (Aoun et al., Reference Aoun, Benmamoun and Choueiri2010; Benmamoun, Reference Benmamoun2000; Fassi Fehri, Reference Fassi Fehri1993; Mohammad, Reference Mohammad2000; Ouhalla, Reference Ouhalla, Cheng and Corver2013; Thompson & Werfelli, Reference Thompson and Werfelli2012). This is illustrated in (3).

According to this analysis, SV order requires the child to execute an additional syntactic movement, that of the subject to the higher subject position, compared to VS order, which does not entail subject movement. Support for the analysis of SV as being more costly than VS comes from a study on adult processing reporting longer reading times for SV sentences compared to VS sentences (Thompson & Werfelli, Reference Thompson and Werfelli2012) indicating that SV may entail greater processing. The DCH would then predict that children acquiring Arabic might favor, at earlier stages, VS order over SV order. The VEPS, however, makes the opposite prediction – namely, that both SV and VS will emerge at the same time, and be used with equal frequency, and that movement is indeed acquired very early.

Most studies that have looked at child acquisition of VS compared to SV in Arabic have been based on production data as summarized in Table 1.Footnote 3 Studies based on sentence repetition tasks (SR), in which children had to repeat sentences that they heard, were carried out on Palestinian Arabic (PA) (Friedmann & Costa, Reference Friedmann and Costa2011; Khamis-Dakwar, Reference Khamis-Dakwar2011), and Qasem (Reference Qasem2020) reported on a longitudinal study of spontaneous language corpora of two children acquiring Yemeni Ibbi Arabic (YIA).

Table 1. Production of SV versus VS word order by age

It emerges that children younger than 2;6 repeated sentences with VS order with greater accuracy than sentences with SV order (Friedmann & Costa, Reference Friedmann and Costa2011; Khamis-Dakwar, Reference Khamis-Dakwar2011), though the difference between these word orders was no longer significant in children aged 2;3 to 2;5 in Khamis-Dakwar (Reference Khamis-Dakwar2011). At age 2;6, children performed equally well for VS and SV orders in Friedmann and Costa (Reference Friedmann and Costa2011), while the children in Khamis-Dakwar (Reference Khamis-Dakwar2011) started at this age performing better for SV compared to VS order. This preference for SV order was observed in Friedmann and Costa (Reference Friedmann and Costa2011) beginning at age three. The two children whose language samples and “informal elicited production” were analyzed by Qasem (Reference Qasem2020) produced more SV utterances, both when they were younger and older than 2;6.

In Arabic, subjects in verbless sentences can occur only before the non-verbal predicate (Al-Balushi, Reference Al-Balushi2019). As illustrated in (4), such verbless clauses consist of a lexical subject and a Determiner Phrase (DP) (4a), a Prepositional Phrase (PP) (4b), an Adjective Phrase (AdjP) (4c) or an Adverb Phrase (AdvP) (4d) as predicate. To our knowledge these structures have not been examined in studies of child acquisition of Arabic.

Aoun et al. (Reference Aoun, Benmamoun and Choueiri2010) argued that the syntax of verbless clauses in Arabic does not involve an empty verbal copula. Rather, sentences such as those in (4) are TP’s in which T projects a DP/AP/PP complement. The subject must therefore be base-generated in Spec-TP and is expected to occur only before the predicate (5) (Al-Balushi, Reference Al-Balushi2019). Consequently, subjects in verbless clauses do not display the word order options found in verbal clauses.

The only way a lexical subject in verbless clauses can be found after the predicate is when the predicate undergoes wh-movement/topicalization or when the subject is extraposed. For example, in the structure for (6a), given in (6b), the predicate ween ‘where’ undergoes wh-movement to CP, and thus, the subject Sara follows the non-verbal predicate.

We report results for LA-speaking children on the emergence of subjects in utterances that express subject-predicate relations. Arabic is interesting, we have seen, because it offers several syntactic alternatives to children in this regard. Clauses may be verbal or verbless, and whereas in verbal clauses lexical subjects may appear before or after the verb, lexical subjects in verbless clauses normally appear only before the predicate. We therefore sought to determine how lexical subjects emerge in very early multiword utterances in the acquisition of LA, and, in particular, where they occur in the earliest clauses. This study does not consider utterances involving null subjects, including those appearing in verbless clauses. Since non-verbal predicates don’t agree with their subjects, it is not possible to distinguish between a non-verbal predicate with a null subject and an isolated NP/PP/AP. In that sense, subject drop in verbal clauses is different from subject drop in verbless ones. Focusing on unambiguous subject-predicate relations, we analyzed the following clause types: verbal and verbless clauses with lexical subjects. The latter include pronominal subjects as well. Our study sought to answer the two specific questions listed in (7).

Understanding how lexical subjects emerge in the acquisition of LA in children with typical development, we hope, will contribute to the constitution of a baseline for determining what typical acquisition of LA looks like in children, thereby making it possible to identify children who might have a language disorder, and who could benefit from speech-language therapy.



Nineteen Lebanese children aged 2;0–2;11 were involved in the study (see Table 2), nine girls and ten boys.Footnote 4 They had all been exposed to LA since birth, had typical language development and no hearing impairment. Participants were recruited from different regions in Lebanon and were representative of the Lebanese population in terms of socioeconomic status, regional and community origin.Footnote 5 LA is spoken by more than 90% of the population of Lebanon, but it coexists notably with French and English (Leclerc, Reference Leclerc2015), the two main languages used for instruction at all levels, including in preschools. Most children growing up in Lebanon are exposed to more than one language, from a very early age, both inside and outside the home. The frequency of exposure to languages other than LA varies considerably. The mothers of all the participants in this study reported (in response to a parental intake interview) using LA with their child at least 50% of the time (mean 80%, with a range from 50% to 100%) and many mothers (17 out of 19) reported also using either English or French with their child.

Table 2. Participants: N, Gender, Age, Region and percentage of Lebanese Arabic spoken by the mother to the child


To address the research questions of the present study, we analyzed spontaneous interactions between the child and his/her mother. Spontaneous language corpora have proven to be useful for examining the course of language acquisition over time (Demuth, Reference Demuth, Houwer, Gillis and Behrens2008).

Prior to collecting language data from each child, parental consent was obtained. An intake form and a language background questionnaire were completed. This included information regarding the child’s medical condition, specifically her/his hearing history, the parents’ level of education, occupation, and the frequency of use of each language spoken at home. Each child was audio- and video-recorded while engaging in a 30-minute play session with his/her mother. Apart from playing naturally, no additional instructions were given to the parent.

Language samples were transcribed and coded in the CHAT format ‘Codes for Human Analysis of Transcripts’, then analyzed via CLAN ‘Computerized Language Analysis system’, one of CHILDES ‘Child Data Exchange System’ programs (MacWhinney, Reference MacWhinney2000). For each child, approximately 100 consecutive child utterances (along with all mother’s utterances in this passage) were analyzed starting at five minutes after the beginning of the recording (unless obtaining 100 utterances required starting earlier). As is customary in studies based on spontaneous language analysis, unintelligible expressions, utterances solely consisting of onomatopoeia or yes/no elliptical responses, repetitions of the previous word or set of words, counting and other sequences of enumeration and as well as social responses were excluded.

Child production was segmented into utterances following the criteria outlined by Rondal (Reference Rondal, Rondal and Seron1999), whereby both intonation and syntactic criteria were considered for segmentation. Clauses were coded as verbal if they contained at least one verb or pseudo-verb.Footnote 6 As our aim in this study was to look at the position of the lexical subject, we coded “SV” preverbal subjects and “VS” postverbal subjects.

Statistical tests were carried out on JASP (JASP Team, 2020). Shapiro-Wilk normality tests showed that distribution was not normal for most measures; the Mann-Whitney U, the Wilcoxon signed-rank and the Spearman’s Rho tests were thus used throughout.


A total of nearly 2,000 child utterances were analyzed. Given that previous studies have shown that Arabic-speaking children displayed a change in preference from VS to SV during their second year, at around age 2;6, the two-year-old participants in this study were divided into two groups according to this cut-off (see Table 3). Independent support for dividing participants into two groups came from calculation of Mean Length of Utterance (MLU), a reliable index of syntactic maturity in children. MLU was calculated by dividing the total number of words, including clitics, produced by each child by the number of utterances produced by that child. We see, in Table 3, that MLU increased with age, with the older two-year-olds having significantly higher MLU’s than the younger two-year-olds (U(19) = 15.000; p = 0.016). This result suggested that we might expect that other syntactic measures would differentiate these two groups.

Table 3. Utterances analyzed in each participant group: Total Number of Utterances, Mean Number of Utterances per child (SD), Number of Utterances Range, Mean Length of Utterance (MLU) (SD) and MLU range in younger and older 2-year-olds

In the younger and older two-year-olds the number of words per utterance varied. Children produced one-word utterances (8a), two-word utterances (8b) but also longer utterances (8c–d).

The utterances produced by these two-year-old children included both verbal and verbless clauses, as illustrated in (9a), with a verbal predicate and a lexical subject (total n = 81), and as in (9b), with a non-verbal (AdjP) predicate and a lexical subject (total n = 210) (see Figure 1).

Figure 1. Comparison of lexical subjects with verbal predicates and lexical subjects with verbless predicates

The majority of the children produced lexical subjects in both verbal and verbless clauses. Mean rates of the latter (calculated over proportions used by each child) appear to be more frequent, as is shown in Figure 1, but this difference was not significant in either group of children (younger: Z(10) = 7.000; p = 0.074; older: Z(9) = 6.000; p = 0.107). Likewise, the older two-year-olds did not produce a higher proportion of lexical subjects with verbal predicates compared to the younger two-year-olds (U(19) = 35.000; p = 0.659).

Turning to the emergence of the syntactic position of lexical subjects in verbal predicates, we observed that out of a total of 870 verbal clauses produced by these two-year-olds, 81 had an overt (lexical, including strong pronouns) subject. Lexical subjects included demonstratives and strong pronouns; as noted earlier, LA is also a null subject language. For all children who produced more than two verbal clauses with a lexical subject, these included both preverbal (10a) and postverbal (10b) subject positions. Likewise, most of the Lebanese mothers produced lexical subjects in both postverbal and preverbal position.

Furthermore, no preference was found for SV or VS in the younger two-year-olds (Z (10) = 24.000; p = 0.105). The older two-year-olds displayed a slight preference for SV over VS (Z (9) = 26.500; p = 0.042) (Figure 2). However, the older children did not have a significantly higher proportion of SV clauses compared to the younger children (U (19) = 29.500; p = 0.208).

Figure 2. Comparison of SV and VS orders in verbal clauses

In the mothers’ speech, no significant difference was found between SV and VS (Z (19) = 101.500; p = 0.08). Furthermore, there were no correlations between subject position rates in the mothers’ speech and their children’s speech SV (rs = 0.336; p = 0.187).

In order to examine the position of subjects in verbless clauses, we analyzed all utterances containing a non-verbal predicate with a lexical subject (N = 210). In the majority of these utterances (see Figure 3), for both younger (67.7%) and older (62.1%) two-year-olds, the subject and the non-verbal predicate appeared in their canonical positions, as illustrated in (11), where the subject hajdee ‘this’ directly precedes the predicate bat’t’aa ‘duck’.

Figure 3. Lexical Subjects in Verbless Clauses

In other cases, the predicate was moved to the left periphery due to wh-movement (12a) or topicalization (12b) (original position is indicated by the same word in angled brackets) and hence preceded the subject.

In the remaining cases in which the subject was not in clause initial position, it also appears to be dislocated to the right. This is illustrated in (13), which involves the dislocation of the subject hajda ‘this’ coreferential with the pronominal clitic uu ‘him’ on aħuu, to the right of the adverbial predicate hoon ‘here’.

Discussion and conclusion

This study investigated the emergence of subjects in LA, based on the analysis of spontaneous language samples from 19 Lebanese children aged 2;0–2;11, and examined the position of lexical subjects in verbal and verbless clauses.

As Friedmann and Costa (Reference Friedmann and Costa2011) and Khamis-Dakwar (Reference Khamis-Dakwar2011) reported that children displayed an early preference for VS order which disappeared at age 2;6, we divided these children into two groups, younger two-year-olds and older two-year-olds according to this cut-off. The older two-year-olds produced significantly longer utterances than the younger two-year-olds, as measured by MLU. Moreover, the MLU range displayed by the Lebanese two-year-olds in our study, 1.3–2.7, was quite similar to that found by Abdalla and Crago (Reference Abdalla and Crago2008) in a study of 10 Urban Hijazi Arabic-speaking children in the same age range (2;0 to 3;0, MLU range 1.4–2.4).

Regarding the emergence of subjects in verbal predicates, we observed that young Lebanese children in the earliest stages of multi-word production, including those who produced very few verbal clauses with lexical subjects, produced subjects in both preverbal and postverbal positions. We interpret these results as indicating that both VS and SV orders seem to emerge at the same time and therefore our results support the hypothesis of Very Early Parameter Setting (VEPS), at least in relation to movement of the subject to Spec-TP. Our analysis of the mothers’ speech showed that these children were receiving input containing evidence for both subject positions, since the mothers produced both VS and SV clauses, including notably the mothers who said they did not overwhelmingly address their children in Arabic (but only 50 – 75% of the time), but also in French or English, both SV languages.

In contrast to what others have found in results based on SR tasks (Friedmann & Costa, Reference Friedmann and Costa2011; Khamis-Dakwar, Reference Khamis-Dakwar2011), VS order was not significantly more frequent than SV order below age 2;6. However, after age 2;6 there was a very slight preference for SV order over VS order. This would seem to go in the same direction as the results reported by Qasem (Reference Qasem2020) in their longitudinal study of two YIA-speaking children and Khamis-Dakwar (Reference Khamis-Dakwar2011) on SR tasks, but also by Friedmann and Costa (Reference Friedmann and Costa2011) for PA-speaking children over age three. Although the VS>SV preference found in the SR studies for children under 2;6 was not found in our spontaneous samples, it may be that there is a change taking place around age 2;6 since the children younger than 2;6 displayed no difference between SV and VS whereas children older than 2;6 displayed a slight preference for SV over VS. The preference for SV follows neither from the VEPS nor from the DCH. To our knowledge no proposal has been made to explain SV preference. It has been asserted that SV is more frequent than VS in adult PA (Friedmann & Costa, Reference Friedmann and Costa2011); we observed no significant SV > VS frequency difference in the mothers’ speech in our language samples.

The difference between our child results, and in particular the absence of any early VS over SV preference, and those found in previous studies could be related to considerable individual variability observed in the younger group of our study, with some children producing very few or no lexical subjects with verbal predicates. Another possible explanation could be that the methods, SR and language sample analysis, might not evaluate the same competence. While SR tasks test the ability to decode, process and store in memory in order to repeat a specific syntactic structure, spontaneous language production has fewer constraints. In other words, SR tasks might require more mature cognitive skills, which could entail that children might have more difficulty with more complex constructions in this mode. Given that structures with SV order involve extra movement, younger children might avoid this additional cost during a SR task and thus perform better at repeating sentences with VS. Since spontaneous language does not involve these extra constraints, it follows that both VS and SV emerge at the same time and show no difference in frequency, if word order parameters are set very early on.

Turning to verbless clauses, two-year-old LA-speaking children always place lexical subjects in initial position before non-verbal predicates, unless the latter undergoes wh-movement or topicalization. These results show that Lebanese two-year-olds adhere strictly to the adult syntax for verbless clauses at a young age, also in conformity with the VEPS hypothesis, as well as to the DCH (since S-predicate order does not entail derivational complexity in this case). It is interesting that in verbless sentences the subject-predicate order is not the result of movement, whereas in verbal sentences, SV results from movement. In other words, these two identical surface word orders are the result of two different derivations. Our findings indicate that young children are sensitive to this difference: they don’t seem to make errors – placing the subject after the non-verbal predicate by analogy with verbal predicates, for example. In other words, they seem to know that verbless clauses are indeed verbless (i.e., present tense T may have a DP/AP/PP complement) and therefore that there is only one position in which the subject can occur – namely, Spec-TP.

In sum, our study offers quantitative and qualitative data on the emergence of subjects in Lebanese two-year-olds based on the analysis of spontaneous language corpora. Young Lebanese children produced lexical subjects in verbal clauses in both available positions (VS and SV) and at the earliest stage. The fact that even children producing very few verbal clauses did so supports the idea that both preverbal and postverbal subjects emerge at the same time. The children systematically produced lexical subjects before non-verbal predicates, the only position available to them in the absence of wh-movement or topicalization of the predicate. Notably, no evidence was found for avoidance of movement, especially of the subject in verbal clauses (SV order) among the youngest children. Moreover, there was evidence for early production of several different kinds of displacement (subject movement to Spec-TP, wh-movement, topicalization and dislocation), both in utterances containing verbless clauses and those with verbal clauses.


We would like to thank the families who participated in this study and who welcomed us to their homes, and Ghada Khattab, Newcastle University UK, for making her recordings available to us for analysis. We also are grateful for comments from Philippe Prévost and Racha Zebib, University of Tours. Finally, we would like to acknowledge the Higher Institute of Speech and Language Therapy of Saint Joseph University in Beirut (directors Camille Moitel Messara and Edith Kouba Hreich) for facilitating work on this corpus.


1 We note that LA is also a null subject language: verbs, which carry rich subject agreement, may occur with a null subject.

2 In LA, VOS order is also possible. However it is not a neutral word order, unlike SVO and VSO, which are.

3 To our knowledge, there is only one study on the acquisition of SV/VS word order based on comprehension data: Aljenaie & Farghal (Reference Aljenaie and Farghal2009).

4 The majority of the children of this study were recruited as part of a research project “Baseline Data for Arabic Acquisition with Clinical Applications” (2009–2012) which was supervised by Ghada Khattab, Newcastle University UK and funded by the Qatar National Research Fund.

5 All varieties of LA include the structures under study here.

6 In Arabic linguistics, the term “pseudo-verb” designates words which function as verbs but which do not have ordinary verbal inflection; notably, their logical subject corresponds to a clitic object pronoun (Comrie, Reference Comrie and Versteegh2008).

7 Missing (X) or substituted (=X) elements are presented in parentheses. For each utterance example, the code and age of the child is provided after the utterance in parentheses.


