Determining who did what to whom is one of the most fundamental tasks when producing or comprehending sentences. Although fundamental, it is not always easy or error-free, especially when the sentence does not adhere to the most frequent and basic word order in the language. For example, the most frequent and basic (canonical) word order in English transitive sentences is (notional) Subject-Verb-(notional) Object (SVO/actives: The girl kissed the boy). This canonical word order does not involve any constituent movement, and the agent (“the girl”) and patient (“the boy”), respectively, map onto the pre- and post-verbal positions. In contrast, an example of a non-canonical structure would be the passive (e.g., The boy was kissed by the girl), where the patient occupies the grammatical subject after being displaced from the notional object position, and the agent is placed within the by-phrase in a postverbal position. This word order is considered non-canonical because it is less frequent and is formed by moving the patient/notional object to the preverbal position (Chomsky, Reference Chomsky1993), whereas the agent now follows the verb. Indeed, passives have been shown to cause difficulties in both production and comprehension relative to actives, for a wide range of language users and learners, including healthy adults (e.g., Ferreira, Reference Ferreira2003) and typically developing (TD) children (e.g., Abbot-Smith et al., Reference Abbot-Smith, Chang, Rowland, Ferguson and Pine2017).
Compared with same-age or sometimes younger TD children, children with Developmental Language Disorder (DLD, also referred to in the literature as Specific Language Impairment, e.g., Leonard, Reference Leonard2014) have been shown to have pronounced difficulties with non-canonical structures in production (Leonard et al., Reference Leonard, Wong, Deevy, Stokes and Fletcher2006) and in comprehension (Marshall & van der Lely, Reference Marshall and van der Lely2006; van der Lely, Reference van der Lely1996). DLD children are characterized as having language impairment across linguistic (sub-)domains, with syntax being particularly affected (see Leonard, Reference Leonard2014, for a summary). This is in the absence of hearing impairment, intellectual disability, socio-behavioral and neurological disorders, and other medical and neurological conditions (Bishop et al., Reference Bishop, Snowling, Thompson and Greenhalgh2017; Schwartz, Reference Schwartz and Schwartz2017). Their difficulties with non-canonical structures have been reported across different sentence types and different languages. However, what makes non-canonical structures particularly vulnerable in DLD children remains inconclusive. Researchers have proposed different accounts, attributing DLD children’s difficulties with non-canonical structures to impaired grammatical knowledge (e.g., van der Lely & Stollwerck, Reference van der Lely and Stollwerck1997), to an impairment in grammatical processing (e.g., Presson & MacWhinney, Reference Presson and MacWhinney2011), or to other domain-general (non-grammar-specific) processing deficits (e.g., in general processing speed and working memory; Leonard et al., Reference Leonard, Eyer, Bedore and Grela1997).
Non-canonical word order and DLD theories
Grammatical representation accounts such as The Representational Deficit for Dependent Relations Account (van der Lely & Stollwerck, Reference van der Lely and Stollwerck1997) and its successor, Computational Grammatical Complexity Hypothesis (van der Lely, Reference van der Lely and Jenkins2004) attribute DLD children’s linguistic deficits to their impaired grammatical knowledge. Embedded within the generative framework of language (Chomsky, Reference Chomsky1993), these theories argue that the syntactic operation movement that gives rise to non-canonical structures is “automatic” and “compulsory” in TD children but “optional” in DLD children, and that DLD children have difficulties in establishing syntactic dependencies between the moved constituent and its trace. Therefore, according to these theories, DLD children may produce sentences in which the compulsorily moved linguistic element remains in situ, for example, by the girl (was) kissed the boy instead of the boy was kissed by the girl (Leonard et al., Reference Leonard, Wong, Deevy, Stokes and Fletcher2006). In terms of comprehension, DLD children may inconsistently assign the agent role to the first Noun Phrase (NP1) they encounter when comprehending a passive structure (Marshall et al., Reference Marshall, Marinis and van der Lely2007). This inconsistent thematic role assignment may lead to chance performance in offline comprehension and production.
Grammatical processing accounts, such as the Unified Competition Model (MacWhinney, 2018), focus on how different populations utilize linguistic cues. Presson and MacWhinney (Reference Presson and MacWhinney2011) proposed that cue cost plays a vital role in predicting the effects of disorders in language. According to this account, DLD children are more likely to rely on a cue that requires less effort during comprehension and production. Typically, cues that are more frequent in the input are more readily available, and hence have a lower processing cost. For example, in English, the first NP that the listener encounters in a sentence offers a strong cue for denoting the agent. This cue is highly available and may elicit a lower processing cost. As such, while English-speaking TD children can make use of the passive-related morphosyntactic cues (i.e., -ed, and by phrase) to assign appropriate thematic roles to the NPs and interpret NP1 in a passive sentence as the patient, DLD children are more likely to rely more on canonical word order cues, interpreting NP1 as the agent. Evidence for this model has been found in the offline comprehension of passives in Persian-speaking monolingual children with and without DLD (Mohamadi et al., Reference Mohamadi, Rafiee, Modaresi, Dastjerdi, Minayi and Ghaderi2015)—DLD children and language-matched TD children were more likely to interpret passives as actives than age-matched TD children using canonical word order cues (see also Dick et al., Reference Dick, Wulfeck, Krupa-Kwiatkowski and Bates2004 for English-speaking children with DLD). In production, DLD children’s reliance on the canonical word order cue would lead to an overproduction of canonical word order or other agent-first structures, a hypothesis the current study will test.
Other accounts combine linguistic properties with general processing limitations to account for DLD children’s pronounced difficulties with non-canonical structures. Among others, the Surface Account (Leonard et al., 1992) and the Linguistic Processing Limitation Account (Deevy & Leonard, Reference Deevy and Leonard2004) are of particular relevance. The Surface Account assumes that given limited processing resources, DLD children are incapable of retaining grammatical morphemes (e.g., -ed and by in English passives), in auditory memory long enough to assign specific grammatical functions to them, leading to the failure to use them efficiently in sentence interpretation (see Bedore & Leonard, Reference Bedore and Leonard2005; Hansson et al., Reference Hansson, Nettelbladt and Leonard2003). Assuming that displaced linguistic elements are interpreted at their trace position (which differs from their surface position as in the boy t was kissed t by the girl), the Linguistic Processing Limitation Account suggests that DLD children have difficulties interpreting the moved NP (the boy) in its trace position due to their limited working memory. In addition, this account also hypothesizes that the longer the distance between the displaced element and its trace, the more difficult the structure is for DLD children.
In sum, current theories on the acquisition and/or processing of non-canonical structures in children with DLD attribute their difficulties to different structural properties: (i) movement (the Representational Deficit for Dependent Relations Account/Computational Grammatical Complexity Hypothesis), word order (the Unified Competition Model), the presence or absence of morphosyntactic cues (the Surface Account and the Unified Competition Model), and the distance between the displaced element and its trace (the Linguistic Processing Limitation Account). However, the empirical premises for these accounts are based primarily on languages (e.g., Indo-European languages), where constituent displacement and non-canonical word order are usually conflated. In contrast, Mandarin Chinese provides an optimal testing ground to tease these theories apart because the displacement of constituents is possible in sentences with different word orders, including non-canonical word orders where the displaced NP1 is still the agent. Moreover, these non-canonical structures differ in structural distance between the moved element and its trace, which is either signaled by morphosyntactic cues (e.g., agent-first with ba “the BA-construction,” patient-first with bei “the BEI-construction) or not signaled by morphosyntactic cues (e.g., patient-first in OSV-construction).
Second, most existing studies have adopted offline methods that measure comprehension after the end of the sentence, requiring relatively high attention load, working memory, and other nonlinguistic processes (Marinis, Reference Marinis, Blom and Unsworth2010). As such, such methods not only disproportionally affect DLD children but also make it difficult to disentangle linguistic processes from nonlinguistic processes. Online processing tasks, on the contrary, measure how participants parse different linguistic information in real-time, such as morphosyntactic cues (e.g., -ed and by), how they build up syntactic representations and assign thematic roles, the performance of which is less subject to other processes involved in offline comprehension. Indeed, existing online processing studies seem to suggest that DLD children do make use of morphosyntactic cues while processing non-canonical structures, contrary to what the Surface Account would predict (e.g., Marinis & Saddy, Reference Marinis and Saddy2013, but see Blom et al., Reference Blom, Vasić and De Jong2014; Chondrogianni et al., Reference Chondrogianni, Marinis, Edwards and Blom2015). Moreover, there continues to be a dearth of research testing the same group(s) of children using all three testing modalities, that is, production, online processing, and offline comprehension.
This study attempts to unravel how Mandarin-speaking DLD children produce and comprehend non-canonical structures as the sentence unfolds (online processing) and after hearing the entire sentence (offline comprehension) to tap into DLD children’s linguistic representations and use of non-canonical structures. To examine production, we adopted the structural priming paradigm. Structural priming is the tendency for prior exposure to a particular structure to facilitate its subsequent use. This is particularly relevant when a structure being primed is a low-frequency structure, such as passives. Importantly, priming taps into the representational level—priming surfaces only when the abstract representation is available (Branigan & Pickering, Reference Branigan and Pickering2017). Additionally, when not producing the primed structure, children’s production of alternative structures (whether erroneous or not) may reveal qualitative differences between the neurotypical and neurodivergent groups in terms of linguistic representations. To unravel whether and how DLD children can implement these representations in real time as the sentence unfolds, as well as make interpretations after all linguistic information is available, we adopted a self-paced listening task with picture verification (for more information, see the Methodology section). This task has been shown to successfully capture (DLD) children’s both real-time processing and offline comprehension (e.g., Marinis & Saddy, Reference Marinis and Saddy2013).
Mandarin non-canonical structures in Mandarin-speaking children
The current study examines three Mandarin non-canonical structures, the BA-construction (1), the BEI-construction (2), and the OSV-construction (3). All examples depict the same event where the agent Zhangsan performed a kicking action on the patient Lisi, which could also be expressed by the canonical SVO word order (the SVO-construction), as in (4). Contrary to the SVO-construction, the three non-canonical structures tested in the current study all involve syntactic movement with the notional object at a preverbal position.

Specifically, the BA-construction involves A-movement of the object from the postverbal position to the preverbal position, where the argument receives Case from the morphosyntactic cue ba (C.-T. J. Huang et al., Reference Huang, Li and Li2009), as in (5) (see Shu, Reference Shu2018; and Zhao, Reference Zhao2021 for an alternative analysis). In contrast, the BEI- and OSV-constructions tested in the current study are argued to involve A’-movement (C.-T. J. Huang et al., Reference Huang, Li and Li2009). For the BEI-construction (sometimes referred to as the Mandarin passive), the morphosyntactic cue bei has been treated as taking a clausal complement (Inflectional Phrase, IP) where the object is a null operator and is moved to its specifier position, where predication with the matrix subject is formed, as in (6) (see Pan & Hu, Reference Pan, Hu, Pan and Hu2021 for an alternative analysis). In the OSV-construction, the object moves to the specifier of a Topic Phrase (TopP), as in (7). Compared with the BEI-construction, the OSV-construction does not have a morphosyntactic cue in between the two NPs. Although we separated the two NPs in the OSV-construction example, a pause after NP1s in production is not obligatory.

As such, the three non-canonical structures share certain properties and differ from each other at the same time. First, the BA-construction, like the SVO-construction, has an agent-patient ordering, while the BEI- and OSV-constructions have a patient-agent ordering. Second, different from the BA- and BEI-constructions that have morphosyntactic cues ba and bei in-between the NPs, the OSV-construction lacks an overt morphosyntactic cue. In terms of the distance between the displaced object and its trace, it is the shortest for the BA-construction (intervened by a verb) relative to the BEI- and OSV-constructions, both of which have an NP and a verb intervening between the displaced object and its trace. It is worth noting here that although the distance between the displaced object and its trace is the same for the BEI- and OSV-constructions, the interpretation of the object in the BEI-construction requires an additional operation relative to the OSV-construction, that is, predication, which has not been proposed to be optional or problematic for DLD children.
Concerning acquisition, a small number of empirical studies with TD children have shown that children as young as two years of age (Deng et al., Reference Deng, Mai and Yip2018) spontaneously produce BA- and BEI-constructions. These structures can be primed from the age of three years (Hsu, Reference Hsu2014a,b; Ji et al., Reference Ji, Sheng and Zheng2023). TD children also show qualitatively adult-like online processing, offline comprehension, and production of these structures under experimental conditions, from age three to five (Hao et al., Reference Hao, Chondrogianni and Sturt2024b; Hao & Chondrogianni, Reference Hao and Chondrogianni2023; Hsu, Reference Hsu2014b, Reference Hsu2018; Y. T. Huang et al., Reference Huang, Zheng, Meng and Snedeker2013; Zhou & Ma, Reference Zhou and Ma2018). In terms of online processing, Hao et al. (2024) showed that, like adults, 5- to 9-year-old TD children made use of relevant cues to revise initial misinterpretations. More specifically, in comprehending the BA- and BEI-constructions, they used the morphosyntactic cues ba and bei, respectively. In comprehending the OSV-constructions, they used the second NP (NP2) cue—when two NPs appear consecutively without a morphosyntactic cue in between, such that NP1 functions as the patient and NP2 as the agent. However, it remains unclear whether the three structures will develop symmetrically. Zhou & Ma (Reference Zhou and Ma2018) reported that three-year-old TD children comprehended the BA- and BEI-constructions equally well (see also Hsu, Reference Hsu2018). In contrast, Y. T. Huang et al. (Reference Huang, Zheng, Meng and Snedeker2013) found that the BEI-construction was more prone to comprehension errors than the BA-construction for five-year-olds. Interestingly, the reversed pattern was observed in naturalistic production, with the BEI-construction produced two months earlier than the BA-construction (Deng et al., Reference Deng, Mai and Yip2018), although this is based on a corpus and diary analysis of one single child. Ji et al. (Reference Ji, Sheng and Zheng2023) showed that, for children aged 3 to 6, priming (with lexical overlap) was more prominent for BA-constructions than for BEI-constructions. However, lexical overlap in the study might have favored the production of agent-first constructions. Turning to the OSV-construction, Hao & Chondrogianni (Reference Hao and Chondrogianni2023) found a lower performance in offline comprehension and production with the OSV-construction as opposed to the BA- and BEI-constructions in Mandarin-speaking children aged five to nine.
As for DLD children, to our knowledge, two studies have tested DLD children’s production of the BEI-construction (Du, He, et al., Reference Du, He and Yu2024) and its Cantonese equivalent (Leonard et al., Reference Leonard, Wong, Deevy, Stokes and Fletcher2006). Meanwhile, there is one study that tested DLD children’s offline comprehension of the BEI-construction (Du, He, et al., Reference Du, He and Yu2024). These studies showed that DLD children (4- to 7-year-olds) have more difficulties producing and comprehending the BEI-construction compared to their age-matched TD peers. However, it remains unclear if Mandarin-speaking DLD children have more difficulties producing and comprehending the BA- and OSV-construction relative to age-matched TD peers, and if such difficulties also manifest in the real-time processing of these structures. Moreover, studies on the potentially differential development and use of these structures in DLD children are lacking.
The present study
The present study compared 5- to 9-year-old Mandarin-speaking DLD children to age-matched TD children on their production, offline comprehension, and online processing of three Mandarin non-canonical structures. Specifically, we address the following questions:
-
1. Do Mandarin non-canonical structures pose more difficulties for DLD children to produce compared with age-matched TD children (in both priming and production patterns)?
-
2. Do Mandarin non-canonical structures pose more difficulties for DLD children to process and interpret relative to age-matched TD children?
-
3. If DLD children experience difficulties with non-canonical structures, do all three non-canonical structures pose similar levels of difficulty? More specifically, how do structural properties, that is, word order, the presence or absence of morphosyntactic cues, and the distance between the displaced element and its trace, modulate production and comprehension performance in DLD children?
Predictions
Table 1 summarizes the predictions each theoretical account makes for each research question.
Table 1. Predictions for DLD children’s performance (relative to TD children) for each research question according to different theoretical accounts

Note. RDDR = Representational Deficit for Dependent Relations Account, CGCH= Computational Grammatical Complexity Hypothesis.
Starting with RQ 1, based on previous cross-linguistic research and studies on Mandarin, we predict that DLD children would have more difficulties relative to age-matched TD children in the production of non-canonical structures (e.g., Du, Durrleman, et al., Reference Du, Durrleman, He and Yu2024; Leonard et al., Reference Leonard, Wong, Deevy, Stokes and Fletcher2006; van der Lely & Battell, Reference van der Lely and Battell2003). More specifically, we expect DLD children to show a smaller priming magnitude compared to TD children, suggesting that DLD children have less robust representations than their TD peers (Garraffa & Smith, Reference Garraffa, Smith and Messenger2022) and produce more errors. In terms of production errors, previous findings have been inconclusive, potentially because of the different coding schemes (e.g., Du, Durrleman, et al., Reference Du, Durrleman, He and Yu2024, but not Leonard et al., Reference Leonard, Wong, Deevy, Stokes and Fletcher2006 considered pro-drop as errors). Here, we make predictions based on different DLD theories. According to the Representational Deficit for Dependent Relations Account or the Computational Grammatical Complexity Hypothesis, DLD children may produce more instances of optional movement, for example, (1) cat chase (bei/ba) dog; (2) chase (bei/ba) dog; (3) cat (bei/ba) dog chase it instead of cat (bei/ba) dog chase. Morphosyntactic cue omissions would be expected by the Surface Account. The Unified Competition Model would predict a more frequent production of the SVO-construction or the BA-construction as they are both agent-first structures. The Linguistic Processing Limitation Account, however, does not make direct predictions about production errors.
In terms of RQ2, we expect DLD children to have lower offline comprehension accuracy compared to age-matched TD children (e.g., Du, He, et al., Reference Du, He and Yu2024; C. Marshall et al., Reference Marshall, Marinis and van der Lely2007; Montgomery et al., Reference Montgomery, Gillam, Evans and Sergeev2017). However, we predict that DLD children will adopt qualitatively similar processing strategies, for example, using relevant linguistic cues (ba, bei, and NP2 cue) to revise misinterpretations compared to TD children (following Marinis & Saddy, Reference Marinis and Saddy2013). It is worth noting here that DLD children’s use of these cues does not contradict the Unified Competition Model, as it does not preclude the possibility that DLD children rely on the agent-first cue for final interpretation, that is, they commit to an interpretation that is consistent with the agent-first cue.
Although no previous study, to our knowledge, has examined the potentially differential performance across structures (RQ3), differential performance across structures (structural effect) is predicted by some but not all DLD theories. Starting with theories that predict no structural effect, as all three structures are derived via syntactic movement, they should be equally difficult for DLD children, importantly performing at chance, according to the Representational Deficit for Dependent Relations Account and the Computational Grammatical Complexity Hypothesis. While it is theoretically possible that different types of movement (i.e., A-movement vs. A’-movement) could lead to differential performance, such distinctions are not predicted by either the Representational Deficit for Dependent Relations Account or the Computational Grammatical Complexity Hypothesis. Instead, both accounts treat movement as uniformly challenging for DLD children who have difficulties establishing syntactic dependencies as a result of optional movement. Empirically, this is supported by findings that both A-movement constructions (e.g., English passives; van der Lely, Reference van der Lely1996) and A’-movement constructions (e.g., wh-questions; van der Lely & Battell, Reference van der Lely and Battell2003) are vulnerable domains for DLD children. The Unified Competition Model predicts that the BA-construction should induce better performance because it shares the same agent-patient order with the SVO-construction, relative to the BEI- and OSV-constructions. A better performance of the BA-construction over the BEI- and OSV-constructions is also predicted by the Linguistic Processing Limitation Account, as the distance between the displaced object and its trace is the shortest in the BA-construction. However, the two may give different predictions concerning the relative performance between BEI- and OSV-constructions. More specifically, although DLD children may rely on the agent-first cue as predicted by the Unified Competition Model, if they can use the bei cue and the NP2 cue, the BEI-construction would induce better performance relative to the OSV-construction as the bei cue has higher availability and lower cue cost compared to the NP2 cue (Li et al., Reference Li, Bates, Liu and MacWhinney1992). It is also worth noting here that the ba cue has higher availability than the bei and the NP2 cue. On the contrary, better performance of the OSV-construction compared to the BEI-construction may be predicted by the Linguistic Processing Limitation Account on the assumption that the predication relationship between the syntactic subject and the null operator moved from the object position increases the distance between where the notional object is realized in the surface structure and where it is originated. Last, the Surface Account predicts the OSV-construction to be less difficult as it does not contain morphosyntactic forms relative to the BA- and BEI-constructions, where morphemes carrying grammatical functions need to be perceived and maintained for successful interpretation.
Methodology
Participants
A total of 59 five-to nine-year-old Mandarin-speaking children living in China participated in the study. About 37 were TD children with no reported history of hearing, speech, language, socio-emotional, or developmental disorders (henceforth the TD group), recruited through primary schools and personal contacts whose parents and teachers have not expressed concerns with their (language) development. The remaining 22 children were diagnosed with language impairment by specialized doctors during one or several visits to specialized children’s hospitals in Xi’an (northern China) and Shanghai (southern China). Within these 22 children, 20 were included as children with DLD if they did not have a diagnosis of other (neuro-)developmental disorders (one hearing loss and intellectual disability, respectively). Specifically, the formal diagnosis of DLD started from a pediatrician who referred children with 1.5 standard deviations below the mean on the language component of the Wechsler Intelligence Scale for Children—Chinese fourth edition (WISC-CIV) (Zhang, Reference Zhang2009), and no physiological disorders for further diagnosis. During further diagnosis, standardized tests for other developmental disorders (e.g., autism spectrum disorder, intellectual disabilities, etc.), and tests for language development were administered. The diagnosis of Ertong Yuyanchihuan “Child Language Impairment” was given if the child performed 1.5 standard deviations below the means on at least two standardized Mandarin language tests. As such, these children met DLD diagnosis criteria (henceforth DLD).
For the analyses, five TD participants (two data loss, two non-completion, and one below-chance filler comprehension) and two DLD participants (below-chance filler comprehension) were excluded. The final sample included 32 TD children (17 female) and 18 DLD children (9 female). The two groups were matched on nonverbal intelligence measured by the Raven’s colored progressive matrices (Raven, Reference Raven2003) (TD: Mean = 109.06, SD = 12.79; DLD: Mean = 102.22, SD = 15.17; t(48) = −1.69, p = .10), age (in month; TD: Mean = 84.59, SD = 17.07; DLD: Mean = 91.00, SD = 17.42; t(48) = 0.42, p = .68) and socioeconomic status measured by maternal education level (TD: Mean = 16.81, SD = 1.13; DLD: Mean = 16.78, SD = 1.01; t(48) = −0.11, p = .91).
Production task
The production task was a comprehension-to-production priming task. Participants first heard a pre-recorded sentence describing a picture shown on a computer screen (prime), after which, participants were asked to describe a new picture on the screen (target). To construct the prime sentences, we selected five transitive verbs (tui “push,” yao “bite,” ti “kick,” qin “kiss,” and ju “raise”). For the targets, three new verbs (zhui “chase,” xi “clean,” and wei “feed”) were selected in addition to tui “push” and ti “kick,” which have been used in the primes too. All primes shared the same structure: Noun Phrase (NP) + morphosyntactic cue ba, bei, or null + NP + Adverb + Verb Phrase (VP). Furthermore, the NPs in all sentences were disyllabic, while all verbs were monosyllabic, followed by an aspectual marker and an adverb (either marking frequency, that is, yi-xia, “once” or action result). As for the adverbs, we included qingqingde “gently,” xiaoxinde “carefully,” and manmande “slowly,” immediately after the second NPs and before the VPs. Each adverb was used three times across verbs in the sentences.
In the primes, each verb appeared nine times and was distributed evenly across conditions so that there were 15 trials of each Prime Type (BA-, BEI-, and OSV-primes), making a total of 45 primesFootnote 1 . To reduce participants’ reliance on non-syntactic cues and to level out item-based priming effects (Tomasello, Reference Tomasello2000), there were no lexical overlaps in any prime-target pair. To avoid any repetition or order effects, we constructed three separate lists such that each prime picture was depicted with all three structures, each of which appeared in only one list. For instance, (8), (9), and (10) are the BA-, BEI-, and OSV-primes for the prime picture (Fig. 1), and they were arranged into lists A, B, and C, respectively (see online supplementary materials on OSF for the lists). Participants were randomly assigned to different lists. In each experimental list, all trials were arranged in a pseudorandom order where trials from the same condition did not appear consecutively.


Figure 1. Example of pictures for experimental trials for both production and comprehension tasks.
The task also included 20 filler trials, such as yi-zhi zhizhu kanshu, yi-zhi zhizhu xiexin, “A spider is reading; a spider is writing.” Each filler trial consisted of a picture with two animals performing an intransitive action (e.g., yuedu “reading,” shuxie “writing,” paobu “running,” kaixin “being happy,” and tiaoyue “jumping”). In addition, all filler primes were in the coordinated canonical SVO word order (i.e., SVO and SVO by juxtaposing two simple clauses).
Comprehension: online processing and offline interpretation
A self-paced listening task with picture verification was designed to examine real-time comprehension as the sentence unfolds, as well as offline comprehension that involved offline interpretation after participants had heard the entire sentence (e.g., see Marinis & Saddy, Reference Marinis and Saddy2013). In each trial, participants first viewed a picture for three seconds to form an interpretation of the depicted event. While the picture remained on screen, they then listened to a sentence segment-by-segment by pressing the spacebar. After hearing the full sentence, participants judged whether it matched the picture by pressing the Z key for “yes” or the M key for “no.” Sentences either matched the picture—where NPs carried the correct thematic roles—or mismatched it, with reversed thematic roles assigned to referents. This manipulation was intended to induce processing difficulties during reanalysis in mismatching conditions, resulting in longer reaction times (RTs) during sentence processing and lower accuracy in the sentence-picture matching part of the task. After crossing Structure and Matching, six experimental conditions (BA-match, BA-mismatch, BEI-match, BEI-mismatch, OSV-match, and OSV-mismatch) were tested in a within-subject design (see Table 2 for Figure 1).
Table 2. Experimental conditions for the comprehension task, paired with Figure 1

A total of 48 experimental sentences were constructed using the same eight verbs used in the production task. Each verb was used six times (i.e., tui “push,” zhui “chase,” yao “bite,” ti “kick,” qin “kiss,” xi “clean,” ju “raise,” and wei “feed”). To ensure consistency across sentences, we included only disyllabic NPs and monosyllabic verbs followed by a perfective aspectual marker and an adverb either marking the frequency of the event (i.e., yi-xia, “once”), or the result of the action. The adverbs qingqingde “gently,” xiaoxinde “carefully,” kaixinde “happily,” and manmande “slowly” were each used twice across verbs. Apart from controlling for any animacy effect by making reversible BA-, BEI-, and OSV-sentences, we also ensured that the two animals mentioned in each sentence were of comparable size (both in the real world and in the pictures) to avoid world knowledge bias or visual cues. Twenty filler trials—two animals performing an intransitive action—were included. It could be that two of the same type of animal are performing different actions, or two different animals are performing the same action. The filler trials were also broken into five segments as indicated by slashes, for example, Wo kanjian/ yi-zhi zhizhu/ kanshu/, yi-zhi zhizhu/ xiexin, “I saw that a spider is reading; a spider is writing.”
Similarly, we constructed six separate lists so that any given condition of the same item appeared once in any given list, and across all lists, all conditions of all items were represented (see the online supplementary material on OSF). The relative position of the agent/patient in the pictures was also counterbalanced (i.e., agents on the left in half of the trials and on the right in the other half). In each experimental list, all sentences were arranged in a pseudorandom order so that trials from the same condition did not appear consecutively. The trial order was the same for each participant.
The experimental sentences were pre-recorded by a male monolingual speaker of standardized Mandarin at a normal rate. In segmenting the recorded sentences, we ensured that each segment sounded as natural as possible, and segment boundaries were also word boundaries.
Procedure
All participants participated in the study via the web at their own homes using laptops/desktops. The whole experiment lasted a total of approximately 60 – 100 mins (each task lasted 30 – 50 mins), depending on the child’s age and language impairment status. The experimental tasks were implemented in JsPsych (de Leeuw, Reference de Leeuw2015), a JavaScript framework for creating behavioral experiments. As such, all participants received the same instructions and prompts. To control for potential carry-over effects between experimental tasks, the presentation of tasks was counterbalanced so that half of the participants completed the production task first, while the other half completed the comprehension task first. The whole process of the experimental tasks for each participant was audio-recorded. The Raven’s test was presented via Qualtrics. In addition, all participants and their parents were informed of their ethical rights of participation, and this information was given verbally and in writing prior to the experiment. Before any tasks, participants (and their parents) were asked to indicate their consent to participating in the study by clicking on the relevant button on the screen. The study was approved by the University’s research ethics committee.
Coding and scoring
In the production task, recordings were separately transcribed by a machine (via Xunfei Tingjian) and the first author. The reliability of coding (between the machine and the human transcriber) reached 99%. Disagreements (21 out of 2115 utterances) were typically caused by the machine’s lesser flexibility in transcribing repeated words and filler sounds (e.g., em, ah, etc.), which might change the phonological decoding of a previous or subsequent word/character. To settle the disagreements, the first author re-listened to the recordings and checked against their first pass transcription. Then, the first author coded the transcribed sentences as “BA,” “BEI,” “OSV,” and “SVO” if the responses were complete utterances with correct thematic roles. Other complete responses were coded as “Reversed” if they depicted correct actions but with reversed thematic roles, or as “Others” if the utterances failed to establish who did what to whom (e.g., yi-zhi laoshu he yi-zhi laohu zhanzai yiqi, “A mouse and a tiger are standing together”), included NP omissions (see example 11) when no morphosyntactic cues were used or were unidentifiable and incomplete. Additionally, instances indexing optional movement: (1) cat chase (bei/ba) dog; (2) chase (bei/ba) dog; or (3) cat (bei/ba) dog chase it, were coded as “Optional.”

NP omissions without a morphosyntactic cue (example 11) were coded as “Others” because in Mandarin, both topics in the OSV-construction and objects in the SVO-construction can be omitted. As such, without the presence of a morphosyntactic cue, the realized NP could be interpreted as the subject in the SVO- or OSV-construction as well as the object in the OSV-construction.
In the comprehension task, if the participant gave a correct response to the question of whether the sentence matched the picture that was asked at the end of each trial, the response was scored as “1,” and it was scored as “0” otherwise. This provided an offline measure of comprehension accuracy. We then included trials with correct responses for the RT data analysis to understand how different cues were used in real time when participants could make correct interpretations of the sentences. In analyzing the RT data, we first excluded extreme values that were below 500ms or above 5000ms after checking the distribution of the data and outliers that were below or above 2 standard deviations of the mean calculated for each structure per participant and per condition. Then, we converted raw RTs to residual RTs, defined as the differences between raw RTs and predicted RTs calculated for each participant and trial based on the duration of each segment, using a linear model. This allows us to control for the differences in duration across trials and segments and individual differences in responding to different items and conditions.
Results
Statistical analyses were carried out with the lme4 package and the mlogit package in R (R Core Team, 2018). Multinomial logistic regressions, binomial logistic regressions, and generalized linear mixed-effect regressions were adopted to respectively analyze the production data, comprehension accuracy data, and RT data, as they afford more dynamic data modeling and account for random effects (Barr et al., Reference Barr, Levy, Scheepers and Tily2013). We included the maximal random effects justified by the design where possible—when models converged (Barr et al., Reference Barr, Levy, Scheepers and Tily2013). To identify the optimal model, we adopted the stepwise backward selection approach (unless stated otherwise) starting from the maximal model using likelihood ratio tests. For the post hoc analyses, Bonferroni pairwise comparisons were conducted for binomial logistic and generalized linear mixed-effect regressions. For multinomial logistic regressions, we exhausted all possible combinations of reference levels for all variables and conducted analyses with reduced models when significant interactions were attested. Note that all categorical variables are treatment coded, and that, unless specified, when reporting statistical models below, bold-faced levels are chosen as the reference level for the variable.
Production task
Figure 2 shows the distribution and proportion of response types primed by different structures. The TD group produced 1305 responses (228; 17% “Others” and 20; 2% “Reversed”). As for the DLD group, 810 responses were collected, and they had the highest proportion of “Others” (228; 28%) and of “Reversed” (86; 11%). Neither group produced instances of optional movement. Table 3 presents the optimal multinomial model that included Group (TD and DLD; RQ 1) and Prime Type (BA, BEI, and OSV; RQ 3) as fixed effects when analyzed against the five-level dependent variable—Response Type (BA, BEI, OSV, SVO, and Reversed). The interaction between Group and Prime Type was significant.

Figure 2. The distribution of response types primed by different structures and their proportions (numbers).
Table 3. Optimal model with group and prime type as fixed effects for all valid responses in the production task

Notes. *p < .05, **p < .01, ***p < .001, R syntax: mlogit(Response_Type∼1|Group*Prime,data = MS_wide, reflevel = “BEI”.
Priming. First, there is a main effect of Prime Type—for both groups, such that the production of one structure was more likely after primes of the same structure (e.g., as shown in the optimal model, the production of the BEI-construction was more likely after BEI-primes for both groups). Second, there is a main effect of Group, such that the TD group was more likely to be primed across structures than the DLD group (e.g., as shown in the optimal model, BEI-primes induced a larger priming magnitude for the TD group than for the DLD group). Last, there is an interaction between Prime Type and Group, which was driven by the fact that the extent to which participants were primed was similar across prime types for the TD group, but the DLD group was primed more by BA-primes relative to the priming by BEI-primes or OSV-primes. See the R script for the optimal models for BA- and OSV-primes.
Production patterns . Given the nature of multinomial logistic regression and our RQ, we will directly unpack the interaction term first (and the interpretation of the effects of Prime Type and Group will also be self-evident). First, after BA-primes, the DLD group produced more reversal errors than the TD group (Estimate = 2.11, SE = 0.57, z =3.71, p < .001). Secondly, after BEI-primes, compared to the TD group, the DLD group produced more BA-constructions, SVO-construction and reversal errors (see the optimal model), and a similar pattern was found after OSV-primes (BA: Estimate = 1.24, SE = 0.25, z = 5.06, p < .001; SVO: Estimate = 1.42, SE = 0.29, z = 4.83, p < .001; reversal: Estimate = 3.50, SE = 0.56, z = 6.31, p < .001). However, after OSV-primes, the TD group produced more BEI-constructions relative to the DLD group. In addition, we found that the probability of producing the SVO-construction after BA-primes was the same for the two child groups (Estimate = 0.38, SE = 0.20, z = 1.95, p = 0.05), but the DLD group was more likely to produce the SVO-construction after BEI- (see the optimal model) and OSV-primes (Estimate = 1.42, SE = 0.29, z = 4.83, p < .001) compared with the TD group.
We then qualitatively analyzed the responses, especially those coded as “Others” and “Reversed.” First, within the responses coded as “Others,” only 29 out of the 228, and 22 out of the 228 were instances of NP omission without morphosyntactic cues in the TD and DLD groups, respectively. Second, when participants made reversal errors, the responses were also syntactically primed. In addition, after BA- and BEI-primes, responses without morphosyntactic cues mostly carried correct thematic roles (coded as “OSV”), with five for the TD group, and seven for the DLD group carrying reversed thematic roles (coded as “Reversed”).
Comprehension task
Accuracy data. Figure 3 shows the offline comprehension accuracy across groups and structures. We ran generalized logistic linear mixed-effect regressions with Group (TD and DLD; RQ 2), Structure (BA, BEI, and OSV; RQ 3), and Matching Condition (Match and Mismatch) as fixed effects. The optimal model (Table 4) included a two-way interaction between Group and Structure.

Figure 3. Offline comprehension accuracy across Matching conditions and Structures in the TD and DLD groups.
Notes. The purple line indexes an accuracy rate of 87% (ceiling) while the red line chance level (50%); numbers are the mean accuracy.
Table 4. Optimal model with group, structure, and matching as fixed effects for the offline comprehension accuracy

Notes. *p < .05, **p < .01, ***p < .001, R syntax: glmer(Accuracy∼Group*Condition*Structure+(1+Condition|Participant)+(1|Item), family=binomial, control = glmerControl(optimizer = ‘bobyqa’))).
As also evident in Figure 3, statistical analyses showed that: (1) the TD group had significantly higher accuracy than the DLD group across structures and conditions (Estimate = 1.46, SE = 0.2, z = 7.32, p < .001); for both groups (2) the Mismatch condition across structures caused more errors compared with the Match condition (TD: Estimate = −1.04, SE = 0.19, z = −5.22, p < .001; DLD: Estimate = −1.41, SE = 0.20, z = −6.93, p < .001), and (3) BA- and BEI-conditions received similar accuracy rate across conditions (all ps > 0.5). In addition, although OSV-conditions elicited more errors compared with both BA- and BEI-conditions for the TD group in both match (BA: Estimate = 1.51, SE = 0.22, z = 4.74, p < .001; BEI: Estimate = 1.05, SE = 0.28, z = 3.70, p = 0.01) and mismatch (BA: Estimate = 1.21, SE = 0.23, z = 5.30, p < .001; BEI: Estimate = 0.88, SE = 0.22, z = 4.12, p = 0.002) conditions, such an OSV-disadvantage was not observed in the DLD group (all ps > 0.5).
Reaction Times . For RTs, we included only the trials where participants gave correct offline comprehension responses and excluded extreme values and outliers. This resulted in the exclusion of 338 (5.68%) data points in the TD group and 99 (5.14%) in the DLD group.
Figure 4 illustrates RTs for the TD and the DLD groups across segments, conditions, and structures. A matching effect was found in the critical segment (Segment 3), where the morphosyntactic and/or word order cue was available for the TD group (mismatch conditions showed slower RTs than matching conditions). For the DLD group, however, such a matching effect was only found in BEI- and OSV-conditions but not in BA-conditions. Nonetheless, a matching effect surfaced later at the last segment, where the verbal information was available in BA-conditions for the DLD group.

Figure 4. Residual RTs across the TD and the DLD groups crossed with matching condition or structure type.
To statistically evaluate the Reaction Time data, we ran generalized linear mixed-effect regressions separately for each segment. Only the analysis for Segment 3 revealed significant effect(s) while the optimal models for Segment 1, 2, 4, and 5 failed to show effect(s) of Group (TD and DLD; RQ 2), Structure (BA, BEI, and OSV; RQ 3), or Condition (Match and Mismatch). For Segment 3, the optimal model (Table 5) included the three-way interaction among the three independent variables.
Table 5. Optimal model with group, structure, and condition as fixed effects for the RTs in Segment 3 with significant simple effects included only

Notes. *p < .05, **p < .01, ***p < .001, r syntax: lmer(scale(ResidualRT)∼Group*Structure*Condition+(1+Structure+Condition|Participant)+(1+Group+Structure+Condition|Item).
Post hoc analysis showed that the TD group exhibited a matching effect across structures (Main effect of Condition: Estimate = 0.69, SE = 0.05, t = 13.09, p < .001). On the other hand, for the DLD group, the matching effect was only found significantly in the BEI-conditions (Estimate = 0.93, SE = 0.17, t = 5.59, p < .001) and marginally in OSV-conditions (Estimate = 0.50, SE = 0.18, t = 2.63, p = .08) but not in BA-conditions (Estimate = 0.02, SE = 0.16, t = 0.13, p = .97). In addition, because a delayed matching effect at the last segment only in BA-conditions and only for the DLD group was observed based on visual inspection of Figure 4, we ran an additional analysis for each group to see if such an effect was also statistically testable but did not surface in the previous analysis given the high resemblance in other structures and conditions between the two groups. Indeed, while there were no significant effects for the TD group, but a matching effect was found in BA-conditions for the DLD group (Estimate = 0.61, SE = 0.28, t = 2.13, p = .03, for the interaction term).
Discussion
The present study adopted a priming production task and a self-paced listening task with picture verification to investigate the production, offline comprehension, and online processing of three Mandarin non-canonical structures in children with and without DLD. These structures involve syntactic movement but differ in word order, the presence or absence of morphosyntactic cues, and the distance between the displaced element and its trace. First, we asked if Mandarin-speaking DLD children had more difficulties with non-canonical structures compared to their age-matched TD peers in production (RQ1) and online processing and offline comprehension (RQ 2). Then we asked if DLD children’s difficulties with these structures, if any, were modulated by structural differences (RQ 3). In the remainder of the section, we first discuss how Mandarin-speaking DLD children represent, produce, and comprehend non-canonical structures. Then we discuss the implications the current findings have on DLD theories.
Starting with RQ 1, the present study showed that relative to the age-matched TD group, the DLD group had a smaller priming magnitude, suggesting weaker structural representations, and that they committed more production errors across the three non-canonical structures, indicating more production difficulties. This is what we predicted, and it is in line with previous research (e.g., Du, Durrleman, et al., Reference Du, Durrleman, He and Yu2024; Leonard et al., Reference Leonard, Wong, Deevy, Stokes and Fletcher2006; van der Lely & Battell, Reference van der Lely and Battell2003). Evidence for the DLD group’s greater difficulty in comprehending non-canonical structures (RQ2) was found in offline comprehension, such that the DLD group showed lower offline comprehension accuracy across structures compared to the TD group (Du, He, et al., Reference Du, He and Yu2024; Marshall et al., Reference Marshall, Marinis and van der Lely2007). In terms of online processing (RQ 2), we found that the two groups adopted the same strategies and used the bei cue and the NP2 cue to revise misinterpretations when processing BEI- and OSV-constructions (in line with Marinis & Saddy, Reference Marinis and Saddy2013). However, contrary to our prediction and previous research, while the TD group made use of the ba cue immediately when it is available, the DLD group did not do so and revised their misinterpretation at a later stage, which we will discuss with reference to RQ 3.
In the context of RQ 3, we observed differential performance across structures and task modalities. The results from the production task suggest that while the TD group was primed to the same degree across structures, the DLD group was primed more by the BA-construction relative to the BEI- and OSV-constructions. However, this BA-advantage for the DLD group was not attested in offline comprehension nor in online processing patterns. In contrast, the BA-construction induced similar offline accuracy compared to the BEI-construction. This is perhaps not completely surprising given that the DLD group did not make use of the ba cue immediately when it was available and revised their misinterpretation at a later stage. However, why did the DLD group fail to use the ba cue locally, and why did they show no BA-advantage in offline comprehension?
We postulate that this might be caused by DLD children’s limited processing resources, coupled with the additional processing load induced by the experimental setup. We argue that, given their limited processing resources, the DLD group relied on the most valid cue, given the specific structure and the specific context, when different cues are available, and could only integrate the information of other (less valid) cues later. For example, in the case of the BEI-construction, the only (morpho)syntactic cue leading to the correct interpretation is the bei cue, such that it was used immediately when available. The same could be argued for the OSV-constructions. In addition, as we unpack later, the experimental setup may have reinforced the validity of the bei cue and the NP2 cue. In contrast, in BA-constructions, as both the agent-first and the ba cue give rise to correct interpretations, the ba cue may be redundant. Moreover, the experimental setup may have reduced the validity of the ba cue in the specific context.
More specifically, in the comprehension task, the non-canonical structures were embedded into a main clause—I saw. As a result, in BEI- and OSV-constructions, the NP1s following the verb “see” can be interpreted as its patient, as well as the patient of the verb in the subordinate clause (the non-canonical structures). In this sense, there is a match between the potential thematical roles of the NP1 in relation to the main and the embedded verb, reinforcing the validity of the bei cue and the NP2 cue. In contrast, in BA-constructions, NP1s have contrasting roles, namely, they are the patient of the main verb but the agent of the embedded verb of the BA-construction. This suggests that as participants heard the sentence, they had to reinterpret NP1s in BA-constructions as the agent of the embedded clause, and not the patient of the main verb. This process may have been automatic in the TD children but may have led the DLD children down a garden path (see also Chondrogianni et al., Reference Chondrogianni, Marinis, Edwards and Blom2015; Marinis & Saddy, Reference Marinis and Saddy2013), requiring more processing resources for DLD children. Meanwhile, as NP1s are the patient of the main clause and the ba cue leads to the opposite interpretation, the validity of the ba cue is reduced in the current experimental context, contrary to a non-embedding context where the ba cue typically agrees with the canonical word order cue. As such, the processing cost of using ba in an embedding context as in the current study is high. As a result, DLD children failed to actively use the ba cue immediately when available, masking the potential BA-advantage. This could explain the lack of matching effect at the segment where ba was available (not using the cue immediately) and a delayed matching effect at the last segment found only for BA-conditions (later cue integrations), as mentioned above. Future research could substitute the “I saw” with “Yesterday,” for example, and have a balanced item number for agent-first vs. patient-first experimental items (and fillers).
Before turning to the theoretical implications of the current study, the DLD group’s increased production of reversal errors across structures, relative to the TD group, warrants further discussion. Previous studies have typically attributed reversal errors to optional movement and/or an agent-first preference in DLD children (e.g., Jensen De López et al., Reference Jensen De López, Sundahl Olsen and Chondrogianni2014; van der Lely, Reference van der Lely1996, Reference van der Lely1998). However, we observed reversal errors that still follow non-canonical sentence structures. These cannot be explained by optional movement because both NPs appear preverbally, in contrast to SVO sentences, where movement is not involved. On the other hand, DLD children’s overall preference for agent-patient ordering cannot fully account for the current findings. Notably, the DLD group produced more reversal errors in patient-first structures (BEI- and OSV-constructions) and, to a lesser extent, in the agent-first BA-construction. The DLD group still produced more reversal errors compared to the TD group in the production of BA-constructions, indicating that agent-first preference cannot be the only factor leading to reversal errors. We suggest that the DLD group’s less developed inhibitory control ability may also play a role. More specifically, it could be the case that DLD children’s BA reversal errors indicate perseverance of exposure to patient-first structures (e.g., BA- and OSV) in the course of the experiment. These can be successfully inhibited by the TD group but not the DLD group. Future studies can directly explore the relationship between inhibition and reversal errors in priming and/or examine the production of these structures in naturalistic contexts.
Implications for DLD theories
Turning to the DLD theories introduced above, the findings from the present study are at odds with the Representational Deficit for Dependent Relations Account and the Computational Grammatical Complexity Hypothesis, according to which syntactic movement is optional in DLD children—leading to instances of optional movement, at chance performance, and difficulty with structures requiring the establishment of syntactic dependencies between a displaced constituent and its trace. None of these patterns were observed in our data. While the DLD group did perform worse than the TD group overall in both production (smaller priming effects) and offline comprehension (lower accuracy), their performance was above chance in comprehension. However, the priming magnitude was smaller in children with DLD compared to their TD peers, indicating that while children with DLD have the relevant grammatical representations, these may not be as readily accessible as in their TD peers.
The Unified Competition Model argues that DLD children rely more on cues that have high availability and low processing cost, for example, the canonical word order cue. As such, the findings that DLD children showed a BA-advantage and produced more SVO and BA-constructions are in line with the model’s prediction as SVO and BA-constructions both have agentive NP1s. It is worth noting here that DLD children’s overreliance on the canonical word order cue does not preclude any potential use of morphosyntactic cues. As such, local and non-local use of morphosyntactic cues and the NP2 cue online are not in contradiction to this account. Although the finding that the bei cue and the NP2 cue, but not the ba cue, were used locally cannot be accounted for by the Model, it does not necessarily refute the model. This is especially true because the ba cue has higher availability than the other two cues (ref here?), suggesting that cue availability overall cannot be the only factor dictating DLD children’s use of linguistic cues. Instead, we postulated that the cost of different cues given specific structure and context (Presson & MacWhinney, Reference Presson and MacWhinney2011), and DLD children’s domain-general processing limitations, especially inhibitory control, may also play a role. Here, we further propose that in determining cue cost, the specific linguistic properties and contexts should be considered. More specifically, because the canonical word order cue and the ba cue lead to similar interpretations, DLD children might adopt compensatory strategies that reduce reliance on redundant cues to allocate more processing resources elsewhere. Future research can benefit from examining cue redundancy in DLD children.
The Surface Account predicts that DLD children have difficulties retaining morphosyntactic cues long enough to assign grammatical functions, which could explain the finding that the DLD group did not use the ba cue locally (immediately when it was available). However, the DLD group ultimately used the ba cue later in processing, contradicting this prediction. Moreover, the BA-advantage and the absence of cue omission in production are inconsistent with this account.
Lastly, the Linguistic Processing Limitation Account attributes DLD children’s difficulties to limited processing capacity, predicting that shorter dependencies should be easier to compute. The BA-advantage is, thus, expected, as the distance between the displaced element and its trace is the shortest in the BA-construction among the three non-canonical structures. However, this account does not incorporate mechanisms for how linguistic cues are used or integrated during online processing. While the BA-advantage in offline measures can be attributed to the relatively shorter linguistic dependency, the finding that DLD children showed delayed rather than absent use of the ba cue is not directly addressed by this account. Importantly, while this account focuses on processing load during syntactic computation, it remains largely silent on the temporal dynamics of cue activation and integration over the course of sentence processing.
Overall, by examining Mandarin structures that allow us to disentangle structural complexity from linear distance, the findings in our study point towards a complex interaction of representational weakness, indicated as weaker priming, and domain-general processing constraints, as evidenced by the use of word order over morphosyntactic cues and cue weighting in children with DLD.
Limitations and future directions
As discussed above, certain results could potentially be attributed to aspects of the experimental tasks and stimuli. For example, the priming paradigm might have promoted the production of morphosyntactic cues given their lexical nature. In the comprehension task, non-canonical structures were embedded into a main clause such that an additional syntactic reanalysis is required, making the task more taxing for DLD children, especially when it comes to the BA-construction. However, based on the overall results across tasks, we were able to draw an overall picture of DLD children’s difficulties with non-canonical structures despite these limitations. This highlights the importance, for future research, of increasing methodological granularity and including various tasks within the same population in DLD psycholinguistic research. That being said, studies (whether experimental or not) avoiding these limitations are also very welcome.
Given the current sample, analysis, and research questions, there are several important aspects that the current study could not investigate. It remains unclear if and how DLD children would differ from a group of language-matched TD children and/or children with other neurodevelopmental disorders such as autism spectrum disorder. In addition, going beyond the aggregated group-level analysis to adopt an individual difference approach offers insights into how and why language develops the way it does (Chondrogianni, Reference Chondrogianni2023; Paradis, Reference Paradis2023). This also has important implications for DLD theories. For example, fluid intelligence, controlled attention, working memory, and long-term memory have been proposed to modulate DLD children’s sentence comprehension ability (Montgomery et al., Reference Montgomery, Gillam and Evans2021). However, given the current sample size and available data, we cannot adopt the individual difference approach and understand the effects of these particular factors on Mandarin-speaking DLD children’s comprehension (and production) of non-canonical structures. We leave these questions for future research and caution readers when generalizing the current results.
Conclusions and clinical implications
The present study highlights the value of cross-linguistic research beyond Indo-European languages and underscores the importance of methodological granularity in advancing our understanding of DLD. By examining three movement-based structures that differ in word order and the presence or absence of morphosyntactic cues across distinct testing modalities, we were able to critically evaluate competing theoretical accounts of DLD. This fine-grained approach enabled us to disentangle the contributions of structural and cue-related factors (typically conflated in English and other Indo-European languages) to comprehension and production difficulties in DLD, offering nuanced insights into the nature of the disorder. Precisely, our current findings do not suggest that DLD children have deficits in their grammar (optional movement) or in using less valid linguistic cues (e.g., bei cue in the BEI-construction). Rather, DLD children have knowledge of different cues, but they have difficulty in using more than one cue at the same time during online processing, which we argue is a result of their limited processing resources coupled with their delayed development.
The current study also bears clinical implications for both the diagnosis of (Mandarin) DLD and intervention for DLD across languages. Mandarin non-canonical structures might constitute clinical markers for DLD in a wide age range. Among different structures, those without a reliable free-standing morphosyntactic cue can be an especially robust clinical marker. As observed across tasks, the OSV-construction induced more difficulties for Mandarin-speaking DLD children relative to the other two structures that are also non-canonical. Meanwhile, this study also suggests that structural priming can be a clinical marker for DLD (manifesting as a less prominent priming effect) while exploring the nature of the linguistic impairment. In the case of the current study, DLD children’s difficulties cannot be attributed to the lack of abstract knowledge of non-canonical structures. Indeed, priming targeting structures without a reliable free-standing morphosyntactic cue can constitute a valid assessment method, especially because there is no lexical overlap (of the morphosyntactic cues) in such structures. In addition, because of DLD children’s reliance on the most valid cue in the specific structure, explicit training in cue validity and coordinating more than one cue would bring benefits to DLD children’s production and comprehension of non-canonical structures.
Replication package
Data, scripts for analysis and experimental lists could be found online at https://osf.io/nt2fc/.
Data availability statement
Supplementary data analyses and visualizations are available at the Supplementary Material.
Competing interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.




