Functional priority of syntax over semantics in Chinese ‘ba’ construction: evidence from eye-tracking during natural reading

Yanjun Wei; Yingjuan Tang; Adam John Privitera

doi:10.1017/langcog.2023.42

Functional priority of syntax over semantics in Chinese ‘ba’ construction: evidence from eye-tracking during natural reading

Published online by Cambridge University Press: 13 September 2023

Yanjun Wei

Yingjuan Tang and

Adam John Privitera

Show author details

Yanjun Wei*: Affiliation:
Key Laboratory of the Cognitive Science of Language (Beijing Language and Culture University), Ministry of Education, Beijing, China Center for the Cognitive Science of Language, Beijing Language and Culture University, Beijing, China
Yingjuan Tang: Affiliation:
Center for the Cognitive Science of Language, Beijing Language and Culture University, Beijing, China
Adam John Privitera: Affiliation:
Centre for Research and Development in Learning, Nanyang Technological University, Singapore
*: Corresponding author: Yanjun Wei; Email: yanjun.wei@blcu.edu.cn

Article contents

Abstract
Introduction
Experiment 1: self-paced reading experiment
Experiment 2: eye-tracking experiment
General discussion
Data availability statement
Author contribution
Financial disclosure
Competing interest
References

Rights & Permissions

Abstract

Studies on sentence processing in inflectional languages support that syntactic structure building functionally precedes semantic processing. Conversely, most EEG studies of Chinese sentence processing do not support the priority of syntax. One possible explanation is that the Chinese language lacks morphological inflections. Another explanation may be that the presentation of separate sentence components on individual screens in EEG studies disrupts syntactic framework construction during sentence reading. The present study investigated this explanation using a self-paced reading experiment mimicking rapid serial visual presentation in EEG studies and an eye-tracking experiment reflecting natural reading. In both experiments, Chinese ‘ba’ sentences were presented to Chinese young adults in four conditions that differed across the dimensions of syntactic and semantic congruency. Evidence supporting the functional priority of syntax over semantics was limited to only the natural reading context, in which syntactic violations blocked the processing of semantics. Additionally, we observed a later stage of integrating plausible semantics with a failed syntax. Together, our findings extend the functional priority of syntax to the Chinese language and highlight the importance of adopting more ecologically valid methods when investigating sentence reading.

Keywords

syntactic processing semantic processing eye-tracking self-paced reading natural reading

Type: Article
Information: Language and Cognition , Volume 16 , Issue 2 , June 2024 , pp. 380 - 400

DOI: https://doi.org/10.1017/langcog.2023.42 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2023. Published by Cambridge University Press

1. Introduction

Generative grammar (Chomsky, Reference Chomsky1980) and other related theories (Fodor, Reference Fodor1983; Pinker, Reference Pinker1991) propose that syntax serves as an independent module from semantics. A key question in sentence comprehension is the relative timecourse of syntactic and semantic processing. According to Friederici’s three-phase model (2002, 2011), a syntactic framework is initially constructed based on word category information during the first phase. This is followed by a second phase for semantic and thematic role assignment. Finally, syntactic and semantic information is integrated in the last stage, especially when there is conflict present. This model proposes that the building of syntactic structure precedes semantic processing (hereafter, the syntax-first model) and that syntax and semantics interact at a later processing stage.

Empirical support for Friederici’s syntax-first model (Friederici, Reference Friederici1995, Reference Friederici2011, Reference Friederici2017) comes from previous studies predominantly using electroencephalography (EEG) in samples of Indo-European language speakers in both the visual (Friederici et al., Reference Friederici, Steinhauer and Frisch1999) and auditory domains (Hahne & Friederici, Reference Hahne and Friederici2002; Isel et al., Reference Isel, Hahne, Maess and Friederici2007). These studies typically adopted a violation paradigm in which the critical word of a sentence would violate syntactic (SYN) or semantic (SEM) constraints. Thus, stimuli could be classified into four different conditions: both syntactically and semantically violated (SYN-SEM-, e.g., Das Türschloß wurde im gegessen, ‘The door lock was in-the eaten’ in English), syntactically violated but semantically correct (SYN-SEM+, e.g., Das Eis wurde im gegessen, ‘The ice cream was in-the eaten’), syntactically correct but semantically violated (SYN + SEM-, e.g., Der Vulkan wurde gegessen, ‘The volcano was eaten’), or syntactically and semantically correct (SYN + SEM+, e.g., Das Brot wurde gegessen, ‘The bread was eaten’). Using the SYN + SEM+ condition as a baseline, the above studies found that syntactic violations generated an early event-related potential (ERP) associated with grammatical error processing (i.e., early left anterior negativity; ELAN) coinciding with the time-window for initial structure building based on syntactic category. Furthermore, the N400 component, which was present in the SYN + SEM- condition, was absent in the SYN-SEM- condition, suggesting that syntactic violations blocked semantic processing. Finally, the SYN-SEM- condition of the above studies also resulted in the generation of a P600 component, indicating a process of reanalysis for syntactic and semantic integration. Taken together, these findings support that the building of syntactic structure precedes semantic processing, and if semantic information fails to be mapped onto syntactic structure, semantics will interact with syntax at a later stage.

Evidence of an initial process of syntactic parsing in Indo-European languages has also been observed in many reading studies conducted using self-paced reading (Deniz, Reference Deniz2022; Kennison, Reference Kennison2009) and eye-tracking paradigms that differ methodologically from previous EEG studies (Brothers & Traxler, Reference Brothers and Traxler2016; Deutsch & Bentin, Reference Deutsch and Bentin2001; Kennison, Reference Kennison2009; Mancini et al., Reference Mancini, Molinaro, Davidson, Aviles and Carreiras2014; Veldre & Andrews, Reference Veldre and Andrews2018). These eye-tracking studies usually used the boundary change paradigm, which makes it possible to manipulate different critical words during parafoveal processing. In this paradigm, an invisible boundary is inserted before the critical word, and when readers’ eyes cross the boundary, the critical word changes from a preview to a target word. Using this paradigm, Veldre and Andrews (Reference Veldre and Andrews2018) manipulated semantically congruent and incongruent previews that either matched or violated syntactic rules. Longer fixation times for the SEM + SYN- condition were observed for previews in terms of first fixation duration (FFD), first-pass reading time (FPRT), and go-past reading time (GPRT) over the SEM + SYN+ condition, and SEM + SYN- previews elicited more regressions from the target word to earlier words. These results suggest that parafoveal syntactic information had been processed before readers’ eyes crossed the boundary to read syntactically violated words and that this effect can be observed in eye-tracking measures of early-phase processing. However, this study reported longer fixation times for previews in FPRT and GPRT for the SEM-SYN+ condition over the SEM + SYN- condition and a higher likelihood of target word refixation during first-pass reading in the SEM-SYN+ condition compared with the SEM + SYN- condition. Differences in these relatively late-phase measures suggest a larger processing cost for semantically violated sentences compared with syntactically violated sentences. Furthermore, Deutsch and Bentin (Reference Deutsch and Bentin2001) reported additional evidence for early-phase syntactic parsing in FPRT but not in later second-pass reading time (SPRT). Taken together, the eye-tracking results also support an early process of syntactic parsing followed by semantic integration difficulties during sentence reading.

Compared to research on Indo-European languages, most evidence from reading studies on the Chinese language does not support the syntax-first model. Evidence against the syntax-first model in Chinese has been reported across a range of experimental paradigms and linguistic structures including the self-paced reading of ‘ba’ construction and ‘bei’ passive construction (Chen, Reference Chen, Wang, Inhoff and Chen1999), eye-tracking on ‘Subject + Verb + Object’ (SVO) construction (Yang et al., Reference Yang, Wang, Chen and Rayner2009), and EEG studies on ‘ba’ construction (Wang et al., Reference Wang, Mo, Xiang, Xu and Chen2013; Zhang et al., Reference Zhang, Yu and Boland2010), ‘bei’ construction (Wang et al., Reference Wang, Mo, Xiang, Xu and Chen2013; Yang et al., Reference Yang, Wu and Zhou2015; Zeng et al., Reference Zeng, Li and Wu2020), idioms (Liu et al., Reference Liu, Li, Shu, Zhang and Chen2010), complex sentences (Chen et al., Reference Chen, Yang, Gao, Fang, Wang and Feng2023), and SVO construction (Sun et al., Reference Sun, Wang and Zhang2019; Wang et al., Reference Wang, Ouyang, Zhou and Wang2015; Yang et al., Reference Yang, Cai, Xie and Jiang2021; Yu & Zhang, Reference Yu and Zhang2008; Zeng et al., Reference Zeng, Mao and Lu2016; Zhang et al., Reference Zhang, Yu and Boland2010, Reference Zhang, Li, Piao, Liu, Huang and Shu2013). Similar to previous Indo-European language studies, the above studies generally examined whether violation conditions induced syntactic ELAN, semantic N400, or reanalysis P600 components compared with the correct condition. Unlike Indo-European language studies, none of these studies had reported the syntactic ELAN or LAN for the SYN- condition in Chinese sentence reading. Additionally, these studies mainly focused on the examination of N400 differences between SYN + SEM- and SYN-SEM- conditions. The results showed that the SYN-SEM- condition was no different (e.g., Wang et al., Reference Wang, Ouyang, Zhou and Wang2015) or had a larger N400 (e.g., Chen et al., Reference Chen, Yang, Gao, Fang, Wang and Feng2023; Yang et al., Reference Yang, Wu and Zhou2015; Zhang et al., Reference Zhang, Yu and Boland2010) compared with the SYN + SEM- condition. The presence of the N400 in the SYN-SEM- condition was taken as evidence that the failure of syntactic framework building did not block semantic integration. In other words, syntactic framework building did not necessarily precede semantic access in Chinese sentence processing.

Only one eye-tracking study has examined the syntax-first model in Chinese sentence reading. Using SVO construction, Yang et al. (Reference Yang, Wang, Chen and Rayner2009) showed that the SYN-SEM- condition had longer FRPT and GPRT at the target region than the SYN + SEM- condition, indicating that a semantic violation could be immediately detected even if the syntax was incorrect and suggesting that syntactic and semantic processing occurred in the same time-window. Therefore, their eye-tracking study also did not support the syntax-first model during Chinese sentence reading. The authors argued that this is because, unlike Indo-European languages, the Chinese language is impoverished in morphological inflections such as gender, number, and case. Thus, the lack of explicit syntactic markers causes the process of syntactic construction to no longer occur before semantic access, and thus, Chinese sentence comprehension relies mainly on semantic and contextual information (Li & Thompson, Reference Li and Thompson1989). Considering that the ‘ba’ construction has a more explicit syntactic marker than the SVO sentences, future eye-tracking studies using the ‘ba’ construction are needed to better investigate the syntax-first model in Chinese.

It is worth noting that although the syntax-first model has not been fully substantiated, the independence of syntactic and semantic processing has been demonstrated in many previous studies on Chinese sentence reading (Chen et al., Reference Chen, Branigan, Wang, Huang and Pickering2020; Huang et al., Reference Huang, Pickering, Yang, Wang and Branigan2016; Yang et al., Reference Yang, Wang, Chen and Rayner2009). Moreover, only three of the above studies found a larger P600 for the SYN-SEM- condition compared with the SYN + SEM- condition (Wang et al., Reference Wang, Mo, Xiang, Xu and Chen2013, Reference Wang, Ouyang, Zhou and Wang2015; Zhang et al., Reference Zhang, Li, Piao, Liu, Huang and Shu2013), suggesting an interactive process of syntax and semantics at a later time-window.

In stark contrast with findings from Indo-European language research, previous studies supporting the syntax-first model in Chinese are scarce and provide mixed results. A more recent intracranial high-density electrocorticography (ECoG) study of SVO sentence reading (Zhu et al., Reference Zhu, Xu, Lu, Hu, Kwok, Zhou, Yuan, Wu, Zhang, Wu and Tan2022) showed that local syntactic phrase violations elicited earlier neural activity than semantic violations, whereas syntactic category violations did not. The lack of the syntax-first effect in the latter case may be due to the word-by-word presentation mode during sentence reading. Another study by Ye et al. (Reference Ye, Luo, Friederici and Zhou2006) auditorily presented ‘ba’ sentences in Chinese (e.g., 把布料裁了, /ba3 bu4liao4 cai2 le0/, literally ‘ba cloth cut’) and found that both the SYN-SEM- and SYN-SEM+ conditions induced an early anterior negativity ERP compared with the SYN + SEM+ condition, supporting the syntax-first model in Chinese. However, different from Indo-European language studies, the N400 component was present in the SYN-SEM- condition, indicating that semantic processing still occurs even in the presence of syntactic violations. This finding can be taken as evidence against the syntax-first model.

There are two limitations present in these previous studies on Chinese sentence processing, regardless of whether their findings supported the syntax-first model or not. First, previous EEG/ECoG and eye-tracking studies generally compared violation conditions with the correct condition, or compared the SYN-SEM- condition with the SYN + SEM- condition. However, no studies tapped into the interaction between syntax and semantics while investigating the syntax-first effect, in many cases due to the lack of a SYN-SEM+ condition (e.g., Wang et al., Reference Wang, Mo, Xiang, Xu and Chen2013, Reference Wang, Ouyang, Zhou and Wang2015; Zhang et al., Reference Zhang, Li, Piao, Liu, Huang and Shu2013). Investigation of alternative patterns for the effect of semantic violation under the SYN- condition as opposed to the SYN+ condition can shed light on whether semantic processing still proceeds when the building of a syntactic framework fails.

Second, EEG studies usually adopted a rapid serial visual presentation (RSVP) reading mode in which sentences were segmented into many individual words or phrases appearing sequentially within a constant duration on different screens (e.g., Zhang et al., Reference Zhang, Chow, Liang and Wang2019). These unnatural conditions may impair the building of syntactic framework. Furthermore, there is ample evidence that syntactic and semantic processing occurs in parafoveal vision during sentence reading (Antunez et al., Reference Antunez, Milligan, Hernandez-Cabrera, Barber and Schotter2022; Brothers & Traxler, Reference Brothers and Traxler2016; Li et al., Reference Li, Midgley and Holcomb2023; Veldre & Andrews, Reference Veldre and Andrews2018). This parafoveal-to-foveal effect cannot be observed in the RSVP reading mode because participants cannot see critical words while reading the words that precede them. In addition, data captured by eye-tracking signals are fundamentally different from EEG signals, with eye-tracking measures more reflective of endpoint processing. Nevertheless, measures derived from eye-tracking signals can still identify the relative timecourse of syntactic and semantic processing. For example, FFD or FPRT indicates initial-stage processing, whereas SPRT indicates later-stage processing. If syntactic violation effects were only present in FFD or FPRT while semantic violation effects were only present in SPRT, this would support the syntax-first mode. Consequently, eye-tracking techniques might be an optimal alternative to explore whether the syntax-first model in Chinese can be observed in more ecologically valid paradigms that mimic natural reading.

Two experiments, utilizing two different methods of text presentation, were conducted to investigate whether there is a functional priority of syntax over semantics in Chinese. In Experiment 1, a self-paced reading task similar to the RSVP used in previous EEG studies was adopted. Experiment 2 was an eye-tracking study in which a complete sentence was presented across a single line on the screen to mimic natural reading conditions. Both experiments used the Chinese ‘ba’ construction in which the function word ‘ba’ is an explicit syntactic marker. After reading the word ‘ba’ located toward the beginning of a sentence, readers can make strong syntactic predictions about the upcoming transitive verb at the critical word position. Therefore, once a noun is introduced at this position, syntactic violations can be detected immediately. Worse performance under the SYN- condition compared with the SYN+ condition was considered a syntactic violation effect, while worse performance for the SEM- condition compared with the SEM+ condition was a semantic violation effect. If there is an interaction between SYN and SEM, that is, a semantic violation effect (SEM+ vs SEM-) in the SYN+ condition but not in the SYN- condition, it indicates that syntactic violations hinder semantic processing and thus support the syntax-first model. If this interaction is not present, it can be concluded that no evidence has been found to support the syntax-first model.

2. Experiment 1: self-paced reading experiment

2.1. Methods

2.1.1. Participants

Twenty-nine native Chinese speakers (14 males and 15 females) participated in this experiment. All participants were college students ranging in age from 18 to 27 years (M_age = 25) reporting normal or corrected-to-normal vision. None of the participants reported reading-related or neurological disorders. This study was approved by the Ethics Committee of the Cognitive Neuroscience of Language Lab at the Beijing Language and Culture University. Participants signed a consent form before the experiment and were compensated 40 RMB (around $5.50 USD) for their participation.

2.1.2. Materials

Experimental materials consisted of 80 sets of sentences, each containing four separate conditions that differed in syntactic or semantic congruence (see Table 1). We used a subtype of the ‘ba’ construction of Chinese as the sentence frame, that is, Subject + ‘ba’ + modifier + noun + adverb + verb + ‘le’ + complement, such as the correct sentence 王强把手里的咖啡轻轻地搅拌了几下 (/Wang2qiang2 ba3 shou3li0de0 ka1fei1 qing1qing1de0 jiao3ban4 le0 ji3xia4/, literally ‘Wangqiang preposition in hand coffee slowly stir aspect marker a few times’). In the ‘ba’ construction, the noun (咖啡, /ka1fei1/, ‘coffee’) is placed before the verb (搅拌, /jiao3ban4/, ‘stir’). The purpose of the construction is to emphasize one’s active intervention in something (e.g., one’s action stir caused the change in the state of coffee). In addition, ‘le’ is an aspect marker, which indicates the completion of an action, and the complementary constituents at the end of the sentence indicate the number of times of performing the action. All materials can be found on the Open Science Framework at https://osf.io/5vfwr/.

Table 1. Example sentences following the ‘ba’ construction for the four experimental conditions

Note: Critical words are in bold. Sentences are segmented to reflect how they were presented during the self-paced reading task (Experiment 1) or to show interest areas during the eye-tracking experiment (Experiment 2). Following the conventions of written Chinese, no spaces were inserted during sentence presentation in the eye-tracking experiment. Additionally, post-CW1 and post-CW2 were combined into a single segment in Experiment 2. Spellings and numbers (tone type) indicate the pronunciation in the Chinese transcription system.

Abbreviations: asp., aspect marker in Chinese indicating the completion of an action; CW, critical word; Mod, modifier; Prep, preposition; SYN + SEM+, both syntactically and semantically correct; SYN + SEM-, syntactically correct but semantically violated; SYN-SEM+, syntactically violated but semantically correct; SYN-SEM-, both syntactically and semantically violated.

The word after the adverb in the sentence is regarded as the critical word (hereafter, CW). The four sentences within a set differed only in the CWs. The CW position is highly expected to be a transitive verb according to the syntactic framework of the Chinese ‘ba’ construction (Yang et al., Reference Yang, Wu and Zhou2015; Zhang et al., Reference Zhang, Yu and Boland2010). Therefore, we used semantically plausible transitive verbs for CWs under the SYN + SEM+ condition. Drawing on a study by Zhang et al. (Reference Zhang, Yu and Boland2010), violated sentences were generated by replacing the CW of the correct sentence with another word that is semantically or syntactically incongruent. Under the SYN + SEM- condition, the transitive verb was replaced by a verb that conforms to the syntactic framework of the ‘ba’ construction but is semantically incongruent (e.g., 装订, /zhuang1ding4/, ‘bind’). With regard to syntactically violated sentences, the transitive verb was substituted with a noun, leading to a violation of the syntactic category. The aspect marker immediately following the noun also violated the Chinese syntactic rules because the noun cannot convey the completion of an action that the aspect marker indicates. Under the SYN-SEM+ condition, nouns generally act as tools or approaches to perform actions (e.g., the CW 勺子, /shao2zi0/, ‘spoon’ is a tool to stir the coffee). In this case, it is relatively easy for readers to semantically integrate this ‘tool’ noun into the preceding sentence context. Under the SYN-SEM- condition, the verb was substituted with a noun that was completely unrelated to the sentence context (e.g., 奖状, /jiang3zhuang4/, ‘certificate’), representing both a syntactic and semantic violation. All sentences were 11 words long. The CW was located in the middle position of each sentence instead of the end to avoid the wrap-up effect during reading (e.g., Zhang et al., Reference Zhang, Li, Piao, Liu, Huang and Shu2013).

Stroke number and word frequency for the CWs were matched across the four conditions (ps > .05 for one-way ANOVA, see Table 2). In the cloze probability test for CWs, students who did not take part in this study (n = 20) were asked to provide the first meaningful word that came to mind in order to complete a sentence fragment that was missing the CW. The results showed that CWs had a cloze probability of 10% under the SYN + SEM+ condition and zero under the other three conditions, supporting that CWs had low predictability. This also ensures that CWs are unlikely to be skipped during sentence reading, allowing for fixations and saccades to be recorded in Experiment 2. Finally, to evaluate the degree of violation, a different group of 20 students was asked to judge the comprehensibility of experimental sentences on a 5-point scale (1 = completely incomprehensible, 5 = completely comprehensible). One-way ANOVA revealed a significant main effect of four conditions, F(3, 316) = 1605.565, p < .001, η _p² = .938. Post-hoc pairwise comparisons with Bonferroni’s corrections showed that semantically correct sentences (SEM+) were significantly more comprehensible than semantically violated sentences (SEM-) under both the SYN+ (p < .001, d = 12.222) and SYN- (p < .001, d = 4.981) conditions.

Table 2. Mean number of strokes, mean word frequency, and cloze probability of the critical words and sentence comprehensibility scores across the four different conditions

Abbreviations: SYN-, syntactically violated; SYN+, syntactically correct; SEM-, semantically violated; SEM+, semantically correct.

The 80 sets of sentences were divided into four lists in a Latin square design in which sentences from the same set were not used in the same list to avoid interference from reading the same words multiple times. Each participant was assigned only one of the four lists, which included 20 sentences for each of the four conditions. To balance the number of correct and violated sentences, another 60 correct ‘ba’ sentences were used as fillers. In addition, 20 correct and 20 violated sentences following the SVO structure were used for each list to offset the ‘ba’ structure stimuli. In total, each participant was presented with 180 sentences during the experiment.

2.1.3. Procedure

Self-paced reading tasks were administered using E-Prime 2.0 (https://pstnet.com/products/e-prime/) on a laptop in a quiet room. During each trial, a fixation cross appeared in the center of the screen and participants were instructed to press the space bar when ready to begin the presentation of sentence segments. Sentences were presented as individual segments in sequential order with each segment displayed separately. Participants were instructed to press the space bar at their own comfortable speed in order to immediately advance the displayed segments with no time limit. Before the formal experiment, participants were told that a random half of the sentences were followed by a ‘yes’ or ‘no’ comprehension question based on the content of the sentence they had just read. This was to ensure that participants read the sentences carefully. Reading times (RTs) for each sentence segment and the accuracy of comprehension questions were recorded. Sentence order was randomized for each list across participants. The experiment took approximately 30 minutes to complete.

2.1.4. Statistical analysis

Sentences with a correct response to the comprehension question (92.8% average accuracy, 86%–100% range) were included for analysis. RTs were analyzed in terms of the whole sentence and five segments of interest including the noun, adverb (hereafter, pre-CW), CW, aspect marker (post-CW1), and complement (post-CW2) of the sentence. RTs beyond three SDs of the mean for each segment were considered outliers and were removed. To achieve normality, RTs were naturally log-transformed before analysis.

Data were analyzed using linear mixed-effects models in R (R Core Team, 2022) via the lmer function from the lme4 package version 1.1–31 (Bates et al., Reference Bates, Machler, Bolker and Walker2015). Fixed-effect structures were identical across models and included main effects for both syntax and semantics (sum coded) and their interaction. Fitting of each model’s random-effects structure began with a maximal model (Barr et al., Reference Barr, Levy, Scheepers and Tily2013), which included random intercepts for participants and items (i.e., sentences) and by-participant random slopes for syntax and semantics. During initial fitting, the random-effects correlation parameters were not included to prevent issues with model convergence. The variance accounted for by each of the random factors was determined using principal component analysis (PCA) and was used to inform the reduction in each model’s random-effects structure. After the removal of each random factor, resultant models were compared to a model containing that random factor using likelihood ratio tests (LRTs). Random factors were only removed from a model if the fit of the more parsimonious model was not significantly different or if the inclusion of a random factor resulted in convergence issues or singular fit. When the most parsimonious random-effects structure was identified, it was compared to an identical model that contained random-effects correlation parameters using LRT. If the inclusion of correlation parameters significantly improved model fit, they were retained. Finally, post-fitting model criticism was performed by trimming absolute standardized residuals exceeding 2.5 standard deviations (Baayen & Milin, Reference Baayen and Milin2010). Estimated p-values were generated using Satterthwaite’s method (Satterthwaite, Reference Satterthwaite1941). A significance threshold of p < .05 was selected for all analyses. For significant interactions, planned pairwise comparisons were conducted to test the simple effects of SEM for both levels of SYN. The degrees of freedom for pairwise comparisons were calculated using Satterthwaite’s method.

2.2. Results and discussion

A complete descriptive summary of RTs for all conditions and units of analysis is presented in Table 3. For all models, the inclusion of by-participant random slopes for syntax and semantics resulted in either a singular fit or model convergence issues. The removal of random slopes did not significantly influence model fit, resulting in all final models containing only random intercepts for participants and items. Interactions between syntax and semantics did not reach the threshold for significance in any model. Additionally, the main effects of syntax and/or semantics were not observed for whole sentence, noun, and pre-CW models, with the latter two null results supporting that the reading difficulty of the segments before the CW was well balanced across the four conditions. A significant main effect of syntax was observed in models for CW, t(311) = 2.44, p = .015, and post-CW1 segments, t(322) = 3.86, p < .001, with syntactic violations associated with longer RTs. In contrast, a significant main effect of semantics was observed in the post-CW2 model with semantic violations associated with longer RTs, t(307) = 2.06, p = .041. Taken together, these results support that the influence of syntactic violations is observed earlier than those of semantic violations (also see Fig. 1). Complete results for each final model can be found on the Open Science Framework at https://osf.io/5vfwr/.

Table 3. Mean reaction times (SD) for each unit of analysis across the four conditions

Abbreviations: CW, critical word; SYN-, syntactically violated; SYN+, syntactically correct; SEM-, semantically violated; SEM+, semantically correct.

Figure 1. Reading times (RTs) for segments of interest in experimental sentences in Experiment 1. Due to the lack of interaction between syntax and semantics, data presented are collapsed across semantic (left panel) or syntactic conditions (right panel). CW, critical word; SYN-, syntactically violated; SYN+, syntactically correct; SEM-, semantically violated; SEM+, semantically correct. ***, p < .001; *, p < .05.

Due to the lack of a syntax–semantics interaction, findings from the self-paced reading experiment demonstrated weak evidence for the syntax-first model. According to Chen (Reference Chen, Wang, Inhoff and Chen1999), this might be due to the word-by-word presentation mode of sentences, which reduced the possibility for readers to build a syntactic framework before semantic integration. Additionally, RT itself cannot distinguish sentence processing at an early stage from a late stage. Presentation differences in word-by-word or whole-sentence paradigms might lead to contradictory findings (Antunez et al., Reference Antunez, Milligan, Hernandez-Cabrera, Barber and Schotter2022). To overcome the methodological limits of self-paced reading paradigms, Experiment 2 employed an eye-tracking paradigm conducted in a more ecologically valid context.

3. Experiment 2: eye-tracking experiment

3.1. Methods

3.1.1. Participants

Thirty-four native Chinese speakers (17 males and 17 females) who did not participate in Experiment 1 participated in this experiment. All participants were college students ranging in age from 18 to 27 years (M_age = 25), reporting normal or corrected-to-normal vision. None of the participants reported reading-related or neurological disorders.

3.1.2. Materials

Materials used in Experiment 2 were identical to Experiment 1.

3.1.3. Procedure

Eye-tracking data were collected using an EyeLink 1000 Plus System (https://www.sr-research.com) with a desktop-mounted eye tracker in a sound-attenuating eye-tracking laboratory at a sampling rate of 1000 Hz. Stimuli were presented using Experiment Builder (v2.3.38) on a 21-inch CRT monitor with a refresh rate of 150 Hz and a resolution of 1024 x 768 pixels. Participants’ heads were stabilized on a chin rest and a forehead rest 58 cm away from the monitor. The visual angle was approximately 0.7°. Participants read sentences binocularly, but only data from the right eye were recorded. A three-point calibration was performed at the beginning of the experiment and after each break. Each trial started with a drift check in which a black dot appeared on the screen at the location of the first character of the upcoming sentence. The maximal gaze-position error of the black dot for a validation procedure had to be less than 1° of visual angle or calibration was repeated. When the dot was fixed, the whole sentence would be displayed across a single line in 25-point black Song Ti font in the middle of a gray screen. Participants were instructed to read each sentence carefully at their own speed and to press the space bar to advance to the next trial. As in Experiment 1, half of the sentences were followed by a ‘yes’ or ‘no’ comprehension question. The order of sentences was randomized in each list across participants. The experiment took approximately 30 minutes to complete.

3.1.4. Statistical analysis

Eye-tracking measurements are comprised of fixations and saccades. FFD is the duration of the first fixation event in an interest area. FPRT is the sum of the duration of all fixations in the first run of fixations, and SPRT is defined based on the second run of fixations. GPRT is the summed fixation duration from when the current interest area is first fixed until the eyes enter an interest area being fixed later than the current area. Total reading time (TRT) is the total amount of time spent fixating on a given interest area. Regression in (RegIn) indicates the probability of the current interest area having at least one regression from later interest areas. Skipping (SP) indicates the probability of an interest area having no fixation in a first-pass reading. Data for all the above measurements were extracted using DataViewer (version 4.1.477).

Sentences with a correct response to the comprehension question (95.1% average accuracy, 80%–100% range) were included for analysis. Fixations and saccades were analyzed in terms of the whole sentence and four interest areas including the noun, pre-CW, CW, and post-CW. Different from the segments in Experiment 1, the post-CW in Experiment 2 combined the aspect marker (i.e., post-CW1 in Experiment 1) and the complementary constituent (i.e., post-CW2 in Experiment 1) into one interest area due to the large majority of missing data in the aspect marker (only the very simple Chinese character ‘了’). Single fixations less than 60 ms or more than 1000 ms were removed. Fixation durations beyond three SDs of the mean for each interest area were considered outliers and were removed. To achieve normality, fixation durations were naturally log-transformed. Fixation and saccade data were analyzed using linear and generalized mixed-effects models, respectively. Generalized models for RegIn and SP data were fit using a binomial distribution. Models for fixed and random-effects structures and fitting and model criticism procedures were identical to those described in Experiment 1 with the exception of models for RegIn and SP data, which did not undergo model criticism as they were fit using binary data. Identical to Experiment 1, significant interactions between syntax and semantics were probed using planned pairwise comparisons to test the simple effects of semantics for both levels of syntax. The degrees of freedom for pairwise comparisons were calculated using Satterthwaite’s method.

3.2. Results and discussion

A complete descriptive summary of fixations and saccades for all conditions and units of analysis is presented in Table 4. For all models, the inclusion of by-participant random slopes for syntax and semantics resulted in either a singular fit or model convergence issues. The removal of random slopes did not significantly influence model fit, resulting in all final models containing only random intercepts for participants and items. For clarity, results are presented separately for each unit of analysis beginning with full sentence results, followed by each interest area of analysis in order of their position in the sentence. Based on the research questions of the present study, interactions will be discussed first, followed by main effects. Full model results for each unit of analysis and measure of interest can be found on the Open Science Framework at https://osf.io/5vfwr/.

Table 4. Means (SD) of each eye-tracking measure for each area of interest and whole sentences across the four conditions

Abbreviations: CW, critical word; FFD, first fixation duration; FPRT, first-pass reading time; GPRT, go-past reading time; RegIn, regression in; SP, skipping probability; SPRT, second-pass reading time; SYN-, syntactically violated; SYN+, syntactically correct; SEM-, semantically violated; SEM+, semantically correct; TRT, total reading time.

Whole Sentence. A significant interaction between syntax and semantics was observed in the TRT model, t(321.32) = −2.34, p = .020. Pairwise comparisons revealed a significant simple effect of semantics in the SYN+ condition with longer fixations for semantic violation only in the absence of a syntactic violation, t(320) = 2.77, p = .006. This result indicates that syntactic violations might hinder semantic processing. The results of the smaller unit analysis are reported as follows.

Noun. No significant interactions or main effects were observed in any model.

Pre-CW. A significant main effect of semantics was observed in the FFD model, but this was driven by a marginally significant interaction between syntax and semantics, t(295.77) = 1.91, p = .057. Pairwise comparisons revealed a significant simple effect of semantics in the SYN+ condition, with longer fixations for semantic violations only in the absence of a syntactic violation, t(296) = −2.87, p = .004. This might be due to the parafoveal-to-foveal effect while fixating the pre-CW, indicating that semantic violation of the CW at the parafoveal vision affected the processing time of the foveal pre-CW and was only present when the syntax was correct. A significant interaction between syntax and semantics was also observed in the RegIn model, z = −2.16, p = .031. Pairwise comparisons revealed a significant simple effect of semantics in the SYN+ condition, z = 2.97, p = .003, indicating that readers tended to regress into the pre-CW from later areas for semantically violated as opposed to correct sentences only when the syntax was correct. In other words, the absence of syntactic framework building prevented readers from reading the previous portion of the sentence to get through the sentence meaning. A significant main effect of syntax was also observed in the SPRT model, but this was driven by a marginally significant interaction between syntax and semantics, t(272.28) = 1.71, p = .089. Pairwise comparisons revealed a significant simple effect of semantics in the SYN- condition, t(252) = 2.07, p = .040. It is worth noting that, different from the FFD and RegIn reflecting early-stage processing, the semantic violation effect on the late-stage SPRT was only present when the syntax was violated. A significant main effect of syntax was observed in the TRT model, with syntactic violations associated with longer fixations, t(307) = 2.57, p = .011. No significant interactions or main effects were observed in any other models.

CW. Significant interactions between syntax and semantics were observed in FPRT, t(302.23) = −2.04, p = .042, GPRT, t(311.89) = −2.73, p = .007, and TRT models, t(303.48) = −4.29, p < .001. In all three models, pairwise comparisons revealed significant simple effects of semantics in the SYN+ condition, but no effects of semantics in the SYN- condition. Semantic violations resulted in longer fixations in FPRT, t(302) = 2.68, p = .008, GPRT, t(312) = 2.94, p = .004, and TRT, t(304) = 5.88, p < .001, only when there were no syntactic violations present. A significant main effect of semantics was also observed in the SPRT model, with semantic violations associated with longer fixations, t(244) = 1.98, p = .049. Taken together, these findings support that failed syntactic processing blocked semantic processing, whereas semantics proceeded when the sentence was syntactically correct. No significant interactions or main effects were observed in any other models.

Post-CW. No significant interactions between syntax and semantics were observed across all measures at the post-CW segment. Significant main effects of both syntax, t(266) = 2.38, p = .018, and semantics, t(265) = 2.66, p = .008, were observed in the TRT model with both violations associated with longer fixations. A significant main effect of semantics was also observed in GPRT, with semantic violations associated with longer fixations, t(266) = 2.48, p = .014. Models for FFD and SPRT data both resulted in a singular fit. No significant interactions or main effects were observed in any other models.

Across all interest areas, the null effects in SP revealed that readers fixed all words to the same degree, providing a valid baseline for examining other eye-tracking measures. Similarly, the null effects in FPRT and GPRT for the noun and pre-CW models showed that the difficulty of processing words before reading the CW was equivalent across the four different conditions.

As mentioned in the Introduction, the interaction between syntax and semantics can provide evidence for the functional priority of syntax. This interaction was observed in the TRT of the whole sentence, the FFD, RegIn, and SPRT of the pre-CW, and the FPRT, GPRT, and TRT of the CW (see Fig. 2). In particular, the semantic violation effect occurred only when sentences were syntactically correct, suggesting that syntactic violations hindered semantic processing. Due to a parafoveal-to-foveal effect, syntactic processing during sentence reading can be traced back to a very early stage as indicated by the FFD reflecting initial reading processes and by reading the words before the CWs that have violations. This effect cannot be detected using reading paradigms where sentence segments are sequentially presented. Evidence supporting the functional priority of syntax was also observed in FPRT for CWs reflecting an early stage of processing and spilled over to subsequent processing as measured by RegIn for pre-CWs and GPRT for CWs. This meant that for semantically violated sentences, readers regressed more into the pre-CW from later areas only when the syntax was correct and this process of regression occurred before readers advanced to read the post-CWs. These findings support the initiation of a syntactic-building process at the very beginning of sentence reading.

Figure 2. Fixation durations and regressions for each area of interest across the four conditions. CW, critical word; FFD, first fixation duration; FPRT, first-pass reading time; GPRT, go-past reading time; SPRT, second-pass reading time; SYN-, syntactically violated; SYN+, syntactically correct; SEM-, semantically violated; SEM+, semantically correct; RegIn, regression in; TRT, total reading time. Only significant simple effects for significant interactions between syntax and semantics are shown. Asterisks’ color corresponds to that of the lines. ***, p < .001; **, p < 0.01; *, p < .05.

In contrast, a semantic violation effect in SPRT was only present when syntax failed, occurring when readers read back to the pre-CWs during second-pass reading. This semantic violation effect for SYN- was also present in SPRT at the post-CWs, although this finding should be considered cautiously as the model fit was singular. Temporally, SPRT reflects a very late stage of processing, suggesting that readers were attempting to integrate the failed syntax into sentence semantics at a late time-window. Finally, a main effect of semantics was observed in SPRT at the CW position, and in GPRT at the post-CW position, both of which reflect a later time-window for semantic processing after syntactic processing was completed.

4. General discussion

Two experiments were conducted to investigate whether there is a functional priority of syntax over semantics during Chinese sentence reading. The self-paced reading experiment did not provide strong evidence for the functional priority of syntax. In contrast, findings from the eye-tracking experiment provided multi-faceted evidence for this priority, mainly reflected in the absence of a semantic violation effect in FFD and RegIn at the pre-CW position and FPRT and GPRT at the CW position when syntax was violated. Additionally, our study revealed an interactive process between syntax and semantics at a later stage, evidenced in the presence of a semantic violation effect in SPRT at the pre-CW and post-CW positions when the syntax was violated. Finally, a semantic violation effect was also observed, presenting as longer SPRT for CWs and longer GPRT for post-CWs, occurring after the time-window of syntactic processing as indicated by FFD and RegIn for pre-CW and FPRT and GRPT for the CW.

The primary contribution of the present study is the support for the functional priority of syntax over semantics in the natural reading of Chinese sentences. This finding differs from most neurophysiological findings (e.g., Wang et al., Reference Wang, Mo, Xiang, Xu and Chen2013; Yang et al., Reference Yang, Wu and Zhou2015; Zhang et al., Reference Zhang, Yu and Boland2010, Reference Zhang, Li, Piao, Liu, Huang and Shu2013). First, previous EEG studies mainly focused on the comparison between SYN-SEM- and SYN + SEM- conditions and explored whether the SYN-SEM- condition had a similar N400 amplitude as SYN + SEM-. We expected that readers always encountered semantic anomalies for SYN-SEM- sentences even if the syntax was violated. Thus, the presence of a semantic N400 for the CWs under the SYN-SEM- condition cannot be taken as valid evidence that syntactic processing failures did not hinder semantic processing. Instead, our study mainly focused on the interaction between syntax and semantics in which a semantic violation effect under SYN+ but not SYN- would provide strong evidence that syntactic processing blocks semantic processing. This functional priority of syntax was extensively detected in our results across various eye-tracking measurements occurring at different interest areas during sentence reading.

Secondly, the reason for the lack of support for the syntax-first model in previous EEG studies may be due to the RSVP reading mode. The sequential presentation of sentence segments on different screens may impair syntactic framework building (Chen, Reference Chen, Wang, Inhoff and Chen1999; Li et al., Reference Li, Midgley and Holcomb2023). This was also evidenced in our findings from Experiment 1 in which a weak syntax-first effect was observed in the self-paced reading mimicking the RSVP. In contrast, Experiment 2 adopted an eye-tracking technique in which whole sentences were presented on a single screen. This method of presentation more closely mirrors natural reading and is therefore a more ecologically valid way to investigate sentence reading. Our findings revealed that under natural contexts the functional priority of syntax was detected. This priority has also been reported by previous eye-tracking studies on Indo-European languages (Brothers & Traxler, Reference Brothers and Traxler2016; Deutsch & Bentin, Reference Deutsch and Bentin2001; Kennison, Reference Kennison2009; Mancini et al., Reference Mancini, Molinaro, Davidson, Aviles and Carreiras2014; Veldre & Andrews, Reference Veldre and Andrews2018). Taken together, the natural reading mode in eye-tracking experiments might be a more ecologically valid approach for examining the syntax-first model.

Additionally, the N400 in EEG studies often occurs at around 300–500 ms. In contrast, the syntactic violation effect in eye-tracking measures was detected in FFD occurring around 220 ms. Meanwhile, this effect was located at the pre-CW position that preceded the CWs where the violation occurred. This effect detected in parafoveal vision was consistent with the findings by Brothers and Traxler (Reference Brothers and Traxler2016), Li et al. (Reference Li, Midgley and Holcomb2023), and Veldre and Andrews (Reference Veldre and Andrews2018), demonstrating that the construction of a syntactic framework had been initiated before target words were foveated. However, this parafoveal-to-foveal effect on pre-CWs cannot be detected in EEG studies given the constraints of the presentation mode. Furthermore, as the Introduction illustrated, the timing signals of eye-tracking techniques should be interpreted with caution. Therefore, the syntax-first model in our study tends to emphasize the functional priority of syntax over semantics more than the specific timecourses of syntactic and semantic processing.

Our results were inconsistent with those reported in a similar eye-tracking study by Yang et al. (Reference Yang, Wang, Chen and Rayner2009). One explanation for these observed differences may be the use of different linguistic constructions. In the SVO sentences used by Yang et al. (Reference Yang, Wang, Chen and Rayner2009), the words at the object position could be a noun, a verb, an adjective, or a functional word in Chinese. Therefore, a violation of the syntactic category for the object was not detected immediately. In contrast, due to the constraints of the ‘ba’ construction used in the present study, a transitive verb is obligatorily placed after the adverb position. During natural reading, ‘ba’ sentences possibly yield a frame-and-slot pattern. In particular, a syntactic frame is initially built such as Subject + ‘ba’ + modifier + noun + adverb + verb + ‘le’ + complement, followed by appropriate words being inserted in corresponding slots. Readers fully expect the presentation of a verb in the CW position. If, however, a noun is introduced in the verb position, a syntactic category violation would be immediately detected. Therefore, the functional word ‘ba’ resembles inflectional morphemes in Indo-European languages and is more likely to lead to the construction of a syntactic framework. This linguistic property conforms to the syntax-first model in that a syntactic framework is initiated at an early stage and slots are filled later based on semantic information. This also explains why no syntax-first effect was observed for syntactic category violation of SVO sentences in Zhu et al. (Reference Zhu, Xu, Lu, Hu, Kwok, Zhou, Yuan, Wu, Zhang, Wu and Tan2022). It is worth noting that a few previous studies (e.g., Yang et al., Reference Yang, Wang, Chen and Rayner2009; Zhang et al., Reference Zhang, Li, Piao, Liu, Huang and Shu2013) confounded the syntactic categories of the CWs when comparing SYN-SEM- condition (i.e., a noun) with SYN + SEM- condition (i.e., verb). Since we focused on the comparison between SEM- and SEM+ conditions, the CWs for both SEM- and SEM+ were always verbs for SYN+ and nouns for SYN-. Therefore, syntactic categories were controlled, supporting comparisons between SEM- and SEM+ conditions.

Our eye-tracking findings are consistent with neurophysiological (Friederici, Reference Friederici2002; Friederici et al., Reference Friederici, Steinhauer and Frisch1999; Isel et al., Reference Isel, Hahne, Maess and Friederici2007) and eye-tracking findings on Indo-European languages (Brothers & Traxler, Reference Brothers and Traxler2016; Deutsch & Bentin, Reference Deutsch and Bentin2001; Kennison, Reference Kennison2009; Mancini et al., Reference Mancini, Molinaro, Davidson, Aviles and Carreiras2014; Veldre & Andrews, Reference Veldre and Andrews2018). Our study’s primary innovation is in borrowing the design and paradigm of neurophysiological studies and extending them into a natural reading context by including eye-tracking. Different from the confounding contrasts in some previous studies (Veldre & Andrews, Reference Veldre and Andrews2018; Yang et al., Reference Yang, Wang, Chen and Rayner2009), the present study is the first to explore the interaction between syntactic and semantic violation effects. Taken together, the primacy of syntax over semantics proposed by Friederici’s three-phase model is extended to the Chinese language in a natural reading context.

Our eye-tracking results also revealed a late time-window for syntactic and semantic integration. When both syntactic and semantic processing failed, readers tended to integrate the incongruent semantics of the CWs with the surrounding portion of sentences, reflected in longer SPRT at the pre-CWs and post-CWs for the SYN-SEM- versus SYN-SEM+ condition. These findings provide additional evidence for an interactive process of syntax and semantics. However, this process was not reported by previous EEG studies due to the lack of a P600 (Yang et al., Reference Yang, Wu and Zhou2015; Zeng et al., Reference Zeng, Li and Wu2020). During a further late stage, the effect of syntactic violation disappeared completely and only the semantic violation effect was still detected, reflected in longer SPRT for CWs and longer GPRT for post-CWs for SEM- versus SEM+ conditions. At this point, participants attempted to resolve previously encountered semantic conflicts, a process lasting until the end of the sentence reading. Taken together, semantic processing was found to occur relatively late in sentence reading.

Previous EEG studies were typically limited to recording the CWs, but violation effects often extended beyond this interest area (i.e., pre-CW, CW, and post-CW). Using eye-tracking techniques, multiple interest areas were recorded in our study to illustrate the timecourse of syntactic or semantic processing during sentence reading. This was not realized even in previous eye-tracking studies on Indo-European languages due to the use of different paradigms (Brothers & Traxler, Reference Brothers and Traxler2016; Veldre & Andrews, Reference Veldre and Andrews2018). In contrast, our study revealed the very late time-window when syntax and semantics interact.

Findings from the present study should be considered in light of some limitations. First, the sentence comprehensibility scores for SYN-SEM+ are relatively lower than SYN + SEM+ (see Table 2). It is difficult to create sentences that are syntactically incorrect but semantically correct. For SYN-SEM+, we used the same method as Zhang et al. (Reference Zhang, Yu and Boland2010) and Yang et al. (Reference Yang, Wu and Zhou2015), which has been validated as an effective method. The present study mainly focused on the semantic violation effect (i.e., SEM+ vs SEM-) under the SYN- or SYN+ separately. Therefore, no direct comparison was made in the analyses between the SYN-SEM+ and SYN + SEM+ conditions due to unmatched sentence comprehensibility. Second, previous EEG studies typically focused on comparisons between SYN + SEM- and SYN-SEM- conditions, with any differences between them (i.e., N400) attributed to the effects of the syntactic violation. N400 can indicate the occurrence of semantic processes, so the absence of semantic N400 under SYN-SEM- can be attributed to the syntactic effect. Based on our current findings, future studies should consider the co-registration of eye-tracking and EEG techniques, which can record fixation-related potentials in whole-sentence presentation paradigms (Antunez et al., Reference Antunez, Milligan, Hernandez-Cabrera, Barber and Schotter2022; Loberg et al., Reference Loberg, Hautala, Hamalainen and Leppanen2018). Through this, the neurophysiological signals under SYN + SEM- vs SYN-SEM- during natural reading can be directly compared with that of previous EEG studies using a word-by-word presentation mode to examine whether LAN, N400, or P600 patterns are consistent between different presentation modes. Additionally, the cloze probability of the CWs for sentences in the SYN + SEM+ condition was significantly higher than the other three conditions, indicating that the CWs were predicted more accurately based on the prior sentence context. However, the cloze probability for correct sentences was kept at a minimum level in our study. Moreover, it has been demonstrated that the cloze probability of CWs did not show a difference in the violation effect (Wang et al., Reference Wang, Mo, Xiang, Xu and Chen2013; Yang et al., Reference Yang, Wu and Zhou2015; Zhang et al., Reference Zhang, Yu and Boland2010), so the confounding of predictability can be ruled out. Finally, violated sentences may induce a domain-general process of executive control that is not related to syntactic or semantic processing. Future studies should consider using congruent phrases or sentences (Pylkkanen, Reference Pylkkanen2019; Zhang et al., Reference Zhang, Xiang and Wang2022) to examine the potential syntax-first model.

In conclusion, our findings provide converging evidence that the building of a syntactic framework takes precedence over semantic processing for Chinese ‘ba’ construction in natural reading. At a late stage, semantic processing interacts with syntactic processing to identify sentence-level meaning. The functional priority of syntax over semantics in Chinese supports the syntax-first model based on Indo-European languages proposed by Friederici. Moreover, the syntax-first effect observed in Chinese, an analytic language, reinforces the universal view of syntactic processing mechanisms underlying human language. Finally, these findings were mainly derived from data collected in a natural reading context using eye-tracking, as opposed to the commonly used RSVP reading mode. This supports that the observation of psycholinguistic phenomena can occur under more ecologically valid conditions. Considerable additional work is needed to further understand the conditions under which the effects described in this study can be observed and replicated.

Acknowledgments

The authors thank Jianqin Wang, Rong Jiang, Otto Loberg, and Michael Ullman for their insightful comments and Runbing Xie for the help with data cleaning.

Data availability statement

The data from the study, as well as the code used to produce the analyses, can be found on the Open Science Framework at https://osf.io/5vfwr/.

Author contribution

Y.W. conceptualized the study; designed the methodology; provided software; involved in formal analysis; curated the data; wrote the original draft; wrote, reviewed, and edited the manuscript; visualized the data; supervised the study; administered the project; and acquired funding. Y.T. designed the methodology, provided software, investigated the study, involved in formal analysis, and wrote the original draft. A.J.P. provided software; validated the study; involved in formal analysis; curated the data; wrote the original draft; wrote, reviewed, and edited the manuscript; and visualized the data.

Financial disclosure

This research was supported by the Science Foundation of Beijing Language and Culture University (the Fundamental Research Funds for the Central Universities) (19YJ130006, 21YJ190002, and 22YJ220007); Humanities and Social Sciences Youth Foundation, Ministry of Education of the People’s Republic of China (22YJC740082); and Youth Talent Development Program at the Beijing Language and Culture University.

Competing interest

The authors declare none.

References

Antunez, M., Milligan, S., Hernandez-Cabrera, J. A., Barber, H. A., & Schotter, E. R. (2022). Semantic parafoveal processing in natural reading: Insight from fixation-related potentials & eye movements. Psychophysiology, 59(4), e13986. https://doi.org/10.1111/psyp.13986CrossRef Google Scholar PubMed

Baayen, R. H., & Milin, P. (2010). Analyzing reaction times. International Journal of Psychological Research, 3(2), 12–28.CrossRef Google Scholar

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001CrossRef Google Scholar PubMed

Bates, D., Machler, M., Bolker, B. M., & Walker, S. C. (2015). Fitting linear mixed-effects models ssing lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01CrossRef Google Scholar

Brothers, T., & Traxler, M. J. (2016). Anticipating syntax during reading: Evidence from the boundary change paradigm. Journal of Experimental Psychology: Learning Memory and Cognition, 42(12), 1894–1906. https://doi.org/10.1037/xlm0000257Google Scholar PubMed

Chen, H.-C. (1999). How do readers of Chinese process words during reading for comprehension? In Wang, J., Inhoff, A. W., & Chen, H.-C. (Eds.), Reading Chinese script: A cognitive analysis (pp. 257–278). Erlbaum.Google Scholar

Chen, L., Yang, M., Gao, F., Fang, Z., Wang, P., & Feng, L. (2023). Mandarin Chinese L1 and L2 complex sentence reading reveals a consistent electrophysiological pattern of highly interactive syntactic and semantic processing: An ERP study. Frontier in Psychology, 14, 1143062. https://doi.org/10.3389/fpsyg.2023.1143062CrossRef Google Scholar PubMed

Chen, X. M., Branigan, H. P., Wang, S. P., Huang, J., & Pickering, M. J. (2020). Syntactic representation is independent of semantics in Mandarin: Evidence from syntactic priming. Language Cognition and Neuroscience, 35(2), 211–220. https://doi.org/10.1080/23273798.2019.1644355CrossRef Google Scholar

Chomsky, N. (1980). Rules and representation. Columbia University Press.CrossRef Google Scholar

Deniz, N. D. (2022). Processing syntactic and semantic information in the L2: Evidence for differential cue-weighting in the L1 and L2. Bilingualism: Language and Cognition, 25(5), 713–725. https://doi.org/10.1017/s1366728921001140CrossRef Google Scholar

Deutsch, A., & Bentin, S. (2001). Syntactic and semantic factors in processing gender agreement in Hebrew: Evidence from ERPs and eye movements. Journal of Memory and Language, 45(2), 200–224. https://doi.org/10.1006/jmla.2000.2768CrossRef Google Scholar

Fodor, J. (1983). Modularity of mind. MIT Press.CrossRef Google Scholar

Friederici, A. D. (1995). The time course of syntactic activation during language processing: A model based on neuropsychological and neurophysiological data. Brain and Language, 50(3), 259–281. https://doi.org/10.1006/brln.1995.1048CrossRef Google Scholar

Friederici, A. D. (2002). Towards a neural basis of auditory sentence processing. Trends in Cognitive Sciences, 6(2), 78–84. https://doi.org/10.1016/s1364-6613(00)01839-8CrossRef Google Scholar PubMed

Friederici, A. D. (2011). The brain basis of language processing: From structure to function. Physiological Reviews, 91(4), 1357–1392. https://doi.org/10.1152/physrev.00006.2011CrossRef Google Scholar PubMed

Friederici, A. D. (2017). Language in our brain: The origins of a uniquely human capacity. The MIT Press.CrossRef Google Scholar

Friederici, A. D., Steinhauer, K., & Frisch, S. (1999). Lexical integration: Sequential effects of syntactic and semantic information. Memory and Cognition, 27(3), 438–453. https://doi.org/10.3758/bf03211539CrossRef Google Scholar PubMed

Hahne, A., & Friederici, A. D. (2002). Differential task effects on semantic and syntactic processes as revealed by ERPs. Cognitive Brain Research, 13(3), 339–356. https://doi.org/10.1016/s0926-6410(01)00127-6CrossRef Google Scholar PubMed

Huang, J., Pickering, M. J., Yang, J. H., Wang, S. P., & Branigan, H. P. (2016). The independence of syntactic processing in Mandarin: Evidence from structural priming. Journal of Memory and Language, 91, 81–98. https://doi.org/10.1016/j.jml.2016.02.005CrossRef Google Scholar

Isel, F., Hahne, A., Maess, B., & Friederici, A. D. (2007). Neurodynamics of sentence interpretation: ERP evidence from French. Biological Psychology, 74(3), 337–346. https://doi.org/10.1016/j.biopsycho.2006.09.003CrossRef Google Scholar PubMed

Kennison, S. M. (2009). The use of verb information in parsing: Different statistical analyses lead to contradictory conclusions. Journal of Psycholinguistic Research, 38(4), 363–378. https://doi.org/10.1007/s10936-008-9096-9CrossRef Google Scholar PubMed

Li, C., & Thompson, S. (1989). Mandarin Chinese: A functional reference grammar. University of California Press.Google Scholar

Li, C. C., Midgley, K. J., & Holcomb, P. J. (2023). ERPs reveal how semantic and syntactic processing unfold across parafoveal and foveal vision during sentence comprehension. Language Cognition and Neuroscience, 38(1), 88–104. https://doi.org/10.1080/23273798.2022.2091150CrossRef Google Scholar PubMed

Liu, Y. Y., Li, P., Shu, H., Zhang, Q. R., & Chen, L. (2010). Structure and meaning in Chinese: An ERP study of idioms. Journal of Neurolinguistics, 23(6), 615–630. https://doi.org/10.1016/j.jneuroling.2010.06.001CrossRef Google Scholar

Loberg, O., Hautala, J., Hamalainen, J. A., & Leppanen, P. H. T. (2018). Semantic anomaly detection in school-aged children during natural sentence reading - A study of fixation-related brain potentials. Plos One, 13(12), e0209741. https://doi.org/10.1371/journal.pone.0209741CrossRef Google Scholar

Mancini, S., Molinaro, N., Davidson, D. J., Aviles, A., & Carreiras, M. (2014). Person and the syntax-discourse interface: An eye-tracking study of agreement. Journal of Memory and Language, 76, 141–157. https://doi.org/10.1016/j.jml.2014.06.010CrossRef Google Scholar

Pinker, S. (1991). Rules of language. Science, 253(5019), 530–535.CrossRef Google Scholar PubMed

Pylkkanen, L. (2019). The neural basis of combinatory syntax and semantics. Science, 366(6461), 62–66. https://doi.org/10.1126/science.aax0050CrossRef Google Scholar PubMed

Satterthwaite, F. E. (1941). Synthesis of variance. Psychometrika, 6(5), 309–316.CrossRef Google Scholar

Sun, C., Wang, H., & Zhang, H. (2019). Hanyu jugou “yiyi tongxing”, yin’ouyu jugou “yixing zhiyi”: laizi ERP de zhengju [A contrastive study of syntactic configurations in Chinese and Indo-European languages based on ERP evidence]. Waiyu jiaoxue yu yanjiu [Foreign Language Teaching and Research], 51(3), 396–408.Google Scholar

R Core Team. (2022). R: A language and environment for statistical computing (Version 4.2.1). R Foundation for Statistical Computing. http://www.r-project.org Google Scholar

Veldre, A., & Andrews, S. (2018). Beyond doze probability: Parafoveal processing of semantic and syntactic information during reading. Journal of Memory and Language, 100, 1–17. https://doi.org/10.1016/j.jml.2017.12.002CrossRef Google Scholar

Wang, F., Ouyang, G., Zhou, C. S., & Wang, S. P. (2015). Re-examination of Chinese semantic processing and syntactic processing: Evidence from conventional ERPs and reconstructed ERPs by residue iteration decomposition (RIDE). Plos One, 10(1), e0117324. https://doi.org/10.1371/journal.pone.0117324CrossRef Google Scholar PubMed

Wang, S. P., Mo, D. Y., Xiang, M., Xu, R. P., & Chen, H. C. (2013). The time course of semantic and syntactic processing in reading Chinese: Evidence from ERPs. Language and Cognitive Processes, 28(4), 577–596. https://doi.org/10.1080/01690965.2012.660169CrossRef Google Scholar

Yang, J. M., Wang, S. P., Chen, H. C., & Rayner, K. (2009). The time course of semantic and syntactic processing in Chinese sentence comprehension: Evidence from eye movements. Memory & Cognition, 37(8), 1164–1176. https://doi.org/10.3758/mc.37.8.1164CrossRef Google Scholar PubMed

Yang, S. Q., Cai, Y. Y., Xie, W., & Jiang, M. H. (2021). Semantic and syntactic processing during comprehension: ERP evidence from Chinese QING structure. Frontiers in Human Neuroscience, 15, 12, 701923. https://doi.org/10.3389/fnhum.2021.701923CrossRef Google Scholar PubMed

Yang, Y., Wu, F. Y., & Zhou, X. L. (2015). Semantic processing persists despite anomalous syntactic category: ERP evidence from Chinese passive sentences. Plos One, 10(6), 15, e0131936. https://doi.org/10.1371/journal.pone.0131936Google Scholar PubMed

Ye, Z., Luo, Y. J., Friederici, A. D., & Zhou, X. (2006). Semantic and syntactic processing in Chinese sentence comprehension: Evidence from event-related potentials. Brain Research, 1071(1), 186–196. https://doi.org/10.1016/j.brainres.2005.11.085CrossRef Google Scholar PubMed

Yu, J., & Zhang, Y. X. (2008). When Chinese semantics meets failed syntax. Neuroreport, 19(7), 745–749. https://doi.org/10.1097/WNR.0b013e3282fda21dCrossRef Google Scholar PubMed

Zeng, T., Li, Y. X., & Wu, M. J. (2020). Syntactic and semantic processing of passive BEI sentences in Mandarin Chinese: Evidence from event-related potentials. Neuroreport, 31(13), 979–984. https://doi.org/10.1097/wnr.0000000000001507CrossRef Google Scholar PubMed

Zeng, T., Mao, W., & Lu, Q. (2016). Syntactic and semantic processing of Chinese middle sentences: evidence from event-related potentials. Neuroreport, 27(8), 568–573. https://doi.org/10.1097/wnr.0000000000000569CrossRef Google Scholar PubMed

Zhang, W., Chow, W. Y., Liang, B., & Wang, S. (2019). Robust effects of predictability across experimental contexts: Evidence from event-related potentials. Neuropsychologia, 134, 107229. https://doi.org/10.1016/j.neuropsychologia.2019.107229CrossRef Google Scholar PubMed

Zhang, W., Xiang, M., & Wang, S. (2022). The role of left angular gyrus in the representation of linguistic composition relations. Human Brain Mapping, 43(7), 2204–2217. https://doi.org/10.1002/hbm.25781CrossRef Google Scholar PubMed

Zhang, Y., Yu, J., & Boland, J. E. (2010). Semantics does not need a processing license from syntax in reading Chinese. Journal of Experimental Psychology: Learning, Memory, and Cognition, 36(3), 765–781. https://doi.org/10.1037/a0019254Google Scholar PubMed

Zhang, Y. X., Li, P., Piao, Q. H., Liu, Y. Y., Huang, Y. J., & Shu, H. (2013). Syntax does not necessarily precede semantics in sentence processing: ERP evidence from Chinese. Brain and Language, 126(1), 8–19. https://doi.org/10.1016/j.bandl.2013.04.001CrossRef Google Scholar

Zhu, Y., Xu, M., Lu, J., Hu, J., Kwok, V. P. Y., Zhou, Y., Yuan, D., Wu, B., Zhang, J., Wu, J., & Tan, L. H. (2022). Distinct spatiotemporal patterns of syntactic and semantic processing in human inferior frontal gyrus. Nature Human Behavior, 6(8), 1104–1111. https://doi.org/10.1038/s41562-022-01334-6CrossRef Google Scholar PubMed

Table 1. Example sentences following the ‘ba’ construction for the four experimental conditions

Table 2. Mean number of strokes, mean word frequency, and cloze probability of the critical words and sentence comprehensibility scores across the four different conditions

Table 3. Mean reaction times (SD) for each unit of analysis across the four conditions

Table 4. Means (SD) of each eye-tracking measure for each area of interest and whole sentences across the four conditions

Article contents

Functional priority of syntax over semantics in Chinese ‘ba’ construction: evidence from eye-tracking during natural reading

Abstract

Keywords

1. Introduction

2. Experiment 1: self-paced reading experiment

2.1. Methods

2.1.1. Participants

2.1.2. Materials

2.1.3. Procedure

2.1.4. Statistical analysis

2.2. Results and discussion

3. Experiment 2: eye-tracking experiment

3.1. Methods

3.1.1. Participants

3.1.2. Materials

3.1.3. Procedure

3.1.4. Statistical analysis

3.2. Results and discussion

4. General discussion

Acknowledgments

Data availability statement

Author contribution

Financial disclosure

Competing interest

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests