Shared book reading is a critical context for children’s word learning. The statistical properties of written language substantially extend the range of words in the child’s input (e.g. Dawson et al., Reference Dawson, Hsiao, Wei, Tan and Banerji2021; Montag et al., Reference Montag, Jones and Smith2015), but it is the socially interactive nature of shared reading that enables meaningful learning to take place. From the lens of sociocultural and constructivist theories (Vygotsky, Reference Vygotsky1978; Wood et al., Reference Wood, Bruner and Ross1976), shared book reading can be viewed as a social process of knowledge construction, where more knowledgeable caregivers gradually scaffold children’s learning within their zone of proximal development. While the benefit of scaffolding for word learning during shared reading is well established (e.g. Blewitt et al., Reference Blewitt, Rump, Shealy and Cook2009; Whitehurst et al., Reference Whitehurst, Falco, Lonigan and Fischel1988), our understanding of the underlying mechanisms is still limited, particularly in relation to the role of child engagement. Wood and colleagues’ seminal work (Reference Wood, Bruner and Ross1976) highlighted several critical elements in the scaffolding process. Recruitment, that is, enlisting and maintaining children’s interest in the task, is one such element.
While there is agreement that shared book reading affords unique word-learning opportunities (e.g. Dawson et al., Reference Dawson, Hsiao, Wei, Tan and Banerji2021; Montag et al., Reference Montag, Jones and Smith2015), for actual learning to happen, children must be actively engaged. Caregiver scaffolding may be critical to encouraging child engagement when the book provides opportunities to learn new words. However, this hypothesis has yet to be empirically tested. Observational and experimental shared-reading studies have offered important and complementary insights into the links between scaffolding, engagement, and word learning (e.g. Blewitt & Langan, Reference Blewitt and Langan2016; Son & Tineo, Reference Son and Tineo2016; Wicks et al., Reference Wicks, Paynter and Westerveld2020), but they have not directly assessed the coupling of scaffolding and engagement during critical word-learning moments, that is, when the book introduces words that are unknown to the child. Enhancing the level of granularity of both coding schemes and analyses may provide critical insights into the engagement-related mechanisms that underlie the word-learning advantages associated with scaffolding.
In the remainder of the introduction, we review the literature on scaffolding, engagement, and word learning from shared book reading in observational and experimental settings. We then turn to an examination of the consistency and variability of these processes across print and digital reading media, before outlining the present study’s aims and hypotheses.
1. Scaffolding and engagement during shared book reading
The key constructs of interest are scaffolding and engagement. We define scaffolding as the process by which caregivers or more knowledgeable individuals enable children or novices to solve a problem, perform a task, or achieve a goal that is just beyond their unassisted efforts (Wood et al., Reference Wood, Bruner and Ross1976), but within their zone of proximal development, that is, within what they can achieve with the support of others (Vygotsky, Reference Vygotsky1978). In the context of shared book reading, caregivers’ scaffolding enables young children, who are not yet able to read independently, to understand and learn from the information within the book. For example, in addition to reading the text aloud, caregivers provide verbal and gestural scaffolds, including repetitions, questions, and pointing, to support the acquisition of new word meanings (Blewitt et al., Reference Blewitt, Rump, Shealy and Cook2009; Flack et al., Reference Flack, Field and Horst2018; Flack & Horst, Reference Flack and Horst2018; Lenhart et al., Reference Lenhart, Lenhard, Vaahtoranta and Suggate2019).
Engagement is a multidimensional construct involving behavioural, emotional, and cognitive components (Fredricks et al., Reference Fredricks, Blumenfeld and Paris2004). It has been defined as a psychological state of activity that affords one to feel activated, exert effort, and be absorbed during learning activities (Wong & Liem, Reference Wong and Liem2022). Some prior studies described both caregiver and child behaviours during shared reading in terms of engagement (Clinton-Lisell et al., Reference Clinton-Lisell, Strouse and Langowski2024). Here, we find it helpful to differentiate between caregiver scaffolding and child engagement, because shared book reading typically provides learning opportunities for the child rather than the caregiver, which, as discussed above, represents the more knowledgeable individual in this scaffolding interaction.
We contend that enhancing engagement might be a crucial mechanism through which scaffolding benefits child word learning. Observational research points to the links between scaffolding, engagement, and word learning during shared book reading. For instance, Son and Tineo (Reference Son and Tineo2016) found that mothers’ attention-getting talk was associated with children’s verbal engagement during shared book reading. Similarly, Wicks et al. (Reference Wicks, Paynter and Westerveld2020) found strong and significant associations between parents’ use of verbal scaffolds, such as questions and prompts, and verbal engagement of children with autism spectrum disorder during shared reading. It is important to point out that these studies analyse global measures of scaffolding and engagement, that is, measures aggregated over the entire shared-reading session. Therefore, they are not directly informative when it comes to pinpointing the coupling of scaffolding and engagement during word-learning moments.
A surprisingly low number of studies have focused on these critical learning moments to date. One notable exception is a study by Hadley and Dickinson (Reference Hadley and Dickinson2019). The authors specifically examined the cues supporting word learning during shared reading and play, in the context of small group activities in the preschool classroom. They coded whether the interaction was instructional (i.e. initiated by the adult), responsive (i.e. initiated by the child), or involving active processing (i.e. the adult encouraged the child’s active participation). These interaction types were assumed to differ in their relative emphases on child engagement. The findings indicated that adult use of target words in responsive interaction was associated with word learning, suggesting that scaffolding–engagement coupling may be a critical mechanism supporting word learning. However, it is important to note that child engagement behaviours were not directly coded in this study.
Lingwood et al.’s (Reference Lingwood, Lampropoulou, De Bezenac, Billington and Rowland2023) study of shared book reading found that caregivers used more prompts during moments of high relative to low child engagement. In contrast, other language-boosting behaviours, such as questions and expansions, were comparable during moments of high and low child engagement. These findings support the link between specific scaffolding behaviours (i.e. prompts) and child engagement in-the-moment. They also underscore the importance of focusing on critical episodes of shared reading rather than aggregating behaviours over the entire session. However, this study did not specifically examine word-learning moments.
Experimental studies also indicate possible links between scaffolding, engagement, and word learning during shared reading. For instance, Blewitt and Langan (Reference Blewitt and Langan2016) manipulated extratextual talk across conditions to elicit low, moderate, or high levels of child engagement when target words appeared in the book. They found that children in the high engagement condition demonstrated better word learning compared to those in the low and moderate engagement conditions, suggesting that scaffolding may support learning by enhancing engagement. However, it is important to note that child engagement was not directly measured in this study. Therefore, it remains unclear whether child engagement was actually boosted as intended by the experimental manipulation and whether enhanced engagement explains the observed word-learning advantages.
2. Consistency and variability between print and digital reading media
The comparison of engagement and learning from print versus digital reading media is a topic of current interest in developmental psychology and early childhood education (Kucirkova, Reference Kucirkova2019). This is because digital media, such as digital books, are increasingly prevalent in young children’s lives and bring a range of potential benefits and drawbacks compared to traditional print books. Digital books can be enriched with multimodal enhancements to support learning (e.g. Bus et al., Reference Bus, Takacs and Kegel2015; Sun et al, Reference Sun, Roberts and Bus2022). However, in the absence of (carefully designed) enhancements, digital books may bring specific challenges given the multi-purpose nature of the medium. Digital books are accessed through handheld devices that serve purposes other than reading, such as streaming and playing. Relevant to our focus is the meta-analysis by Clinton-Lisell et al. (Reference Clinton-Lisell, Strouse and Langowski2024), which quantified the effect of reading medium on various measures of child engagement. They found no reliable differences in child global engagement as a function of reading medium, when examining both behavioural engagement, assessed via pointing and visual attention, and cognitive engagement, assessed through relevant verbalisations. However, when considering the effect of reading medium on parental behaviours, the authors found that caregiver–child dyads were aligned for cognitive, but not behavioural, engagement.
Individual studies suggest that the reading medium can have opposite effects on caregiver and child behaviours. For instance, prior research has reported that compared to print books, digital books are associated with fewer communicative initiations, responses, and less expanding talk by mothers (e.g. Korat & Or, Reference Korat and Or2010), but enhanced communicative initiations, responsiveness, and visual attention in children (Korat & Or, Reference Korat and Or2010; Richter & Courage, Reference Richter and Courage2017; Wainwright et al., Reference Wainwright, Allen and Cain2020). This is not surprising when we consider parents’ preference for print media (Eun Kim & Hassinger-Das, Reference Eun Kim and Hassinger-Das2019, pp. 89–100; Strouse & Ganea, Reference Strouse and Ganea2017), which is likely due to prolonged and accumulating experience with reading print. On the contrary, young children seem to prefer digital over print books (Richter & Courage, Reference Richter and Courage2017), even if most of their shared-reading experience is still happening via print. There may be multiple mechanisms underlying children’s preferences for digital books, such as their relative novelty and the opportunity for interactivity afforded by such devices. This complex set of observations underscores the importance of considering the reading medium in our analysis of scaffolding–engagement coupling during shared book reading’s word-learning moments. This design also allows us to maximise the ecological validity of our findings in the light of contemporary literacy practice.
3. The present study
The present study investigates the coupling of scaffolding and engagement during critical word-learning moments in the context of shared reading with a caregiver. It also assesses whether any coupling of scaffolding and engagement is robust or variable across print and digital reading media to provide insights into commonalities and differences in caregiver–child interaction across book formats. Our overarching aim was to probe engagement as a critical underlying mechanism in the scaffolding process. First, we discover whether scaffolding and engagement are coupled during shared reading’s word-learning moments (aim 1). Second, we determine whether the coupling of scaffolding and engagement is robust across print and digital reading media (aim 2).
We hypothesised that higher levels of scaffolding should be linked to higher levels of engagement during critical word-learning moments, defined by the pages of storybooks introducing target words. This link should be observed above and beyond the contribution of other potential predictors of engagement, such as children’s age, gender, and their reported knowledge of the target word. We also assessed the consistency versus variability of our findings across reading media. In this case, we did not make a specific prediction because the evidence to date is mixed.
4. Methods
4.1. Participants
Seventy-eight British English-speaking caregiver–child dyads provided data for this study. Data collection took place between January and June 2023 in the context of a larger study (Diprossimo & Cain, Reference Diprossimo and Cain2025), which included 99 dyads. Child participants were typically developing, as reported by their caregivers, and aged between 4;0 and 5;11 (years; months). Children’s mean age was 57.74 months (SD = 7.00; 55.13% girls; 44.87% boys). The age range of caregivers was 29;0–45;0 years (M = 37.69; SD = 3.67; 94.87% mothers; 5.13% fathers). Caregivers were predominantly highly educated, with 80.77% achieving an undergraduate degree or higher. The socioeconomic status (SES) of our participants was derived from their postcode (Government of the UK, 2019). The derived Index of Multiple Deprivation reflected various domains of deprivation, which include income, employment, health, education, barriers to housing and services, crime, and living environment (Ministry of Housing Communities and Local Government, 2019). According to this comprehensive index, 24% of our sample was below the 5th decile, 64% was above the 5th decile, and the remaining 12% was within the 5th decile. Caregivers reported reading print books with their child frequently: the vast majority (96.2%) on a daily basis, and the remainder either several times a week (2.6%) or once or twice a week (1.3%). The pattern differed for digital shared book reading: the majority (75.6%) of caregivers reported never reading digital books with their child, and only a small proportion reported reading digital books with their child more than once a month (9%), once or twice a week (10.3%), several times a week (3.8%), or daily (1.3%).
Participants were recruited via the university database and flyers distributed in public book libraries in a middle-sized town in North West England. To complement this strategy, participating caregivers were also asked to share the study flyer with their own social networks. Prior to data collection, written informed consent was obtained from caregivers. Children received a book, and caregivers received a travel reimbursement for their participation. This research has received ethical approval from the Faculty of Science and Technology, Lancaster University (reference number: FST-2022-0791-RECR-3).
4.2. Materials
To ensure the novelty of the storyline and comparability of target words, two custom storybooks were designed in Canva Pro. Our plots featured a canonical Western structure of exposition, conflict, and resolution. A similar structure has been successfully used in previous studies with 3.5- to 4.5-year-olds (e.g. Piazza et al., Reference Piazza, Cohen, Trach and Lew-Williams2021). Each story introduced an animal protagonist and an activity they enjoyed very much. As the protagonist engaged with the activity, an unexpected event occurred that needed to be resolved. While searching for a solution, the protagonist came across the target items and corresponding words. Each story ended with the protagonist finding what they were looking for. Each storybook was available in print and digital format. Each story served as the print condition for half of the participating dyads and the digital condition for the other half in a within-subjects design with order of story and format presentation counterbalanced across participants. Digital books did not include any hotspots or additional features. The size of the book was matched across formats (single page size: 126 x 113 mm; open book/iPad screen size: 126 x 226 mm). Each target word was depicted in the visual storyline twice on two successive pages. Target words were repeated three times in the text across the same two consecutive pages where they were illustrated. Each target word was accompanied by an adjective on the second mention. Storybook materials can be found on OSF under Creative Commons Attribution 4.0 International (https://doi.org/10.17605/OSF.IO/365VC).
To ensure high levels of ecological validity, target words were real words likely unfamiliar to children in our age range. The selection of the low-frequency target words was informed by several criteria (see Lenhart et al., Reference Lenhart, Lenhard, Vaahtoranta and Suggate2020 and Sarı et al., Reference Sarı, Başal, Takacs and Bus2019 for a similar approach), including their frequency in the SUBLEX corpus of children’s TV programmes (van Heuven et al., Reference van Heuven, Mandera, Keuleers and Brysbaert2014) and their age of acquisition (Kuperman et al., Reference Kuperman, Stadthagen-Gonzalez and Brysbaert2012). We selected concrete nouns for animals (myna, okapi, sloth, toucan) and tools (clamp, valve, chisel, screw). We included one word in each category that was more likely to be known by children (i.e. toucan, screw) to support motivation and engagement with the storyline. Each story contained four target words. Words across stories were closely matched on the age of acquisition and frequency (Kuperman et al., Reference Kuperman, Stadthagen-Gonzalez and Brysbaert2012; van Heuven et al., Reference van Heuven, Mandera, Keuleers and Brysbaert2014). Psycholinguistic properties of the target words and accompanying adjectives are reported in Supplementary Table S1.
4.3. Measuring target word knowledge
Caregivers completed a vocabulary checklist designed after The MacArthur Communicative Development Inventories (Fenson, Reference Fenson2002) as a proxy for their child’s knowledge of target words (see Shi et al., Reference Shi, Gu and Vigliocco2022 for a similar approach). For each target word, caregivers stated whether their child understood (receptive knowledge) or understood and said (receptive and expressive knowledge) the target word or not. One point was assigned if receptive or receptive and expressive knowledge were marked as present by caregivers; otherwise, 0 was assigned.
4.4. Procedure and study design
After completing the vocabulary checklist, caregiver–child dyads engaged in shared book reading. The session took place in an observation room that enabled non-intrusive audio and video recording of the interaction. Dyads received the following instructions: “I would like you to read together as you would do at home. Please take your time, I will be back when you are finished.” In a within-subjects design, each dyad read one of the two books presented on paper and the other book presented on an iPad, with the order of book and format presentation counterbalanced across participants. A detailed description of the sessions is available in Diprossimo and Cain (Reference Diprossimo and Cain2025).
4.5. Coding scheme
A coding scheme was developed to quantify caregivers’ verbal and gestural scaffolds during word-learning moments (adapted from Hadley & Dickinson, Reference Hadley and Dickinson2019). Word-learning moments were defined by the pages of the storybook introducing the target words. For each unique target word–child combination (hereafter observation), several behaviours were coded: (1) the number of target word repetitions by the caregiver in extra-textual talk; (2) whether definitional information, including synonyms, perceptual, or conceptual information, in relation to the target word was provided by the caregiver in extra-textual talk; the number of (3) comments (e.g. “Look at that!”) and (4) questions that were related to the target word in extra-textual talk (e.g. “Can you find the [X]?”); and gestural behaviour for each observation, specifically, (5) the presence of pointing and (6) iconic gestures (i.e. a gesture that illustrates word meaning such as opening and closing one’s hand with fingers straight to mimic a clamp).
Children’s verbal and gestural engagement was coded for each observation. Mirroring caregiver scaffolding, the following behaviours were coded: (1) the number of target word repetitions by the child; the number of (2) child comments and (3) questions related to the target words (e.g. “What’s that?”); and gestural behaviour for each observation, specifically, the presence of (4) pointing and (5) iconic gestures. Definitional information was not coded for the child because target words were selected to be likely unknown to children.
After being trained with a pilot dataset, two student assistants, blind to our hypotheses, independently coded the video recordings of the shared-reading interactions. To assess inter-rater reliability, 20% of videos were double-coded. Intra-class correlation (ICC) analysis revealed that levels of agreement ranged from good to excellent (Cicchetti, Reference Cicchetti1994): caregiver repetition (ICC = .95), caregiver definition (ICC = .85), caregiver comment (ICC = .83), caregiver question (ICC = .97), caregiver pointing (ICC = .95), caregiver iconic gesture (ICC = .72), child repetition (ICC = .89), child comment (ICC = .90), child question (ICC = .80), child pointing (ICC = .72). The code child iconic gesture was excluded from further analyses as it was extremely rare in the data and the level of agreement was insufficient (ICC = .45).
4.6. Analytic strategy
First, we assessed the correlation between the coded behaviours. Then, we assessed the dimensionality of scaffolding and engagement codes using Wayne Velicer’s Minimum Average Partial (MAP) criterion (Velicer, Reference Velicer1976) and applied the scree test to identify a sudden drop in eigenvalues (Cattell, Reference Cattell1966). This allowed us to determine the number of principal components to retain. Principal component analyses (PCA) were conducted in R version 4.1.3 (2022-03-10) using the functions vss, fa.parallel, and principal of the packages psych and psychTools. After dimension reduction, we fit linear mixed models (Baayen et al., Reference Baayen, Davidson and Bates2008) to predict child engagement at the level of each word-learning episode, including the random intercept for each subject to account for inter-individual variation in the baseline level of engagement. Convergence issues were addressed by increasing the number of iterations and using different optimisers. To address our first research aim, we compared two models: a null model (M0) including book format, reported knowledge of individual words, child age, and gender, as control predictors of engagement, and a full model (M1) also including the test predictor of caregiver scaffolding in addition to control predictors. To address our second research aim, we compared M1 with an interaction model (M2) that allowed the effect of scaffolding to vary across the levels of book format. Table 1 provides a schematic view of the model comparison strategy. The models were implemented in R version 4.1.3 (2022-03-10) with the function lmer of the R package lme4 (version 1.1–33) (Bates et al., Reference Bates, Mächler, Bolker and Walker2015). Predicted probabilities were computed using the function ggpredict of the R package ggeffects, version 1.3.2 (Lüdecke, Reference Lüdecke2018). This study was not pre-registered. Data and code necessary to reproduce the analyses presented here are available at the OSF project repository: https://doi.org/10.17605/OSF.IO/365VC.
Table 1. Model comparison strategy.

Note: Test predictors in italics. M = model, A = aim.
5. Results
5.1. Preliminary analyses
Reading time was comparable in print (M = 4.65 minutes; SD = 2.44) and digital format (M = 4.70 minutes; SD = 2.91). We report descriptive statistics of scaffolding and engagement behaviours for each word-learning episode in Tables 2 and 3. There were significant albeit moderate correlations between scaffolding and engagement behaviours (see Table 3), motivating the need to assess the dimensionality of these two constructs.
Table 2. Means and standard deviations of caregiver scaffolding and child engagement behaviours per each word-learning episode by book format

a refers to count variables.
b refers to binary variables.
Table 3. Means, standard deviations, and correlations with confidence intervals of caregiver scaffolding and child engagement behaviours per each word-learning episode

a refers to count variables.
b refers to binary variables.
Note: Values in square brackets indicate the 95% confidence interval for each correlation. The confidence interval is a plausible range of population correlations that could have caused the sample correlation. * indicates p < .05. ** indicates p < .01
For the scaffolding variables, the Kaiser–Meyer–Olkin measure of sampling adequacy was 0.76, which was above the recommended minimum value of 0.50 (Kaiser, Reference Kaiser1974). We assessed the dimensionality of scaffolding using the Wayne Velicer’s Minimum Average Partial (MAP) criterion (Velicer, Reference Velicer1976) and applying the scree test to identify a sudden drop in eigenvalues (Cattell, Reference Cattell1966). According to these criteria, a single scaffolding component emerged. PCA revealed that the proportion of variance explained by this component was 44%. As expected, loadings were all positive: caregiver repetitions (.71), comments (.84), definition (.72), questions (.67), pointing (.38), and iconic gesture (.55). Given the extraction of a single component for scaffolding, a single score is used in subsequent analysis.
For the engagement variables, the Kaiser–Meyer–Olkin measure was 0.59, exceeding the recommended minimum value of 0.50 (Kaiser, Reference Kaiser1974). The same analysis was conducted to assess the dimensionality of engagement. A single engagement component emerged according to MAP criterion (Velicer, Reference Velicer1976) and by applying the scree test (Cattell, Reference Cattell1966). PCA revealed that this component explained 40% of the variance. As expected, loadings were all positive: child repetitions (.62), comments (.73), questions (.65), and pointing (.51). Given the extraction of a single component for engagement, a single score was used in subsequent analyses.
Both PCA were conducted on a nested dataset containing multiple observations per child. This is because word-learning episodes were the unit of interest for our subsequent analyses. The repeated-measures nature of the data was handled during modelling. As a sanity check, we also conducted PCA after computing an average score per child for each behaviour coded. The loadings were comparable (see Supplementary Table S2, Supplementary Materials), suggesting that within-participant variation did not bias the extraction of components.
5.2. Confirmatory analyses
To address our first aim, we compared M1, including caregiver scaffolding, with M0, lacking this predictor but being otherwise identical. The likelihood ratio test revealed that M1 was a significantly better fit to the data compared to M0 (χ2 = 120.98, df = 1, p < .001). Caregiver scaffolding was significantly coupled with child engagement during word-learning episodes (β = 0.49, SE = 0.04, CI = 0.41–0.57, p < .001) after controlling for reading medium, child individual lexical knowledge, age, and gender. We note that child age was also significantly and positively associated with engagement (β = 0.11, SE = 0.05, CI = 0.01–0.20, p = .037).
To address our second aim, we compared the interaction model M2 with M1, which lacked the interaction between scaffolding and book format but was otherwise identical. The likelihood ratio test revealed that M2 was not a better fit for the data than M1 (χ2 = 1.892, df = 1, p = .169). Therefore, there was no evidence that the coupling of scaffolding and engagement was conditional on the reading medium, supporting its robustness across book formats. Table 4 reports the results of the linear mixed models estimating child engagement. The best model, M1, explained 37% of the variance as illustrated by the conditional R 2. Figure 1 depicts the predicted and observed values for scaffolding and engagement.
Table 4. Linear mixed models predicting child engagement in each word-learning episode as a function of scaffolding, after controlling for book format, individual lexical knowledge, age, and gender.

Note: Book format was dummy-coded with print as the reference level. Individual lexical knowledge is a binary variable where 0 refers to reported unknown and 1 to reported known target words. Age is expressed in months and z-transformed. Gender is dummy-coded with girls as the reference level. Scaffolding and engagement are both factor scores derived from the PCA. Significant predictors are signalled in bold

Figure 1. Predicted and observed values for scaffolding and engagement components during individual word-learning episodes.
We checked for order effects by subsetting our dataset into two subsamples: one sample composed of children who started with the print reading condition and the other sample of children starting with the digital reading condition. The pattern of results did not differ (see Supplementary Tables S3–S4). These checks suggest that consistency in scaffolding–engagement coupling across reading media is not a byproduct of carryover effects.
As a robustness check, we complemented our analyses with a composite score approach, where the binary version of codes was arithmetically summed and modelled as such. The pattern of results remained consistent (see Supplementary Table S5, Supplementary Materials).
6. Discussion
This study assessed the coupling of scaffolding and engagement during word-learning moments in the context of shared reading between caregiver–child dyads. The key aim was to establish whether engagement is a critical underlying mechanism in the scaffolding process. We found that scaffolding and engagement were tightly coupled during critical word-learning episodes. This effect was robust across print and digital reading media. Child age was a significant predictor of engagement. These results support the hypothesis that the word-learning advantage typically associated with scaffolding might be explained through the mechanism of enhanced child engagement during the task. This bears important theoretical, methodological, and practical implications. We discuss each of these perspectives in turn.
From a theoretical standpoint, these findings support and extend prior models and evidence on the crucial role of engagement in the scaffolding process (Blewitt & Langan, Reference Blewitt and Langan2016; Son & Tineo, Reference Son and Tineo2016; Wood et al., Reference Wood, Bruner and Ross1976). Specifically, the critical contribution of the present study stems from the fine-grained investigation of word-learning episodes during naturalistic shared reading. This allowed us to pinpoint the coupling of scaffolding and engagement during critical moments for learning. In terms of language development, it is well established that shared book reading provides a valuable source of input (Dawson et al., Reference Dawson, Hsiao, Wei, Tan and Banerji2021; Montag, Reference Montag2019; Montag et al., Reference Montag, Jones and Smith2015; Nation et al., Reference Nation, Dawson and Hsiao2022; Noble et al., Reference Noble, Cameron-Faulkner and Lieven2018). This uniquely rich input might be optimally accessible to children in the light of the joint attentional focus and active engagement afforded by the shared-reading context (e.g. Farrant & Zubrick, Reference Farrant and Zubrick2011). The present work empirically supports this view by clearly demonstrating the coupling of scaffolding and engagement during critical word-learning episodes in naturalistic shared-reading settings.
These findings can also be interpreted in the light of recent accounts of agency as a driver of cognitive development (Tomasello, Reference Tomasello2024). Specifically, our results speak to the idea that early childhood is marked by a qualitative shift from joint agency to metacognitive agency, reflected in an increasingly sophisticated ability to coordinate actions with others towards a shared goal. Crucially, the emergence of metacognitive agency allows children to reflect on and regulate their own cognitive processes. The fine-grained coupling of scaffolding and engagement during word-learning moments that we observed may thus reflect children’s increasingly sophisticated ability to coordinate with their caregivers towards the shared goal of maximising comprehension and learning from shared book reading. Such coordination may be mediated by children’s developing metacognitive skills. Our analyses indeed revealed that child age was significantly and positively associated with engagement. Future work may usefully adopt a longitudinal design to map the development of this scaffolding–engagement coordination in relation to emerging metacognition. Furthermore, child engagement was significantly higher for words reported to be unknown to the child by their caregiver. This provides additional support for the idea that children actively participate to maximise their learning progress during shared book reading, by signalling their knowledge gaps through questions such as “What’s that?”, commenting “I don’t know what that is” or pointing at an unknown entity in the book.
It is important to note that multiple underlying mechanisms might explain the learning advantages typically associated with scaffolding. The focus of this paper was engagement. Other mechanisms at play likely involve the reduction in cognitive load via modulation or fine-tuning (Leung et al., Reference Leung, Tunkel and Yurovsky2021; Shi et al., Reference Shi, Gu and Vigliocco2022), simplifying the task by reducing degrees of freedom, marking critical features by accentuating what is most relevant, controlling frustration, and demonstrating or modelling the desired behaviours (Wood et al., Reference Wood, Bruner and Ross1976). Given the many elements interacting in the scaffolding process (Carranza-Pinedo & Diprossimo, Reference Carranza-Pinedo and Diprossimo2025; Diprossimo et al., Reference Diprossimo, Ushakova, Zoski, Gamble, Irey and Cain2023; Diprossimo & Cain, Reference Diprossimo and Cain2026), future work employing computational models may allow us to simulate engagement and learning under (a combination of) different scaffolding parameters and offer new insights to be tested empirically (Cheung et al., Reference Cheung, Hartley and Monaghan2021).
Methodologically, this research adds to an emerging body of work moving beyond global analysis of behaviour over the entire shared-reading sessions (Hadley & Dickinson, Reference Hadley and Dickinson2019; Lingwood et al., Reference Lingwood, Lampropoulou, De Bezenac, Billington and Rowland2023). Our work indicates that analysing specific word-learning episodes is a promising way to advance our understanding of how caregiver–child dyads coordinate their behaviours at fine-grained levels. The granularity of our coding scheme did not permit a more detailed exploration of temporal dynamics, beyond analysing the defined word-learning episodes that were the focus of the current study. An important next step is to further increase the granularity with sequential analysis, which may be particularly helpful for assessing the directionality of relations (for an example in the classroom context, see Deshmukh et al., Reference Deshmukh, Zucker, Tambyraja, Pentimonti, Bowles and Justice2019). For instance, it may be useful to explore the extent to which scaffolding is driving engagement and engagement is driving scaffolding during caregiver–child shared reading. It is plausible that bidirectional relations will be observed, yet one behaviour may be a stronger predictor of the other.
From a more applied perspective, these findings provide important insights into naturally occurring behaviours that could be fostered via direct or indirect interventions to support early literacy. For example, caregivers’ awareness of scaffolding–engagement coupling may be boosted through observation and small group discussion of videorecorded shared-reading interactions. Increased awareness of scaffolding–engagement coupling may maximise the extent to which caregivers attend to children’s engagement behaviours, enhancing the quality of shared reading. Furthermore, these findings contribute to a more nuanced understanding of the similarities and differences in scaffolding and engagement during print and digital shared book reading. We identified both consistency and variability across reading formats. A critical novel contribution of this study is the identification of comparable scaffolding–engagement coupling across formats. Descriptive analyses revealed comparable levels of child engagement, which is in line with the review by Clinton-Lisell et al. (Reference Clinton-Lisell, Strouse and Langowski2024). They also revealed a higher overall level of scaffolding in the print compared to the digital reading condition, which is consistent with prior research (e.g. Korat & Or, Reference Korat and Or2010). Thus, we did not find an overall negative impact of digital media. This suggests that the negative impact of digital media may be specific only to some aspects of caregiver scaffolding, but not scaffolding–engagement coupling, or child engagement. These findings might be attributed to caregivers’ preferences for and familiarity with the print format, as indicated by prior survey studies (Strouse & Ganea, Reference Strouse and Ganea2017). We believe that informing caregivers about the potential of digital shared book reading to extend, rather than replace, print-based experiences might mitigate the detrimental effects of digital format on caregiver scaffolding observed here and in prior work.
6.1. Limitations and future directions
There are limitations to our study, which should be considered. Our findings are limited to the characteristics of our sample and learning materials. We note that our caregivers were primarily highly educated mothers who reported reading frequently to their child. Hence, these results may not replicate in other samples and contexts. That is a target for future research. With respect to the learning materials, we did not find differences in scaffolding–engagement coupling across print and digital formats. Nevertheless, future work should consider the potential moderating role of different features of print and digital books, as well as book genres. This would allow a more nuanced account of the many elements that interact in the scaffolding process.
From a methodological point of view, we note that the components extracted from the PCA explained a relatively small proportion of variance (40–44%). However, these values align with those observed in the same age group for the coding of narrative production after shared reading (Silva & Cain, Reference Silva and Cain2019) or behavioural regulation measures (Schmitt et al., Reference Schmitt, Pentimonti and Justice2012). Our robustness checks using composite scores revealed the same pattern of results. The consistency in results across different coding and analytic approaches confirms the main study findings.
Finally, the focus of the present study was to assess the relationship between caregiver scaffolding and child engagement during word-learning moments. Word-learning measures were not analysed. An important target for future research is to assess the links between scaffolding–engagement coupling and word-learning outcomes.
7. Conclusions
Overall, this study makes a unique contribution to the extant literature by providing the first empirical evidence of scaffolding–engagement coupling during shared reading’s word-learning moments. We hope that future work will continue to examine the multifaceted and fine-grained aspects of caregiver–child interactions that may explain the outstanding pace of vocabulary development in early childhood.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S0305000926100555.
Acknowledgements
We sincerely thank all the participating caregivers and children. We are grateful to Simya Aravamuthan, Gracey Caller, Arwen Hon, Phoebe Schaw, and Ffion Jones for their support with data scoring, annotations, and participant recruitment. We thank Anastasia Ushakova for her valuable advice on data analysis. This work has received funding from the Marie Sklodowska-Curie Actions grant agreement no. 857897 and the Women in Research (WiRe) Postdoctoral fellowship.
Author contribution
Conceptualisation: L.D., K.C.; Data Curation: L.D.; Formal Analysis: L.D.; Funding Acquisition: L.D., K.C.; Project Administration: L.D.; Supervision: K.C.; Visualisation: L.D.; Writing – Original Draft.; L.D.; Writing – Review and Editing: L.D., K.C.
Funding statement
This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions grant agreement no. 857897 and the Women in Research (WiRe) Postdoctoral fellowship.
Competing interests
We have no competing interests to disclose. AI tools were not used in this research.
