The association between screen media quantity, content, and context and language development

This study investigates the influence of the quantity, content, and context of screen media use on the language development of 85 Saudi children aged 1 to 3 years. Surveys and weekly event-based diaries were employed to track children ’ s screen use patterns. Language development was assessed using JISH Arabic Communicative Development Inventory (JACDI). Findings indicate that the most significant predictor of expressive and receptive vocabulary in 12- to 16-month-olds was screen media context (as measured by the frequency of interactive joint media engagements). In older children (17- to 36-month-olds), more screen time (as measured by the amount of time spent using screens, the prevalence of background TV at home, and the onset age of screen use) had the highest negative impact on expressive vocabulary and mean length of utterance. These findings support health recommendations on the negative effects of excessive screen time and the positive effects of co-viewing media with children.


Introduction
Screen time is nowadays an integral part of young children's lives in many parts of the world wherever they have access to content delivered via TVs, andincreasingly more commonlyvia computers, tablets, and smartphones. Examining the relation between screen media exposure and language development in the first few years of life is of extreme importance as it is well established that these early stages are a time of rapid and substantial neural, cognitive, and linguistic development (Bornstein, 2015;Rodriguez et al., 2009). The reported findings to date on the association between screen media use and children's language development vary widely across studies based on their differing foci on variables such as the child's age, the screen medium examined (e.g., television, touchscreen, computer), screen media parameters (quantity, content, or context), and language outcomes (e.g., expressive language skills, receptive language skills, novel word learning, imitation). The aim of this study is to examine the predictive relationship between the aforementioned three screen media use parameters (quantity, content, and context) and language outcomes among children under 3.

Screen media use and language development
The association between screen media exposure, which we consider as an integral element of the environment, and language development is examined by drawing on Bronfenbrenner (1979)'s ecological systems theory. Bronfenbrenner (1979) focuses on the social contexts in which children live and posits that their development is influenced by the reciprocal interactions between a series of nested ecological systems (i.e., MICROSYSTEM, MESOSYSTEM, EXOSYSTEM, MACROSYSTEM, and CHRONOSYSTEM). When ecological systems theory was first introduced in 1979, television was the prevalent technology available to children. Television was considered by Bronfenbrenner (1979) to be part of the child's exosystem because it enters the child's home from an external source. This powerful medium influences parents and parent-child interactions, and thus it operates not just within the child's microsystem, but rather across the child's ecological borders. Johnson and Puplampu (2008) introduced the ECOLOGICAL TECHNO-SUBSYSTEM as a dimension of the microsystem to account for the presence of the internet within the ecological system. The techno-subsystem includes children's interaction with their microsystem (i.e., immediate environments) via technology. Bronfenbrenner (1979) coined the term MOLAR activity, defining it as "an ongoing behavior possessing a momentum of its own and perceived as having meaning or intent by the participants in the setting" (Bronfenbrenner, 1979, p. 45), and highlighted the significance of molar activities on learning and development. Experiences and activities do not play equal roles in children's development; some occur infrequently and/or are not very significant, whereas others (molar activities) occur frequently and have more notable influences on development (Lauricella, Wartella & Rideout, 2015). Given the important presence of screen media across different contexts within children's ecological systems, along with the increasing amount of time spent by children and adults engaged with screens, children's and parents' media use, parental attitudes towards media, and parental media mediation practices can all be regarded as molar activities in children's environments (Lauricella et al., 2015).
One of the major concerns associated with children's excessive use of screens is the reduction, or even possible displacement, of real-life social interactions (Dore, Logan, Lin, Purtell & Justice, 2020;Dynia, Dore, Bates & Justice, 2021). This possibility is particularly important because these types of interactions are an essential component of ecological systems theory. Indeed, the concern about TIME DISPLACEMENT is not new. Bronfenbrenner (1979) talked about the lure of television and of its power to turn children into passive and silent spectators.
Association between screen media use and language development In this section, we review the literature on three screen media parameters: quantity, content, and context for children under the age of 5.

Quantity of screen media exposure
Research on the quantity of screen media exposure typically involves (a) the onset age at which children start using screens, which contributes to the total cumulative amount of time children are exposed to screens, and (b) the amount of FOREGROUND EXPOSURE (time children spend actively engaged with screens) and BACKGROUND EXPOSURE (time children spend being exposed to screens in the background without actively viewing or using them). Few studies have specifically looked at the association between the onset age at which screen media viewing starts and language outcomes. So far, the evidence suggests that children who start using screens at earlier ages have lower language outcomes than those who start later (Chonchaiya & Pruksananonda, 2008;Hudon, Fennell & Hoftyzer, 2013;Supanitayanon, Trairatvorakul & Chonchaiya, 2020).
A major concern in the literature is that excessive screen media use may displace other stimulating activities that are established in the literature as predictive of better language outcomes such as reading and play (e.g., Bus, van Ijzendoorn & Pellegrini, 1995;Cameron-Faulkner, MacDonald, Serratrice, Melville & Gattis, 2017;Farrant & Zubrick, 2011;Weisberg, Zosh, Hirsh-Pasek & Golinkoff, 2013;Whitehurst & Lonigan, 1998). Therefore, studies that quantify the amount of time children spend on using screens should also take into consideration the amount of time children spend on other daily activities such as reading and play.

Content of screen media
Screen media content types available to children vary and they differ in several aspects including the audience that they target (i.e., child-directed vs. adult-directed), their educational value (i.e., educational vs. non-educational), and the languages that they use (e.g., child's first language, child's second language, a foreign language). Variation in these aspects of screen media content has been linked to a range of language and learning outcomes.
A number of longitudinal studies have shown that young children's viewing of programs that are not age-appropriate is negatively associated with language and cognitive outcomes (e.g., Barr, Danziger, Hilliard, Andolina & Ruskis, 2010;Wright, Huston, Murphy, St. Peters, Pinon, Scantlin & Kotler, 2001a). Adult-directed programs have also been found to reduce the quality and quantity of parent-child interactions, which are significant predictors of child language development (e.g., Kirkorian, Pempek, Murphy, Schmidt & Anderson, 2009).
Educational benefits are among the most frequent motives that parents cite for allowing their children to use screens (Bentley, Turner & Jago, 2016;Li, Mendoza & Milanaik, 2017). Several studies have found positive associations between viewing specific children's educational programming (e.g., Sesame Street, Arthur, Clifford, Dragon Tales, Dora the Explorer, and Blue's Clues) and language outcomes in children above the age of 2 years (e.g., Linebarger & Walker, 2005;Wright, Huston, Scantlin & Kotler, 2001b). However, children younger than 2 do not seem to gain similar benefits from watching educational shows (e.g., DeLoache, Chiong, Sherman, Islam, Vanderborght, Troseth, Strouse & O'Doherty, 2010;Krcmar, 2014;Tomopoulos, Dreyer, Berkule, Fierman, Brockmeyer & Mendelsohn, 2010). This adds to accumulating evidence that children under 2 do not learn as effectively from screen media as they do from live presentations (e.g., Neuman, Kaefer, Pinkham & Strouse, 2014;Roseberry, Hirsh-Pasek, Parish-Morris & Golinkoff, 2009), in what has been named the VIDEO DEFICIT EFFECT (VDE; Anderson & Pempek, 2005). The VDE is also known as a TRANSFER DEFICIT (Barr, 2010) as young children experience difficulties in transferring learning from a 2D to a 3D context. In a recent systematic review and meta-analysis, Madigan et al. (2020) concluded that, in older children, better quality of screen exposure (i.e., educational and co-viewing) appears to be beneficial for language skills.
The language input received from screen media is an important aspect to consider, especially in communities where the language variety used in screen media is different from the variety that children hear in daily conversations around them. Some studies have found negative associations between children viewing screen media in a language other than the language spoken at home and their first language development (e.g., Duch et al., 2013).

Context of screen media exposure
The social context of screen media use refers to whether screen media is viewed with other people or individually. Two or more people watching television together has been described for many years as CO-VIEWING (Austin, 1993;Valkenburg, Krcmar, Peeters & Marseille, 1999). JOINT MEDIA ENGAGEMENT (JME), a more recent term, is sometimes used to refer to both TV co-viewing and mobile media co-using (Takeuchi & Stevens, 2011). Co-viewing of educational programs with contingently responsive adults has been found to have better outcomes such as better attentiveness, novel word learning, and expressive vocabulary growth than solitary viewing (e.g., Myers, Crawford, Murphy, Aka-Ezoua & Felix, 2018;Rasmussen, Keene, Berke, Densley & Loof, 2017;Strouse, Troseth, O'Doherty & Saylor, 2018). CONTINGENCY refers to the follow up on the child's current focus of attention whereas RESPONSIVENESS refers to a caregiver's sensitivity to a child's attempts to interact, recognition of child's cues and needs, and responding to these attempts, signals, and needs appropriately and promptly (Matthews, McGillion & Pine, 2016;McGillion, Pine, Herbert & Matthews, 2017). According to the latest screen time recommendations from the American Academy of Pediatrics (2016), interactive co-viewing is the primary factor in facilitating toddlers' word learning from screens. A recent review of the literature on JME by Ewin, Reupert, McLean and Ewin (2021) found that the effect of JME on parents' and children's language quantity and quality was mixed. However, it should be noted that most of the studies reviewed in Ewin et al. (2021) on the impact of JME on language compared shared print book reading to shared e-book reading rather than JME versus solitary viewing or solitary screen media use. The review also found that child's age is one of the factors that influence parent-child interactions during JME with parents providing less verbal and physical support as children get older and are more capable of using devices on their own. This reduction in parental scaffolding may decrease associated positive interactions (Ewin et al., 2021).

The current study
Our review of the literature revealed several gaps that require attention. First, children under 3 years are under-represented in research and in governmental and think tanks' reports into the impact of screen media use on children's health and development in general, and on language development specifically. For example, the UK's Office of Communications (Ofcom) publishes annual reports on adults and children's media use and attitudes, but they only examine children aged 3 years and older. Similarly, EU Kids Online publishes yearly reports on children's media use in Europe but does not report data on children under 3. By focusing on children under the age of 3 years in this study, we are targeting a critical time in children's emergent receptive and expressive language skills. Early childhood is also a crucial period for the establishment of lifelong media habits and routines and a critical window for intervention (Radesky & Christakis, 2016).
Second, non-Western cultures are under-represented in the literature, as most of what is known about children's screen media use and its effects on language development comes from North America and Europe. To the best of our knowledge, this is the first study to investigate the association between screen media exposure and language development in young children in the Middle East and North Africa (MENA) region. Saudi Arabia, the location of our fieldwork, provides a unique setting for this study as it is the largest media market in the MENA region (Dubai Press Club & Dubai Media City, 2016), and the world's highest per capita consumer of YouTube (Smith, 2013). Research on children under the age of 3 in Saudi Arabia has often been a neglected area in child development research. Most previous studies have focused on school-aged children or older and on children's physical health rather than their regular daily routines and practices (e.g., parent-child interactions, screen time, reading, and play) and the impact of these practices on children's development. Only 17% of children aged 3-5 years in the country attend kindergarten (Saudi Ministry of Education, 2021). No data is available on children under 3 years. The focus of the Ministry of Education efforts does not include children under 3. In fact, early childhood did not receive much attention in the country until recently, in March 2021, when the Saudi Affairs Council, established in 2017, and the UNICEF partnered in a campaign to raise awareness of the importance of the first three years of life. The significance of conducting research on this young population is particularly important in Saudi Arabia, a demographically young country with a population of over 35 million, where almost 30% are under the age of 15, with 11% under the age of 5 (Saudi General Authority for Statistics, 2019, 2020).
Third, research to date has tended to focus on the association between television and children's health and development. Recent investigations that examined effects of new media on children seemed to exclude traditional media, although it is important to understand that children and adults today usually engage in multitasking with media. For example, the family could be sitting in the living room watching television together and at the same time each of the family members could be engaged with their mobile media Language development and screen media context device. Therefore, any investigation of screens should consider the various outlets used to access media content.
Finally, most of the public debate and research efforts to date have mainly focused on either the quantity, content, or context of screen media use. Very few studies have comprehensively examined the impacts of all three aspects together. To better understand the screen media use practices of children and the associations between these practices and their language development, we examined the extent to which each of the three screen media use parameters (quantity, content, and context) predict language outcomes among children under 3.

Participants
The final sample in the present study consisted of 85 1-to 3-year-old Saudi children residing in Saudi Arabia. The study started with an initial sample of 139 participants. Nearly 75% (n = 104) of the potential participants were eligible to participate in the study. Only stay-at-home children (those who do not attend day care) were eligible to enter the study. In Saudi Arabia, children typically do not start daycare before 3: therefore, our sample is representative of the population. Attrition rate over the course of the study was 18% (N = 19).
The mean age of the 85 children in the final sample was 24.92 months (SD = 7.67 months). Table 1 provides details of the socioeconomic characteristics of the target children and their parents.

Procedures
All participation in this study was voluntary. Ethical approval for the study was received from the University of Manchester Research Ethics Committee. Participants were recruited via several social media platforms. All materials used in this study were administered in Arabic. Data were collected between March and June 2017.
Each participant was sent (via email) a Home Literacy and Media Diary (described below), with detailed instructions on how to complete it and an example of a completed 1-day diary. Each participant was asked to complete seven daily diaries over a period of 7 weeks. Completed diaries were collected (via email or WhatsApp) from participants on a weekly basis and checked regularly to address any immediate problems or incorrect entries. After completing the fourth diary, each participant was asked to complete a hard copy of the JISH Arabic Communicative Development Inventory (JACDI; Dashash & Safi, 2014), either the Words and Gestures (JACDI-WG) version for children between 12 to 16 months, or the Words and Sentences (JACDI-WS) version for children between 17 and 36 months. Participants were contacted by phone by the first author and were given instructions on how to complete the JACDI. After submitting the last diary, each participant was sent a link to complete the Home Literacy and Media Survey (described below) via the online data collection engine Survey Monkey.
The final sample included all participants who submitted at least two diaries for at least one weekday and one weekend day, completed the language assessment tool, and completed the online survey. Out of the 85 participants in the final sample, 52 submitted For the purpose of this study, lower-income was defined as families earning less than SAR 10,000 a month; middle-income was families earning between SAR 10,000 and SAR 19,999 a month, and higher-income was families earning over SAR 20,000 a month (SAR 1 = USD0.267 as of March 1, 2021; SAR = Saudi Arabian Riyal). According to the Saudi General Authority for Statistics, the median monthly household income in 2013 was SAR 10,723 (The Saudi General Authority for Statistics, 2013).
seven diaries, 24 submitted two diaries, and nine submitted more than two but fewer than seven diaries.

Materials and measures
The primary caregivers completed three measures relating to their child's screen media exposure: (1) a home literacy and media diary, (2) a vocabulary assessment tool, and (3) a home literacy and media survey. Details of each measure are described below.

The Home Literacy and Media Diary (HLM Diary)
The Home Literacy and Media (HLM) Diary (see Supplementary Materials) is a 24-hour, event-based, parent-report diary that aims at collecting data on target children's media use, reading, and play activities. It was adapted from the Child Development Supplement to the Panel Study of Income Dynamics (PSID-CDS; University of Michigan Institute for Social Research, 2014). Each participant was asked to log their target child's activities as they occurred over the course of one chosen day each week for a period of 7 weeks with the aim of having a total of 5 different weekdays and 2 different weekend days. Screen time per day was calculated by adding up the number of minutes spent daily in viewing screens then dividing it by the number of diaries submitted.

The JISH Arabic Communicative Development Inventory (JACDI)
The JISH Arabic Communicative Development Inventory (JACDI; Dashash & Safi, 2014) is a standardized, norm-referenced measure designed to assess Saudi Arabic vocabulary development in infants and toddlers aged 8 to 36 months. It is the Saudi Arabic adaptation of the MacArthur-Bates Communicative Development Inventories (CDI; Fenson, Dale, Reznick, Thal, Bates, Hartung, Pethick & Reilly, 1993). It includes the JACDI-WG for 8-to 16-month-old children (which was used in this study to assess receptive and expressive vocabulary) and JACDI-WS for 17-to 36-month-old children (which was used in this study to assess expressive vocabulary and mean length of utterance).

The Home Literacy and Media Survey (HLM Survey)
The Home Literacy and Media (HLM) Survey (see Supplementary Materials) was used to collect specific information about the target children's screen media and literacy environment. The HLM Survey consists of 84 questions; thirteen survey items were adapted from the Common Sense Media Zero to Eight Survey (Rideout, 2013) and the Parenting in the Age of Digital Technology Survey (Wartella, Rideout, Lauricella & Connell, 2014). The remainder of the survey items were developed by the first author.
To limit the scope of this paper, we did not analyse the survey items that pertain to parental attitudes toward their children's media use or parental screen media mediation practices and styles, which are not investigated in this study. Items were presented in various formats including yes/no questions, checklists, open-ended questions, and Likert scales.
Both data collection tools (the diary and the survey) were utilized to collect in-depth information from the respondents and contributed to specific research questions. For example, the amount of foreground screen media exposure, the types of contents viewed on screens, details of the social context of screen media use, and the frequency of reading and play activities were all captured using the diary. However, it was not possible to collect some information using the diary tool such as demographic information of children and their parents, the age at which children started using screens, the availability of internet connection at home, and the number of media devices and books at home, therefore, the survey tool was used to collect these specific data.
To ensure face validity, the survey and the diary were pilot-tested on a small group of mothers of 1-to 3-year-olds for clarity, readability, errors, and completion time, and changes were made accordingly.

Analysis
Statistical analysis was performed in R (version 3.4.2). Descriptive statistics were used to assess measures of central tendency and variability. When examining the amount of time children spend using screens, we divided the children into two age groups based on international guidelines on screen time that make a distinction between screen time recommendations for children above and below 2 years. Thus, we divided the children here into a younger group aged 1 to 2 years (n = 42, M = 18.17 months, SD = 3.87 months) and an older group aged 2 to 3 years (n = 43, M = 31.51 months, SD = 3.61 months).
Regression analyses were utilized to answer our primary research question. For the regression analyses, the sample was divided into two age groups according to the two JACDI versions: younger children aged 12-16 months (n = 18, M = 14.39 months, SD = 1.33 months) and older children aged 17-36 months (n = 67, M = 27.75 months, SD = 6.01 months). To select the best regression model, we used stepwise model selection, which utilizes the Akaike Information Criterion (AIC) to eliminate the non-significant predictors. In addition, we used the F-ratio test to help us decide whether to use the full or reduced model. The predictor and outcome variables are described below.

Predictor variables
Our main predictor variables were screen media quantity, screen media content, and screen media context. We were also interested in comparing the prevalence of screen media in children's home environments with the prevalence of reading, which has been long regarded as the most substantial component of the Home Literacy Environment (HLE). Reading, family socioeconomic status (SES; as indexed by parental education, parental employment, and household income), child gender, and age were included in the model as predictors to explore their effects on the outcome variables.
The predictor variables were grouped into five broad composite categories: (1) screen media quantity, (2) screen media content, (3) screen media viewing context, (4) reading prevalence, and (5) family SES. Gender and age were later added to the regression model as factors. Table 2 provides more details on the variables included within each category and the scores that were assigned to each variable. Each composite category was given a composite global score. For the screen media categories, higher scores were given to conditions that have been described in the literature as "more positive" screen media viewing experiences. For example, a higher score was given to a child who views screens for less than 2 hours a day, who rarely has TV on at home when no one is watching, who started viewing screens after the age of 2, who watches child-directed educational content more than other content types, who mostly watches screen media content in their mother tongue, and who is mostly accompanied by an interacting adult while watching. On the other hand, lower scores were given to negative practices as per previous literature (e.g., higher amount of screen time at this young age, solitary viewing, watching adult-directed content). With regard to the reading prevalence category, higher scores were given to conditions where reading was more frequent, and children had access to more books at home. As for the family SES category, higher scores were given to conditions where parents were more highly educated, were employed, and had a higher monthly income.

Screen media quantity
In order to determine quantity of screen media exposure, we considered: (1) the average amount of time a child spends daily viewing screens (TV and mobile media devices) as per the diary data, (2) the frequency of background TV exposure (which adds to the total screen media exposure time) as per the survey data, and (3) the onset age of screen media viewing (TV and mobile media devices) as per the survey data. For each of the three subcategories, scores were given based on which option each child fell into as shown in Table 2. For example, for the first subcategory (i.e., overall screen time), to calculate the average daily screen time for a given child, the total minutes the child spent using all types of screens in the seven days was divided by the number of completed diaries (ideally 7 diaries for each child), then a score was given for the child based on which option the child's use fell into. For example, 1 point was given for those whose average daily overall screen time was more than 2 hours, 2 points for those who spent 2 hours or less, and 3 points for those who never used screens. This means that higher scores were given to better practices for each subcategory as per previous literature. This method of calculating scores was used in all other categories and subcategories.

Screen media content
To determine quality of screen media content, we used three variables: (1) target audience (i.e., child-directed content vs. adult-directed content); (2) content genre (i.e., childdirected educational content, child-directed non-educational content, child-directed songs and rhymes); and (3) language/language variety of the content viewed (i.e., Modern Standard Arabic (MSA), Saudi/Gulf Colloquial Arabic, Non-Saudi/Gulf Colloquial Arabic, English, no speech [silent, noise or music only]). For each child, we identified the most viewed/used screen media content type (i.e., the content type viewed/ used for periods longer than the other types). For example, if a child watched childdirected content for a total of 660 minutes across the seven days and watched adultdirected content for a total of 360 minutes across the seven days, child-directed content would be considered the most viewed content type for that child.
To decide whether a show was educational or non-educational, we used the Common Sense Media (CSM) evaluation of educational value for each show (Common Sense Media, 2017). If a show was rated by CSM at least 3 out of 5 for educational value, it was considered educational. It is worth noting that most of the shows that Saudi children watch on TV and mobile media devices are international shows that are also aired on American and British channels but are dubbed in Arabic. For the shows that could not be found on CSM (e.g., local shows and shows produced specifically for an Arabic-speaking audience), we viewed five different episodes or video clips of each show and determined its educational value. We followed Zimmerman and Christakis' (2007) method of content classification. Any show that was designed to have primarily educational value for children was considered educational. Any show that was designed to be primarily entertaining for children was considered non-educational.

Screen media viewing context
To determine the social context of viewing, we looked at two variables: (1) solitary viewing vs. co-viewing; and (2) interactive co-viewing (verbal interaction while co-viewing) vs. passive or silent co-viewing (no verbal interaction while co-viewing). Similarly to how we calculated the most frequently viewed content types, for each child, we identified the most frequent type of viewing context based on the number of total minutes that they engaged in each type. For example, if a child spent more time viewing media alone than co-viewing media with another person, their most frequent type of social context would be solitary viewing.

Reading prevalence
To determine the prevalence of reading in the child's environment, we looked at two variables: (1) how often the child is read to at home; and (2) the number of books available to the child at home.

Outcome variables
The outcome variables were derived from the JACDI. For the children aged 12 to 16 months (n = 18, M = 14.39 months, SD = 1.33 months), the outcome variables were expressive and receptive vocabulary size as measured by the JACDI-WG. For the children aged 17 to 36 months (n = 67, M = 27.75 months, SD = 6.01 months), we used the JACDI-WS to assess expressive vocabulary size (i.e., the number of words produced) since the JACDI-WS does not assess receptive vocabulary for children above 16 months, and mean length of the three longest utterances (M3L).

Results
In this section, we first report descriptive information on the screen media use and reading practices among children in our sample and then move on to address our primary research question: to what extent does each of the three screen media use parameters (quantity, content, and context) predict language outcomes among children under 3.

Descriptive statistics
Quantity of screen media exposure Quantity of foreground screen media exposure Based on the diary data, compared to time spent in book reading and indoor play and/or outdoor play, screen media viewing/using was the most prevalent activity among Saudi young children (Figure 1). Children in the sample (including those who had never engaged in one or more of the activities) spent an average of 149.26 minutes (SD = 108.32 minutes) daily exposed to screens ( The boxplots in Figure 2 show the distribution of time spent in different activities as per the diary data (including children who were never engaged in any of the activities). It should be noted that 7% of the children in the sample had never watched TV, 14% had never viewed mobile media, 2% were never exposed to screen media (TV and mobile  media), 60% were never read to, and 47% had never played outdoors. The diary data is in stark contrast to what mothers reported in the survey about their evaluation of their children's screen time; the majority indicated that their children watch TV and use mobile media "moderately" or "rarely" (TV: 79%; mobile media: 72% according to the survey data). We compared the two age groups with regard to their screen time and found that 95% of all children below 2 and 91% of all children above 2 in the sample exceeded screen time recommendations by the World Health Organization (2019) and the American Academy of Pediatrics (2016) which both call for no screen time for children under 2 and no more than 1 hour for children aged 2-5 years. Older children (2 to 3 years) viewed TV and used mobile media devices significantly more frequently than younger children (1 to 2 years) (younger age group: M = 117.24 minutes, SD = 86.43 minutes; older age group: M = 180.53 minutes, SD = 118.95 minutes; t(83) = 2.80, p = .006).

Prevalence of background screen media exposure
Over half of the mothers (59%) indicated in the survey that TV was "often" (32%) or "always" (27%) left on in the background at their homes even if no one was actually watching it.
Onset age of screen media exposure As per the survey data, the average age of starting to watch TV among 1-to 3-year-old children was about 13 months (M = 12.76; SD = 7.38), while the average age of starting to view or use mobile media was about 18 months (M = 17.82; SD = 7.43). Fifty-six percent of the children in the sample started watching TV at the age of 2 years or earlier, and 78% started using mobile media at the age of 2 years or earlier.

Content of screen media exposure
Content of screen media based on target audience Based on the diary data, children in the sample watched child-directed media more than adult-directed media on both screen types (TV: 83%; mobile media: 87%).

Content of screen media based on genre
The most viewed media content genre on all screens, as per the diary data, was childdirected non-educational content (viewed the most by 56% of the sample), followed by children's songs and rhymes (Arabic songs on TV and Arabic and English songs on mobile media, viewed the most by 35% of the sample). The most frequently viewed content genre on TV alone was child-directed non-educational content (55%), which was the most frequently viewed content type among only 24% of mobile media users. Childdirected educational programming was more often viewed on mobile media screens (13%) than on TV screens (5%). The most frequently viewed content type on mobile media devices was children's songs and rhymes (44%), viewed the most on TV by 40% of the sample. Two additional content genres were included when exploring the types of content young children viewed on mobile media devices: unboxing videos and browsing photos and videos, as we found that these are additional genres children frequently view on mobile media devices. Browsing photos and videos on mobile media devices was the most frequently viewed content on mobile media devices in 13% of the sample, while watching unboxing videos was the most frequently viewed content in 7% of the sample.

Content of screen media based on its language variety
The diary data showed that the most viewed language variety on TV was MSA (58% vs. 32% among mobile media users), followed by Non-Saudi/Gulf Colloquial Arabic (22% vs. 5% among mobile media users). The most viewed language variety on mobile media devices was English (37%), though English accounted for only 6% of TV viewing.

Social context of screen media viewing Solitary viewing vs. co-viewing
Co-viewing/co-using screens with mothers was the most frequent type of viewing among both TV viewers (55%) and mobile media users (43%) as per the diary data. On all screen types, co-viewing media with fathers was not the most frequent type of viewing for any of the children in the sample. Co-viewing/co-using media with both parents was the most frequent form of viewing in 12% of TV viewers, while no mobile media users were found to have co-viewing with both parents as being the most frequent type of viewing. Co-viewing was more frequent on all screen types than solitary media use. Solitary viewing/using was far more common in mobile media use (36%) than in TV viewing (3%).
Interactive co-viewing vs. passive co-viewing Passive co-viewing of TV was more common than interactive co-viewing, as the diary data revealed that it was the most frequent type of co-viewing in 73% of the sample. No comparable data was available for mobile media co-use.

Prevalence of reading
Reading to young children in our sample was very infrequent as per the diary data. Nearly two thirds (60%) of the mothers in this study "never" read to their children, and only 9% read to their children every day. In addition, over one third (34%) of the children in the sample had no books at home, 15% had 1-2 books, and 29% had 3-9 books. Table 3 and Table 4 show the descriptive statistics for the predictor variables and outcome variables that were included in the regression models for each age group.

Regression analyses
Younger children: 12 to 16 months We conducted a multiple linear regression examining the association between the predictors in Table 2 as well as gender and age and the number of words understood by children aged 12 to 16 months. The multiple linear regression was fitted to the data of the younger children to estimate the degree of influence of each predictor on the number of words that these children understood. The parameter estimates of the model are shown in Table 5. p-values for all predictions showed non-significant effects.
The F-ratio test indicated no significant difference between the fitted regression model and the null model, F(7, 3) = 1.853, p = .330. Using stepwise selection, the best model was   the model which included media content, media context, reading, SES, and age as predictors (AIC = 99.71). The adjusted R 2 values for the full model (R 2 Adjusted = 0.374) and the reduced model (R 2 Adjusted = 0.544) indicated that the reduced model was a better fit in describing the variation in the number of words understood by the younger children group. There was no significant difference between the full model and the reduced model, which indicates that the additional variables in the full model did not contribute to explaining the variation in the response, F(3, 5) = 0.32, p = .749. Table 6 shows the parameter estimates of the reduced model. Screen media context contributed significantly to explaining the variation in the number of words understood by the younger children group. A one-unit increase in the composite score of screen media context is expected to increase the number of words understood by 116.76 words.
Next, we ran a multiple linear regression examining the association between the predictors and the number of words produced by children aged 12 to 16 months. Table 7 shows the parameter estimates of the fitted model.
The F-ratio test indicated no significant difference between the full model including the predictors and the null model, F(7, 3) = 1.30, p = .454. The stepwise regression only retained screen media context and age in the regression model. This simple linear model  gave the lowest AIC of 55.98. The adjusted R 2 values for the full regression model (R 2 Adjusted = 0.172) and the reduced regression model (R 2 Adjusted = 0.585) indicated that no additional information was explained by adding other variables to the reduced model, F(3, 8) = 0.20, p = .941. Table 8 shows the parameter estimates of the reduced model. Screen media context had a nearly significant positive association with the number of words produced by the younger children group. The number of words produced is expected to increase by 7.54 words with a one-unit increase in the composite score of screen media context. This effect is significant at α = .05. In summary, children whose caregivers co-engaged with them in viewing/using screens, and verbally interacted with them while co-viewing, had larger expressive vocabulary scores on the JACDI-WG than their counterparts.
Older children: 17 to 36 months As described above, we conducted a multiple linear regression examining the association between the predictors and the number of words produced by children aged 17 to 36 months. Table 9 presents the effect of each predictor on the number of words produced, as described by the full multiple regression model.
The F-ratio test was used to compare the contribution of the full regression model in describing the relationship between the response and the independent variables against the null model. A significant difference was found, F(7, 48) = 11.62, p < .001. A stepwise  regression was carried out to select variables that decreased the AIC value. The lowest AIC value (573.50) was obtained when media context, family SES, and child gender were removed from the model. To compare regression models with different numbers of predictors, the adjusted R 2 was obtained. The adjusted R 2 values suggest no differences in model fit in describing the variation in the raw number of words produced between the reduced model (R 2 Adjusted = 0.571) and the full model (R 2 Adjusted = 0.575). There was no significant difference between the full model and the reduced model, F(48, 51) = 1.15, p = .340. Table 10 presents the effects of each predictor on the number of words produced, as described by the reduced multiple regression model. Reading had the largest positive impact on the number of words produced; the number of words produced is expected to increase by 36.36 words with a one-unit increase in the reading composite score. As explained in the Analysis section, the reading composite score would increase if the frequency of reading to the child and the number of books at home increased. Screen media quantity also showed a significant positive association with the number of words produced. A one-unit increase in the screen media quantity composite score is expected to increase the number of words produced by 34.51 words. It should be noted that a higher screen media quantity score does not mean more screen time, but rather means better  screen media use practices (as detailed in the Analysis section). In other words, the less children aged 17 to 36 months were exposed to foreground and background screen media, and the older they were when they started viewing screens, the higher their expressive vocabulary scores. Age was also a significant predictor of the number of words produced. The older the children, the higher expressive vocabulary scores they had. Finally, we conducted a multiple linear regression for the association between the predictors and the mean length of the three longest utterances (M3L) produced by children aged 17 to 36 months. Table 11 presents the effect of each predictor on the M3L produced, as described by the full multiple regression model.
The F-ratio test indicated a significant improvement in the prediction of the fitted regression model against the null model, F(7, 48) = 9.88, p < .001. Using stepwise selection, we found that dropping screen media quantity, screen media content, and SES gave the best model (the lowest AIC value of 47.77). The adjusted R 2 values suggest that the reduced model (R 2 Adjusted = 0.54) was slightly better than the full model (R 2 Adjusted = 0.53) in describing the variation of the M3L. There was no significant difference between the full model and the reduced model, F(48, 51) = 0.51, p = .676. Table 12 shows the parameter estimates of the reduced model. The reading prevalence score and age both had positive effects on the M3L outcomes, while the  screen media context score had a negative effect. A one-unit increase in the composite score of reading prevalence is expected to increase the M3L by 0.29 words. A one-unit increase in the screen media context composite score is expected to decrease the M3L by 0.32.

Discussion
The aim of the current study was to examine the extent to which each of the three screen media use parameters (quantity, content, and context) predict language outcomes among children under 3 in a group of 85 Saudi Arabic-speaking toddlers. Our analysis revealed two main findings. First, for children aged 12-16 months, screen media context (i.e., the frequency of interactive joint media engagements with the child) correlated positively with expressive and receptive vocabulary size. Second, for 17-to 36-month-olds, screen media quantity correlated negatively with expressive vocabulary scores whereas reading correlated positively with both expressive vocabulary scores and the mean length of the three longest utterances children produced (M3L). Our first main finding shows that children whose caregivers co-engaged with them in viewing/using screens and verbally interacted with them while co-viewing had larger expressive and receptive vocabulary size than their counterparts who engaged in solo media use or had passive non-interactive co-viewing. This finding adds to emerging evidence suggesting a positive association between interactive joint media engagements and early language development (e.g., American Academy of Pediatrics, 2016;Courage, 2017;Dore et al., 2020;Myers et al., 2018;Strouse et al., 2018). In line with Bronfenbrenner's (1979) ecological systems theory, this study shows that the involvement of caregivers, who are part of the child's most immediate context (i.e., microsystem), with their children during screen media use positively mediates the effects screens can have on children. It is well known that parent-child interactions are exceptionally important for early language development (e.g., Hart & Risley, 1995;Rodriguez & Tamis-LeMonda, 2011). Therefore, maintaining positive caregiver-child interactions through verbally interacting during and/or after co-viewing can mitigate adverse effects of screen media use on early language development. Interactive joint media engagements provide children with opportunities for receiving contingent responses and increase conversational turntaking which are conducive to vocabulary growth (Gilkerson et al., 2018;McGillion et al., 2017). Conversational turnsand not just the sheer quantity of wordshave been recently highlighted as key in affecting children's verbal skills (Donnelly & Kidd, 2021;Romeo et al., 2018). Screens, in that sense, can be utilized as prompts for additional, more diverse parent-child interactions. It should be noted that interactive co-viewing may not be beneficial in and of itself but may indicate a generally more supportive home environment. It may be that caregivers who engage in interactive co-viewing of media with their young children are generally more engaged in verbal social interactions with their children throughout the day and/or are more sensitive and warm caregivers, which is known to have a positive effect on child development.
Unlike its positive relation to language outcomes in younger children, screen media context (i.e., the frequency of interactive joint media engagements with the child) was negatively correlated with the M3L of toddlers older than 16 months. Child's age is an important factor that could explain the discrepancies found for the effects of JME on younger versus older children. Previous studies (e.g., Ewin et al., 2021;Wood et al., 2016) have found that parents provide less scaffolding during JME with older children than with younger children. Hence, this reduction in supportive interactions may result in negative language outcomes. Another possible explanation is that as children get older, they become more capable of understanding media content and follow programming, and having a parent interact with them during viewing may disrupt the flow of the program. As the majority of programming viewed on TV was in MSA, and the majority of programming viewed on mobile media devices was either in MSA or English, it could be confusing or distracting to talk to older children who are more capable of understating MSA or English in the child's home language variety (i.e., Saudi Arabic), which is another possible explanation for the negative link found here. This is further complicated by the diglossic contexts (Ferguson, 1959) in which Arabic-speaking children grow up and is worthy of future investigation.
It should also be remembered that our study did not include an analysis of the interactional features of parental talk while co-viewing. Previous research has indicated that features of parent-child interactions while co-viewing vary depending on several factors, including the types of screen media content and the child's age, and that this variation in parental speech has been linked to differences in outcome measures. For example, Sims and Colunga (2013) found that parents of 30-to 36-month-old children used four types of language when talking to their children during co-viewing: tag questions, label elicitation and feedback, narrating, and wh-questions and explicit labelling. Co-viewing was negatively associated with retention of word learning only when parents used more narrating during the co-viewing (Sims & Colunga, 2013). Similarly, it is possible that certain interactional features that caregivers in our sample used when co-viewing with their older children might have contributed to the more negative language outcomes. Future research is needed to further explore these possibilities.
Our second main finding shows that among the older children within our sample (17to 36-month-olds), screen media quantity, (i.e., the amount of time a child spends daily viewing screens, the prevalence of background TV in the child's environment, and the onset age of screen media viewing) correlated negatively with expressive vocabulary. This finding highlights the influence of the sheer volume of screen media time over and above the other variables studied. This finding also supports previous research indicating a negative association between the amount of screen time and language outcomes (e.g., Chonchaiya & Pruksananonda, 2008;Duch et al., 2013;Dynia et al., 2021;Hill et al., 2020;Supanitayanon et al., 2020;Tomopoulos et al., 2010;van den Heuvel et al., 2019). The influence of screen media use, as a molar activity the children engage in on a daily basis, further supports Bronfenbrenner's views.
Despite its significant effect on the expressive vocabulary size of the older group, screen media quantity was not significantly correlated to language outcomes in the younger group. This is surprising as the majority of previous research indicates a negative association between the amount of screen time and language skills in infants and toddlers under two years (see Madigan et al., 2020 for a review). It is not clear why there was an association between screen media quantity and expressive vocabulary size in the older group, but not in the younger group. Thus, investigating the effects of the amount of screen time on language skills in larger samples of Arabic-speaking children under 16 months of age is worthy of future research.
There are concerns that the increasing use of technology is leading to a notable decline in reading and play among children (American Academy of Pediatrics, 2016; Anderson & Subrahmanyam, 2017;Frost, 2012;Seo & Lee, 2017). In the current study, we found that screen time was the most prevalent activity among Saudi children under 3 years of age when compared to time spent in reading or playing outdoors which are two activities that have been found to support language development. Confirming previous findings , this study found that the frequency of reading to toddlers is very low in Saudi Arabia. It has been well established that reading is one of the home literacy environment components that are most significantly and positively linked with concurrent and long-term literacy and language outcomes (e.g., Bus et al., 1995;Farrant & Zubrick, 2011;Whitehurst & Lonigan, 1998). In our study, time spent in shared reading activities was predictive of expressive vocabulary and M3L in the older age group, though it was not predictive of vocabulary outcomes in the younger age group, likely because it was a very infrequent activity in younger children. In addition, outdoor play and direct experiences in outdoor settings foster opportunities for childdirected speech, verbal communications, and language development (e.g., Cameron-Faulkner et al., 2017;Cameron-Faulkner, Melville & Gattis, 2018;O'Brien & Murray, 2007). Playing outdoors was low among the children in our sample, which could be attributed to the hot weather in the country and the lack of green space, parks, and outdoor play areas. Although our findings are not able to shed light on whether the increased screen media use directly displaces time spent reading and playing outdoors, our results regarding the discrepancies between time spent on these activities warrant further investigation.
The type of screen media content, as measured by the screen media's target audience, educational value, and content language, was not significantly correlated to language outcomes in either age groups. This finding further supports the notion that infants and toddlers do not seem to benefit from educational content viewed on screens (DeLoache et al., 2010;Krcmar, 2014;Neuman et al., 2014;Roseberry et al., 2009;Tomopoulos et al., 2010). It should be noted that there are other variables that could have been included within the content parameter and could have shown different results. For instance, we did not examine the formal features of the content viewed (e.g., rapid pacing, visual special effects, frequent camera cuts, loud music, non-speech vocalizations), the interactivity and contingency features of the content viewed, or the language-and literacy-promoting strategies employed in the content viewed. To examine these variables, a more detailed qualitative multimodal content analysis would be necessary.
Child age was a significant predictor of language outcomes in the older group but not in the younger group. This is probably attributed to the extended range of ages in the older group (17-36 months) compared to the shorter range in the younger group (12-16 months). It is also in line with studies that confirmed VOCABULARY SPURT, a rapid increase in the rate of vocabulary acquisition starting at around 18 months of age (Fenson, Dale, Reznick, Bates, Thal & Pethick, 1994;Goldfield & Reznick, 1990).
Child gender was not significantly correlated to language outcomes in either age groups. This finding is in contrast with findings in English-speaking countries (e.g., Fenson et al., 1994;Lange, Euler & Zaretsky, 2016) as well as non-English European countries indicating that young girls typically outperform boys on language measures in general and expressive vocabulary in specific. However, our finding is in line with studies on Arabic-speaking children. For example, Al-Akeel (1998) found no gender differences in comprehension skills in Saudi children aged 3 to 6 years. More recently, Abdelwahab, Forbes, Cattani, Goslin and Floccia (2021) also did not find significant gender differences in the vocabulary outcomes of 8-30-month-old Arabic-speaking children.
There are a number of key strengths associated with the current study. First, this is one of few studies that have attempted to provide a comprehensive understanding of children's screen media exposure by taking into account not only the amount of time children spend with screens (quantity), but also what children watch (content), and how they watch it (context), as well as the associations between each of these variables and children's language outcomes. Second, this study used an extended version of detailed weekly diaries over a period of 7 weeks to track children's screen media use. Most diary studies have utilized only 1 to 2 days of data and assumed they were representative of other weekdays. Finally, we collected data by using both a diary and a survey. The use of both instruments enabled us to collect rich information about children's daily routines through diaries as well as information about children, parents, and home environments that were not possible to collect with the diaries such as demographic information, the onset age of screen media use, and the number of books at home.
There are, however, also some limitations. First, parent-report measures, in general, are susceptible to socially desirable answers, recall bias, and memory lapses. Secondly, the contradictory results found between the two age groups with regard to the association between the social context of media use and language outcomes call for further research on this topic. Future studies may benefit from directly observing what caregivers actually do or say during co-viewing. Thirdly, the sample size in the younger age group means that the findings are, to some extent, exploratory and that further research is needed. Finally, we did not ask parents to report in the diaries whether they verbally interacted with their children during co-viewing of mobile media. This data point would have been valuable for the study and should be included in any future research on this topic.
In today's rapidly changing media landscape, understanding children's media use patterns (especially in the early critical developmental years) and examining their association with children's health and development are of extreme importance. This study provides a comprehensive picture of the screen media environment of young children by considering the quantity of the time spent with traditional and new media, several content features of the screen media available to children, and the social context of screen media engagements among children. An important take-home message from our study is that WHAT young children watch or HOW MUCH they watch it is not as important to their language development as HOW they watch. Findings from this study and from a large body of prior research continue to show that talking to children matters.