Investigating the Mechanisms Driving Referent Selection and Retention in Toddlers at Typical and Elevated Likelihood for Autism Spectrum Disorder

Abstract It was suggested that children's referent selection may not lay memory traces sufficiently strong to lead to retention of new word-object mappings. If this was the case we expect incorrect selections to be easily rectified through feedback. Previous work suggested this to be the case in toddlers at typical likelihood (TL) but not in those at elevated likelihood (EL) for autism spectrum disorder (ASD) (Bedford et al., 2013). Yet group differences in lexical knowledge may have confounded these findings. Here, TL (N = 29) and EL toddlers (N = 75) chose one of two unfamiliar objects as a referent for a new word. Both groups retained the word-referent mapping above chance when their choices were immediately reinforced but were at chance after corrective feedback. The same pattern of results was obtained when children observed another experimenter make the initial referent choice. Thus, children's referent choices lay memory traces that compete with subsequent correction; these strong word-object associations are not a result of children actively choosing potential referents for new words.


Introduction
Naturalistic word learning situations are often ambiguous with many competing, equally plausible referents (Trueswell, Medina, Hafri, & Gleitman, 2013). A first step in word learning is to determine which of the many objects in the visual scene may be the intended referent of a new word. In some instances, the identity of the referent is ostensively indicated by the speaker holding, pointing, or gazing towards it; in the absence of these cues, the child needs to make use of alternative strategies. Previous research has shown that infants as young as 12 months may be able to track co-occurrences of words and objects to infer the most likely mappings (Smith & Yu, 2008). Later in development, toddlers were shown to use various heuristics: for example, they build on their emerging lexical knowledge to rule out potential referents and fast map new words onto nameless objects, a strategy sometimes referred to as 'mutual exclusivity' (Halberda, 2003;Merriman & Schuster, 1991).
Despite the advantages that many referent selection strategies seem to confer to word learning, in a seminal 2008 paper, Horst and Samuelson showed that although 24-month-olds made correct referent choices through mutual exclusivity, they could not remember the word-referent associations after a 5-minute delay. Successful retention was observed only when the child's choice had been immediately reinforced by an experimenter ostensively labelling the object. Horst and Samuelson (2008) suggest that poor retention of new name-object mappings formed during referent selection may be due to competition for attention as children select a potential referent, or competition for memory, as children have to hold the referent's (but not the distractor's) properties and name in mind. In contrast to the ambiguity associated with children's referent mapping, it was suggested that ostensive referential cues, such as holding an object while labelling it, may support retention by reducing the competition created by other objects present when the naming event occurs (McMurray, Horst & Samuelson, 2012;Axelsson, Churchley & Horst, 2012). Other researchers suggested that, when children make a referent choice, they set probabilistic links between words and alternative referents (Fazly, Alishahi & Stevenson, 2010;Kachergis, Yu & Shiffrin, 2012). Accordingly, poor retention following children's referent selection may be a result of association strength reflecting this probabilistic distribution, in contrast to setting an exclusive association, when one referent is singled out through ostension. Yet, another line of work had challenged this view that children can entertain multiple possible word-referent mappings and proposed that referent disambiguation goes through a process of testing unique hypotheses which are subsequently rejected or confirmed with further experience or when provided feedbackthe "propose-but-verify" account (Trueswell et al., 2013;Berens, Horst & Bird, 2018). In support of this latter view, 2-year-olds were shown to retain no memory trace of associations between words and alternative referents (Woodward, Gleitman & Trueswell, 2016).
The idea that children's choices in referent disambiguation studies reflect unique hypotheses seems at odds with findings that these choices bear little weight in the long-term retention of word-referent mappings. However, assessing the relative weight given to children's referent choices is limited if we only investigate the impact of reinforcement on retention, as previous studies have done ;Horst and Samuelson's (2008) findings are compatible with both 1) children's referent choices making no contribution to retention as well as with 2) children's referent choices making a substantial contribution but which needed an extra nudge to lead to successful retention. Only by pitting the child's choices and feedback against each other, by using corrective feedback, can we tease apart between these options.
Autism is a neurodevelopmental disorder, characterized by social communication impairments and restricted, repetitive behaviors. Speech and language delays are the most common reason for referral of toddlers for assessment for ASD and predict long-term outcomes (for a review, Eigsti, de Marchena, Schuh & Kelley, 2011). By 12 months of age, toddlers with an older sibling with ASD, who have a 20% likelihood of being diagnosed themselves (henceforth at elevated likelihood, EL; Ozonoff, Young, Carter, Messinger, Yirmiya, Zwaigenbaum, Bryson, Carver, Constantino, Dobkins, Hutman, Iverson, Landa, Rogers, Sigman & Stone, 2011), already have smaller vocabularies than their peers (for a systematic review, Garrido, Petrova, Watson, Garcia-Retamero & Carballo, 2017). To understand whether the use of feedback may explain these differences in vocabulary, Bedford, Gliga, Frame, Hudry, Chandler, Johnson, Charman, and BASIS Team (2013) compared the impact of ostensive feedback that was either reinforcing or corrective, depending on whether or not the child's initial referent choice was correct, in twenty-four-months-olds with and without a family history of Autism Spectrum Disorder (ASD). Following ostensive feedback, typically developing toddlers with no family history of ASD retained the correct word-referent mapping at above chance level even when their first choice had been incorrect, suggesting that the child's choice bore relatively little weight. In contrast, toddlers with an older sibling with ASD performed at chance when their first choice was incorrect.
In addition to social and communication differences, longer visual disengagement latencies were described in infants at EL for ASD (Elsabbagh, Volein, Holmboe, Tucker, Csibra, Baron-Cohen, Bolton, Charman, Baird & Johnson, 2009;Bryson, Zwaigenbaum, McDermott, Rombough & Brian, 2008; but see for typical disengagement in toddlers with ASD, Fischer et al., 2016). Therefore, to explain the performance of toddlers at EL for ASD, Bedford et al. (2013) reasoned that although experimenters made sure that the child was attending to the object when they provided feedback by labelling the referent, covert attention shifts from one object to the other, when receiving corrective feedback, may have lagged in EL toddlers compared to toddlers at typical likelihood for ASD (henceforth TL), impairing their ability to use corrective input.
However, there is another potential interpretation of the differential performance between TL and EL toddlers following corrective feedback; making an incorrect choice in Bedford et al. (2013) meant that children selected the familiar object as a potential referent for the new word. This could have been the result of a delay or failure to access the familiar object's label. While the frequency of these errors was similar in both groups, eventually, or on correction, TL toddlers but not EL toddlers may have successfully retrieved the familiar label, which would have helped with re-assigning the new label to the unfamiliar object, during corrective feedback. Bedford et al. (2013) found an association between performance when receiving corrective feedback and vocabulary size. This was interpreted as evidence that using corrective feedback may be consequential for vocabulary growth and might explain in part why children at elevated familial likelihood of ASD have smaller vocabularies. Yet this association is also compatible with the opposite direction of causality, which is that larger vocabularies, associated with faster lexical access (Fernald, Perfors & Marchman, 2006), helped TL toddlers' performance in the correction trials.
In order to address this potential confound, in the current study we made changes to the procedure used in Bedford et al. (2013). Here, both objects in the referent selection trials were novel. The child's referent choice in response to a new label was then either reinforced or corrected by an experimenter. This design removed the confounding influence of lexical knowledge while also allowing us to control the number of the objects for which correction or reinforcement was received. As in Bedford et al. (2013) participants were toddlers with typical or elevated likelihood for ASD. If differences between EL and TL participants were previously confounded by lexical access, both EL and TL participants are expected to show above chance retention only when their choices are reinforced, not corrected. This outcome would suggest that children's choices contribute more to retention than previously suggested. It would also suggest that use of feedback is not atypical in toddlers with familial history for ASD.
Contingent on this potential outcome and to further shed light on the processes underlying children's referent selection, a second experimenter took turns with the participant in making an initial choice. Evidence brought in support of propose-but-verify accounts was criticized because it could not tease apart between two mechanisms underlying unique referent-word mapping, the active generation of hypothesis versus an attentional bias towards one of the referents, due to not having enough time to attend to more than one of the objects present, (Yu & Smith, 2012) or resulting from attention being drawn post-selection to a randomly selected referent (Samuelson, personal communication). Previous work suggested that active involvement in learning (Begus, Gliga & Southgate, 2014), in particular actively generating a hypothesis, leads to better knowledge consolidation than being presented with the same information passively (Markant & Gureckis, 2014). If children's referent choices are indeed a manifestation of hypothesis testing, we expect them to contribute more to retention compared to a condition in which toddlers passively observed the same choice made by another person. This would manifest in better performance after correction, and worse performance after reinforcement, when the choice was made by an experimenter as compared to when it was made by the child. Given limited prior work on active learning in toddlers with family history for ASD, we made no prediction with respect to how this group will perform in this condition.

Methods
Ethical approval was obtained from the NHS Health Research Authority (NHS RES London REC 06/MRE02/73) and Birkbeck, University of London ethics committee.

Participants
The majority of toddlers took part in this study at their 24-month visit, as part of a longitudinal study, the British Autism Study for Infant Siblings (BASIS). The BASIS cohort has 143 participants, 116 EL and 27 at TL participants; of these 104 (81 EL, 23 TL) children took part in the current study. Following a power calculation (using the means and standard deviations from Bedford et al., 2013, which showed a sample size of N = 28 was required to detect an effect of d = 0.56 with 90% power at alpha = 0.05 for the Reinforced condition, and sample size N = 21 to detect an effect d = 0.67 with 90% power at alpha = 0.05 for the Corrected condition), we recruited an additional 10 typically developing children for this particular study, from a volunteer database at Birkbeck, University of London. These children also had at least one older sibling and no reported first-degree relatives with an ASD diagnosis. Of the 114 children (81 EL, 33 TL) who took part in the current study, 3 were excluded due to technical problems (2 EL, 1 TL) and 7 children (4 EL, 3 TL) were excluded because they completed no valid memory trials (2 EL and 1 TL had not completed the familiarization trials and other 2 EL and 2 TL passed the familiarization trials but had no other valid trials), leaving a final sample size of N = 104 (75 EL, 29 TL).
EL participants were assigned to the group as a result of having an older sibling (proband) diagnosed with ASD. Proband diagnoses were confirmed by a clinician using the parent-report Social Communication Questionnaire (SCQ; Rutter, Bailey, & Lord, 2003) and Development and Wellbeing Assessment (DAWBA; Goodman, Ford, Richards, Gatward, & Meltzer, 2000). 77 probands met the criteria for ASD on both SCQ and DAWBA, while 8 did not meet the threshold for SCQ but were included due to meeting criteria for DAWBA. For 19 probands, data were missing for the SCQ (5) or DAWBA (19). Exclusion criteria for both groups included known medical or neurological conditions and prematurity.

Materials
Stimuli were four small familiar objects and eight unfamiliar objects (figure S1). Familiar objects were a baby shoe, a toy car, a toy dog and a toy duck, chosen on the basis of children being familiar with their names, based on the MacArthur Bates Communicative Development Inventory estimates for 24 months of age (CDI; Dale & Fenson, 1996) and confirmed by parental report. The eight unfamiliar objects: a bottle stop, an egg poacher, an egg shaper, a lemon juicer, a black ring, a watering can head, a whisk and a honey dipper, were objects toddlers were unlikely to have been familiar with (see Supplementary Material, Supplementary Materials). We relied on the parent or the child indicating whether they knew any of the objects and removed trials where this was the case (one trial was removed). The 4 novel words were pseudo-words chosen due to their compliance with English phonetics and phonotacticsdax, sefo, neem and moxi. Objects were presented on a rectangular tray that was covered with a cloth to prevent them from moving.

Procedure
The participant sat at a table either alone or on a parent's lap, with one experimenter (the administering experimenter) to their side and the other (the participating experimenter) facing them. The session was video-recorded using a set of two cameras, providing different angles of the experimenters and the child. There were three phases to the procedure: FAMILIARISATION, REFERENT SELECTION and RETENTION.

Familiarisation
In the first familiarization trial, the administering experimenter presented two of the familiar objects (e.g., duck and shoe) on the tray to the participating experimenter and, when the child was attending to the tray and the administering experimenter, addressed the participating experimenter "Can you see the shoe? Can you give me the shoe please?". The participating experimenter then put the shoe in the administering experimenter's hand, who said "Well done, this is the shoe, thank you." whilst showing the shoe to the child. In the second trial, the administering experimenter now addressed the child, asking for the other object, in a similar manner. If the child did not give, point or touch the object, the procedure was repeated, this time using the other pair of objects. If the child again did not engage with the objects, the study was discontinued.

Referent selection
There were four referent selection trials, two involving the child and two involving the participating experimenter. Before each trial, the main experimenter made eye contact with the child and announced whether it was the child's or the participating experimenter's turn to answer a question, e.g., "Now it's [experimenter name/child's name] turn". On each trial, two unfamiliar objects were presented on the tray and the child or experimenter participant were asked "Can you see the [novel word]? Can you give me the [novel word]? Can you give it to me?". These questions were only asked once the administering experimenter ensured that the child was attending to the tray. After a choice was made, the response to the object chosen by the participant was manipulated so that each participant received both reinforcing and corrective feedback, in different trials. When a 'correct' choice had been made, the administering experimenter took the object from the child and, ensuring the child was looking at her, said "This is the [target word]. This is the [target word]" before passing both objects to the child to play with for a few seconds before the next trial. When an 'incorrect' choice had been made, the administering experimenter took the object from the participant, placed it back on the table then lifted the other object saying, again in view of the participants, "This is the [target word]. This is the [target word]" before passing both objects to the child to play with for a few seconds before the next trial. For both types of feedback and for both the child and the experimenter participant choices, the administering experimenter addressed the child, with mutual gaze, when providing the feedback. The trials were ordered in a fixed order: Experimenter corrected, Participant reinforced, Participant corrected, Experimenter reinforced. Two versions were created, in which different pairs of objects were assigned to each trial type (see Supplementary Materials, Supplementary Materials), with half of the participants doing one version and the other half doing the other version. Which object was the correct choice was fixed within a version but counterbalanced across versions.

Retention
After a 5-minute break in which the child freely played with a set of toys which were not labelled, in the testing room, the retention trials began, with only the child and the administering experimenter participating. The pairs of novel objects were presented to the child in the same order as they were seen during the fast-mapping section. The administering experimenter presented the objects on the tray to the child and said, "Can you give me the [target word]?"; she then placed her hand above the tray and in between the two objects. Irrespective of choice made by the child, the experimenter said 'thank you' and moved onto the next trial until all four were complete.

Data analysis
Choices were video-coded for both referent selection and retention trials. Referent selection trials were coded for validity with 14.9% (62/416) of trials considered invalid either due to error in trial administration, or because the child made an invalid response (i.e., no object touched or given, both objects given simultaneously). One trial was removed because the child spontaneously named one of the unfamiliar objects (the whisk). Only those retention trials for which there was a valid initial referent choice were coded as correct/incorrect based on the object given to the administering experimenter (or to the parent if child was reluctant to give an object to the experimenter) or on the basis of first object touched if no object was given. A further 29 retention trials (8.2%) were excluded if the child made no response (i.e., by not touching an object or giving both objects), leaving a total of 325/416 valid trials, 78% (83.6% for TL group and 76% for the EL group; chi-square p = .112) for the analysis. Following data reduction, seven participants had no valid trials (4 EL and 3 TL participants), which meant that 97 participants were entered in the analysis. Thirty-six trials (13%) were rated by a second coder, and Cohen's κ was run to determine agreement. The inter-rater reliability found very good agreement, κ = .82, p < .001.
Data were analysed using binary logistic Generalised Estimating Equation (GEE) models with logit link function and an unstructured correlation matrix. This allows inclusion of participants even in the event of missing trials (Zeger & Liang, 1986). Word learning score was the outcome and predictors were Feedback type (Reinforced versus Corrected), Participant (Child versus Experimenter) and Group (Elevated Likelihood versus Typical Likelihood). In a second stage, two-and three-way interactions were added to the model. Models including child's age and sex as covariates are included in the Supplementary Materials (Supplementary Materials), but adding covariates did not change the significance level of any of the results. Models were also re-run excluding any children who went on to an ASD diagnosis (n= 7), to test whether these children were driving the effects. Again, removing ASD participants did not change the results (see Supplementary Materials, Supplementary Materials).
Communicative Development Inventory (CDI; Dale & Fenson, 1996) During the visit, parents completed the CDI (Dale & Fenson, 1996), a parent-report measure of vocabulary. Receptive vocabulary was calculated as total number of words 'understood' and words 'understood and said' (see Table 1).

Results
Sample descriptive statistics show that toddlers at elevated likelihood of ASD had smaller vocabularies and a higher amount of ASD traits than toddlers at typical likelihood for ASD (see Table 1). A binary logistic GEE showed a significant main effect of Feedback on word learning ( p = 0.002), with better word learning following reinforcement (i.e., when the initial choice was correct) versus correction (i.e., when the initial choice was incorrect). There was no significant effect of Participant (i.e., whether the initial choice was made by the child or the experimenter), and no main effect of Group. There were also no significant 2-or 3-way interactions (see Table 2 for full results).

Discussion
The present study aimed to clarify the mechanisms underlying children's fast mapping of words onto referents. A previous study  found that typically developing toddlers, with no family history of ASD, readily used corrective feedback to update an incorrect word-object mapping. This finding was in line with the suggestion that children's referent selections in conditions of ambiguity yield weak word-object associations, which strengthen gradually as children encounter correct word-object co-occurrences again and again (Kucker & Samuelson, 2012). In contrast, the same study  showed that toddlers at elevated likelihood for ASD performed at chance following corrective feedback, consistent with them not updating their knowledge following correction. In the present study we aimed to rule out the possibility that this group difference resulted from differences in lexical knowledge, by asking toddlers at typical or elevated likelihood for ASD to choose the referent of a new word from amongst two unfamiliar objects. Just like in Bedford et al. (2013), having received reinforcing feedback on their initial word-object mappings, participants scored above chance on memory retention trials, 5 minutes later. Importantly, and in contrast to Bedford et al. (2013), when receiving corrective feedback, all participants, not only EL participants, performed at chance level in the memory trials, with no effect of group. Bedford et al. (2013) explained EL toddlers' performance in corrective feedback trials, as resulting from perseveration or difficulties with shifting attention, characteristic of this population. Yet, TL toddlers' successful use of the corrective feedback may have been confounded by their more robust lexical knowledge (realising that their choice of a familiar object, as a referent for the novel word, was actually incorrect). With that confound removed, in the current study, we find evidence that supports the view that word-object mappings created when children make a referent selection are strong enough to compete with ostensive corrective feedback.
We have laid out in the introduction various accounts of the mechanisms underlying children's referent disambiguation in word learning. Some had suggested that self-generated mappings are probabilistically distributed over potential referents (Kachergis et al., 2012). Although our findings do not provide a strong case against this view, they do not support a scenario in which weights would be distributed equally between the two potential referentshad this been the case we would expect similar, above chance performance following corrective or reinforcing feedback. While performance following correction versus reinforcement of a child's choice does suggest that they encode in memory chosen word-object association, these trials cannot tell us whether children's choices reflect active hypothesis testing or attention biases (e.g., the inability to attend to more than one referent or attention being enhanced once a referent is chosen and handed). Based on previous work suggesting an advantage for information actively learned, we reasoned that if children's referent choices reflect active hypothesis testing, they should bear more weight in long term retention than when they passively observe another person making the selection. Our prediction was not confirmed, since feedback on the child's choices did not result in significantly better performance with reinforcing feedback or worse performance with corrective feedback, when compared to feedback on the experimenter choices. Thus, while our findings support the view that children's referent selection contributes to long term retention, they do not speak for the involvement of active "propose-but-verify" strategies. Rather, it seems that the act of choosing one referent over the other (which children experienced both when they or another person made that choice) was sufficient to bias attention and lay that particular word-object association into memory. Previous work on word learning in ASD had found particular challenges when children's own attention was in conflict with referential cues provided by an experimenter (Baron-Cohen, Baldwin & Crowson, 1997), a finding compatible with more weight given to own referent choices, in this population. Yet more recent work challenged these initial results, suggesting they may be specific to children with ASD and intellectual disability (Luyster & Lord, 2009). Just like their typically developing peers, children with ASD were shown to learn words from cross-situational and ostensive cues (Venker, 2019) and to use a variety of referential cues to find the referents of new words (Field, Lewis & Allen, 2019; Hartley, Bird & Monaghan, 2019). Our findings are in line with these studies since we did not find evidence to suggest that the use of feedback in word learning is atypical in EL toddlers. While we acknowledge that in our group only a small proportion of toddlers later received a diagnosis of ASD, our sample, just like other samples of children with family history of ASD (e.g., Hudry, Chandler, Bedford, Pasco, Gliga, Elsabbagh, Johnson & Charman, 2013), had both reduced vocabularies and increased ASD symptoms. If the mechanisms of word learning are not atypical, what then might explain smaller vocabularies in children with ASD or with a family history of ASD? Many of the studies showing success in referent mapping have placed participants in conditions optimal for attentionfor example, minimizing distractionwhich may overestimate learning abilities and may not reflect learning outside the lab. We know from previous work that participants at elevated likelihood for ASD have more difficulty distributing attention away from salient features to the referents of communication (Parsons et al., 2019) and the proportion of time spent on distractors during a word learning task predicted subsequent word learning measured with a looking while listening paradigm (Gliga, Elsabbagh, Hudry, Charman, Johnson & BASIS team. 2012). Yet, an alternative explanation for reduced vocabularies is possible. Verbal responsiveness from caregivers is critical for vocabulary growth, in both typical development and ASD (Edmunds, Kover & Stone, 2019). However, parental responsiveness depends on how much children themselves elicit conversation. Indeed, infants at EL for ASD, who communicated less or were generally hypo-reactive, had parents who were themselves less responsive verbally, preferring instead to engage in play (Kinard et al., 2017). Parental perceived level of competency of the child with ASD, which may also be reflecting fewer attempts from the child to initiate communication, predicted less parental verbal input and, as a consequence, smaller vocabularies (Fusaroli, Weed, Fein & Naigles, 2019). Thus, smaller vocabularies in children with ASD or a family history of ASD may be a result not of difficulties with word learning but of atypical engagement in social interaction and communication, an important hypothesis to be tested in further studies.
What do our findings mean for our understanding of how children learn words? Earlier reports showing poor retention of referent choices were corroborated with findings from cross-situational word learning studies, which require many co-occurrences of word-object pairings, to suggest that children do not need referential cues to learn words but that learning in the absence of these cues is a slow and gradual process (McMurray et al., 2012). One advantage of this incremental process is that instances of mapping errors can be easily corrected by an accumulation of additional accurate evidence. In contrast, our findings suggest that, when children make a referent choice, this is laid in memory and will interfere with correction. This raises questions about how learning occurs in real life, where children may make many incorrect inferences about word referents. From the limited prior work on children's experience and use of feedback we know that parents often accept and adopt children's incorrect usages, but when correcting they either only offer the actual object name or provide additional explanation (e.g., "It is a truck, not a car, see it has a place to put things in", Gruendel, 1977;Chouinard & Clark, 2003). When given corrective feedback in a series of bi-weekly experimental play sessions, corrective feedback that was accompanied with an explanation was most successful in making infants eventually adopt the correct labels (Chapman, Leonard & Mervis, 1986). In our study, toddlers heard the label only twice; while this may not approximate how children typically make use of corrective feedback our study provides a first step to understanding children's use of feedback and lays the groundwork for more ecologically valid studies.
In conclusion, children's referent mapping errors are costly, with ostensive labelling of the correct referents not overriding these initial incorrect associations. Contrary to a previous study, we found no evidence that learning words from reinforcing or corrective feedback is atypical in toddlers at elevated likelihood for ASD. Further studies are needed to clarify why children at EL for ASD have slower growing vocabularies.