The advantage of story-telling: children's interpretation of reported speech in narratives*

Abstract Children struggle with the interpretation of pronouns in direct speech (Ann said, “I get a cookie”), but not in indirect speech (Ann said that she gets a cookie) (Köder & Maier, 2016). Yet children's books consistently favor direct over indirect speech (Baker & Freebody, 1989). To reconcile these seemingly contradictory findings, we hypothesize that the poor performance found by Köder and Maier (2016) is due to the information-transmission setting of that experiment, and that a narrative setting facilitates children's processing of direct speech. We tested 42 Dutch children (4;1–7;2) and 20 adults with a modified version of Köder and Maier's referent selection task, where participants interpret speech reports in an interactive story book. Results confirm our hypothesis: children are much better at interpreting pronouns in direct speech in such a narrative setting than they were in an information-transmission setting. This indicates that the pragmatic context of reports affects their processing effort.

our hypothesis: children are much better at interpreting pronouns in direct speech in such a narrative setting than they were in an information-transmission setting. This indicates that the pragmatic context of reports affects their processing effort.

I N T R O D U C T I O N
The paradox of direct speech Consider the following passage from Arnold Lobel's classic Frog and Toad are Friends (Lobel, ), in which Frog and Toad express radically different emotions towards the advent of spring: () "Toad, Toad," shouted Frog, "wake up. It is spring!" "Blah," said a voice from inside the house. "Toad, Toad," cried Frog. "The sun is shining! The snow is melting. Wake up!" "I am not here," said the voice.
As the passage in () illustrates, speech reports can make up a rather big part of children's books, and they provide a window into the characters' thoughts and emotions. In order to understand children's comprehension of literary texts, it is therefore important to know at what age children are able to understand speech reports and what difficulties they encounter in their acquisition.
Linguists divide speech reports into (at least) two distinct types: direct speech (John said, "I'm happy") and indirect speech (John said that he is happy). Like many stories written for children, the Frog and Toad stories contain far more direct speech quotations than indirect speech reports, as illustrated in the passage in () (cf. Baker & Freebody, ; Kümmerling-Meibauer & Meibauer, ). To bring out the differences between direct and indirect speech, consider the following attempt at rephrasing the original passage above using only indirect speech reports.
() Frog shouted that Toad should wake up because it was spring. An uninterested grunt came from inside the house. Frog cried that the sun was shining, and that the snow was melting, so Toad should wake up. Toad said that he was not there.
The fundamental difference between the original direct speech reports in () and their indirect counterparts in () is the perspective from which the narrator presents Frog's and Toad's utterances. In direct speech, the narrator shifts completely to the character's perspective, presenting what it would be like to listen to the characters directly. In indirect speech, the K Ö D E R A N D M A I E R  narrator describes the content of the reported speech act from a more detached third person perspective (Clark & Gerrig, ). This fundamental perspective difference between direct and indirect speech explains the various linguistic differences between them. First, in direct speech, pronouns and other deictic expressions are interpreted from the reported speaker's (i.e. the character's) perspective, while in indirect speech they are interpreted from the reporting speaker's (i.e. the narrator's) perspective (Kaplan, ). Hence, I am not here in () becomes that he was not there in (). Second, in direct speech, expressive elements like vocatives (Toad, Toad) and imperatives (wake up!) can be used to vividly express the emotional state of the character, while in indirect speech we must resort to non-expressive paraphrases (that Toad should wake up) (cf. Banfield, ; Coulmas, ). Finally, when adults read stories to children, direct speech allows them to imitate the voices of the characters and thereby provide additional information about the characters and their emotions (Couper-Kuhlen, ; Günthner, ).
The linguistic characteristics of direct speechespecially the use of expressives and theatrical voice imitationsmake it a more vivid mode of presentation of what the characters are saying and feeling than indirect speech (Groenewold, (Gerrig, ) into the fictional world. Based on the above, one would expect that children find direct speech reports easier to understand than indirect speech reports, which would explain the observed tendency for authors to use direct speech in children's books.
Surprisingly, our recent study (Köder & Maier, ) on children's processing of pronouns in speech reports shows the exact opposite: direct speech is more difficult to understand for children than indirect speech. This leads to a paradoxical situation. On the one hand, children's book authors' preference for direct speech can be explained by its inherent vividness, which should make it easier for children to follow along. On the other hand, experimental results indicate that children find it very difficult to interpret simple direct speech reports. In the current paper we show how to solve this 'paradox of direct speech' by taking the conversational context of the report into account, but first, in the remainder of this section we briefly summarize our earlier experiment that motivated and inspired this follow-up study. Köder and Maier (): the hidden costs of perspective shifting In Köder and Maier (), children and adults played an animated tablet game called Who gets what?, in which they interpreted speech reports to figure out which of three animals gets a certain object. A typical experimental item in the game looked like this: first Dog whispers something (inaudibly for the participant) to Elephant, then Elephant walks over to Monkey to report what Dog told him, using either direct (a) or indirect speech (b): () a. Dog said: "I/you/he get(s) the football".
b. Dog said that I/you/he get(s) the football.
Based on the report, the participants then had to select either Dog, Elephant, or Monkey as the intended recipient of the object. Note that these items always involve two speech events, taking place in different spatio-temporal contexts. The first speech event, Dog whispers to Elephant, we call the REPORTED SPEECH EVENT, taking place in the REPORTED SPEECH CONTEXT; and the second, Elephant reports what he heard, is called the REPORTING SPEECH EVENT, situated in the REPORTING SPEECH CONTEXT. The main findings of this experiment are that, at the age of four, children were already at ceiling for pronoun interpretation in indirect speech, whereas at the age of eleven, they still struggled with pronoun interpretation in direct speech. More specifically, children seemed to systematically interpret pronouns in direct speech as in indirect speech. For instance, when listening to a direct speech report, as in (a), with the first person pronoun I, children tended to select the speaker of the reporting speech context (Elephant) rather than the speaker of the reported speech context (Dog).

Narrative vs. information transmission: a new hypothesis
In the Who gets what? game, the sole purpose of the utterances is to convey information about objects and animals in the world around the speaker, viz. who is supposed to get the object. We will refer to this general type of discourse, where a speaker is sharing information about the world around him, as an INFORMATION-TRANSMISSION SETTING. In Köder and Maier's () task the information transmitted directly concerns the objects and animals present in the reporting speech context, which is therefore more salient and relevant than the reported speech context that precedes it. Köder, Maier, and Hendriks () argue that this reduced salience of the reported speech context relative to the reporting speech context makes the required shift to the reported speaker's perspective in direct speech exceptionally hard. This is what led children, and some adults, to incorrectly interpret pronouns in direct speech relative to the more salient reporting speaker's perspective.
Children's stories are not vehicles for transmitting information about the world. They constitute what we will call a NARRATIVE SETTING, where the purpose of the discourse is to induce a game of pretense or make-believe K Ö D E R A N D M A I E R (Walton, ), or perhaps to describe or create some imaginary world(s) (Lewis, ; Werth, ). In a narrative setting, the reporting speaker is the so-called narrator, a minimally intrusive presenter of the characters' actions. The focus of attention then is not on this narrator situated in the reporting speech context, but on the reported speakers, i.e. the fictional characters in the story (Baker & Freebody, ; Banfield, ). In other words, when we encounter a report in a narrative setting, the reported speech context is more salient than the reporting speech context. We can thus expect the perspective shift from the reporting speaker (narrator) to the reported speaker (character) to be much easier for children in a typical narrative setting like a children's book than in an information-transmission setting like the Who gets what? game.
Switching from Köder and Maier's () information-transmission setting to a narrative setting should therefore make direct speech interpretation easier.
Our hypothesis then is that the pragmatic context in which a direct speech report is used influences its processing costs. We predict that, in a narrative setting, children will be much better at pronoun interpretation in direct speech than they were in the information-transmission setting used by Köder and Maier (). Since correct indirect speech interpretation does not involve separating multiple perspectives, we do not expect a narrative bonus there. To test this hypothesis, we developed a pronominal referent-selection task in which direct and indirect speech reports are presented as part of a narrative.

Participants
The participants of this study were forty-two monolingual Dutch-speaking children between ages ; and ; (see Table ). The data of one additional child was not saved due to technical problems. We focus on children at the younger end of the age range tested by Köder and Maier () because we expect the narrative discourse context to have the biggest facilitating effect there. The participating children were recruited from an elementary school in the north of the Netherlands. Written parental consent was obtained prior to the experiment. Children received a small reward (a sticker) for participating. In addition, twenty adult native speakers of Dutchmostly studentsparticipated without compensation. All participants were tested individually in a quiet room at their school or at the university.

Stimuli and Procedure
The experiment has been built as an Android application and was presented to participants on a touchscreen tablet. In its design, we tried to simulate children's everyday experience of picture-book reading. Participants listened to a story read by a male speaker (i.e. the narrator) and saw illustrating pictures. The experiment consists of three parts that all form one coherent narrative: the introduction phase, the pronominal gender pre-test, and the speech report test. An online version of the experiment can be found at < http://www.philos.rug.nl/cgm/story-demo/> (Google Chrome required).  Phase I: introducing the protagonists. In the beginning of the story, the narrator introduces the two main protagonists: a girl monkey called Anita Aap 'Anita Monkey' and a boy elephant called Oscar Olifant 'Oscar Elephant' (see Figure a). The participants were asked questions about the names and gender of the protagonists (e.g. Who is Anita Aap? and Who is a boy?) and gave their answer by touching one of the highlighted protagonists (see Figure b). In cases where participants responded incorrectly, they received negative feedback ("No, that was incorrect. Please try again") and were asked the question again. All participants answered these initial comprehension questions correctly on the first trial.


Phase II: pronominal gender pre-test. After the introduction of the protagonists, we tested whether participants are able to use the gender feature of third person singular pronouns as a cue for reference identification. The story continues with the narrator saying that Anita and Oscar are best friends and live next to each other in two houses. One day, Oscar and Anita wake up early and, as always, start the day with a morning workout. To find out who did which exercise, participants had to interpret four sentences that contain either the Dutch masculine pronoun hij 'he' (see ()) or the feminine pronoun zij 'she' (see ()).
'She did a handstand.' Phase III: speech report test. The following part of the narrative contains the speech report test, in which the participants had to interpret personal pronouns in twenty-four speech reports ( direct speech,  indirect speech). The narrator describes that Oscar and Anita go for a walk after their morning workout. On their trip, Anita and Oscar come across twenty-four objects at different locations such as in a tree, in a pond, and in a cave. All of these objects look exactly like things that they possess themselves. After the discovery of each object, the narrator reports Anita's or Oscar's utterance in either direct or indirect speech. Consider, for instance, the scene in which Oscar and Anita discover a backpack hanging in a tree. In this case, the participants would hear the text in ( After the auditory presentation of the speech report, participants had to answer the narrator's question Who has a backpack like that too? by touching one of the highlighted protagonists on the screen (the correct answer for () is Oscar Olifant). The software records the accuracy of referent selection. At the end of the story, it turns out that the objects that Oscar and Anita find all over the place do not just look like theirs, but actually ARE theirs. A naughty dog has taken their things to play with. Oscar and Anita confront the dog, who then promises to collect and return all their belongings.
Children's stories typically have a narrator who describes the actions of the story characters, but does not participate in the story himself. In order to create an ecologically valid narrative context, we therefore avoided pronouns that refer to the narrator or the listener of the story and included only those that refer to the story protagonists. These are the pronouns ik 'I' and jij 'you' in direct speech and hij 'he' and zij 'she' in indirect speech. We introduced a female character and feminine third person pronouns in order to make it possible to unambiguously refer to both the speaking and addressed story character in an indirect speech report. Note that the kind of pronouns tested is a departure from the non-narrative paradigm of Köder and Maier's () Who gets what? game, which included all six combinations of person (I, you, he) and report type (direct, indirect), but lacked feminine pronouns.
Here is an example of each of our four types of stimuli: () DIRECT SPEECHik: Anita Aap zei tegen Oscar Olifant, "Ik heb ook zo'n voetbal". 'Anita Aap said to Oscar Olifant, "I have a football like that too".' () DIRECT SPEECHjij: Anita Aap zei tegen Oscar Olifant, "Jij hebt ook zo'n auto". 'Anita Aap said to Oscar Olifant, "You have a car like that too".' () INDIRECT SPEECHzij: Anita Aap zei tegen Oscar Olifant dat zij ook zo'n hoed heeft. 'Anita Aap said to Oscar Olifant that she has a hat like that too.' () INDIRECT SPEECHhij: Anita Aap zei tegen Oscar Olifant dat hij ook zo'n klok heeft. 'Anita Aap said to Oscar Olifant that he has a clock like that too.' Note that the pronouns zij and hij in indirect speech are used with two different functions, namely to refer to either the subject (see ()) or the K Ö D E R A N D M A I E R object of the main clause (see ()). In contrast to Köder and Maier's () study, the reporting clauses in our study (e.g. Anita Aap said to Oscar Olifant) mention not only the reported speaker but also the addressee. This was necessary to make the use of a third person pronoun referring to the addressee, like in example (), felicitous.
In half of the reports, Anita is addressing Oscar, as in () to (), in the other half, Oscar is addressing Anita. While the order of scenes of the protagonists' journey is the same for all participants, we randomized the order of the objects found at each location and the speech reports associated with them. The spatial position of Anita and Oscar in the pictures is counterbalanced. To control for the possibility that participants have general preferences for associating a certain object with a certain protagonist (e.g. the car with the male elephant), we created two versions of the experiment and assigned participants randomly to one of them at the outset. The two versions differ in the following respects: in version A, all objects belong to the opposite story character (Anita, Oscar) than in version B, and are associated with the opposite type of speech report (direct, indirect). The experiment took participants about  to  minutes to complete.

Pronominal gender pre-test
In the pronominal gender pre-test, we tested whether all participants are able to use the gender information of third person singular pronouns for determining the correct referent. Figure  shows children's and adults' interpretation of the pronouns hij 'he' and zij 'she' in simple non-embedded sentences (e.g. He skipped rope).
While six-to seven-year-old children and adults are at ceiling for the interpretation of both third person singular pronouns, four-to five-year-old children are at chance for the masculine pronoun (t() = , p = ), and slightly above chance for the feminine pronoun (t() = ·, p = ·). This means that the youngest age group did not use the gender feature reliably as an interpretational cue.
Speech report test Figure  shows the accuracy of pronoun interpretation in direct and indirect speech distinguishing between pronouns that refer to the speaker and pronouns that refer to the addressee of the reported context. Adults are at ceiling in all conditions. Six-to seven-year-olds show a mean accuracy between % and %. In four-to five-year-olds, the percentage of correct pronoun interpretation is on average between % and %.S ince the pronominal gender pre-test showed that four-to five-year-old children have difficulties using the gender feature, we split up the indirect speech results of these younger children by pronominal gender in Figure . This reveals that the only condition where the four-to five-year-olds' performance differs significantly from chance is when the feminine pronoun zij 'she' refers to the subject of the matrix clause (i.e. to the speaker) (t() = ·, p = ·).
We analyzed the accuracy data in the speech report test with mixed-effects logistic regression modeling. Using a procedure of model comparison, we added stepwise fixed-effect factors to the baseline model (including random intercepts for subjects). AGE and GENDER of the participants turned out to predict accuracy of pronoun interpretation. All other factors (REPORT TYPE (direct, indirect), REFERENT (speaker, addressee), PRONOUN (I, you, he, she), EXPERIENCE (-; a number indicating how often a certain type of stimulus such as 'direct speechik' has been encountered before), SEQUENCE NUMBER (-), SPATIAL POSITION of the protagonists (monkey left vs. right of elephant), and VERSION (A, B)) did not improve the goodness of fit of the model. The index of concordance of the model is ·, which indicates that it has real predictive power (Baayen, ). Table  shows that participants' accuracy of pronoun interpretation improves with age (p < ·). Female participants performed significantly better than male participants (p = ·). A closer look at the data reveals that in the age group of the four-to five-year-olds, girls had a mean accuracy of . (SD = ·), boys of only . (SD = ·). Among six-to seven-year-old children, girls outperformed their male peers with an accuracy of . (SD = ·) in comparison to . (SD = ·). In adults, there were no gender-related differences in accuracy.

Comparison with Köder and Maier's () results
To find out how the setting (narrative vs. information transmission) influences the accuracy of pronoun interpretation in speech reports, we analyzed the data of the current experiment together with that of children of comparable ages in Köder and Maier (). A note of caution is required concerning this comparison as these two experiments differ in several respects as pointed out above: (i) the number of pronouns tested in direct and indirect speech; (ii) the number of referential candidates ( in the current study,  in Köder and Maier, ) and therefore the chance level (. for our study, . for Köder and Maier, ); and (iii) the mention of the addressee in the reporting clause. Both experiments contain direct speech reports with first and second person pronouns and indirect speech reports with a third person pronoun referring to the subject of the matrix clause. The results for these cases are presented in Table . As one-sample t-tests show, in Köder and Maier's () study, the mean accuracy for direct speech interpretation for both four-to five-year-olds and six-to seven-year-olds is, with values between . and ., significantly below the chance level of .. By contrast, in the current study, the mean accuracy for direct speech interpretation is well above the chance level of . for both four-to five-year-olds (with a mean TA B L E  . Fixed-effects coefficients of the model fitted to accuracy of pronoun interpretation in speech reports  accuracy of . on ik and . on jij) and six-to seven-year-olds (with a mean accuracy of . on ik and . on jij). We directly compared the accuracy of pronoun interpretation in the two experiments with a multiple comparison analysis, using the 'multcomp' package in R (Hothorn, Bretz & Westfall, ). The results indicate that children were better at interpreting pronouns in direct speech in the narrative setting of this study compared to the information-transmission setting of Köder and Maier (). In particular, in the current study, four-to five-year-old children exhibited a higher accuracy for interpreting the pronouns ik (β = ·, z = ·, p < ·) and jij (β = ·, z = ·, p < ·) in direct speech. Similarly, six-to seven-year-olds in this study performed better in their interpretation of ik (β = ·, z = ·, p < ·) and jij (β = ·, z = ·, p < ·) in direct speech. In indirect speech, four-to five-year-olds' accuracy with hij/zij was lower in this study than in Köder and Maier's (β = -·, z = -·, p = ·), while for six-to seven-year-olds indirect speech performance did not differ between experiments (β = -·, z = -·, p = ·). For adults, there were no significant differences between the two experiments for all types of reports.

D I S C U S S I O N
In this experiment, we investigated the interpretation of pronouns in direct and indirect speech reports in the context of a narrative. Set against the background of Köder and Maier's () finding that children have great difficulty interpreting pronouns in direct speech, our hypothesis was that children's performance on direct speech will improve if we integrate the task into a narrative rather than an information-transmission setting. The motivation for this hypothesis is that in an information-transmission setting the focus is on the here and now, i.e. the reporting speech context. This makes the direct speech perspective shift to the less salient reported speech context cognitively demanding. By contrast, in a narrative setting, the focus is on the story world, i.e. the reported speech context, facilitating the direct speech perspective shift. This hypothesis is confirmed by the results. In the narrative setting of the current study, children's accuracy of direct speech interpretation was significantly higher than in the information-transmission setting of Köder and Maier (). The effect of setting is even more striking if one takes into consideration that children in Köder and Maier's study showed a strong preference for evaluating pronouns in direct speech with respect to the reporting speech context, which resulted in a below chance performance. In the current study, even four-to five-year-olds were above chance in direct speech interpretation, i.e. they tended to evaluate pronouns with respect to the reported speech context. This provides T H E A D VA N T A G E O F S T O R Y -T E L L I N G strong evidence that a narrative setting makes it easier for children to shift to the reported speaker's perspective in direct speech.
We propose that, when presented with a construction involving multiple perspectivessuch as direct speech reportschildren follow a strategy of interpreting deictic pronouns relative to the most salient perspective. In the information-transmission setting of Köder and Maier's () experiment, the actual reporting speaker's perspective is most salient, which is why children used it as deictic orientation point. While this facilitates pronoun interpretation in indirect speech, it makes pronoun interpretation in direct speech harder, as this requires a (cognitively demanding; cf. Köder et al., ) perspective shift to the less salient perspective of the reported speaker.
In the narrative setting used in the current study, the salience of reported and reporting speech context is reversed. The reported speech context with the story protagonists is the focus of attention, while the reporting speech context including the narrator and his audience is backgrounded. Because of this backgrounding, the reporting speech context is less salient and therefore does not 'attract' (Evans, ; Maier, to appear) I and you in direct speech. As a result, the pronouns can be straightforwardly linked to the reported speaker or addressee, i.e. the story protagonists.
An interesting difference between the two experiments is that in our earlier study (Köder & Maier, ), first person pronouns in direct speech were easier to interpret than second person pronouns, while there were no significant differences between these pronouns in the current study. We explained the apparent ease of direct speech I in the previous study with the fact that its referent, i.e. the reported speaker, is linguistically mentioned in the reporting clause. This advantage of I compared to you disappears if the addressee is mentioned as well, which is the case in the stimuli of the current experiment (cf. Anita Aap said to Oscar Olifant).
While children performed better in direct speech comprehension in the current study compared to Köder and Maier (), indirect speech performance of the youngest children decreased from ceiling performance to chance in the current study, while we originally expected no effect. This difference could be the result of introducing gender-marked pronouns. As the gender pre-test indicates, the four-to five-year-old participants were not yet able to reliably use the gender information of third person pronouns in their choice of referent. They were at chance for the interpretation of the masculine pronoun hij 'he' and slightly above chance for the interpretation of the feminine pronoun zij 'she'. The better performance on the feminine pronoun could be due to the fact that the gender-marking of zij is more salient than that of hij because hij is also used as a gender-neutral default form in Dutch (Audring, ; Booij, ). The fact that, in indirect speech, four-to five-year-olds did better on speaker-referring feminine K Ö D E R A N D M A I E R  pronouns (Anita Aap told Oscar Olifant that she . . .) than on addressee-referring ones (Oscar Olifant told Anita Aap that she . . .) could be due to children's (and adults') well-established bias for subject antecedents (Crawley, Stevenson & Kleinman, ; Smyth, ).
An unexpected result of this studyindependent of our main resultis that girls outperformed boys in interpreting pronouns in speech reports. It is well documented cross-linguistically that girls are ahead of boys with respect to early communicative abilities and later reading abilities (Eriksson et al., ; Mullis, Martin, Foy & Drucker, ). However, since Köder and Maier () did not find a significant gender effect, we suggest that this difference could also be due to subtle differences in design. It is possible that children found our simple picture-book stimuli less engaging than the animation-based game of Köder and Maier. As a study by Oakhill and Petrides () indicates, a lack of interest in a topic has a bigger effect on boys' story comprehension than on girls'.

C O N C L U S I O N
Speech reports have an important function in children's books as they provide insights into the characters' minds. Most children's books contain more direct than indirect speech reports, presumably because direct speech creates a more vivid impression of listening to the characters speaking to each other directly. In the current study, we investigated at what age Dutch children are able to understand direct and indirect speech in a narrative. Our results indicate that at the age of sixwhen entering school in many countrieschildren master both direct and indirect speech comprehension.
This result contrasts sharply with Köder and Maier's () findings that even eleven-year-old children still struggle with the interpretation of pronouns in direct speech. We suggest that the poor direct speech performance of this earlier study is due to the fact that speech reports were integrated into a game involving simple information transmission, rather than a fictional narrative. A narrative setting highlights the perspective of the characters and therefore facilitates a shift from the narrator's to the character's perspective, which is required for direct speech interpretation. We conclude that the pragmatic context of a report influences the processing cost associated with perspective shifting. Our findings provide support for the idea that narratives enhance our perspective-taking abilities, allowing readers or listeners to more easily understand complex interactions involving multiple distinct perspectives (cf. Kidd