Toward a mechanistic psychology of dialogue

Published online by Cambridge University Press:  01 April 2004

Martin J. Pickering*
Department of Psychology, University of Edinburgh, EdinburghEH8 9JZ, United Kingdom
Simon Garrod*
Department of Psychology, University of Glasgow, Glasgow G12 8QT, United Kingdom


Traditional mechanistic accounts of language processing derive almost entirely from the study of monologue. Yet, the most natural and basic form of language use is dialogue. As a result, these accounts may only offer limited theories of the mechanisms that underlie language processing in general. We propose a mechanistic account of dialogue, the interactive alignment account, and use it to derive a number of predictions about basic language processes. The account assumes that, in dialogue, the linguistic representations employed by the interlocutors become aligned at many levels, as a result of a largely automatic process. This process greatly simplifies production and comprehension in dialogue. After considering the evidence for the interactive alignment model, we concentrate on three aspects of processing that follow from it. It makes use of a simple interactive inference mechanism, enables the development of local dialogue routines that greatly simplify language processing, and explains the origins of self-monitoring in production. We consider the need for a grammatical framework that is designed to deal with language in dialogue rather than monologue, and discuss a range of implications of the account.

Open Peer Commentary
Copyright © Cambridge University Press 2004

1. In more detail, the procedure is as follows. Two players are confronted with two computer-controlled mazes that do not differ in relevant ways. They are seated in different rooms but communicate via an audio link. The players each have a token representing their current position in their maze, which is only visible to them, and they take turns to move the tokens through the maze one position at a time until both players have reached their respective goal positions. At any time approximately half of the paths in each maze are closed. The closed paths are in different positions for each player and are only visible to that player. What makes the game collaborative is that the mazes are linked in such a way that when one player lands in a position where the other player's maze has a “switch” box, all of his closed paths open and open paths close. This means that the players have to keep track of each other's positions to successfully negotiate their mazes. The dialogue shown in Table 1 is taken from a conversation that occurred at the beginning of a game. Garrod and Anderson (1987) analyzed transcripts from 25 pairs of players to see how location descriptions developed over the course of each game. Some of the results of this analysis are considered in more detail in section 2.2.

2. Actually, Carlson-Radvansky and Jiang only found inhibition if the two trials used the same axis of the reference frame (e.g., the up-down axis). This limitation may be related to the fact that priming was assessed outside a dialogue situation. An interesting prediction is that interlocutors would align on reference frames, not just axes.

3. Critically, ordinals such as 4th can only quantify over ordered sets of items, whereas locative adjectives such as top or bottom usually modify unordered sets of items. Therefore when speakers say 4th row, they either have to give a post-modifying phrase such as from the bottom, which imposes a particular ordering on the set of rows, or they have to assume that row denotes an element in an implicitly ordered set of rows. In other words, they assume that row in the bare 1st row is to be interpreted like storey of a building in 1st storey. (Notice that it is odd to talk of the 2nd storey from the bottom or even the bottom storey of a building, but fine to talk about the bottom floor.)

4. A very interesting issue occurs when alignment at one level conflicts with alignment at another. Perhaps the most obvious cases of this are when alignment at the situation model requires nonalignment at the lexical level. For example, in Schober's (1993) example, two interlocutors who are facing each other use different terms to refer to similar locations (on the left vs. on the right) to maintain the same egocentric frame of reference. Likewise, Markman and Gentner (1993) show that successful use of analogy can require lexical misalignment. In Garrod and Anderson's (1987) maze game, if one player uses second row to refer to the second row from the top in a five-row maze, then the other player will tend to use fourth row to refer to the second row from the bottom. The player could lexically align by using second row in this way, but of course this would involve misalignment of situation models, and would therefore be misleading. The implication is that normally alignment at the situation level overrides alignment at lower levels.

5. We assume that a case, for example, where the speaker could not remember who he meant by John (while speaking) would be pathological.

6. Most theories accept that a few dialogue phenomena do need to be explained. For example, “binding” theory (Chomsky 1981) can be evoked to explain why himself is coreferential with John in A: Who does John love? B: Himself; though see Ginzburg (1999) for evidence against an account in such terms. Rather than think of question-answer pairs as a marginal phenomenon that needs special explanation in a monological account, we regard them as a particularly orderly aspect of dialogue.

7. Roughly, Ginzburg and Sag assume feature structures taken from Head-Driven Phrase Structure Grammar (Pollard & Sag 1994), in which context is incorporated into the representation of the fragments using the critical notion of QUDs (“questions under discussion”).

8. Estimates from small group dialogues indicate that as many as 31% of turns are interrupted by the listener (Fay et al. 2000).

9. Jackendoff uses the term conceptual structures instead of semantic structures, for reasons that we shall ignore for current purposes.

10. Note that Jackendoff (2002) assumes interface rules between semantic (conceptual) structures and phonological structures (p. 127, Fig. 5.5). If this is correct, it suggests that Figure 2 should incorporate such a link as well. He also suggests that the lexicon should be regarded as part of the interface components (p. 131).

11. The tendency might even be stronger for young children than adults, at least when it is the verb that is repeated. According to the “verb island hypothesis,” syntactic information is more strongly associated with individual verbs in young children than it is in adults (e.g., children are often able to use a particular construction with some verbs but not others; Tomasello 2000).

Toward a mechanistic psychology of dialogue
