This chapter develops the hypothesis that perception is abduction in layers and that understanding spoken language is a special case. These rather grand hypotheses are rich with implications: philosophical, technological, and physiological.
We present here a layered-abduction computational model of perception that unifies bottom-up and top-down processing in a single logical and information-processing framework. In this model the processes of interpretation are broken down into discrete layers where at each layer a best-explanation composite hypothesis is formed of the data presented by the layer or layers below, with the help of information from above. The formation of such a hypothesis is an abductive inference process, similar to diagnosis and scientific theory formation. The model treats perception as a kind of frozen or “compiled” deliberation. It applies in particular to speech recognition and understanding, and is a model both of how people process spoken language input, and of how a machine can be organized to do it.
Perception is abduction in layers
There is a long tradition of belief in philosophy and psychology that perception relies on some form of inference (Kant, 1787; Helmholtz; Bruner, 1957; Rock, 1983; Gregory, 1987; Fodor, 1983). But this form of inference has been typically thought of as some form of deduction, or simple recognition, or feature based classification, not as abduction. In recent times researchers have occasionally proposed that perception, or at least language understanding, involves some form of abduction or explanation-based inference (Charniak & McDermott, 1985, p. 557; Charniak, 1986; Dasigi, 1988; Josephson, 1982, pp. 87-94; Fodor, 1983, pp. 88, 104; Hobbs, Stickel, Appelt and Martin, 1993).