We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
While reasoning can produce temporary changes of location, learning produces persistent changes of mass or configuration. When someone temporarily responds to instruction or threat but then reverts to an old behavior when the teacher or threat departs, we say that person did not learn anything. Mechanically, we would identify such response with an elastic material that rebounds on relief from compression, but such elastic behavior does not produce the permanent changes we associate with thought. True learning, involving change of mass or deformation of spatial configuration, constitutes plastic changes in the character of the material, including dynamogenetic changes that affect material response. In this chapter, let us consider learning involving changes of habits represented in the mass and changes of configuration represented in position. We distinguish types of reasoning and learning both by the types of changes involved and by the types of forces producing the change.
Accretion
The simplest sort of changes to memory just add new elements to the long-term memory represented by the mass of the agent. Such accretion also represents the effects of the most common sort of inference and learning mechanisms.
Many psychological theories view learning as transfer of information from short-term memory to long-term memory. Different theories of learning posit different means for effecting this transfer. Some theories require transfer to long-term memory of some beliefs in short-term memory simply because they persist long enough in short-term memory.
Researchers and developers of educational software have experimented with natural language processing (NLP) capabilities and related technologies since the 1960's. Automated essay scoring was perhaps the first application of this kind (Page 1966). Over a decade later, Writer's Workbench, a text-editing application, was developed as a tool for classroom teachers (MacDonald, Frase, Gingrich and Keenan 1982). Intelligent tutoring applications, though more in the spirit of artificial intelligence, were also being developed during this time (Carbonell 1970; Brown, Burton and Bell 1974; Stevens and Collins 1977; Burton and Brown 1982; Clancy 1987).
This book uses concepts from mechanics to help the reader understand and formalize theories of mind, with special concentration on understanding and formalizing notions of rationality and bounded rationality that underlie many parts of psychology and economics. The book provides evidence that mechanical notions including force and inertia play roles as important in understanding psychology and economics as they play in physics. Using this evidence, it attempts to clarify the nature of the concepts of motivation, effort, and habit in psychology and the ideas of rigidity, adaptation, and bounded rationality in economics. The investigation takes a mathematical approach. The mechanical interpretations developed to characterize mechanical reasoning and rationality also speak to other questions about mind, notably questions of dualism and materialism.
More generally, the exposition sketches the development of psychology and economics as subfields of mechanics by showing how one might formalize representative psychological and economic systems in such a way that these formalized systems satisfy modern axiomatic treatments of mechanics. This formalization explicates psychological and economic concepts under study by identifying corresponding properties of certain mechanical systems. Not all concepts of psychology and economics correspond to mechanical notions, and among those that do, not all concepts currently popular in psychology and economics correspond to natural mechanical ones.
Many should find familiar the notions of materialism and reductionism, and should recognize that these doctrines enjoy large numbers of adherents. Fewer need have heard of finitism because of its presently smaller number of adherents, though many should recognize some of its aspects in current scientific and technological trends. This chapter tries to collect and address some of these issues as they relate to a broadened mechanics.
What is finitism?
I use the term finitism to refer to the thesis that the spatial and material world and its behavior are finite, not just finitely axiomatizable (as are the infinity of natural and real numbers) but actually finite in the sense of being composed of a finite number of bits of stuff that may undergo finite numbers of possible changes at each of a set of discrete temporal instants. The finitistic picture of the world in some locality thus resembles an enormous, possibly nondeterministic or probabilistic finite automaton, or more naturally, as a cellular automaton.
One can consider strengthenings of this local notion of finiteness to finiteness of space and time as well. Finiteness of space means that at each instant there are only finitely many places at which events may occur, so that the entire universe looks instantaneously like a cellular automaton. Finiteness of time means that the event world contains only finitely many temporal instants. Thus the strongest notion of finitism, involving both spatial and temporal finiteness, views the entire universe as a gigantic finite automaton.
The mechanical understanding of mind bridges both the gap between the mental and the physical and the gap between the rational and the dynamical. In addition to seeking a better understanding of the relation of mind to body, one specific motivation in pursuing this understanding stems from an interest in finding new means with which to characterize and analyze limits to rationality, a central interest common to psychology, economics, and artificial intelligence. Pursuing this motivation requires facing philosophical problems that have puzzled people for millennia.
Although science has answered some of these philosophical questions about nature and mind, it has left others unanswered. For example, one ancient question concerns determinism, or more generally, lawfulness. Many views hold the mind to exhibit essential freedoms not enjoyed by matter; other views hold the mind subject to various laws of psychology, economics, sociology, and anthropology, and argue about the precedence of these competing regulations. Though scientific progress has inspired some of the competing variants and the development of quantum theories has complicated the stark alternatives contemplated by earlier generations, scientific evidence has done less than one might expect to support or weaken the cases for the fundamental alternatives. The liberty or lawfulness of the mind remains controversial.
Unresolved questions do not represent failures of science. They represent the human condition.
The axioms on forces given in the previous chapter characterize the nature of inertial forces and the structure of systems of forces in isolation, but otherwise say nothing about how forces arise in the evolution of mechanical systems. Although the special laws of forces depend on the specific class of material involved, Noll states three additional general axioms concerning dynamogenesis that bear on the general character of mechanical forces.
The first of Noll's general axioms on dynamogenesis states the principle of determinism, that the history of body and contact forces (or equivalently, the stress) at preceding instants determines a unique value for these forces at a given instant. The second axiom states the principle of locality, that the forces at a point depend only on the configuration of bodies within arbitrarily small neighborhoods of the point. The third axiom states the principle of frame indifference, that forces depend only on the intrinsic properties of motions and deformation, not on properties that vary with the reference frame.
Although we follow the pattern set by Noll regarding frame indifference, the broader mechanics requires some adjustment in the conceptions of both determinism and locality. The discrete materials of psychology and economics provide different and somewhat weaker motivations for determinism and locality of dynamogenesis, even if one winds up making traditional determinism and locality assumptions in specific systems.
We present the RAGS (Reference Architecture for Generation Systems) framework: a specification of an abstract Natural Language Generation (NLG) system architecture to support sharing, re-use, comparison and evaluation of NLG technologies. We argue that the evidence from a survey of actual NLG systems calls for a different emphasis in a reference proposal from that seen in similar initiatives in information extraction and multimedia interfaces. We introduce the framework itself, in particular the two-level data model that allows us to support the complex data requirements of NLG systems in a flexible and coherent fashion, and describe our efforts to validate the framework through a range of implementations.
The recent scaling down of mobile device form factors has increased the importance of predictive text entry. It is now also becoming an important communication tool for the disabled. Techniques related to predictive text entry software are discussed in a generalized, language-independent manner. The essence of predictive text entry is twofold, consisting of (1) the design of codes for text entry, and (2) the use of adaptive language models for decoding. Code design is examined in terms of the information-theoretical efficiency. Four adaptive language models are introduced and compared, and experimental results on text entry with these models are shown for English, Thai and Japanese.
Natural Language Generation (NLG) can be used to generate textual summaries of numeric data sets. In this paper we develop an architecture for generating short (a few sentences) summaries of large (100KB or more) time-series data sets. The architecture integrates pattern recognition, pattern abstraction, selection of the most significant patterns, microplanning (especially aggregation), and realisation. We also describe and evaluate SumTime-Turbine, a prototype system which uses this architecture to generate textualsummaries of sensor data from gas turbines.
Folk wisdom holds that incorporating a part-of-speech tagger into a system that performs deep linguistic analysis will improve the speed and accuracy of the system. Previous studies of tagging have tested this belief by incorporating an existing tagger into a parsing system and observing the effect on the speed of the parser and accuracy of the results. However, not much work has been done to determine in a fine-grained manner exactly how much tagging can help to disambiguate or reduce ambiguity in parser output. We take a new approach to this issue by examining the full parse-forest output of a large-scale LFG-based English grammar (Riezler et al. (2002)) running on the XLE grammar development platform (Maxwell and Kaplan (1993); Maxwell and Kaplan (1996)); and partitioning the parse outputs into equivalence classes based on the tag sequences for each parse. If we find a large number of tag-sequence equivalence classes for each sentence, we can conclude that different parses tend to be distinguished by their tags; a small number means that tagging would not help much in reducing ambiguity. In this way, we can determine how much tagging would help us in the best case, if we had the “perfect tagger” to give us the correct tag sequence for each sentence. We show that if a perfect tagger were available, a reduction in ambiguity of about 50% would be available. Somewhat surprisingly, about 30% of the sentences in the corpus that was examined would not be disambiguated, even by the perfect tagger, since all of the parses for these sentences shared the same tag sequence. Our study also helps to inform research on tagging by providing a targeted determination of exactly which tags can help the most in disambiguation.
In spite of difficulty in defining the syllable unequivocally, and controversy over its role in theories of spoken and written language processing, the syllable is a potentially useful unit in several practical tasks which arise in computational linguistics and speech technology. For instance, syllable structure might embody valuable information for building word models in automatic speech recognition, and concatenative speech synthesis might use syllables or demisyllables as basic units. In this paper, we first present an algorithm for determining syllable boundaries in the orthographic form of unknown words that works by analogical reasoning from a database or corpus of known syllabifications. We call this syllabification by analogy (SbA). It is similarly motivated to our existing pronunciation by analogy (PbA) which predicts pronunciations for unknown words (specified by their spellings) by inference from a dictionary of known word spellings and corresponding pronunciations. We show that including perfect (according to the corpus) syllable boundary information in the orthographic input can dramatically improve the performance of pronunciation by analogy of English words, but such information would not be available to a practical system. So we next investigate combining automatically-inferred syllabification and pronunciation in two different ways: the series model in which syllabification is followed sequentially by pronunciation generation; and the parallel model in which syllabification and pronunciation are simultaneously inferred. Unfortunately, neither improves performance over PbA without syllabification. Possible reasons for this failure are explored via an analysis of syllabification and pronunciation errors.
This paper describes in detail an algorithm for the unsupervised learning of natural language morphology, with emphasis on challenges that are encountered in languages typologically similar to European languages. It utilizes the Minimum Description Length analysis described in Goldsmith (2001), and has been implemented in software that is available for downloading and testing.
There has been a contemporary surge of interest in the application of stochastic models of parsing. The use of tree-adjoining grammar (TAG) in this domain has been relatively limited due in part to the unavailability, until recently, of large-scale corpora hand-annotated with TAG structures. Our goals are to develop inexpensive means of generating such corpora and to demonstrate their applicability to stochastic modeling. We present a method for automatically extracting a linguistically plausible TAG from the Penn Treebank. Furthermore, we also introduce labor-inexpensive methods for inducing higher-level organization of TAGs. Empirically, we perform an evaluation of various automatically extracted TAGs and also demonstrate how our induced higher-level organization of TAGs can be used for smoothing stochastic TAG models.
A part-of-speech tagger is a fundamental and indispensable tool in computational linguistics, typically employed at the critical early stages of processing. Although taggers are widely available that achieve high accuracy in very general domains, these do not perform nearly as well when applied to novel specialized domains, and this is especially true with biological text. We present a stochastic tagger that achieves over 97.44% accuracy on MEDLINE abstracts. A primary component of the tagger is its lexicon which enumerates the permitted parts-of-speech for the 10000 words most frequently occurring in MEDLINE. We present evidence for the conclusion that the lexicon is as vital to tagger accuracy as a training corpus, and more important than previously thought.
To respond correctly to a free form factual question given a large collection of text data, one needs to understand the question to a level that allows determining some of the constraints the question imposes on a possible answer. These constraints may include a semantic classification of the sought after answer and may even suggest using different strategies when looking for and verifying a candidate answer. This work presents a machine learning approach to question classification. Guided by a layered semantic hierarchy of answer types, we develop a hierarchical classifier that classifies questions into fine-grained classes. This work also performs a systematic study of the use of semantic information sources in natural language classification tasks. It is shown that, in the context of question classification, augmenting the input of the classifier with appropriate semantic category information results in significant improvements to classification accuracy. We show accurate results on a large collection of free-form questions used in TREC 10 and 11.
For one aspect of grammatical annotation, part-of-speech tagging, we investigate experimentally whether the ceiling on accuracy stems from limits to the precision of tag definition or limits to analysts' ability to apply precise definitions, and we examine how analysts' performance is affected by alternative types of semi-automatic support. We find that, even for analysts very well-versed in a part-of-speech tagging scheme, human ability to conform to the scheme is a more serious constraint than precision of scheme definition. We also find that although semi-automatic techniques can greatly increase speed relative to manual tagging, they have little effect on accuracy, either positively (by suggesting valid candidate tags) or negatively (by lending an appearance of authority to incorrect tag assignments). On the other hand, it emerges that there are large differences between individual analysts with respect to usability of particular types of semi-automatic support.