To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We discuss in this chapter a class of approximate inference algorithms based on stochastic sampling: a process by which we repeatedly simulate situations according to their probability and then estimate the probabilities of events based on the frequency of their occurrence in the simulated situations.
Introduction
Consider the Bayesian network in Figure 15.1 and suppose that our goal is to estimate the probability of some event, say, wet grass. Stochastic sampling is a method for estimating such probabilities that works by measuring the frequency at which events materialize in a sequence of situations simulated according to their probability of occurrence. For example, if we simulate 100 situations and find out that the grass is wet in 30 of them, we estimate the probability of wet grass to be 3/10. As we see later, we can efficiently simulate situations according to their probability of occurrence by operating on the corresponding Bayesian network, a process that provides the basis for many of the sampling algorithms we consider in this chapter.
The statements of sampling algorithms are remarkably simple compared to the methods for exact inference discussed in previous chapters, and their accuracy can be made arbitrarily high by increasing the number of sampled situations. However, the design of appropriate sampling methods may not be trivial as we may need to focus the sampling process on a set of situations that are of particular interest.
Automated reasoning has been receiving much interest from a number of fields, including philosophy, cognitive science, and computer science. In this chapter, we consider the particular interest of computer science in automated reasoning over the last few decades, and then focus our attention on probabilistic reasoning using Bayesian networks, which is the main subject of this book.
Automated reasoning
The interest in automated reasoning within computer science dates back to the very early days of artificial intelligence (AI), when much work had been initiated for developing computer programs for solving problems that require a high degree of intelligence. Indeed, an influential proposal for building automated reasoning systems was extended by John McCarthy shortly after the term “artificial intelligence” was coined [McCarthy, 1959]. This proposal, sketched in Figure 1.1, calls for a system with two components: a knowledge base, which encodes what we know about the world, and a reasoner (inference engine), which acts on the knowledge base to answer queries of interest. For example, the knowledge base may encode what we know about the theory of sets in mathematics, and the reasoner may be used to prove various theorems about this domain.
McCarthy's proposal was actually more specific than what is suggested by Figure 1.1, as he called for expressing the knowledge base using statements in a suitable logic, and for using logical deduction in realizing the reasoning engine; see Figure 1.2. McCarthy's proposal can then be viewed as having two distinct and orthogonal elements.
Pyramid annotation makes it possible to evaluate quantitatively and qualitatively the content of machine-generated (or human) summaries. Evaluation methods must prove themselves against the same measuring stick – evaluation – as other research methods. First, a formal assessment of pyramid data from the 2003 Document Understanding Conference (DUC) is presented; this addresses whether the form of annotation is reliable and whether score results are consistent across annotators. A combination of interannotator reliability measures of the two manual annotation phases (pyramid creation and annotation of system peer summaries against pyramid models), and significance tests of the similarity of system scores from distinct annotations, produces highly reliable results. The most rigorous test consists of a comparison of peer system rankings produced from two independent sets of pyramid and peer annotations, which produce essentially the same rankings. Three years of DUC data (2003, 2005, 2006) are used to assess the reliability of the method across distinct evaluation settings: distinct systems, document sets, summary lengths, and numbers of model summaries. This functional assessment addresses the method's ability to discriminate systems across years. Results indicate that the statistical power of the method is more than sufficient to identify statistically significant differences among systems, and that the statistical power varies little across the 3 years.
We address in this chapter a number of problems that arise in real-world applications, showing how each can be solved by modeling and reasoning with Bayesian networks.
Introduction
We consider a number of real-world applications in this chapter drawn from the domains of diagnosis, reliability, genetics, channel coding, and commonsense reasoning. For each one of these applications, we state a specific reasoning problem that can be addressed by posing a formal query with respect to a corresponding Bayesian network. We discuss the process of constructing the required network and then identify the specific queries that need to be applied.
There are at least four general types of queries that can be posed with respect to a Bayesian network. Which type of query to use in a specific situation is not always trivial and some of the queries are guaranteed to be equivalent under certain conditions. We define these query types formally in Section 5.2 and then discuss them and their relationships in more detail when we go over the various applications in Section 5.3.
The construction of a Bayesian network involves three major steps. First, we must decide on the set of relevant variables and their possible values. Next, we must build the network structure by connecting the variables into a DAG. Finally, we must define the CPT for each network variable. The last step is the quantitative part of this construction process and can be the most involved in certain situations.
We propose a novel approach to developing a tractable affective dialogue model for probabilistic frame-based dialogue systems. The affective dialogue model, based on Partially Observable Markov Decision Process (POMDP) and Dynamic Decision Network (DDN) techniques, is composed of two main parts: the slot-level dialogue manager and the global dialogue manager. It has two new features: (1) being able to deal with a large number of slots and (2) being able to take into account some aspects of the user's affective state in deriving the adaptive dialogue strategies. Our implemented prototype dialogue manager can handle hundreds of slots, where each individual slot might have hundreds of values. Our approach is illustrated through a route navigation example in the crisis management domain. We conducted various experiments to evaluate our approach and to compare it with approximate POMDP techniques and handcrafted policies. The experimental results showed that the DDN–POMDP policy outperforms three handcrafted policies when the user's action error is induced by stress as well as when the observation error increases. Further, performance of the one-step look-ahead DDN–POMDP policy after optimizing its internal reward is close to state-of-the-art approximate POMDP counterparts.
Having located a misspelling, a spellchecker generally offers some suggestions for the intended word. Even without using context, a spellchecker can draw on various types of information in ordering its suggestions. A series of experiments is described, beginning with a basic corrector that implements a well-known algorithm for reversing single simple errors, and making successive enhancements to take account of substring matches, pronunciation, known error patterns, syllable structure and word frequency. The improvement in the ordering produced by each enhancement is measured on a large corpus of misspellings. The final version is tested on other corpora against a widely used commercial spellchecker and a research prototype.
An important problem in knowledge discovery from text is the automatic extraction of semantic relations. This paper addresses the automatic classification of the semantic relations expressed by English genitives. A learning model is introduced based on the statistical analysis of the distribution of genitives' semantic relations in a corpus. The semantic and contextual features of the genitive's noun phrase constituents play a key role in the identification of the semantic relation. The algorithm was trained and tested on a corpus of approximately 20,000 sentences and achieved an f-measure of 79.80 per cent for of-genitives, far better than the 40.60 per cent obtained using a Decision Trees algorithm, the 50.55 per cent obtained using a Naive Bayes algorithm, or the 72.13 per cent obtained using a Support Vector Machines algorithm on the same corpus using the same features. The results were similar for s-genitives: 78.45 per cent using Semantic Scattering, 47.00 per cent using Decision Trees, 43.70 per cent using Naive Bayes, and 70.32 per cent using a Support Vector Machines algorithm. The results demonstrate the importance of word sense disambiguation and semantic generalization/specialization for this task. They also demonstrate that different patterns (in our case the two types of genitive constructions) encode different semantic information and should be treated differently in the sense that different models should be built for different patterns.
Human-generated summaries are a blend of content and style, bound by the task restrictions, but are ‘subject to subjectiveness’ of the individuals summarising the documents. We study the impact of various facets that cause subjectivity such as brevity, information content and information coverage on human-authored summaries. The scale of subjectivity is quantitatively measured among various summaries using a question–answer-based cross-comprehension test. The test evaluates summaries for meaning rather than exact words based on questions, framed by the summary authors, derived from the summary. The number of questions that cannot be answered after reading the candidate summary reflects its subjectivity. The qualitative analysis of the outcome of the cross-comprehension test shows the relationship between the length of a summary, information content and nature of questions framed by the summary author.
Support Vector Machines (SVM) have been used successfully in many Natural Language Processing (NLP) tasks. The novel contribution of this paper is in investigating two techniques for making SVM more suitable for language learning tasks. Firstly, we propose an SVM with uneven margins (SVMUM) model to deal with the problem of imbalanced training data. Secondly, SVM active learning is employed in order to alleviate the difficulty in obtaining labelled training data. The algorithms are presented and evaluated on several Information Extraction (IE) tasks, where they achieved better performance than the standard SVM and the SVM with passive learning, respectively. Moreover, by combining SVMUM with the active learning algorithm, we achieve the best reported results on the seminars and jobs corpora, which are benchmark data sets used for evaluation and comparison of machine learning algorithms for IE. In addition, we also evaluate the token based classification framework for IE with three different entity tagging schemes. In comparison to previous methods dealing with the same problems, our methods are both effective and efficient, which are valuable features for real-world applications. Due to the similarity in the formulation of the learning problem for IE and for other NLP tasks, the two techniques are likely to be beneficial in a wide range of applications1.