Search results for Artificial Intelligence and Natural Language Processing

5 - Explanations, Motives, and Intentions
Douglas Walton, University of Windsor, Ontario
Book:

Goal-based Reasoning for Argumentation

Published online:

05 November 2015

Print publication:

25 August 2015, pp 122-147
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Ascription of an intention to an agent is especially important in law. In criminal law the intent to commit a criminal act, called mens rea, refers to the guilty mind, the key element needed to prosecute a defendant for a crime. For example, in order to prove that a defendant has committed the crime of theft of an object, it needs to be established that the defendant had the intention never to return the object to its owner. Studying examples of how intention is proved in law is an important resource for giving us clues on how reasoning to an intention should be carried out. Intention is also fundamentally important in ethical reasoning where there are problems about how the end can justify the means.
This chapter introduces the notion of inference to the best explanation, often called abductive reasoning, and presents recent research on evidential reasoning that uses the concept of a so-called script or story as a central component. The introduction of these two argumentation tools show how they are helpful in moving forward toward a solution to the longstanding problem of analyzing how practical reasoning from circumstantial evidence can be used to support or undermine a hypothesis that an agent has a particular intention. Legal examples are used to show that even though ascribing an intention to an agent is an evaluation procedure that combines argumentation and explanation, it can be rationally carried out by using a practical reasoning model that accounts for the weighing of factual evidence on both sides of a disputed case.
The examples studied in this chapter will involve cases where practical reasoning is used as the glue that combines argumentation with explanation. Section 1 considers a simple example of a message on the Internet advising how to mount a flagpole bracket to a house. The example tells the reader how to take the required steps to attach a bracket to the house in order to mount a flagpole so that the reader can show his patriotism by displaying a flag on his house. The example text is clearly an instance of practical reasoning. The author of the ad presumes that the reader has a goal, and he tells the reader how to fulfill that goal by carrying out a sequence of actions.

Bibliography
Douglas Walton, University of Windsor, Ontario
Book:

Goal-based Reasoning for Argumentation

Published online:

05 November 2015

Print publication:

25 August 2015, pp 275-284
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Frontmatter
Douglas Walton, University of Windsor, Ontario
Book:

Goal-based Reasoning for Argumentation

Published online:

05 November 2015

Print publication:

25 August 2015, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Douglas Walton, University of Windsor, Ontario
Book:

Goal-based Reasoning for Argumentation

Published online:

05 November 2015

Print publication:

25 August 2015, pp vii-x
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
Douglas Walton, University of Windsor, Ontario
Book:

Goal-based Reasoning for Argumentation

Published online:

05 November 2015

Print publication:

25 August 2015, pp 285-292
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Acknowledgments
Douglas Walton, University of Windsor, Ontario
Book:

Goal-based Reasoning for Argumentation

Published online:

05 November 2015

Print publication:

25 August 2015, pp xi-xii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Dedication
Douglas Walton, University of Windsor, Ontario
Book:

Goal-based Reasoning for Argumentation

Published online:

05 November 2015

Print publication:

25 August 2015, pp v-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

A method based on rules and machine learning for logic form identification in Spanish †
F. MARTÍNEZ-SANTIAGO, M. C. DÍAZ-GALIANO, M. Á. GARCÍA-CUMBRERAS, A. MONTEJO-RÁEZ
Journal:

Natural Language Engineering / Volume 23 / Issue 1 / January 2017

Published online by Cambridge University Press:

24 August 2015, pp. 131-153
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Logic Forms (LF) are simple, first-order logic knowledge representations of natural language sentences. Each noun, verb, adjective, adverb, pronoun, preposition and conjunction generates a predicate. LF systems usually identify the syntactic function by means of syntactic rules but this approach is difficult to apply to languages with a high syntax flexibility and ambiguity, for example, Spanish. In this study, we present a mixed method for the derivation of the LF of sentences in Spanish that allows the combination of hard-coded rules and a classifier inspired on semantic role labeling. Thus, the main novelty of our proposal is the way the classifier is applied to generate the predicates of the verbs, while rules are used to translate the rest of the predicates, which are more straightforward and unambiguous than the verbal ones. The proposed mixed system uses a supervised classifier to integrate syntactic and semantic information in order to help overcome the inherent ambiguity of Spanish syntax. This task is accomplished in a similar way to the semantic role labeling task. We use properties extracted from the AnCora-ES corpus in order to train a classifier. A rule-based system is used in order to obtain the LF from the rest of the phrase. The rules are obtained by exploring the syntactic tree of the phrase and encoding the syntactic production rules. The LF algorithm has been evaluated by using shallow parsing with some straightforward Spanish phrases. The verb argument labeling task achieves 84% precision and the proposed mixed LFi method surpasses 11% a system based only on rules.

Sparsity and normalization in word similarity systems
JEAN MARK GAWRON, KELLEN STEPHENS
Journal:

Natural Language Engineering / Volume 22 / Issue 3 / May 2016

Published online by Cambridge University Press:

19 August 2015, pp. 351-395
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We investigate the problem of improving performance in distributional word similarity systems trained on sparse data, focusing on a family of similarity functions we call Dice-family functions (Dice 1945Ecology26(3): 297–302), including the similarity function introduced in Lin (1998Proceedings of the 15th International Conference on Machine Learning, 296–304), and Curran (2004 PhD thesis, University of Edinburgh. College of Science and Engineering. School of Informatics), as well as a generalized version of Dice Coefficient used in data mining applications (Strehl 2000, 55). We propose a generalization of the Dice-family functions which uses a weight parameter α to make the similarity functions asymmetric. We show that this generalized family of functions (α systems) all belong to the class of asymmetric models first proposed in Tversky (1977Psychological Review84: 327–352), and in a multi-task evaluation of ten word similarity systems, we show that α systems have the best performance across word ranks. In particular, we show that α-parameterization substantially improves the correlations of all Dice-family functions with human judgements on three words sets, including the Miller–Charles/Rubenstein Goodenough word set (Miller and Charles 1991Language and Cognitive Processes6(1): 1–28; Rubenstein and Goodenough 1965Communications of the ACM8: 627–633).

Data-driven deep-syntactic dependency parsing †
MIGUEL BALLESTEROS, BERND BOHNET, SIMON MILLE, LEO WANNER
Journal:

Natural Language Engineering / Volume 22 / Issue 6 / November 2016

Published online by Cambridge University Press:

18 August 2015, pp. 939-974
- Article
- - You have access
- PDF
- HTML
- Export citation
‘Deep-syntactic’ dependency structures that capture the argumentative, attributive and coordinative relations between full words of a sentence have a great potential for a number of NLP-applications. The abstraction degree of these structures is in between the output of a syntactic dependency parser (connected trees defined over all words of a sentence and language-specific grammatical functions) and the output of a semantic parser (forests of trees defined over individual lexemes or phrasal chunks and abstract semantic role labels which capture the frame structures of predicative elements and drop all attributive and coordinative dependencies). We propose a parser that provides deep-syntactic structures. The parser has been tested on Spanish, English and Chinese.

Silhouette + attraction: A simple and effective method for text clustering †
MARCELO L. ERRECALDE, LETICIA C. CAGNINA, PAOLO ROSSO
Journal:

Natural Language Engineering / Volume 22 / Issue 5 / September 2016

Published online by Cambridge University Press:

14 August 2015, pp. 687-726
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This article presents silhouette–attraction (Sil–Att), a simple and effective method for text clustering, which is based on two main concepts: the silhouette coefficient and the idea of attraction. The combination of both principles allows us to obtain a general technique that can be used either as a boosting method, which improves results of other clustering algorithms, or as an independent clustering algorithm. The experimental work shows that Sil–Att is able to obtain high-quality results on text corpora with very different characteristics. Furthermore, its stable performance on all the considered corpora is indicative that it is a very robust method. This is a very interesting positive aspect of Sil–Att with respect to the other algorithms used in the experiments, whose performances heavily depend on specific characteristics of the corpora being considered.

Bayesian Speech and Language Processing

Shinji Watanabe, Jen-Tzung Chien
Published online:

05 August 2015

Print publication:

15 July 2015
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
With this comprehensive guide you will learn how to apply Bayesian machine learning techniques systematically to solve various problems in speech and language processing. A range of statistical models is detailed, from hidden Markov models to Gaussian mixture models, n-gram models and latent topic models, along with applications including automatic speech recognition, speaker verification, and information retrieval. Approximate Bayesian inferences based on MAP, Evidence, Asymptotic, VB, and MCMC approximations are provided as well as full derivations of calculations, useful notations, formulas, and rules. The authors address the difficulties of straightforward applications and provide detailed examples and case studies to demonstrate how you can successfully use practical Bayesian inference methods to improve the performance of information systems. This is an invaluable resource for students, researchers, and industry practitioners working in machine learning, signal processing, and speech and language processing.

Modernising historical Slovene words
YVES SCHERRER, TOMAŽ ERJAVEC
Journal:

Natural Language Engineering / Volume 22 / Issue 6 / November 2016

Published online by Cambridge University Press:

03 August 2015, pp. 881-905
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We propose a language-independent word normalisation method and exemplify it on modernising historical Slovene words. Our method relies on character-level statistical machine translation (CSMT) and uses only shallow knowledge. We present relevant data on historical Slovene, consisting of two (partially) manually annotated corpora and the lexicons derived from these corpora, containing historical word–modern word pairs. The two lexicons are disjoint, with one serving as the training set containing 40,000 entries, and the other as a test set with 20,000 entries. The data spans the years 1750–1900, and the lexicons are split into fifty-year slices, with all the experiments carried out separately on the three time periods. We perform two sets of experiments. In the first one – a supervised setting – we build a CSMT system using the lexicon of word pairs as training data. In the second one – an unsupervised setting – we simulate a scenario in which word pairs are not available. We propose a two-step method where we first extract a noisy list of word pairs by matching historical words with cognate modern words, and then train a CSMT system on these pairs. In both sets of experiments, we also optionally make use of a lexicon of modern words to filter the modernisation hypotheses. While we show that both methods produce significantly better results than the baselines, their accuracy and which method works best strongly correlates with the age of the texts, meaning that the choice of the best method will depend on the properties of the historical language which is to be modernised. As an extrinsic evaluation, we also compare the quality of part-of-speech tagging and lemmatisation directly on historical text and on its modernised words. We show that, depending on the age of the text, annotation on modernised words also produces significantly better results than annotation on the original text.

NLP meets the cloud
ROBERT DALE
Journal:

Natural Language Engineering / Volume 21 / Issue 4 / August 2015

Published online by Cambridge University Press:

22 July 2015, pp. 653-659
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
With NLP services now widely available via cloud APIs, tasks like named entity recognition and sentiment analysis are virtually commodities. We look at what's on offer, and make some suggestions for how to get rich.

Revisiting the ontologising of semantic relation arguments in wordnet synsets
HUGO GONÇALO OLIVEIRA, PAULO GOMES
Journal:

Natural Language Engineering / Volume 22 / Issue 6 / November 2016

Published online by Cambridge University Press:

22 July 2015, pp. 819-848
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Ontologising is the task of associating terms, in text, with an ontological representation of their meaning, in an ontology. In this article, we revisit algorithms that have previously been used to ontologise the arguments of semantic relations in a relationless thesaurus, resulting in a wordnet. For increased flexibility, the algorithms do not use the extraction context when selecting the most adequate synsets for each term argument. Instead, they exploit a term-based lexical network which can be established by knowledge extracted automatically, or obtained from the resource the relations are being ontologised to. On the latter idea, we made several experiments to conclude that the algorithms can be used both for wordnet creation and for their enrichment. Besides describing the algorithms with some detail, the aforementioned experiments, which target both English and Portuguese, and their results are reported and discussed.

NLE volume 21 issue 4 Cover and Back matter
Journal:

Natural Language Engineering / Volume 21 / Issue 4 / August 2015

Published online by Cambridge University Press:

22 July 2015, pp. b1-b6
- Article
- - You have access
- PDF
- Export citation

NLE volume 21 issue 4 Cover and Front matter
Journal:

Natural Language Engineering / Volume 21 / Issue 4 / August 2015

Published online by Cambridge University Press:

22 July 2015, pp. f1-f2
- Article
- - You have access
- PDF
- Export citation

3 - Statistical models in speech and language processing
from Part I - General discussion
Shinji Watanabe, Jen-Tzung Chien, National Chiao Tung University, Taiwan
Book:

Bayesian Speech and Language Processing

Published online:

05 August 2015

Print publication:

15 July 2015, pp 53-134
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter focuses on basic statistical models (Gaussian mixture models (GMM), hidden Markov models (HMM),n–gram models and latent topic models), which are widely used in speech and language processing. These are well-known generative models, and these probabilistic models can generate speech and language features based on their likelihood functions. We also provide parameter-learning schemes based on maximum likelihood (ML) estimation which is derived according to the expectation and maximization (EM) algorithm (Dempster et al. 1976). Basically, the following chapters extend these statistical models from ML schemes to Bayesian schemes. These models are fundamental for speech and language processing.We specifically build an automatic speech recognition (ASR) system based on these models and extend them to deal with different problems in speaker clustering, speech verification, speech separation and other natural language processing systems.
In this chapter, Section 3.1 first introduces the probabilistic approach to ASR, which aims to find the most likely word sequence W corresponding to the input speech feature vectors O. Bayes decision theory provides a theoretical solution to build up a speech recognition system based on the posterior distribution of the word sequence p(W|O) given speech feature vectors O. Then the Bayes theorem decomposes the problem based on p(W|O) into two problems based on two generative models of speech features p(O|W) (acoustic model) and language features p(W) (language model), respectively. Therefore, the Bayes theorem changes the original problem to these two independent generative model problems.
Next, Section 3.2 introduces the HMM with the corresponding likelihood function as a generative model of speech features. The section first describes the discrete HMM, which has a multinomial distribution as a state observation distribution, and Section 3.2.4 introduces the GMM as a state observation distribution of the continuous density HMM for acoustic modeling. The GMM by itself is also used as a powerful statistical model for other speech processing approaches in the later chapters. Section 3.3 provides the basic algorithms of forward–backward and Viterbi algorithms. In Section 3.4, ML estimation of HMM parameters is derived according to the EM algorithm to deal with latent variables included in the HMM efficiently. Thus, we provide the conventional ML treatment of basic statistical models for acoustic models based on the HMM.

4 - Maximum a-posteriori approximation
from Part II - Approximate inference
Shinji Watanabe, Jen-Tzung Chien, National Chiao Tung University, Taiwan
Book:

Bayesian Speech and Language Processing

Published online:

05 August 2015

Print publication:

15 July 2015, pp 137-183
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Maximum a-posteriori (MAP) approximation is a well-known and widely used approximation for Bayesian inference. The approximation covers all variables including model parameters Θ, latent variables Z, and classification categories C (word sequence W in the automatic speech recognition case). For example, the Viterbi algorithm (arg maxZp(Z|O)) in the continuous density hidden Markov model (CDHMM), as discussed in Section 3.3.2, corresponds to the MAP approximation of latent variables, while the forward–backward algorithm, as discussed in Section 3.3.1, corresponds to an exact inference of these variables. As another example, the MAP decision rule (arg maxCp(C|O)) in Eq. (3.2) also corresponds to the MAP approximation of inferring the posterior distribution of classification categories. Since the final goal of automatic speech recognition is to output the word sequence, the MAP approximation of the word sequence matches the final goal. Thus, the MAP approximation can be applied to all probabilistic variables in speech and language processing as an essential technique.
This chapter starts to discuss the MAP approximation of Bayesian inference in detail, but limits the discussion only to model parameters Θ in Section 4.1. In the MAP approximation for model parameters, the prior distributions work as a regularization of these parameters, which makes the estimation of the parameters more robust than that of the maximum likelihood (ML) approach. Another interesting property of the MAP approximation for model parameters is that we can easily involve the inference of latent variables by extending the EM algorithm from ML to MAP estimation. Section 4.2 describes the general EM algorithm with the MAP approximation by following the ML-based EM algorithm, as discussed in Section 3.4. Based on the general MAP–EM algorithm, Section 4.3 provides MAP–EM solutions for CDHMM parameters, and introduces the well-known applications based on speaker adaptation. Section 4.5 describes the parameter smoothing method in discriminative training of the CDHMM, which actually corresponds to the MAP solution for discriminative parameter estimation. Section 4.6 focuses on the MAP estimation of GMM parameters, which is a subset of the MAP estimation of CDHMM parameters. It is used to construct speaker GMMs that are used for speaker verification. Section 4.7 provides an MAP solution of n –gram parameters that leads to one instance of interpolation smoothing, as discussed in Section 3.6.2. Finally, Section 4.8 deals with the adaptive MAP estimation of latent topic model parameters.

Appendix B - Vector and matrix formulas
Shinji Watanabe, Jen-Tzung Chien, National Chiao Tung University, Taiwan
Book:

Bayesian Speech and Language Processing

Published online:

05 August 2015

Print publication:

15 July 2015, pp 390-391
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

3242 results in Artificial Intelligence and Natural Language Processing

5 - Explanations, Motives, and Intentions

Summary

Bibliography

Frontmatter

Contents

Index

Acknowledgments

Dedication

A method based on rules and machine learning for logic form identification in Spanish †

Sparsity and normalization in word similarity systems

Data-driven deep-syntactic dependency parsing †

Silhouette + attraction: A simple and effective method for text clustering †

Bayesian Speech and Language Processing

Modernising historical Slovene words

NLP meets the cloud

Revisiting the ontologising of semantic relation arguments in wordnet synsets

NLE volume 21 issue 4 Cover and Back matter

NLE volume 21 issue 4 Cover and Front matter

3 - Statistical models in speech and language processing

Summary

4 - Maximum a-posteriori approximation

Summary

Appendix B - Vector and matrix formulas

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

Save Search

3242 results in Artificial Intelligence and Natural Language Processing

Summary

Bayesian Speech and Language Processing

Summary

Summary