Search results for Artificial Intelligence and Natural Language Processing

Editorial note
Ruslan Mitkov
Journal:

Natural Language Engineering / Volume 26 / Issue 1 / January 2020

Published online by Cambridge University Press:

27 December 2019, p. 1
- Article
- - You have access
- PDF
- HTML
- Export citation

NLE volume 26 issue 1 Cover and Back matter
Journal:

Natural Language Engineering / Volume 26 / Issue 1 / January 2020

Published online by Cambridge University Press:

27 December 2019, pp. b1-b2
- Article
- - You have access
- PDF
- Export citation

Voice assistance in 2019
Robert Dale
Journal:

Natural Language Engineering / Volume 26 / Issue 1 / January 2020

Published online by Cambridge University Press:

27 December 2019, pp. 129-136
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
The end of the calendar year always seems like a good time to pause for breath and reflect on what’s been happening over the last 12 months, and that’s as true in the world of commercial NLP as it is in any other domain. In particular, 2019 has been a busy year for voice assistance, thanks to the focus placed on this area by all the major technology players. So, we take this opportunity to review a number of key themes that have defined recent developments in the commercialization of voice technology.

It all starts with entities: A Salient entity topic model
Chuan Wu, Evangelos Kanoulas, Maarten de Rijke
Journal:

Natural Language Engineering / Volume 26 / Issue 5 / September 2020

Published online by Cambridge University Press:

22 November 2019, pp. 531-549
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Entities play an essential role in understanding textual documents, regardless of whether the documents are short, such as tweets, or long, such as news articles. In short textual documents, all entities mentioned are usually considered equally important because of the limited amount of information. In long textual documents, however, not all entities are equally important: some are salient and others are not. Traditional entity topic models (ETMs) focus on ways to incorporate entity information into topic models to better explain the generative process of documents. However, entities are usually treated equally, without considering whether they are salient or not. In this work, we propose a novel ETM, Salient Entity Topic Model, to take salient entities into consideration in the document generation process. In particular, we model salient entities as a source of topics used to generate words in documents, in addition to the topic distribution of documents used in traditional topic models. Qualitative and quantitative analysis is performed on the proposed model. Application to entity salience detection demonstrates the effectiveness of our model compared to the state-of-the-art topic model baselines.

Horacio Saggion, Automatic Text Simplification. Synthesis lectures on human language technologies, April 2017. 137 pages, ISBN:1627058680 9781627058681
Carolina Scarton
Journal:

Natural Language Engineering / Volume 26 / Issue 4 / July 2020

Published online by Cambridge University Press:

18 November 2019, pp. 489-492
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Keyword extraction: Issues and methods
Nazanin Firoozeh, Adeline Nazarenko, Fabrice Alizon, Béatrice Daille
Journal:

Natural Language Engineering / Volume 26 / Issue 3 / May 2020

Published online by Cambridge University Press:

11 November 2019, pp. 259-291
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Due to the considerable growth of the volume of text documents on the Internet and in digital libraries, manual analysis of these documents is no longer feasible. Having efficient approaches to keyword extraction in order to retrieve the ‘key’ elements of the studied documents is now a necessity. Keyword extraction has been an active research field for many years, covering various applications in Text Mining, Information Retrieval, and Natural Language Processing, and meeting different requirements. However, it is not a unified domain of research. In spite of the existence of many approaches in the field, there is no single approach that effectively extracts keywords from different data sources. This shows the importance of having a comprehensive review, which discusses the complexity of the task and categorizes the main approaches of the field based on the features and methods of extraction that they use. This paper presents a general introduction to the field of keyword/keyphrase extraction. Unlike the existing surveys, different aspects of the problem along with the main challenges in the field are discussed. This mainly includes the unclear definition of ‘keyness’, complexities of targeting proper features for capturing desired keyness properties and selecting efficient extraction methods, and also the evaluation issues. By classifying a broad range of state-of-the-art approaches and analysing the benefits and drawbacks of different features and methods, we provide a clearer picture of them. This review is intended to help readers find their way around all the works related to keyword extraction and guide them in choosing or designing a method that is appropriate for the application they are targeting.

Deep Learning in Natural Language Processing, edited by Li Deng and Yang Liu. Singapore: Springer, 2018. ISBN 9789811052088. XVII + 329 pages
Haoda Feng, Feng Shi
Journal:

Natural Language Engineering / Volume 27 / Issue 3 / May 2021

Published online by Cambridge University Press:

08 November 2019, pp. 373-375
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

NLE volume 25 issue 6 Cover and Front matter
Journal:

Natural Language Engineering / Volume 25 / Issue 6 / November 2019

Published online by Cambridge University Press:

11 October 2019, pp. f1-f2
- Article
- - You have access
- PDF
- Export citation

Editorial note
Ruslan Mitkov
Journal:

Natural Language Engineering / Volume 25 / Issue 6 / November 2019

Published online by Cambridge University Press:

11 October 2019, p. 675
- Article
- - You have access
- PDF
- HTML
- Export citation

NLE volume 25 issue 6 Cover and Back matter
Journal:

Natural Language Engineering / Volume 25 / Issue 6 / November 2019

Published online by Cambridge University Press:

11 October 2019, pp. b1-b2
- Article
- - You have access
- PDF
- Export citation

Mining, analyzing, and modeling text written on mobile devices
K. Vertanen, P.O. Kristensson
Journal:

Natural Language Engineering / Volume 27 / Issue 1 / January 2021

Published online by Cambridge University Press:

10 October 2019, pp. 1-33
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We present a method for mining the web for text entered on mobile devices. Using searching, crawling, and parsing techniques, we locate text that can be reliably identified as originating from 300 mobile devices. This includes 341,000 sentences written on iPhones alone. Our data enables a richer understanding of how users type “in the wild” on their mobile devices. We compare text and error characteristics of different device types, such as touchscreen phones, phones with physical keyboards, and tablet computers. Using our mined data, we train language models and evaluate these models on mobile test data. A mixture model trained on our mined data, Twitter, blog, and forum data predicts mobile text better than baseline models. Using phone and smartwatch typing data from 135 users, we demonstrate our models improve the recognition accuracy and word predictions of a state-of-the-art touchscreen virtual keyboard decoder. Finally, we make our language models and mined dataset available to other researchers.

Five Tips for a Successful API
Robert Dale
Journal:

Natural Language Engineering / Volume 25 / Issue 6 / November 2019

Published online by Cambridge University Press:

26 September 2019, pp. 769-772
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
It’s now remarkably easy to release to the world a cloud-based application programming interface (API) that provides some software function as a service. As a consequence, the cloud API space has become very densely populated, so that even if a particular API offers a service whose potential value is considerable, there are many other factors that play a role in determining whether or not that API will be commercially successful. If you’re thinking about entering the API marketplace with your latest and greatest idea, this post offers some entirely subjective advice on how you might increase the chances of your offering not being lost in all the noise.

Uncovering the language of wine experts
Ilja Croijmans, Iris Hendrickx, Els Lefever, Asifa Majid, Antal Van Den Bosch
Journal:

Natural Language Engineering / Volume 26 / Issue 5 / September 2020

Published online by Cambridge University Press:

23 September 2019, pp. 511-530
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Talking about odors and flavors is difficult for most people, yet experts appear to be able to convey critical information about wines in their reviews. This seems to be a contradiction, and wine expert descriptions are frequently received with criticism. Here, we propose a method for probing the language of wine reviews, and thus offer a means to enhance current vocabularies, and as a by-product question the general assumption that wine reviews are gibberish. By means of two different quantitative analyses—support vector machines for classification and Termhood analysis—on a corpus of online wine reviews, we tested whether wine reviews are written in a consistent manner, and thus may be considered informative; and whether reviews feature domain-specific language. First, a classification paradigm was trained on wine reviews from one set of authors for which the color, grape variety, and origin of a wine were known, and subsequently tested on data from a new author. This analysis revealed that, regardless of individual differences in vocabulary preferences, color and grape variety were predicted with high accuracy. Second, using Termhood as a measure of how words are used in wine reviews in a domain-specific manner compared to other genres in English, a list of 146 wine-specific terms was uncovered. These words were compared to existing lists of wine vocabulary that are currently used to train experts. Some overlap was observed, but there were also gaps revealed in the extant lists, suggesting these lists could be improved by our automatic analysis.

Twenty-five years of information extraction
Ralph Grishman
Journal:

Natural Language Engineering / Volume 25 / Issue 6 / November 2019

Published online by Cambridge University Press:

20 September 2019, pp. 677-692
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Information extraction is the process of converting unstructured text into a structured data base containing selected information from the text. It is an essential step in making the information content of the text usable for further processing. In this paper, we describe how information extraction has changed over the past 25 years, moving from hand-coded rules to neural networks, with a few stops on the way. We connect these changes to research advances in NLP and to the evaluations organized by the US Government.

Automatic summarisation: 25 years On
Constantin Orăsan
Journal:

Natural Language Engineering / Volume 25 / Issue 6 / November 2019

Published online by Cambridge University Press:

19 September 2019, pp. 735-751
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Automatic text summarisation is a topic that has been receiving attention from the research community from the early days of computational linguistics, but it really took off around 25 years ago. This article presents the main developments from the last 25 years. It starts by defining what a summary is and how its definition changed over time as a result of the interest in processing new types of documents. The article continues with a brief history of the field and highlights the main challenges posed by the evaluation of summaries. The article finishes with some thoughts about the future of the field.

Word sense disambiguation using implicit information
Goonjan Jain, D.K. Lobiyal
Journal:

Natural Language Engineering / Volume 26 / Issue 4 / July 2020

Published online by Cambridge University Press:

13 September 2019, pp. 413-432
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Humans proficiently interpret the true sense of an ambiguous word by establishing association among words in a sentence. The complete sense of text is also based on implicit information, which is not explicitly mentioned. The absence of this implicit information is a significant problem for a computer program that attempts to determine the correct sense of ambiguous words. In this paper, we propose a novel method to uncover the implicit information that links the words of a sentence. We reveal this implicit information using a graph, which is then used to disambiguate the ambiguous word. The experiments show that the proposed algorithm interprets the correct sense for both homonyms and polysemous words. Our proposed algorithm has performed better than the approaches presented in the SemEval-2013 task for word sense disambiguation and has shown an accuracy of 79.6 percent, which is 2.5 percent better than the best unsupervised approach in SemEval-2007.

Discovering multiword expressions
Aline Villavicencio, Marco Idiart
Journal:

Natural Language Engineering / Volume 25 / Issue 6 / November 2019

Published online by Cambridge University Press:

11 September 2019, pp. 715-733
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper, we provide an overview of research on multiword expressions (MWEs), from a natural language processing perspective. We examine methods developed for modelling MWEs that capture some of their linguistic properties, discussing their use for MWE discovery and for idiomaticity detection. We concentrate on their collocational and contextual preferences, along with their fixedness in terms of canonical forms and their lack of word-for-word translatatibility. We also discuss a sample of the MWE resources that have been used in intrinsic evaluation setups for these methods.

How to evaluate machine translation: A review of automated and human metrics
Eirini Chatzikoumi
Journal:

Natural Language Engineering / Volume 26 / Issue 2 / March 2020

Published online by Cambridge University Press:

11 September 2019, pp. 137-161
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This article presents the most up-to-date, influential automated, semiautomated and human metrics used to evaluate the quality of machine translation (MT) output and provides the necessary background for MT evaluation projects. Evaluation is, as repeatedly admitted, highly relevant for the improvement of MT. This article is divided into three parts: the first one is dedicated to automated metrics; the second, to human metrics; and the last, to the challenges posed by neural machine translation (NMT) regarding its evaluation. The first part includes reference translation–based metrics; confidence or quality estimation (QE) metrics, which are used as alternatives for quality assessment; and diagnostic evaluation based on linguistic checkpoints. Human evaluation metrics are classified according to the criterion of whether human judges directly express a so-called subjective evaluation judgment, such as ‘good’ or ‘better than’, or not, as is the case in error classification. The former methods are based on directly expressed judgment (DEJ); therefore, they are called ‘DEJ-based evaluation methods’, while the latter are called ‘non-DEJ-based evaluation methods’. In the DEJ-based evaluation section, tasks such as fluency and adequacy annotation, ranking and direct assessment (DA) are presented, whereas in the non-DEJ-based evaluation section, tasks such as error classification and postediting are detailed, with definitions and guidelines, thus rendering this article a useful guide for evaluation projects. Following the detailed presentation of the previously mentioned metrics, the specificities of NMT are set forth along with suggestions for its evaluation, according to the latest studies. As human translators are the most adequate judges of the quality of a translation, emphasis is placed on the human metrics seen from a translator-judge perspective to provide useful methodology tools for interdisciplinary research groups that evaluate MT systems.

Learning keyphrases from corpora and knowledge models
R. Silveira, V. Furtado, V. Pinheiro
Journal:

Natural Language Engineering / Volume 26 / Issue 3 / May 2020

Published online by Cambridge University Press:

10 September 2019, pp. 293-318
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Extraction keyphrase systems traditionally use classification algorithms and do not consider the fact that part of the keyphrases may not be found in the text, reducing the accuracy of such algorithms a priori. In this work, we propose to improve the accuracy of these systems with inferential mechanisms that use a knowledge representation model, including symbolic models of knowledge bases and distributional semantics, to expand the set of keyphrase candidates to be submitted to the classification algorithm with terms that are not in the text (not-in-text terms). The basic assumption we have is that not-in-text terms have a semantic relationship with terms that are in the text. To represent this relationship, we have defined two new features to be represented as input to the classification algorithms. The first feature refers to the power of discrimination of the inferred not-in-text terms. The intuition behind this is that good candidates for a keyphrase are those that are deduced from various textual terms in a specific document and that are not often deduced in other documents. The other feature represents the descriptive strength of a not-in-text candidate. We argue that not-in-text keyphrases must have a strong semantic relationship with the text and that the power of this semantic relationship can be measured in a similar way as popular metrics like TFxIDF. The method proposed in this work was compared with state-of-the-art systems using five corpora and the results show that it has significantly improved automatic keyphrase extraction, dealing with the limitation of extracting keyphrases absent from the text.

NLE volume 25 issue 5 Cover and Back matter
Journal:

Natural Language Engineering / Volume 25 / Issue 5 / September 2019

Published online by Cambridge University Press:

09 September 2019, pp. b1-b2
- Article
- - You have access
- PDF
- Export citation

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

3404 results in Artificial Intelligence and Natural Language Processing

Editorial note

NLE volume 26 issue 1 Cover and Back matter

Voice assistance in 2019

It all starts with entities: A Salient entity topic model

Horacio Saggion, Automatic Text Simplification. Synthesis lectures on human language technologies, April 2017. 137 pages, ISBN:1627058680 9781627058681

Keyword extraction: Issues and methods

Deep Learning in Natural Language Processing, edited by Li Deng and Yang Liu. Singapore: Springer, 2018. ISBN 9789811052088. XVII + 329 pages

NLE volume 25 issue 6 Cover and Front matter

Editorial note

NLE volume 25 issue 6 Cover and Back matter

Mining, analyzing, and modeling text written on mobile devices

Five Tips for a Successful API

Uncovering the language of wine experts

Twenty-five years of information extraction

Automatic summarisation: 25 years On

Word sense disambiguation using implicit information

Discovering multiword expressions

How to evaluate machine translation: A review of automated and human metrics

Learning keyphrases from corpora and knowledge models

NLE volume 25 issue 5 Cover and Back matter

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

Save Search

3404 results in Artificial Intelligence and Natural Language Processing