Search results for Artificial Intelligence and Natural Language Processing

15 - Toward Automatic Discovery of Diverse Perspectives Sihao Chen, Daniel Khashabi, and Dan Roth
from Part IV - Mining and Modeling Perspectives
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 193-207
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Part IV - Mining and Modeling Perspectives
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 169-170
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - The Mediation of Online Information Julia Noordegraaf and Thomas Poell
from Part III - Mediating Perspectives
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 113-118
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

16 - Formal Representation and Extraction of Perspectives Aldo Gangemi and Valentina Presutti
from Part IV - Mining and Modeling Perspectives
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 208-228
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

5 - The Micro Level: Linguistic Perspective in Written Discourse Kobie van Krieken and José Sanders
from Part II - Social Impact
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 60-74
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

3 - Computational Linguistics for Subjectivity Preslav Nakov
from Part I - Theoretical Background
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 31-54
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

19 - GRaSP: A Model for the Perspective Web Piek Vossen and Antske Fokkens
from Part IV - Mining and Modeling Perspectives
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 260-276
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Perspectives from a Social Psychological and Communication Scientific Perspective Ivar Vermeulen and Camiel Beukeboom
from Part I - Theoretical Background
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 9-30
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

11 - Content, Form, and Reception: Perspectives from Digital Media Data Christina Neumayer
from Part III - Mediating Perspectives
Edited by Piek Vossen, VU University Amsterdam, Antske Fokkens, VU University Amsterdam
Book:

Creating a More Transparent Internet

Published online:

21 April 2022

Print publication:

05 May 2022, pp 143-155
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Abstract meaning representation of Turkish
Elif Oral, Ali Acar, Gülşen Eryiğit
Journal:

Natural Language Engineering / Volume 30 / Issue 1 / January 2024

Published online by Cambridge University Press:

28 April 2022, pp. 171-200
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Abstract meaning representation (AMR) is a graph-based sentence-level meaning representation that has become highly popular in recent years. AMR is a knowledge-based meaning representation heavily relying on frame semantics for linking predicate frames and entity knowledge bases such as DBpedia for linking named entity concepts. Although it is originally designed for English, its adaptation to non-English languages is possible by defining language-specific divergences and representations. This article introduces the first AMR representation framework for Turkish, which poses diverse challenges for AMR due to its typological differences compared to English; agglutinative, free constituent order, morphologically highly rich resulting in fewer word surface forms in sentences. The introduced solutions to these peculiarities are expected to guide the studies for other similar languages and speed up the construction of a cross-lingual universal AMR framework. Besides this main contribution, the article also presents the construction of the first AMR corpus of 700 sentences, the first AMR parser (i.e., a tree-to-graph rule-based AMR parser) used for semi-automatic annotation, and the evaluation of the introduced resources for Turkish.

A survey of methods for revealing and overcoming weaknesses of data-driven Natural Language Understanding
Viktor Schlegel, Goran Nenadic, Riza Batista-Navarro
Journal:

Natural Language Engineering / Volume 29 / Issue 1 / January 2023

Published online by Cambridge University Press:

22 April 2022, pp. 1-31
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Recent years have seen a growing number of publications that analyse Natural Language Understanding (NLU) datasets for superficial cues, whether they undermine the complexity of the tasks underlying those datasets and how they impact those models that are optimised and evaluated on this data. This structured survey provides an overview of the evolving research area by categorising reported weaknesses in models and datasets and the methods proposed to reveal and alleviate those weaknesses for the English language. We summarise and discuss the findings and conclude with a set of recommendations for possible future research directions. We hope that it will be a useful resource for researchers who propose new datasets to assess the suitability and quality of their data to evaluate various phenomena of interest, as well as those who propose novel NLU approaches, to further understand the implications of their improvements with respect to their model’s acquired capabilities.

Creating a More Transparent Internet

Edited by Piek Vossen, Antske Fokkens
Published online:

21 April 2022

Print publication:

05 May 2022
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
On social media, new forms of communication arise rapidly, many of which are intense, dispersed, and create new communities at a global scale. Such communities can act as distinct information bubbles with their own perspective on the world, and it is difficult for people to find and monitor all these perspectives and relate the different claims made. Within this digital jungle of perspectives on truth, it is difficult to make informed decisions on important things like vaccinations, democracy, and climate change. Understanding and modeling this phenomenon in its full complexity requires an interdisciplinary approach, utilizing the ample data provided by digital communication to offer new insights and opportunities. This interdisciplinary book gives a comprehensive view on social media communication, the different forms it takes, the impact and the technology used to mine it, and defines the roadmap to a more transparent Web.

NLE volume 28 issue 3 Cover and Front matter
Journal:

Natural Language Engineering / Volume 28 / Issue 3 / May 2022

Published online by Cambridge University Press:

08 April 2022, pp. f1-f2
- Article
- - You have access
- PDF
- Export citation

NLE volume 28 issue 3 Cover and Back matter
Journal:

Natural Language Engineering / Volume 28 / Issue 3 / May 2022

Published online by Cambridge University Press:

08 April 2022, pp. b1-b2
- Article
- - You have access
- PDF
- Export citation

The voice synthesis business: 2022 update
Robert Dale
Journal:

Natural Language Engineering / Volume 28 / Issue 3 / May 2022

Published online by Cambridge University Press:

08 April 2022, pp. 401-408
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
In the past few years, high-quality automated text-to-speech synthesis has effectively become a commodity, with easy access to cloud-based APIs provided by a number of major players. At the same time, developments in deep learning have broadened the scope of voice synthesis functionalities that can be delivered, leading to a growth in the range of commercially viable use cases. We take a look at the technology features and use cases that have attracted attention and investment in the past few years, identifying the major players and recent start-ups in the space.

Real-world sentence boundary detection using multitask learning: A case study on French
KyungTae Lim, Jungyeul Park
Journal:

Natural Language Engineering / Volume 30 / Issue 1 / January 2024

Published online by Cambridge University Press:

06 April 2022, pp. 150-170
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We propose a novel approach for sentence boundary detection in text datasets in which boundaries are not evident (e.g., sentence fragments). Although detecting sentence boundaries without punctuation marks has rarely been explored in written text, current real-world textual data suffer from widespread lack of proper start/stop signaling. Herein, we annotate a dataset with linguistic information, such as parts of speech and named entity labels, to boost the sentence boundary detection task. Via experiments, we obtained F1 scores up to 98.07% using the proposed multitask neural model, including a score of 89.41% for sentences completely lacking punctuation marks. We also present an ablation study and provide a detailed analysis to demonstrate the effectiveness of the proposed multitask learning method.

Gender bias in legal corpora and debiasing it
Nurullah Sevim, Furkan Şahinuç, Aykut Koç
Journal:

Natural Language Engineering / Volume 29 / Issue 2 / March 2023

Published online by Cambridge University Press:

30 March 2022, pp. 449-482
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Word embeddings have become important building blocks that are used profoundly in natural language processing (NLP). Despite their several advantages, word embeddings can unintentionally accommodate some gender- and ethnicity-based biases that are present within the corpora they are trained on. Therefore, ethical concerns have been raised since word embeddings are extensively used in several high-level algorithms. Studying such biases and debiasing them have recently become an important research endeavor. Various studies have been conducted to measure the extent of bias that word embeddings capture and to eradicate them. Concurrently, as another subfield that has started to gain traction recently, the applications of NLP in the field of law have started to increase and develop rapidly. As law has a direct and utmost effect on people’s lives, the issues of bias for NLP applications in legal domain are certainly important. However, to the best of our knowledge, bias issues have not yet been studied in the context of legal corpora. In this article, we approach the gender bias problem from the scope of legal text processing domain. Word embedding models that are trained on corpora composed by legal documents and legislation from different countries have been utilized to measure and eliminate gender bias in legal documents. Several methods have been employed to reveal the degree of gender bias and observe its variations over countries. Moreover, a debiasing method has been used to neutralize unwanted bias. The preservation of semantic coherence of the debiased vector space has also been demonstrated by using high-level tasks. Finally, overall results and their implications have been discussed in the scope of NLP in legal domain.

Generating Arabic TAG for syntax-semantics analysis
Cherifa Ben Khelil, Chiraz Ben Othmane Zribi, Denys Duchier, Yannick Parmentier
Journal:

Natural Language Engineering / Volume 29 / Issue 2 / March 2023

Published online by Cambridge University Press:

24 March 2022, pp. 386-424
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Arabic presents many challenges for automatic processing. Although several research studies have addressed some issues, electronic resources for processing Arabic remain relatively rare or not widely available. In this paper, we propose a Tree-adjoining grammar with a syntax-semantic interface. It is applied to the modern standard Arabic, but it can be easily adapted to other languages. This grammar named “ArabTAG V2.0” (Arabic Tree Adjoining Grammar) is semi-automatically generated by means of an abstract representation called meta-grammar. To ensure its development, ArabTAG V2.0 benefits from a grammar testing environment that uses a corpus of phenomena. Further experiments were performed to check the coverage of this grammar as well as the syntax-semantic analysis. The results showed that ArabTAG V2.0 can cover the majority of syntactical structures and different linguistic phenomena with a precision rate of 88.76%. Moreover, we were able to semantically analyze sentences and build their semantic representations with a precision rate of about 95.63%.

In-depth analysis of the impact of OCR errors on named entity recognition and linking
Ahmed Hamdi, Elvys Linhares Pontes, Nicolas Sidere, Mickaël Coustaty, Antoine Doucet
Journal:

Natural Language Engineering / Volume 29 / Issue 2 / March 2023

Published online by Cambridge University Press:

18 March 2022, pp. 425-448
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Named entities (NEs) are among the most relevant type of information that can be used to properly index digital documents and thus easily retrieve them. It has long been observed that NEs are key to accessing the contents of digital library portals as they are contained in most user queries. However, most digitized documents are indexed through their optical character recognition (OCRed) version which include numerous errors. Although OCR engines have considerably improved over the last few years, OCR errors still considerably impact document access. Previous works were conducted to evaluate the impact of OCR errors on named entity recognition (NER) and named entity linking (NEL) techniques separately. In this article, we experimented with a variety of OCRed documents with different levels and types of OCR noise to assess in depth the impact of OCR on named entity processing. We provide a deep analysis of OCR errors that impact the performance of NER and NEL. We then present the resulting exhaustive study and subsequent recommendations on the adequate documents, the OCR quality levels, and the post-OCR correction strategies required to perform reliable NER and NEL.

Explainable lexical entailment with semantic graphs
Adam Kovacs, Kinga Gemes, Andras Kornai, Gabor Recski
Journal:

Natural Language Engineering / Volume 29 / Issue 5 / September 2023

Published online by Cambridge University Press:

28 February 2022, pp. 1223-1246
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We present novel methods for detecting lexical entailment in a fully rule-based and explainable fashion, by automatic construction of semantic graphs, in any language for which a crowd-sourced dictionary with sufficient coverage and a dependency parser of sufficient accuracy are available. We experiment and evaluate on both the Semeval-2020 lexical entailment task (Glavaš et al. (2020). Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 24–35) and the SherLIiC lexical inference dataset of typed predicates (Schmitt and Schütze (2019). Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 902–914). Combined with top-performing systems, our method achieves improvements over the previous state-of-the-art on both benchmarks. As a standalone system, it offers a fully interpretable model of lexical entailment that makes detailed error analysis possible, uncovering future directions for improving both the semantic parsing method and the inference process on semantic graphs. We release all components of our system as open source software.

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

3242 results in Artificial Intelligence and Natural Language Processing

15 - Toward Automatic Discovery of Diverse Perspectives Sihao Chen, Daniel Khashabi, and Dan Roth

Part IV - Mining and Modeling Perspectives

8 - The Mediation of Online Information Julia Noordegraaf and Thomas Poell

16 - Formal Representation and Extraction of Perspectives Aldo Gangemi and Valentina Presutti

5 - The Micro Level: Linguistic Perspective in Written Discourse Kobie van Krieken and José Sanders

3 - Computational Linguistics for Subjectivity Preslav Nakov

19 - GRaSP: A Model for the Perspective Web Piek Vossen and Antske Fokkens

2 - Perspectives from a Social Psychological and Communication Scientific Perspective Ivar Vermeulen and Camiel Beukeboom

11 - Content, Form, and Reception: Perspectives from Digital Media Data Christina Neumayer

Abstract meaning representation of Turkish

A survey of methods for revealing and overcoming weaknesses of data-driven Natural Language Understanding

Creating a More Transparent Internet

NLE volume 28 issue 3 Cover and Front matter

NLE volume 28 issue 3 Cover and Back matter

The voice synthesis business: 2022 update

Real-world sentence boundary detection using multitask learning: A case study on French

Gender bias in legal corpora and debiasing it

Generating Arabic TAG for syntax-semantics analysis

In-depth analysis of the impact of OCR errors on named entity recognition and linking

Explainable lexical entailment with semantic graphs

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

Save Search

3242 results in Artificial Intelligence and Natural Language Processing

Creating a More Transparent Internet