Natural Language Engineering: Volume 21 - Graphs in NLP

Word from the editors
ZORNITSA KOZAREVA, VIVI NASTASE, RADA MIHALCEA
Published online by Cambridge University Press:

20 April 2015, pp. 661-664
- Article
- - You have access
- PDF
- HTML
- Export citation
Graph structures naturally model connections. In natural language processing (NLP) connections are ubiquitous, on anything between small and web scale. We find them between words – as grammatical, collocation or semantic relations – contributing to the overall meaning, and maintaining the cohesive structure of the text and the discourse unity. We find them between concepts in ontologies or other knowledge repositories – since the early ages of artificial intelligence, associative or semantic networks have been proposed and used as knowledge stores, because they naturally capture the language units and relations between them, and allow for a variety of inference and reasoning processes, simulating some of the functionalities of the human mind. We find them between complete texts or web pages, and between entities in a social network, where they model relations at the web scale. Beyond the more often encountered ‘regular’ graphs, hypergraphs have also appeared in our field to model relations between more than two units.

A survey of graphs in natural language processing *
VIVI NASTASE, RADA MIHALCEA, DRAGOMIR R. RADEV
Published online by Cambridge University Press:

12 October 2015, pp. 665-698
- Article
- - You have access
- PDF
- HTML
- Export citation
Graphs are a powerful representation formalism that can be applied to a variety of aspects related to language processing. We provide an overview of how Natural Language Processing problems have been projected into the graph framework, focusing in particular on graph construction – a crucial step in modeling the data to emphasize the phenomena targeted.

Textual entailment graphs
LILI KOTLERMAN, IDO DAGAN, BERNARDO MAGNINI, LUISA BENTIVOGLI
Published online by Cambridge University Press:

23 June 2015, pp. 699-724
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this work, we present a novel type of graphs for natural language processing (NLP), namely textual entailment graphs (TEGs). We describe the complete methodology we developed for the construction of such graphs and provide some baselines for this task by evaluating relevant state-of-the-art technology. We situate our research in the context of text exploration, since it was motivated by joint work with industrial partners in the text analytics area. Accordingly, we present our motivating scenario and the first gold-standard dataset of TEGs. However, while our own motivation and the dataset focus on the text exploration setting, we suggest that TEGs can have different usages and suggest that automatic creation of such graphs is an interesting task for the community.

Clique-based semantic kernel with application to semantic relatedness
A. H. JADIDINEJAD, F. MAHMOUDI, M. R. MEYBODI
Published online by Cambridge University Press:

14 April 2015, pp. 725-742
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The emergence of knowledge repositories in a variety of domains provides a valuable opportunity for semantic interpretation of high dimensional datasets. Previous researches investigate the use of concept instead of word as a core semantic feature for incorporating semantic knowledge from an ontology into the representation model of documents. On the other hand, in machine learning and information retrieval, data objects are represented as a flat feature vector. The inconsistency between the structural nature of the knowledge repositories and the flat representation of features in machine learning leads researchers to neglect the structure of the knowledge base and leverage concepts as isolated semantic features, which is known as bag-of-concepts. Although, using concepts has some advantages over words, by neglecting the relation between concepts, the problem of vocabulary mismatch remains in force. In this paper, a novel semantic kernel is proposed which is capable of incorporating the relatedness between conceptual features. This kernel leverages clique theory to map data objects to a novel feature space wherein complex data objects will be comparable. The proposed kernel is relevant to all applications which have a prior knowledge about the relatedness between features. We concentrate on representing text documents and words using Wikipedia and WordNet, respectively. The experimental results over a set of benchmark datasets have revealed that the proposed kernel significantly improves the representation of both words and texts in the application of semantic relatedness.

CO-graph: A new graph-based technique for cross-lingual word sense disambiguation
ANDRES DUQUE, LOURDES ARAUJO, JUAN MARTINEZ-ROMO
Published online by Cambridge University Press:

16 April 2015, pp. 743-772
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper, we present a new method based on co-occurrence graphs for performing Cross-Lingual Word Sense Disambiguation (CLWSD). The proposed approach comprises the automatic generation of bilingual dictionaries, and a new technique for the construction of a co-occurrence graph used to select the most suitable translations from the dictionary. Different algorithms that combine both the dictionary and the co-occurrence graph are then used for performing this selection of the final translations: techniques based on sub-graphs (communities) containing clusters of words with related meanings, based on distances between nodes representing words, and based on the relative importance of each node in the whole graph. The initial output of the system is enhanced with translation probabilities, provided by a statistical bilingual dictionary. The system is evaluated using datasets from two competitions: task 3 of SemEval 2010, and task 10 of SemEval 2013. Results obtained by the different disambiguation techniques are analysed and compared to those obtained by the systems participating in the competitions. Our system offers the best results in comparison with other unsupervised systems in most of the experiments, and even overcomes supervised systems in some cases.

An automatic approach to identify word sense changes in text media across timescales
SUNNY MITRA, RITWIK MITRA, SUMAN KALYAN MAITY, MARTIN RIEDL, CHRIS BIEMANN, PAWAN GOYAL, ANIMESH MUKHERJEE
Published online by Cambridge University Press:

16 April 2015, pp. 773-798
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper, we propose an unsupervised and automated method to identify noun sense changes based on rigorous analysis of time-varying text data available in the form of millions of digitized books and millions of tweets posted per day. We construct distributional-thesauri-based networks from data at different time points and cluster each of them separately to obtain word-centric sense clusters corresponding to the different time points. Subsequently, we propose a split/join based approach to compare the sense clusters at two different time points to find if there is ‘birth’ of a new sense. The approach also helps us to find if an older sense was ‘split’ into more than one sense or a newer sense has been formed from the ‘join’ of older senses or a particular sense has undergone ‘death’. We use this completely unsupervised approach (a) within the Google books data to identify word sense differences within a media, and (b) across Google books and Twitter data to identify differences in word sense distribution across different media. We conduct a thorough evaluation of the proposed methodology both manually as well as through comparison with WordNet.

NLE volume 21 issue 5 Cover and Front matter
Published online by Cambridge University Press:

12 October 2015, pp. f1-f2
- Article
- - You have access
- PDF
- Export citation

NLE volume 21 issue 5 Cover and Back matter
Published online by Cambridge University Press:

12 October 2015, pp. b1-b7
- Article
- - You have access
- PDF
- Export citation

Natural Language Processing

Refine listing

Actions for selected content:

Natural Language Engineering, Graphs in NLP

Editorial Note

Word from the editors

Articles

A survey of graphs in natural language processing *

Textual entailment graphs

Clique-based semantic kernel with application to semantic relatedness

CO-graph: A new graph-based technique for cross-lingual word sense disambiguation

An automatic approach to identify word sense changes in text media across timescales

Front Cover (OFC, IFC) and matter

NLE volume 21 issue 5 Cover and Front matter

Back Cover (IBC, OBC) and matter

NLE volume 21 issue 5 Cover and Back matter

Natural Language Processing

Refine listing

Actions for selected content:

Save Search

Natural Language Engineering, Graphs in NLP

Editorial Note

Articles

Front Cover (OFC, IFC) and matter

Back Cover (IBC, OBC) and matter