Natural Language Engineering: Volume 23 -

One, no one and one hundred thousand events: Defining and processing events in an inter-disciplinary perspective *
R. SPRUGNOLI, S. TONELLI
Published online by Cambridge University Press:

25 October 2016, pp. 485-506
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We present an overview of event definition and processing spanning 25 years of research in NLP. We first provide linguistic background to the notion of event, and then present past attempts to formalize this concept in annotation standards to foster the development of benchmarks for event extraction systems. This ranges from MUC-3 in 1991 to the Time and Space Track challenge at SemEval 2015. Besides, we shed light on other disciplines in which the notion of event plays a crucial role, with a focus on the historical domain. Our goal is to provide a comprehensive study on event definitions and investigate which potential past efforts in the NLP community may have in a different research domain. We present the results of a questionnaire, where the notion of event for historians is put in relation to the NLP perspective.

Multilingual native language identification
SHERVIN MALMASI, MARK DRAS
Published online by Cambridge University Press:

02 December 2015, pp. 163-215
- Article
- - You have access
- PDF
- HTML
- Export citation
We present the first comprehensive study of Native Language Identification (NLI) applied to text written in languages other than English, using data from six languages. NLI is the task of predicting an author’s first language using only their writings in a second language, with applications in Second Language Acquisition and forensic linguistics. Most research to date has focused on English but there is a need to apply NLI to other languages, not only to gauge its applicability but also to aid in teaching research for other emerging languages. With this goal, we identify six typologically very different sources of non-English second language data and conduct six experiments using a set of commonly used features. Our first two experiments evaluate our features and corpora, showing that the features perform well and at similar rates across languages. The third experiment compares non-native and native control data, showing that they can be discerned with 95 per cent accuracy. Our fourth experiment provides a cross-linguistic assessment of how the degree of syntactic data encoded in part-of-speech tags affects their efficiency as classification features, finding that most differences between first language groups lie in the ordering of the most basic word categories. We also tackle two questions that have not previously been addressed for NLI. Other work in NLI has shown that ensembles of classifiers over feature types work well and in our final experiment we use such an oracle classifier to derive an upper limit for classification accuracy with our feature set. We also present an analysis examining feature diversity, aiming to estimate the degree of overlap and complementarity between our chosen features employing an association measure for binary data. Finally, we conclude with a general discussion and outline directions for future work.

Editorial note
RUSLAN MITKOV
Published online by Cambridge University Press:

16 December 2016, pp. 1-2
- Article
- - You have access
- PDF
- HTML
- Export citation
In one of my previous editorial notes I promised that the positive developments of the Journal of Natural Language Engineering (JNLE) would be a continuous and common practice. I am proud to report that I have been able to keep this promise. JNLE has enjoyed another very successful year. The impact factor of the journal increased for the second consecutive year, with the journal listed both among the Linguistics and Computer Science categories. From 2016 onwards, JNLE is offering six 160-page issues per year, which by far exceeds the four 96-page issues from less than 10 years ago!

Generating natural language descriptions using speaker-dependent information †
THIAGO C. FERREIRA, IVANDRÉ PARABONI
Published online by Cambridge University Press:

27 February 2017, pp. 813-834
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper discusses the issue of human variation in natural language referring expression generation. We introduce a model of content selection that takes speaker-dependent information into account to produce descriptions that closely resemble those produced by each individual, as seen in a number of reference corpora. Results show that our speaker-dependent referring expression generation model outperforms alternatives that do not take human variation into account, or which do so less extensively, and suggest that the use of machine-learning methods may be an ideal approach to mimic complex referential behaviour.

Deep-neural network approaches for speech recognition with heterogeneous groups of speakers including children †
ROMAIN SERIZEL, DIEGO GIULIANI
Published online by Cambridge University Press:

12 April 2016, pp. 325-350
- Article
- - You have access
- PDF
- HTML
- Export citation
This paper introduces deep neural network (DNN)–hidden Markov model (HMM)-based methods to tackle speech recognition in heterogeneous groups of speakers including children. We target three speaker groups consisting of children, adult males and adult females. Two different kind of approaches are introduced here: approaches based on DNN adaptation and approaches relying on vocal-tract length normalisation (VTLN). First, the recent approach that consists in adapting a general DNN to domain/language specific data is extended to target age/gender groups in the context of DNN–HMM. Then, VTLN is investigated by training a DNN–HMM system by using either mel frequency cepstral coefficients normalised with standard VTLN or mel frequency cepstral coefficients derived acoustic features combined with the posterior probabilities of the VTLN warping factors. In this later, novel, approach the posterior probabilities of the warping factors are obtained with a separate DNN and the decoding can be operated in a single pass when the VTLN approach requires two decoding passes. Finally, the different approaches presented here are combined to take advantage of their complementarity. The combination of several approaches is shown to improve the baseline phone error rate performance by thirty per cent to thirty-five per cent relative and the baseline word error rate performance by about ten per cent relative.

Natural language processing in mental health applications using non-clinical texts †
RAFAEL A. CALVO, DAVID N. MILNE, M. SAZZAD HUSSAIN, HELEN CHRISTENSEN
Published online by Cambridge University Press:

30 January 2017, pp. 649-685
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Natural language processing (NLP) techniques can be used to make inferences about peoples’ mental states from what they write on Facebook, Twitter and other social media. These inferences can then be used to create online pathways to direct people to health information and assistance and also to generate personalized interventions. Regrettably, the computational methods used to collect, process and utilize online writing data, as well as the evaluations of these techniques, are still dispersed in the literature. This paper provides a taxonomy of data sources and techniques that have been used for mental health support and intervention. Specifically, we review how social media and other data sources have been used to detect emotions and identify people who may be in need of psychological assistance; the computational techniques used in labeling and diagnosis; and finally, we discuss ways to generate and personalize mental health interventions. The overarching aim of this scoping review is to highlight areas of research where NLP has been applied in the mental health literature and to help develop a common language that draws together the fields of mental health, human-computer interaction and NLP.

A transformation-driven approach for recognizing textual entailment †
ROBERTO ZANOLI, SILVIA COLOMBO
Published online by Cambridge University Press:

16 June 2016, pp. 507-534
- Article
- - You have access
- PDF
- HTML
- Export citation
Textual Entailment is a directional relation between two text fragments. The relation holds whenever the truth of one text fragment, called Hypothesis (H), follows from another text fragment, called Text (T). Up until now, using machine learning approaches for recognizing textual entailment has been hampered by the limited availability of data. We present an approach based on syntactic transformations and machine learning techniques which is designed to fit well with a new type of available data sets that are larger but less complex than data sets used in the past. The transformations are not predefined, but calculated from the data sets, and then used as features in a supervised learning classifier. The method has been evaluated using two data sets: the SICK data set and the EXCITEMENT English data set. While both data sets are of a larger order of magnitude than data sets such as RTE-3, they are also of lower levels of complexity, each in its own way. SICK consists of pairs created by applying a predefined set of syntactic and lexical rules to its T and H pairs, which can be accurately captured by our transformations. The EXCITEMENT English data contains short pieces of text that do not require a high degree of text understanding to be annotated. The resulting AdArte system is simple to understand and implement, but also effective when compared with other existing systems. AdArte has been made freely available with the EXCITEMENT Open Platform, an open source platform for textual inference.

Translating text into pictographs
VINCENT VANDEGHINSTE, INEKE SCHUURMAN LEEN SEVENS, FRANK VAN EYNDE
Published online by Cambridge University Press:

11 November 2015, pp. 217-244
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We describe and evaluate a text-to-pictograph translation system that is used in an online platform for Augmentative and Alternative Communication, which is intended for people who are not able to read and write, but who still want to communicate with the outside world. The system is set up to translate from Dutch into Sclera and Beta, two publicly available pictograph sets consisting of several thousands of pictographs each. We have linked large amounts of these pictographs to synsets or combinations of synsets of Cornetto, a lexical-semantic database for Dutch similar to WordNet. In the translation system, the Dutch input text undergoes shallow linguistic analysis and the synsets of the content words are looked up. The system looks for the nearest pictographs in the lexical-semantic database and displays the message into pictographs. We evaluated the system and results showed a large improvement over the baseline system which consisted of straightforward string-matching between the input text and the filenames of the pictographs.
Our system provides a clear improvement in the communication possibilities of illiterate people. Nevertheless there is room for further improvement.

Social media text normalization for Turkish
GÜLŞEN ERYİǦİT, DİLARA TORUNOǦLU-SELAMET
Published online by Cambridge University Press:

02 June 2017, pp. 835-875
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Text normalization is an indispensable stage in processing noncanonical language from natural sources, such as speech, social media or short text messages. Research in this field is very recent and mostly on English. As is known from different areas of natural language processing, morphologically rich languages (MRLs) pose many different challenges when compared to English. Turkish is a strong representative of MRLs and has particular normalization problems that may not be easily solved by a single-stage pure statistical model. This article introduces the first work on the social media text normalization of an MRL and presents the first complete social media text normalization system for Turkish. The article conducts an in-depth analysis of the error types encountered in Web 2.0 Turkish texts, categorizes them into seven groups and provides solutions for each of them by dividing the candidate generation task into separate modules working in a cascaded architecture. For the first time in the literature, two manually normalized Web 2.0 datasets are introduced for Turkish normalization studies. The exact match scores of the overall system on the provided datasets are 70.40 per cent and 67.37 per cent (77.07 per cent with a case insensitive evaluation).

Improving mention detection for Basque based on a deep error analysis
ANDER SORALUZE, OLATZ ARREGI, XABIER ARREGI, ARANTZA DÍAZ DE ILARRAZA
Published online by Cambridge University Press:

12 July 2016, pp. 351-384
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper presents the improvement process of a mention detector for Basque. The system is rule-based and takes into account the characteristics of mentions in Basque. A classification of error types is proposed based on the errors that occur during mention detection. A deep error analysis distinguishing error types and causes is presented and improvements are proposed. At the final stage, the system obtains an F-measure of 74.57% under the Exact Matching protocol and of 80.57% under Lenient Matching. We also show the performance of the mention detector with gold standard data as input, in order to omit errors caused by the previous stages of linguistic processing. In this scenario, we obtain an F-measure of 85.89% with Strict Matching and of 89.06% with Lenient Matching, i.e., a difference of 11.32 and 8.49 percentage points, respectively. Finally, how improvements in mention detection affect coreference resolution is analysed.

Classifying news versus opinions in newspapers: Linguistic features for domain independence
K. R. KRÜGER, A. LUKOWIAK, J. SONNTAG, S. WARZECHA, M. STEDE
Published online by Cambridge University Press:

21 February 2017, pp. 687-707
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Newspaper text can be broadly divided in the classes ‘opinion’ (editorials, commentary, letters to the editor) and ‘neutral’ (reports). We describe a classification system for performing this separation, which uses a set of linguistically motivated features. Working with various English newspaper corpora, we demonstrate that it significantly outperforms bag-of-lemma and PoS-tag models. We conclude that the linguistic features constitute the best method for achieving robustness against change of newspaper or domain.

Can machine translation systems be evaluated by the crowd alone
YVETTE GRAHAM, TIMOTHY BALDWIN, ALISTAIR MOFFAT, JUSTIN ZOBEL
Published online by Cambridge University Press:

16 September 2015, pp. 3-30
- Article
- - You have access
- PDF
- HTML
- Export citation
Crowd-sourced assessments of machine translation quality allow evaluations to be carried out cheaply and on a large scale. It is essential, however, that the crowd's work be filtered to avoid contamination of results through the inclusion of false assessments. One method is to filter via agreement with experts, but even amongst experts agreement levels may not be high. In this paper, we present a new methodology for crowd-sourcing human assessments of translation quality, which allows individual workers to develop their own individual assessment strategy. Agreement with experts is no longer required, and a worker is deemed reliable if they are consistent relative to their own previous work. Individual translations are assessed in isolation from all others in the form of direct estimates of translation quality. This allows more meaningful statistics to be computed for systems and enables significance to be determined on smaller sets of assessments. We demonstrate the methodology's feasibility in large-scale human evaluation through replication of the human evaluation component of Workshop on Statistical Machine Translation shared translation task for two language pairs, Spanish-to-English and English-to-Spanish. Results for measurement based solely on crowd-sourced assessments show system rankings in line with those of the original evaluation. Comparison of results produced by the relative preference approach and the direct estimate method described here demonstrate that the direct estimate method has a substantially increased ability to identify significant differences between translation systems.

Lemaza : An Arabic why-question answering system *
AQIL M. AZMI, NOUF A. ALSHENAIFI
Published online by Cambridge University Press:

24 August 2017, pp. 877-903
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Question answering systems retrieve information from documents in response to queries. Most of the questions are who- and what-type questions that deal with named entities. A less common and more challenging question to deal with is the why -question. In this paper, we introduce Lemaza (Arabic for why), a system for automatically answering why -questions for Arabic texts. The system is composed of four main components that make use of the Rhetorical Structure Theory. To evaluate Lemaza, we prepared a set of why -question–answer pairs whose answer can be found in a corpus that we compiled out of Open Source Arabic Corpora. Lemaza performed best when the stop-words were not removed. The performance measure was 72.7%, 79.2% and 78.7% for recall, precision and c@1, respectively.

Sentiment analysis in Turkish at different granularity levels
RAHIM DEHKHARGHANI, BERRIN YANIKOGLU, YUCEL SAYGIN, KEMAL OFLAZER
Published online by Cambridge University Press:

21 October 2016, pp. 535-559
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Sentiment analysis has attracted a lot of research interest in recent years, especially in the context of social media. While most of this research has focused on English, there is ample data and interest in the topic for many other languages, as well. In this article, we propose a comprehensive sentiment analysis system for Turkish. We cover different levels of sentiment analysis such as aspect, sentence, and document levels as well as some linguistic issues such as conjunction and intensification in Turkish sentiment analysis. Our system is evaluated on Turkish movie reviews and the obtained accuracies range from sixty per cent to seventy-nine per cent in ternary and binary classification tasks at different levels of analysis.

A scalable architecture for data-intensive natural language processing †
ZUHAITZ BELOKI, XABIER ARTOLA, AITOR SOROA
Published online by Cambridge University Press:

09 May 2017, pp. 709-731
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Computational power needs have greatly increased during the last years, and this is also the case in the Natural Language Processing (NLP) area, where thousands of documents must be processed, i.e., linguistically analyzed, in a reasonable time frame. These computing needs have implied a radical change in the computing architectures and big-scale text processing techniques used in NLP. In this paper, we present a scalable architecture for distributed language processing. The architecture uses Storm to combine diverse NLP modules into a processing chain, which carries out the linguistic analysis of documents. Scalability requires designing solutions that are able to run distributed programs in parallel and across large machine clusters. Using the architecture presented here, it is possible to integrate a set of third-party NLP modules into a unique processing chain which can be deployed onto a distributed environment, i.e., a cluster of machines, so allowing the language-processing modules run in parallel. No restrictions are placed a priori on the NLP modules apart of being able to consume and produce linguistic annotations following a given format. We show the feasibility of our approach by integrating two linguistic processing chains for English and Spanish. Moreover, we provide several scripts that allow building from scratch a whole distributed architecture that can be then easily installed and deployed onto a cluster of machines. The scripts and the NLP modules used in the paper are publicly available and distributed under free licenses. In the paper, we also describe a series of experiments carried out in the context of the NewsReader project with the goal of testing how the system behaves in different scenarios.

A classification approach for detecting cross-lingual biomedical term translations
H. HAKAMI, D. BOLLEGALA
Published online by Cambridge University Press:

14 December 2015, pp. 31-51
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Finding translations for technical terms is an important problem in machine translation. In particular, in highly specialized domains such as biology or medicine, it is difficult to find bilingual experts to annotate sufficient cross-lingual texts in order to train machine translation systems. Moreover, new terms are constantly being generated in the biomedical community, which makes it difficult to keep the translation dictionaries up to date for all language pairs of interest. Given a biomedical term in one language (source language), we propose a method for detecting its translations in a different language (target language). Specifically, we train a binary classifier to determine whether two biomedical terms written in two languages are translations. Training such a classifier is often complicated due to the lack of common features between the source and target languages. We propose several feature space concatenation methods to successfully overcome this problem. Moreover, we study the effectiveness of contextual and character n-gram features for detecting term translations. Experiments conducted using a standard dataset for biomedical term translation show that the proposed method outperforms several competitive baseline methods in terms of mean average precision and top-k translation accuracy.

Supervised approach to recognise Polish temporal expressions and rule-based interpretation of timexes †
JAN KOCOŃ, MICHAŁ MARCIŃCZUK
Published online by Cambridge University Press:

27 September 2016, pp. 385-418
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A key challenge of the Information Extraction in Natural Language Processing is the ability to recognise and classify temporal expressions (timexes). It is a crucial source of information about when something happens, how often something occurs or how long something lasts. Timexes extracted automatically from text, play a major role in many Information Extraction systems, such as question answering or event recognition. We prepared a broad specification of Polish timexes – PLIMEX. It is based on the state-of-the-art annotation guidelines for English, mainly TIMEX2 and TIMEX3 (a part of TimeML – Markup Language for Temporal and Event Expressions). We have expanded our specification for a description of the local meaning of timexes, based on LTIMEX annotation guidelines for English. Temporal description supports further event identification and extends event description model, focussing on anchoring events in time, events ordering and reasoning about the persistence of events. We prepared the specification, which is designed to address these issues, and we annotated all documents in Polish Corpus of Wroclaw University of Technology (KPWr) using our annotation guidelines. We also adapted our Liner2 machine learning system to recognise Polish timexes and we propose two-phase method to select a subset of features for Conditional Random Fields sequence labelling method. This article presents the whole process of corpus annotation, evaluation of inter-annotator agreement, extending Liner2 system with new features and evaluation of the recognition models before and after feature selection with the analysis of statistical significance of differences. Liner2 with presented models is available as open source software under the GNU General Public License.

Developing, evaluating, and refining an automatic generator of diagnostic multiple choice cloze questions to assess children's comprehension while reading *
JACK MOSTOW, YI-TING HUANG, HYEJU JANG, ANDERS WEINSTEIN, JOE VALERI, DONNA GATES
Published online by Cambridge University Press:

14 April 2016, pp. 245-294
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We describe the development, pilot-testing, refinement, and four evaluations of Diagnostic Question Generator (DQGen), which automatically generates multiple choice cloze (fill-in-the-blank) questions to test children's comprehension while reading a given text. Unlike previous methods, DQGen tests comprehension not only of an individual sentence but of the context preceding it. To test different aspects of comprehension, DQGen generates three types of distractors: ungrammatical distractors test syntax; nonsensical distractors test semantics; and locally plausible distractors test inter-sentential processing.
1. (1) A pilot study of DQGen 2012 evaluated its overall questions and individual distractors, guiding its refinement into DQGen 2014.
2. (2) Twenty-four elementary students generated 200 responses to multiple choice cloze questions that DQGen 2014 generated from forty-eight stories. In 130 of the responses, the child chose the correct answer. We define the distractiveness of a distractor as the frequency with which students choose it over the correct answer. The incorrect responses were consistent with expected distractiveness: twenty-seven were plausible, twenty-two were nonsensical, fourteen were ungrammatical, and seven were null.
3. (3) To compare DQGen 2014 against DQGen 2012, five human judges categorized candidate choices without knowing their intended type or whether they were the correct answer or a distractor generated by DQGen 2012 or DQGen 2014. The percentage of distractors categorized as their intended type was significantly higher for DQGen 2014.
4. (4) We evaluated DQGen 2014 against human performance based on 1,486 similarly blind categorizations by twenty-seven judges of sixteen correct answers, forty-eight distractors generated by DQGen 2014, and 504 distractors authored by twenty-one humans. Surprisingly, DQGen 2014 did significantly better than humans at generating ungrammatical distractors and marginally better than humans at generating nonsensical distractors, albeit slightly worse at generating plausible distractors. Moreover, vetting DQGen 2014's output and writing distractors only when necessary would halve the time to write them all, and produce higher quality distractors.

A scaffolding approach to coreference resolution integrating statistical and rule-based models
HEEYOUNG LEE, MIHAI SURDEANU, DAN JURAFSKY
Published online by Cambridge University Press:

21 March 2017, pp. 733-762
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We describe a scaffolding approach to the task of coreference resolution that incrementally combines statistical classifiers, each designed for a particular mention type, with rule-based models (for sub-tasks well-matched to determinism). We motivate our design by an oracle-based analysis of errors in a rule-based coreference resolution system, showing that rule-based approaches are poorly suited to tasks that require a large lexical feature space, such as resolving pronominal and common-noun mentions. Our approach combines many advantages: it incrementally builds clusters integrating joint information about entities, uses rules for deterministic phenomena, and integrates rich lexical, syntactic, and semantic features with random forest classifiers well-suited to modeling the complex feature interactions that are known to characterize the coreference task. We demonstrate that all these decisions are important. The resulting system achieves 63.2 F1 on the CoNLL-2012 shared task dataset, outperforming the rule-based starting point by over seven F1 points. Similarly, our system outperforms an equivalent sieve-based approach that relies on logistic regression classifiers instead of random forests by over four F1 points. Lastly, we show that by changing the coreference resolution system from relying on constituent-based syntax to using dependency syntax, which can be generated in linear time, we achieve a runtime speedup of 550 per cent without considerable loss of accuracy.

POS-tagging arabic texts: A novel approach based on ant colony
CHIRAZ ZRIBI BEN OTHMANE, FERIEL BEN FRAJ, ICHRAF LIMAM
Published online by Cambridge University Press:

11 February 2016, pp. 419-439
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The specificities of the Arabic language, mainly agglutination and vocalization make the task of POS-tagging more difficult than for Indo-European languages. Consequently, POS-tagging texts with good accuracy remains a challenging problem for Arabic language processing applications. In this work, we consider the task of POS-tagging as an optimization problem modeled as a graph whose nodes correspond to all possible grammatical tags given by a morphological analyzer for words in a sentence and the goal is to find the best path (sequence of tags) in this graph. To resolve this problem, we propose a novel approach based on ant colony. Ant colony-based algorithms are among the most efficient methods to resolve optimization problems modeled as a graph. The collaboration of ants having various knowledge creates a collective intelligence and increases efficiency. We have performed experiments on both vocalized and non-vocalized texts and tested two different tagsets containing fine and coarse grained composite tags. The obtained results showed good accuracy rates and hence, the benefits of swarm intelligence for the POS-tagging problem.

Natural Language Processing

Refine listing

Actions for selected content:

Natural Language Engineering, Volume 23 - May 2017

Articles

One, no one and one hundred thousand events: Defining and processing events in an inter-disciplinary perspective *

Multilingual native language identification

Editorial Note

Editorial note

Articles

Generating natural language descriptions using speaker-dependent information †

Deep-neural network approaches for speech recognition with heterogeneous groups of speakers including children †

Natural language processing in mental health applications using non-clinical texts †

A transformation-driven approach for recognizing textual entailment †

Translating text into pictographs

Social media text normalization for Turkish

Improving mention detection for Basque based on a deep error analysis

Classifying news versus opinions in newspapers: Linguistic features for domain independence

Can machine translation systems be evaluated by the crowd alone

Lemaza : An Arabic why-question answering system *

Sentiment analysis in Turkish at different granularity levels

A scalable architecture for data-intensive natural language processing †

A classification approach for detecting cross-lingual biomedical term translations

Supervised approach to recognise Polish temporal expressions and rule-based interpretation of timexes †

Developing, evaluating, and refining an automatic generator of diagnostic multiple choice cloze questions to assess children's comprehension while reading *

A scaffolding approach to coreference resolution integrating statistical and rule-based models

POS-tagging arabic texts: A novel approach based on ant colony

Natural Language Processing

Refine listing

Actions for selected content:

Save Search

Natural Language Engineering, Volume 23 - May 2017

Articles

Editorial Note

Articles