Detecting errors in English article usage by non-native speakers

NA-RAE HAN; MARTIN CHODOROW; CLAUDIA LEACOCK

doi:10.1017/S1351324906004190

Abstract

One of the most difficult challenges faced by non-native speakers of English is mastering the system of English articles. We trained a maximum entropy classifier to select among a/an, the, or zero article for noun phrases (NPs), based on a set of features extracted from the local context of each. When the classifier was trained on 6 million NPs, its performance on published text was about 83% correct. We then used the classifier to detect article errors in the TOEFL essays of native speakers of Chinese, Japanese, and Russian. These writers made such errors in about one out of every eight NPs, or almost once in every three sentences. The classifier's agreement with human annotators was 85% (kappa = 0.48) when it selected among a/an, the, or zero article. Agreement was 89% (kappa = 0.56) when it made a binary (yes/no) decision about whether the NP should have an article. Even with these levels of overall agreement, precision and recall in error detection were only 0.52 and 0.80, respectively. However, when the classifier was allowed to skip cases where its confidence was low, precision rose to 0.90, with 0.40 recall. Additional improvements in performance may require features that reflect general knowledge to handle phenomena such as indirect prior reference. In August 2005, the classifier was deployed as a component of Educational Testing Service's Criterion$^{SM}$ Online Writing Evaluation Service.

Information

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Lee, Yong‐Won Gentile, Claudia and Kantor, Robert 2008. ANALYTIC SCORING OF TOEFL® CBT ESSAYS: SCORES FROM HUMANS AND E‐RATER®. ETS Research Report Series, Vol. 2008, Issue. 1,

Burstein, Jill 2009. Computational Linguistics and Intelligent Text Processing. Vol. 5449, Issue. , p. 6.

Chodorow, Martin Gamon, Michael and Tetreault, Joel 2010. The utility of article and preposition error correction systems for English language learners: Feedback and assessment. Language Testing, Vol. 27, Issue. 3, p. 419.

Xi, Xiaoming 2010. Automated scoring and feedback systems: Where are we and where are we heading?. Language Testing, Vol. 27, Issue. 3, p. 291.

Chang, Ru-Yng Wu, Chung-Hsien and Prasetyo, Philips Kokoh 2010. Error diagnosis using penalized probabilistic FOIL for Chinese as a Second Language learner. p. 401.

Lee, Yong-Won Gentile, Claudia and Kantor, Robert 2010. Toward Automated Multi-trait Scoring of Essays: Investigating Links among Holistic, Analytic, and Text Feature Scores. Applied Linguistics, Vol. 31, Issue. 3, p. 391.

Nagata, Ryo and Kawai, Atsuo 2011. Knowlege-Based and Intelligent Information and Engineering Systems. Vol. 6882, Issue. , p. 144.

NAGATA, Ryo and KAWAI, Atsuo 2012. A Method for Detecting Determiner Errors Designed for the Writing of Non-native Speakers of English. IEICE Transactions on Information and Systems, Vol. E95-D, Issue. 1, p. 230.

Chang, Ru-Yng Wu, Chung-Hsien and Prasetyo, Philips Kokoh 2012. Error Diagnosis of Chinese Sentences Using Inductive Learning Algorithm and Decomposition-Based Testing Mechanism. ACM Transactions on Asian Language Information Processing, Vol. 11, Issue. 1, p. 1.

Burstein, Jill 2012. The Encyclopedia of Applied Linguistics.

Litman, Diane 2012. Adaptive Technologies for Training and Education. p. 247.

Murphy-Hill, Emerson Barik, Titus and Black, Andrew P. 2013. Interactive ambient visualizations for soft advice. Information Visualization, Vol. 12, Issue. 2, p. 107.

Umezawa, Jiro Mizuno, Junta Okazaki, Naoaki and Inui, Kentaro 2013. Computational Linguistics and Intelligent Text Processing. Vol. 7817, Issue. , p. 559.

Zhou, Ya Wang, Xiaojuan Huang, Guimin Zeng, Xiaolan and Zeng, Xiangyan 2013. Proceedings of the 2012 International Conference on Information Technology and Software Engineering. Vol. 211, Issue. , p. 541.

Rozovskaya, Alla and Roth, Dan 2014. Building a State-of-the-Art Grammatical Error Correction System. Transactions of the Association for Computational Linguistics, Vol. 2, Issue. , p. 419.

Yang, Ping 2014. Back to basics: Cracking a nut in using English indefinite articles. English Today, Vol. 30, Issue. 4, p. 28.

Vázquez-Cano, Esteban Martín-Monje, Elena and Fernández-Álvarez, Miguel 2014. El rol de las e-rúbricas en la evaluación de materiales digitales para la enseñanza de lenguas en entornos virtuales de aprendizaje. REDU. Revista de Docencia Universitaria, Vol. 12, Issue. 1, p. 135.

Ferrero, Carmen López Renau, Irene Nazar, Rogelio and Torner, Sergi 2014. Computer-assisted Revision in Spanish Academic Texts: Peer-assessment. Procedia - Social and Behavioral Sciences, Vol. 141, Issue. , p. 470.

2014. Automated Grammatical Error Detection for Language Learners, Second Edition.

Tetreault, Joel Chodorow, Martin and Madnani, Nitin 2014. Bucking the trend: improved evaluation and annotation practices for ESL error detection systems. Language Resources and Evaluation, Vol. 48, Issue. 1, p. 5.

Download full list

Article contents

Detecting errors in English article usage by non-native speakers

Abstract

Information

Access options

Article purchase

Temporarily unavailable

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

Detecting errors in English article usage by non-native speakers

Abstract

Information

Access options

Article purchase

Temporarily unavailable

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests