Hostname: page-component-77f85d65b8-8wtlm Total loading time: 0 Render date: 2026-04-19T18:20:41.443Z Has data issue: false hasContentIssue false

Annotating argumentative structure in English-as-a-Foreign-Language learner essays

Published online by Cambridge University Press:  26 August 2021

Jan Wira Gotama Putra*
Affiliation:
School of Computing, Tokyo Institute of Technology, Tokyo, Japan
Simone Teufel
Affiliation:
School of Computing, Tokyo Institute of Technology, Tokyo, Japan Tokyo Tech World Research Hub Initiative (WRHI), Tokyo Institute of Technology, Tokyo, Japan Department of Computer Science and Technology, University of Cambridge, Cambridge CB2 1TN, UK
Takenobu Tokunaga
Affiliation:
School of Computing, Tokyo Institute of Technology, Tokyo, Japan
*
*Corresponding author. E-mail: gotama.w.aa@m.titech.ac.jp
Rights & Permissions [Opens in a new window]

Abstract

Argument mining (AM) aims to explain how individual argumentative discourse units (e.g. sentences or clauses) relate to each other and what roles they play in the overall argumentation. The automatic recognition of argumentative structure is attractive as it benefits various downstream tasks, such as text assessment, text generation, text improvement, and summarization. Existing studies focused on analyzing well-written texts provided by proficient authors. However, most English speakers in the world are non-native, and their texts are often poorly structured, particularly if they are still in the learning phase. Yet, there is no specific prior study on argumentative structure in non-native texts. In this article, we present the first corpus containing argumentative structure annotation for English-as-a-foreign-language (EFL) essays, together with a specially designed annotation scheme. The annotated corpus resulting from this work is called “ICNALE-AS” and contains 434 essays written by EFL learners from various Asian countries. The corpus presented here is particularly useful for the education domain. On the basis of the analysis of argumentation-related problems in EFL essays, educators can formulate ways to improve them so that they more closely resemble native-level productions. Our argument annotation scheme is demonstrably stable, achieving good inter-annotator agreement and near-perfect intra-annotator agreement. We also propose a set of novel document-level agreement metrics that are able to quantify structural agreement from various argumentation aspects, thus providing a more holistic analysis of the quality of the argumentative structure annotation. The metrics are evaluated in a crowd-sourced meta-evaluation experiment, achieving moderate to good correlation with human judgments.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press
Figure 0

Figure 1. Closure over restatement relation. Solid links are explicit, dashed lines implicit. (a) Annotation A. (b) Annotation B.

Figure 1

Figure 2. Argumentative discourse structure annotation of example text from page 19.

Figure 2

Figure 3. Example of restatement closures. Solid links are explicit, dashed lines implicit. (a) Annotation A. (b) Annotation B. (c) Closure of A. (d) Closure of B.

Figure 3

Figure 4. Example of descendant set matching between annotation A (left) and B (right). Exact-matching scores in red (to the left of each node); partial-matching scores in green to the right. Gray nodes represent non-AC.

Figure 4

Figure 5. Illustration of an “AMT task.”

Figure 5

Table 1. Evaluation result of structure-based inter-annotator agreement metrics

Figure 6

Table 2. Intra-annotator agreement of annotator A

Figure 7

Table 3. Confusion matrix of annotator A in intra-annotator agreement study

Figure 8

Table 4. Inter-annotator agreement results

Figure 9

Table 5. Confusion matrix between annotator A and B in the inter-annotator agreement study

Figure 10

Table 6. Statistics of the final corpus. Sentences and tokens are automatically segmented using nltk (Bird, Klein, and Loper 2009)

Figure 11

Table 7. Distribution of relation direction

Figure 12

Figure 6. An excerpt of annotation for essay ‘W_PAK_SMK0_022_B1_1_EDIT’. (a) Original essay. (b) A potential improvement for (a).

Figure 13

Figure 7. An excerpt of annotation for essay “W_CHN_SMK0_045_A2_0_EDIT”. (a) Original essay. (b) A potential improvement for (a).