Skip to main content Accessibility help

Interlingual annotation of parallel text corpora: a new framework for annotation and evaluation


This paper focuses on an important step in the creation of a system of meaning representation and the development of semantically annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to annotate multiple translations of foreign-language texts with interlingual content. Three levels of representation are introduced: deep syntactic dependencies (IL0), intermediate semantic representations (IL1), and a normalized representation that unifies conversives, nonliteral language, and paraphrase (IL2). The resulting annotated, multilingually induced, parallel corpora will be useful as an empirical basis for a wide range of research, including the development and evaluation of interlingual NLP systems and paraphrase-extraction systems as well as a host of other research and development efforts in theoretical and applied linguistics, foreign language pedagogy, translation studies, and other related disciplines.

Hide All
Artstein, R., and Poesio, M. 2005a. Bias decreases in proportion to the number of annotators. In Proceedings of FG-MoL 2005, Edinburgh, UK, pp. 141150.
Artstein, R., and Poesio, M. 2005b. Kappa Cubed = Alpha (or Beta). Technical Report NLE Technote 2005-01, University of Essex.
Artstein, R., and Poesio, M. 2008. Inter-coder agreement for computational linguistics. Computational Linguistics 34: 555596.
Baker, C. F., Fillmore, C. J. and Lowe, J. B. 1998. The Berkeley FrameNet project. In Boitet, C., and Whitelock, P. (eds.), Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and Seventeenth International Conference on Computational Linguistics, pp. 8690. San Francisco, CA: Morgan Kaufmann Publishers.
Baker, Kathryn, Bloodgood, Michael, Dorr, Bonnie J., Filardo, Nathaniel W., Levin, L., and Piatko, C. 2010. A modality lexicon and its use in automatic tagging. In Seventh Language Resources and Evaluation Conference (LREC-2010). University of Malta, Malta.
Baker, K., Bethard, S., Bloodgood, M., Brown, R., Callison-Burch, C., Coppersmith, G., Dorr, B., Filardo, W., Giles, K., Irvine, , Ann, K., Mike, L., Lori, M., Justin, M., Jim, M., Scott, P., Aaron, P. A., Piatko, C., Schwartz, L., and Zajic, D 2009. Semantically informed machine translation. Technical Report 002, Human Language Technology Center of Excellence, Summer Camp for Applied Language Exploration, Johns Hopkins University, Baltimore, MD.
Bannard, C., and Callison-Burch, C. 2005. Paraphrasing with bilingual parallel corpora. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), Ann Arbor, MI, pp. 597604.
Barzilay, R., and Lee, L. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In Proceedings of HLT-NAACL, Edmonton, Canada, pp. 1623.
Bateman, J. A., Kasper, R. T, Moore, J. D., and Whitney, R. A. 1989. A general organization of knowledge for natural language processing: The Penman upper model. Technical Report Unpublished research report, USC/Information Sciences Institute, Marina del Rey. ISI-TR-85-029.
Böhmová, A., Hajič, J., Hajičová, E., and Hladká, B. 2003. The prague dependency treebank: three-level annotation scenario. In Abeillé, A. (ed.), Treebanks: Building and Using Syntactically Annotated Corpora, pp. 103128. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Callison-Burch, C., Koehn, P., and Osborne, M. 2006. Improved statistical machine translation using paraphrases. In Proceedings of HLT-NAACL, New York, pp. 1724.
Cerrato, L. 2004. A coding scheme for annotation of feedback phenomena in conversational speech. In Proceedings of the LREC Workshop on Models of Human Behaviour for the Specification and Evaluation of Multimodal Input and Output Interfaces, Lisbon, Portugal, pp. 2528.
Cohen, J. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20 (1): 3746.
Cohen, J. 1968. Weighted Kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin 70: 213220.
di Eugenio, B., and Glass, M. 2004. The Kappa statistic: A second look. Computational Linguistics 30 (1): 95101.
Dice, J. L. R. 1945. Measures of the amount of ecologic association between species. Ecology 26: 297302.
Dolan, W., Quirk, C., and Brockett, C. 2004. Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In Proceedings of COLING 2004. Geneva, Switzerland.
Dorr, B. J. 1993. Machine Translation: A View from the Lexicon. Cambridge, MA: The MIT Press.
Dorr, B. J., Green, R., Levin, L., Rambow, O., Farwell, D., Habash, N., Helmreich, S., Hovy, E., Miller, K. J., Mitamura, T., Reeder, F., and Siddharthan, A. 2004. Semantic annotation and lexico-syntactic paraphrase. In Proceedings of the Workshop on Building Lexical Resources from Semantically Annotated Corpora (LREC-2004). Portugal.
Dorr, B. J., Olsen, M., Habash, N., and Thomas, S. 2001. LCS verb database. Technical Report Online software database, University of Maryland, College Park, MD. [2010, March 29].
Farwell, D. and Helmreich, S. 1999. Pragmatics and translation. Procesamiento de Lenguaje Natural 24: 1936.
Farwell, D., Helmreich, S., Reeder, F., Miller, K., Dorr, B., Habash, N., Hovy, E., Levin, L., Mitamura, T., Rambow, O., and Siddharthan, A. 2004. Interlingual annotation of multilingual text corpus. In Proceedings of the Workshop on Frontiers in Corpus Annotation. Workshop at the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), Boston, MA, pp. 5562.
Fellbaum, C. (ed.) 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: The MIT Press. [2010, March 29].
Fellbaum, C., Grabowski, J., and Landes, S. 1998. Performance and confidence in a semantic annotation task. In Fellbaum, C. (ed.), WordNet: An Electronic Lexical Database, pp. 217239. Cambridge, MA: MIT Press. [2010, March 29].
Fellbaum, C., Palmer, M., Dang, H. T., Delfs, L., and Wolf, S. 2001. Manual and automatic semantic annotation with wordnet. In Proceedings of the Workshop on WordNet and Other Lexical Resources. Pittsburgh, PA.
Ferro, L., Mani, I., Sundheim, B., and Wilson, G. 2001. TIDES temporal annotation guidelines, Version 1.0.2. Technical Report MTR 01W0000041, Mitre, McLean, VA.
Fillmore, C. 1968. The case for case. In Bach, E., and Harms, R. (eds.), Universals in Linguistic Theory, pp. 188. New York: Holt, Rinehart and Winston.
Fillmore, C., Johnson, C., and Petruck, M. 2003. Background to FrameNet. International Journal of Lexicography 16 (3): 235250.
Fleischman, M., Echihabi, A., and Hovy, E. H. 2003. Offline strategies for online question answering: answering questions before they are asked. In Proceedings of the ACL Conference. Sapporo, Japan.
Francis, W. N., and Kucera, H. 1982. Frequency Analysis of English Usage. Boston, MA: Houghton Mifflin.
Funaki, S. 1993. Multi-lingual machine translation (mmt) project. In Proceedings of the MT Summit IV. Washington, DC.
Garside, R., Leech, G., and McEnery, A. M. 1997. Corpus Annotation: Linguistic Information from Computer Text Corpora. London: Addison Wesley Longman.
Gut, U., and Bayerl, P. S. 2004. Measuring the reliability of manual annotations of speech corpora. In Proceedings of Speech Prosody, Nara, Japan, pp. 565568.
Habash, N., and Dorr, B. J. 2003. Interlingua annotation experiment results. In Proceedings of AMTA-2002 Interlingua Reliability Workshop. Tiburon, CA.
Habash, N., Dorr, B., and Monz, C. 2009 Symbolic-to-statistical hybridization: extending generation-heavy machine yranslation. Machine Translation 23 (1): 2363.
Habash, N., Dorr, B. J., and Traum, D. 2003. Hybrid natural language generation from lexical conceptual structures. Machine Translation 18 (2): 81128.
Hajič, J., Vidová-Hladká, B., and Pajas, P. 2001. The prague dependency treebank: annotation structure and support. In Proceedings of the IRCS Workshop on Linguistic Databases, pp. 105114. University of Pennsylvania, Philadelphia, PA.
Helmreich, S., and Farwell, D. 1998. Translation differences and pragmatics-based MT. Machine Translation 13 (1): 1739.
Hirst, G. 2003. Paraphrasing paraphrased. In Keynote address for The Second International Workshop on Paraphrasing: Paraphrase Acquisition and Applications. Association for Computational Linguistics ACL 2003, Sapporo, Japan.
Hovy, E. H., Marcus, M., Palmer, M., Pradhan, S., Ramshaw, L., and Weischedel, R. 2006. OntoNotes: the 90% solution. In Proceedings of the Human Language Technology/North American Association of Computational Linguistics conference (HLT-NAACL 2006), New York.
Hovy, E., Marcus, M., and Weischedel, R. 2003a. OntoBank. In Presentation at Darpa PI Meeting. Arden House, Harriman, New York.
Hovy, E. H., Philpot, A., Ambite, J. L., Arens, Y., Klavans, J., Bourne, W., and Saroz, D. 2003c. Data acquisition and integration in the DGRC's energy data collection project. In Proceedings of the NSF's dg.o 2001 Conference. Los Angeles, CA.
Hovy, E., Philpot, A., Klavans, J. L., Germann, U., and Davis, P. T. 2003b. Extending metadata definitions by automatically extracting and organizing glossary definitions. In Proceedings of the National Conference on Digital Government Research. Boston, MA.
Jaccard, P. 1908. Nouvelles recherches sur la distribution florale. Bulletin de la Societe Vaudoise des Sciences Naturelles 44: 223–70.
Jackendoff, R. 1972. Grammatical relations and functional structure. In Semantic Interpretation in Generative Grammar. Cambridge, MA: The MIT Press.
Kingsbury, P., and Palmer, M. 2002. From treebank to PropBank. In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC-2002). Las Palmas, Spain.
Kingsbury, P., Snyder, B., Xue, N., and Palmer, M. 2003. PropBank as a bootstrap for Richer annotation schemes. In Sixth Workshop on Interlinguas: Annotations and Translations, MT Summit IX. New Orleans, LA.
Kipper, K., Palmer, M., and Rambow, O. 2002. Extending PropBank with VerbNet semantic predicates. In Workshop on Applied Interlinguas (AMTA-2002). Tiburon, CA.
Knight, K., and Luk, S. K. 1994. Building a large-scale knowledge base for machine translation. In Proceedings of AAAI. Seattle, WA.
Kozlowski, R., McCoy, K. F., and Vijay-Shanker, K. 2003. Generation of single-sentence paraphrases from predicate/argument structure using lexico-grammatical resources. In Proceedings of the Second International Workshop on Paraphrasing: Paraphrase Acquisition and Applications (IWP2003), Sapporo, Japan, pp. 18. ACL 2003.
Krippendorff, K. 1980. Content Analysis: An Introduction to Its Methodology. Beverly Hills, CA: Sage Publications.
Krippendorff, K. 2007. Computing Krippendorff's alpha-reliability. [2010, March 29].
Levin, B., and Rappaport-Hovav, M. 1998. From lexical semantics to argument realization. In Borer, H. (ed.), Handbook of Morphosyntax and Argument Structure. Dordrecht: Kluwer Academic Publishers.
Madnani, N., Ayan, N. F., Resnik, P., and Dorr, B. 2007. Using paraphrases for parameter tuning in statistical machine translation. In Proceedings of the ACL Workshop on Statistical Machine Translation. Prague, Czech Republic.
Mahesh, K., and Nirenburg, S. 1995. A situated ontology for practical NLP. In Proceedings of the Workshop on Basic Ontological Issues in Knowledge Sharing, International Joint Conference on Artificial Intelligence (IJCAI-95). Montreal, Canada.
Marcus, M. P., Santorini, B., and Marcinkiewicz, M. A. 1994. Building a large annotated corpus of english: the Penn treebank. Computational Linguistics, 19 (2): 313330.
Martins, T., Rino, L. H. Machado, Nunes, M. G. Volpe, Montilha, G., and Novais, O. O. 2000. An interlingua aiming at communication on the web: how language-independent can it be? In Proceedings of Workshop on Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP, ANLP-NAACL. Seattle, WA.
Mel'čuk, I. A. 1988. Dependency Syntax: Theory and Practice. New York: State University of New York Press.
Mitamura, T., Miller, K. J., Dorr, B. J., Farwell, D., Habash, N., Levin, L., Helmreich, S., Hovy, E., Levin, L., Rambow, O., Reeder, F., and Siddharthan, A. 2004. Semantic Annotation of Multilingual Text Corpora. Portugal.
Miyoshi, H., Sugiyama, K., Kobayashi, M., and Ogino, T. 1996. An overview of the edr electronic dictionary and the current status of its utilization. In Proceedings of the 16th conference on Computational Linguistics, Copenhagen, Denmark, pp. 10901093.
Moore, R. C. 1994. Semantic evaluation for spoken-language systems. In Proceedings of the 1994 ARPA Human Language Technology Workshop. Princeton, NJ.
Palmer, M., Dang, H. T., and Fellbaum, C. 2005a. Making fine-grained and coarse-grained sense distinctions. Journal of Natural Language Engineering 13: 137163.
Palmer, M., Gildea, D., and Kingsbury, P. 2005b. The proposition bank: a corpus annotated with semantic roles. Computational Linguistics 31 (1): 71106.
Pang, B., Knight, K., and Marcu, D. 2003. Syntax-based alignment of multiple translations: extracting paraphrases and generating new sentences. In Proceedings of HLT-NAACL. Edmonton, Canada.
Passonneau, R. 2004. Computing reliability for coreference annotation. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC). Lisbon, Portugal.
Passonneau, R. 2006. Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC). Genoa, Italy.
Passonneau, R. J. 2010. Formal and functional assessment of the pyramid method for summary content evaluation. Natural Language Engineering 16: 107131.
Passonneau, R., Habash, N., and Rambow, O. 2006. Inter-annotator agreement on a multilingual semantic annotation task. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC). Genoa, Italy.
Passonneau, R., Nenkova, A., McKeown, K., and Sigelman, S. 2005. Applying the pyramid method in DUC 2005. In Proceedings of the Document Understanding Conference (DUC) Workshop. Vancouver, Canada.
Passonneau, R. J., Salleb-Aouissi, A., and Ide, N. 2009. Making sense of word sense variation. In Proceedings of the NAACL-HLT 2009 Workshop on Semantic Evalutions: Recent Achievements and Future Directions (SEW-2009), Boulder, CO, pp. 29.
Philpot, A., Fleischman, M., and Hovy, E. H. 2003. Semi-automatic construction of a general purpose ontology. In Proceedings of the International Lisp Conference. New York.
Philpot, A., Hovy, E., and Pantel, P. 2005. The omega ontology. In Proceedings of IJCAI. Edinburgh, Scotland.
Pradhan, S., Hovy, E. H., Marcus, M., Palmer, M., Ramshaw, L., and Weischedel, R. 2007. OntoNotes: a unified relational semantic representation. In Proceedings of the First IEEE International Conference on Semantic Computing (ICSC-07), Irvine, CA, pp. 517524.
Rambow, O., Dorr, B., Farwell, D., Green, R., Habash, N., Helmreich, S., Hovy, E., Levin, L., Miller, K. J., Mitamura, T., Reeder, F., and Advaith, S. 2006. Parallel syntactic annotation of multiple languages. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC2006). Genoa, Italy.
Reeder, F., Dorr, B., Farwell, D., Habash, N., Helmreich, S., Hovy, E., Levin, L., Mitamura, T., Miller, K., Rambow, O., and Siddharthan, A. 2004. Interlingual Annotation for MT Development. Georgetown University, Washington, DC.
Reidsma, D., and Carletta, J. 2008. Reliability measurement without limits. Computational Linguistics 34: 319326.
Rinaldi, F., Dowdall, J., Kaljurand, K., Hess, M., and Moll, D. 2003. Exploiting paraphrases in a question-answering system. In Proceedings of the Second International Workshop on Paraphrasing: Paraphrase Acquisition and Applications (IWP2003), Edmonton, Canada, pp. 2532. ACL 2003.
Scott, W. 1955. Reliability of content analysis: the case of nominal scale coding. Public Opinion Quarterly 17: 321325.
Siegel, S., and Castellan, N. J. 1988. Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill.
Stowell, T. 1981. Origins of Phrase Structure. PhD thesis, MIT.
Tapanainen, P., and Jarvinen, T. 1997. A non-projective dependency parser. In Proceedings of the Fifth Conference on Applied Natural Language Processing and Association for Computational Linguistics. Washington Marriott Hotel, Washington, DC.
Véronis, J. 2000. From the Rosetta stone to the information society: a survey of parallel text processing. In Véronis, J. (ed.), Parallel Text Processing: Alignment and Use of Translation Corpora, pp. 124. London: Kluwer Academic Publishers.
Walker, K., Bamba, M., Miller, D., Ma, X., Cieri, C., and Doddington, G. 2003. Multiple-translation arabic corpus, Part 1. Technical Report catalog number LDC2003T18 and ISBN 1-58563-276-7, Linguistic Data Consortium (LDC).
White, J., and O'Connell, T. 1994 The ARPA MT evaluation methodologies: evolution, lessons, and future approaches. In Proceedings of the Conference of the Association for Machine Translation in the Americas. Columbia, MD.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed