Skip to main content
×
×
Home

Design and realization of a modular architecture for textual entailment

  • SEBASTIAN PADÓ (a1), TAE-GIL NOH (a2), ASHER STERN (a3), RUI WANG (a4) and ROBERTO ZANOLI (a5)...
Abstract

A key challenge at the core of many Natural Language Processing (NLP) tasks is the ability to determine which conclusions can be inferred from a given natural language text. This problem, called the Recognition of Textual Entailment (RTE), has initiated the development of a range of algorithms, methods, and technologies. Unfortunately, research on Textual Entailment (TE), like semantics research more generally, is fragmented into studies focussing on various aspects of semantics such as world knowledge, lexical and syntactic relations, or more specialized kinds of inference. This fragmentation has problematic practical consequences. Notably, interoperability among the existing RTE systems is poor, and reuse of resources and algorithms is mostly infeasible. This also makes systematic evaluations very difficult to carry out. Finally, textual entailment presents a wide array of approaches to potential end users with little guidance on which to pick. Our contribution to this situation is the novel EXCITEMENT architecture, which was developed to enable and encourage the consolidation of methods and resources in the textual entailment area. It decomposes RTE into components with strongly typed interfaces. We specify (a) a modular linguistic analysis pipeline and (b) a decomposition of the ‘core’ RTE methods into top-level algorithms and subcomponents. We identify four major subcomponent types, including knowledge bases and alignment methods. The architecture was developed with a focus on generality, supporting all major approaches to RTE and encouraging language independence. We illustrate the feasibility of the architecture by constructing mappings of major existing systems onto the architecture. The practical implementation of this architecture forms the EXCITEMENT open platform. It is a suite of textual entailment algorithms and components which contains the three systems named above, including linguistic-analysis pipelines for three languages (English, German, and Italian), and comprises a number of linguistic resources. By addressing the problems outlined above, the platform provides a comprehensive and flexible basis for research and experimentation in textual entailment and is available as open source software under the GNU General Public License.

Copyright
References
Hide All
Agirre, E., Cer, D., Diab, M., and Gonzalez-Agirre, A., 2012. SemEval-2012 task 6: a pilot on semantic textual similarity. In Proceedings of the International Workshop on Semantic Evaluation, Montréal, Canada, pp. 385–93.
Androutsopoulos, I., and Malakasiotis, P., 2010. A survey of paraphrasing and textual entailment methods. Journal of Artificial Intelligence Research 38: 135–87.
Baker, C. F., Fillmore, C. J., and Lowe, J. B., 1998. The Berkeley FrameNet project. In Proceedings of the Joint International Conference on Computational Linguistics and Annual Meeting of the Association for Computational Linguistics, Montréal, QC, pp. 8690.
Bar-Haim, R., Dagan, I., Greental, I., and Shnarch, E., 2007. Semantic inference at the lexical-syntactic level. In Proceedings of the Annual Meeting of the American Association for Artificial Intelligence, Vancouver, BC, pp. 871–6.
Bar-Haim, R., Szpektor, I., and Glickman, O., 2005. Definition and analysis of intermediate entailment levels. In Proceedings of the ACL-PASCAL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, Ann Arbor, MI, pp. 5560.
Ben Aharon, R., Szpektor, I., and Dagan, I., 2010. Generating entailment rules from FrameNet. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 241–6.
Bentivogli, L., Magnini, B., Dagan, I., Trang Dang, H., and Giampiccolo, D. 2009. The fifth PASCAL recognising textual entailment challenge. In Proceedings of the TAC Workshop on Textual Entailment, Gaithersburg, MD.
Berant, J. 2012. Global Learning of Textual Entailment Graphs. PhD thesis, Tel Aviv University, Israel.
Berant, J., Dagan, I., and Goldberger, J., 2012. Learning entailment relations by global graph structure optimization. Computational Linguistics 38 (1): 73111.
Berger, A., Caruana, R., Cohn, D., Freitag, D., and Mittal, V., 2000. Bridging the lexical chasm: statistical approaches to answer-finding. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, pp. 192–9.
Bergmair, R. 2009. A proposal on evaluation measures for RTE. In Proceedings of the Workshop on Applied Textual Inference, Singapore, pp. 1017.
Bobrow, D., Condoravdi, C., Crouch, R., Paiva, V. De, Karttunen, L., King, T., Nairn, R., Price, C., and Zaenen, A., 2007. Precision-focused textual inference. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic, pp. 1621.
Bos, J., and Markert, K., 2005. Recognising textual entailment with logical inference. In Proceedings of the Joint Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, BC, pp. 628635.
Bos, J., and Markert, K., 2006. When logical inference helps determining textual entailment (and when it doesn’t). In Proceedings of the Second PASCAL Challenges Workshop on Recognising Textual Entailment, Venice, Italy, pp. 98103.
Cabrio, E., and Magnini, B., 2011. Towards component-based textual entailment. In Proceedings of the International Conference on Computational Semantics, Oxford, UK, pp. 320–4.
Callmeier, U., Eisele, A., Schäfer, U., and Siegel, M., 2004. The DeepThought core architecture framework. In Proceedings of the International Conference on Language Resources and Evaluation, Lisbon, Portugal, pp. 1205–8.
Castillo, J., 2010. A machine learning approach for recognizing textual entailment in Spanish. In Proceedings of the NAACL/HLT Young Investigators Workshop on Computational Approaches to Languages of the Americas, Los Angeles, CA, pp. 62–7.
Chklovski, T., and Pantel, P., 2004. VerbOcean: mining the web for fine-grained semantic verb relations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, pp. 3340.
Clark, P., Harrison, P., Thompson, J., Murray, W., Hobbs, J., and Fellbaum, C. 2007. On the role of lexical and world knowledge in RTE3. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, pp. 54–9.
Clarke, J., Srikumar, V., Sammons, M., and Roth, D., 2012. An NLP curator (or: How I learned to stop worrying and love NLP pipelines). In Proceedings of the International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 3276–83.
Cohen, K. Bretonnel, and Carpenter, B. (eds.), 2008. Proceedings of the ACL Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics.
Crysmann, B., Frank, A., Kiefer, B., Müller, S., Neumann, G., Piskorski, J., Schäfer, U., Siegel, M., Uszkoreit, H., Xu, F., Becker, M., and Krieger, H.-U., 2002. An integrated architecture for shallow and deep processing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 441–8.
Cunningham, H., 2002. GATE, a general architecture for text engineering. Computers and the Humanities 36 (2): 223–54.
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., Damljanovic, D., Heitz, T., Greenwood, M. A., Saggion, H., Petrak, J., Li, Y., and Peters, W., 2011. Text Processing with GATE (Version 6). Sheffield, UK: University of Sheffield.
Curran, J., 2003. Blueprint for a high-performance NLP infrastructure. In Proceedings of the HLT-NAACL Workshop on Software Engineering and Architecture of Language Technology Systems, Berkeley, CA, pp. 3944.
Dagan, I., Glickman, O., and Magnini, B., 2005. The PASCAL recognising textual entailment challenge. In Proceedings of the PASCAL Challenges Workshop on Recognising Textual Entailment, Southampton, UK, pp. 177–90.
Dagan, I., Roth, D., and Zanzotto, F. M., 2012. Recognizing Textual Entailment: Models and Applications. Synthesis Lectures on Human Language Technologies number 17. New York: Morgan & Claypool.
de Marneffe, M.-C., and Manning, C. D., 2008. The Stanford typed dependencies representation. In Proceedings of the COLING Workshop on Cross-Framework and Cross-Domain Parser Evaluation, Manchester, UK, pp. 18.
de Marneffe, M.-C., Rafferty, A. N., and Manning, C. D., 2008. Finding contradictions in text. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Columbus, OH, pp. 1039–47.
Efron, B., and Tibshirani, R. J., 1993. An Introduction to the Bootstrap. New York: Chapman and Hall.
Fellbaum, C. (ed.), 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
Ferrucci, D., and Lally, A., 2004. UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering 10 (3–4): 327–48.
Finkel, J. R., Grenager, T., and Manning, C., 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the Annual Meeting on Association for Computational Linguistics, Ann Arbor, MI, pp. 363–70.
Goldberg, Y., and Elhadad, M., 2010. An efficient algorithm for easy-first non-directional dependency parsing. In Proceedings of the Annual Conference of the North American Chapter of the ACL, Los Angeles, CA, pp. 742–50.
Gurevych, I., Mühlhäuser, M., Müller, C., Steimle, J., Weimer, M., and Zesch, T. 2007. Darmstadt knowledge processing repository based on UIMA. In Proceedings of the First Workshop on Unstructured Information Management Architecture at the Conference of the Society for Computational Linguistics and Language Technology, Tübingen, Germany.
Haghighi, A., and Klein, D. 2009. Simple coreference resolution with rich syntactic and semantic features. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 1152–61.
Hajič, J., Ciaramita, M., Johansson, R., Kawahara, D., Martì, M. A., Màrquez, L., Meyers, A., Nivre, J., Padó, S., Štepánek, J., Stranák, P., Surdeanu, M., Xue, N., and Zhang, Y., 2009. The CoNLL-2009 shared task: syntactic and semantic dependencies in multiple languages. In Proceedings of the Conference of Natural Language Learning, Boulder, CO, pp. 118.
Harabagiu, S., and Hickl, A., 2006. Methods for using textual entailment in open-domain question answering. In Proceedings of the International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 905–12.
Harabagiu, S., Hickl, A., and Lacatusu, F. 2007. Satisfying information needs with multi-document summaries. Information Processing and Management 43 (6): 1619–42.
Harmeling, S., 2009. Inferring textual entailment with a probabilistically sound calculus. Journal of Natural Language Engineering 15 (4): 459–77.
Hinrichs, M., Zastrow, T., and Hinrichs, E., 2010. WebLicht: web-based LRT services in a distributed eScience infrastructure. In Proceedings of the International Conference on Language Resources and Evaluation, Valletta, Malta, pp. 489–93.
Ide, N., and Suderman, K., 2007. GrAF: a graph-based format for linguistic annotations. In Proceedings of the ACL Linguistic Annotation Workshop, Prague, Czech Republic, pp. 18.
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E., 2007. Moses: open source toolkit for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, pp. 177–80.
Lin, D., and Pantel, P., 2002. Discovery of inference rules for question answering. Journal of Natural Language Engineering 7 (4): 343–60.
Lotan, A., Stern, A., and Dagan, I., 2013. TruthTeller: annotating predicate truth. In Proceedings of the Annual Meeting of the North American Chapter of the ACL, Atlanta, GA, pp. 752–7.
MacCartney, B., Grenager, T., de Marneffe, M.-C., Cer, D., and Manning, C. D., 2006. Learning to recognize features of valid textual entailments. In Proceedings of the Human Language Technology Conference of the NAACL, New York City, NY, pp. 41–8.
MacCartney, B., and Manning, C. D. 2007, June. Natural logic for textual inference. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic, pp. 193200.
Mehdad, Y., Negri, M., and Federico, M., 2010. Towards cross-lingual textual entailment. In Proceedings of the Annual Conference of the North American Chapter of the ACL, Los Angeles, CA, pp. 321–4.
Mehdad, Y., Negri, M., and Federico, M., 2011. Using bilingual parallel corpora for cross-lingual textual entailment. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Portland, OR, pp. 1336–45.
Meyers, A. (edr.), 2005. Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky. Stroudsburg, PA: Association for Computational Linguistics.
Mirkin, S., Dagan, I., and Padó, S., 2010. Assessing the role of discourse references in entailment inference. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 1209–19.
Mirkin, S., Dagan, I., and Shnarch, E., 2009. Evaluating the inferential utility of lexical-semantic resources. In Proceedings of the Conference of the European Chapter of the ACL, Athens, Greece, pp. 558–66.
Monz, C., and de Rijke, M., 2001. Light-weight entailment checking for computational semantics. In Proceedings of the Conference on Inference in Computational Semantics, Siena, Italy, pp. 5972.
Nairn, R., Condoravdi, C., and Karttunen, L., 2006. Computing relative polarity for textual inference. In Proceedings of the Conference on Inference in Computational Semantics, Buxton, UK, pp. 6776.
Negri, M., Bentivogli, L., Mehdad, Y., Giampiccolo, D., and Marchetti, A., 2011. Divide and conquer: crowdsourcing the creation of cross-lingual textual entailment corpora. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 670–9.
Negri, M., Kouylekov, M., Magnini, B., Mehdad, Y., and Cabrio, E., 2009. Towards extensible textual entailment engines: the EDITS package. In Proceeding of the Conference of the Italian Association for Artificial Intelligence, Reggio Emilia, Italy, pp. 314–23.
Negri, M., Marchetti, A., Mehdad, Y., Bentivogli, L., and Giampiccolo, D., 2012. SemEval-2012 task 8: cross-lingual textual entailment for content synchronization. In The Joint Conference on Lexical and Computational Semantics and Sixth International Workshop on Semantic Evaluation, Montréal, Canada, pp. 399407.
Nielsen, R. D., Ward, W., and Martin, J. H., 2009. Recognizing entailment in intelligent tutoring systems. Journal of Natural Language Engineering 15 (4): 479501.
Nivre, J., and Nilsson, J., 2005. Pseudo-projective dependency parsing. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, MI, pp. 99106.
Padó, S., Cer, D., Galley, M., Jurafsky, D., and Manning, C. D. 2009. Measuring machine translation quality as semantic equivalence: a metric based on entailment features. Machine Translation, 23 (2–3): 181–93.
Patrick, J., and Cunningham, H. (eds.), 2003. Proceedings of the HLT-NAACL Workshop on Software Engineering and Architecture of Language Technology Systems. Stroudsburg, PA: Association for Computational Linguistics.
Peñas, A., Rodrigo, Á., Sama, V., and Verdejo, F., 2008. Testing the reasoning for question answering validation. Journal of Logic and Computation 18: 459474.
Pianta, E., Girardi, C., and Zanoli, R., 2008. The TextPro tool suite. In Proceedings of the International Conference on Language Resources and Evaluation, Marrakech, Morocco, pp. 2603–7.
Romano, L., Kouylekov, M., Szpektor, I., Dagan, I., and Lavelli, A., 2006. Investigating a generic paraphrase-based approach for relation extraction. In Proceedings of the Conference of the European Chapter of the ACL, Trento, Italy, pp. 401–8.
Sammons, M., Vydiswaran, V., and Roth, D. 2012. Recognizing textual entailment. In Bikel, Daniel M. and Zitouni, I. (eds.), Multilingual Natural Language Applications: From Theory to Practice. Englewood Cliffs, NJ: Prentice Hall, pp. 209258.
Schäfer, U. 2007. Integrating Deep and Shallow Natural Language Processing Components – Representations and Hybrid Architectures. PhD thesis, Saarland University, Saarbrücken, Germany.
Schmid, H., 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing, Manchester, UK, pp. 44–9.
Stern, A., and Dagan, I., 2011. A confidence model for syntactically motivated entailment proofs. In Proceedings of the Conference on Recent Advances in Natural Language Processing, Borovets, Bulgaria, pp. 455–62.
Tatu, M., and Moldovan, D., 2005. A semantic approach to recognizing textual entailment. In Proceedings of the Joint Conference on Human Language Technology and Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, pp. 371–8.
Toutanova, K., and Manning, C. D. 2000. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In Proceedings of the Conference on Empirical methods in Natural Language Processing, Hong Kong, pp. 6370.
Wang, R. 2011. Intrinsic and Extrinsic Approaches to Recognizing Textual Entailment. PhD. thesis, Saarland University, Saarbrücken, Germany.
Wang, R., and Neumann, G. 2008a. An accuracy-oriented divide-and-conquer strategy for recognizing textual entailment. In Proceedings of the TAC Workshop on Textual Entailment, Gaithersburg, MD.
Wang, R., and Neumann, G., 2008b. Information synthesis for answer validation. In Working Notes for the CLEF 2008 Workshop, Aarhus, Denmark, pp. 742–5.
Wang, R., and Zhang, Y. 2009. Recognizing textual relatedness with predicate-argument structures. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 784–92.
Zanzotto, F. M., Pennacchiotti, M., and Moschitti, A., 2009. A machine learning approach to textual entailment recognition. Journal of Natural Language Engineering 15 (4): 551–82.
Zeller, B. D., and Padó, S., 2013. A search task dataset for German textual entailment. In Proceedings of the International Conference on Computational Semantics, Potsdam, Germany, pp. 288–99.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed