Skip to main content
    • Aa
    • Aa

A method based on rules and machine learning for logic form identification in Spanish


Logic Forms (LF) are simple, first-order logic knowledge representations of natural language sentences. Each noun, verb, adjective, adverb, pronoun, preposition and conjunction generates a predicate. LF systems usually identify the syntactic function by means of syntactic rules but this approach is difficult to apply to languages with a high syntax flexibility and ambiguity, for example, Spanish. In this study, we present a mixed method for the derivation of the LF of sentences in Spanish that allows the combination of hard-coded rules and a classifier inspired on semantic role labeling. Thus, the main novelty of our proposal is the way the classifier is applied to generate the predicates of the verbs, while rules are used to translate the rest of the predicates, which are more straightforward and unambiguous than the verbal ones. The proposed mixed system uses a supervised classifier to integrate syntactic and semantic information in order to help overcome the inherent ambiguity of Spanish syntax. This task is accomplished in a similar way to the semantic role labeling task. We use properties extracted from the AnCora-ES corpus in order to train a classifier. A rule-based system is used in order to obtain the LF from the rest of the phrase. The rules are obtained by exploring the syntactic tree of the phrase and encoding the syntactic production rules. The LF algorithm has been evaluated by using shallow parsing with some straightforward Spanish phrases. The verb argument labeling task achieves 84% precision and the proposed mixed LFi method surpasses 11% a system based only on rules.

Hide All

This work has been partially funded by the ATTOS project (TIN2012-38536-C03-01) from the Spanish Government and the AORESCU project (TIC 07684) from the Andalucía Government.

Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

R. Agerri , and A. Peñas 2010. On the automatic generation of intermediate logic forms for WordNet glosses. In A. Gelbukh (ed.), Computational Linguistics and Intelligent Text Processing, pp. 2637. Lecture Notes in Computer Science, vol. 6008. Berlin Heidelberg: Springer.

C. F. Baker , C. J. Fillmore , and J. B. Lowe 1998. The Berkeley FrameNet project. In Proceedings of the 17th International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, Université de Montreal, Canada, pp. 8690.

W. Daelemans , and A. van den Bosch 2005. Memory-Based Language Processing, Cambridge: Cambridge University Press.

Ó. Ferrández , R. M. Terol , R. Muñoz , P. Martínez-Barco , and M. Palomar 2007. A knowledge-based textual entailment approach applied to the AVE task. In C. Peters , P. Clough , F. C. Gey , J. Karlgren , B. Magnini , D. W. Oard , M. Rijke , and M. Stempfhuber (eds.), Evaluation of Multilingual and Multi-modal Information Retrieval, pp. 490493. Lecture Notes in Computer Science, vol. 4730. Berlin Heidelberg: Springer.

D. Gildea , and D. Jurafsky 2002. Automatic labeling of semantic roles. Computational Linguistics 28 (3), 245288.

J. Henderson , P. Merlo , I. Titov , and G. Musillo 2013. Multilingual joint parsing of syntactic and semantic dependencies with a latent variable model. Computational linguistics, 39 (4), 949998.

L. Màrquez , X. Carreras , K. Litkowski , and S. Stevenson 2008. Semantic role labeling: an introduction to the special issue. Computational Linguistics 34 (2): 145159.

M. C. McCord , J. W. Murdock , and B. K. Boguraev 2012. Deep parsing in Watson. IBM Journal of Research and Development 56 (3.4): Berlin: Springer.

D. Moldovan , C. Clark , S. Harabagiu , and D. Hodges 2007. Cogex: a semantically and contextually enriched logic prover for question answering. Journal of Applied Logic 5 (1): 4969.

R. Muñoz-Terol , P. Martínez-Barco , and M. Palomar 2007. Applying logic forms and statistical methods to CL-SR performance. In Evaluation of Multilingual and Multi-modal Information Retrieval, pp. 766769. Lecture Notes in Computer Science, vol. 4730. Berlin: Springer.

M. Palmer , D. Gildea , and N. Xue 2010. Semantic role labeling. Synthesis Lectures on Human Language Technologies 3 (1), 1103.

M. Tatu , B. Iles , and D. Moldovan 2007. Automatic answer validation using COGEX. In Evaluation of Multilingual and Multi-modal Information Retrieval, pp. 494501. Lecture Notes in Computer Science. Berlin: Springer.

Y. Todorova 2009. Answering questions from natural language using A-Prolog. In Logic Programming, pp. 544546. Lecture Notes in Computer Science, vol. 5649. Berlin: Springer Berlin Heidelberg.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 3
Total number of PDF views: 30 *
Loading metrics...

Abstract views

Total abstract views: 1078 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 26th September 2017. This data will be updated every 24 hours.