Skip to main content
    • Aa
    • Aa

Interpreting compound nouns with kernel methods


This paper presents a classification-based approach to noun–noun compound interpretation within the statistical learning framework of kernel methods. In this framework, the primary modelling task is to define measures of similarity between data items, formalised as kernel functions. We consider the different sources of information that are useful for understanding compounds and proceed to define kernels that compute similarity between compounds in terms of these sources. In particular, these kernels implement intuitive notions of lexical and relational similarity and can be computed using distributional information extracted from text corpora. We report performance on classification experiments with three semantic relation inventories at different levels of granularity, demonstrating in each case that combining lexical and relational information sources is beneficial and gives better performance than either source taken alone. The data used in our experiments are taken from general English text, but our methods are also applicable to other domains and potentially to other languages where noun–noun compounding is frequent and productive.

Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

T. Baldwin and T. Tanaka 2004. Translation by machine of complex nominals: getting it right. In Proceedings of the ACL-04 Workshop on Multiword Expressions: Integrating Processing, Barcelona, Spain.

C. Berg , J. P. R. Christensen and P. Ressel 1984. Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions. Berlin, Germany: Springer.

T. Briscoe , J. Carroll and R. Watson 2006. The second release of the RASP system. In Proceedings of the ACL-06 Interactive Presentation Sessions, Sydney, Australia.

C. Cortes and V. Vapnik 1995. Support vector networks. Machine Learning 20 (3): 273–97.

B. Devereux and F. Costello 2005. Investigating the relations used in conceptual combination. Artificial Intelligence Review 24 (3–4): 489515.

B. Devereux and F. Costello 2007. Learning to interpret novel noun-noun compounds: evidence from a category learning experiment. In Proceedings of the ACL-07 Workshop on Cognitive Aspects of Computational Language Acquisition, Prague, Czech Republic.

Thomas G. Dietterich 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10 (7): 1895–923.

Z. Estes and Lara L. Jones 2006. Priming via relational similarity: a copper horse is faster when seen through a glass eye. Journal of Memory and Language 55 (1): 89101.

Christina L. Gagné 2002. Lexical and relational influences on the processing of novel compounds. Brain and Language 81 (1–3): 723–35.

Christina L. Gagné , and Edward J. Shoben 2002. Priming relations in ambiguous noun-noun compounds. Memory and Cognition 30 (4): 637–46.

R. Girju , D. Moldovan , M. Tatu and D. Antohe 2005. On the semantics of noun compounds. Computer Speech and Language 19 (4): 479–96.

R. Girju , P. Nakov , V. Nastase , S. Szpakowicz , P. Turney , and D. Yuret 2007. SemEval-2007 Task 04: classification of semantic relations between nominals. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-07), Prague, Czech Republic.

L. Lee 1999. Measures of distributional similarity. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), College Park, MD.

D. Lin 1999. Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), College Park, MD.

J. Mercer 1909. Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London, Series A, 209: 415–46.

D. Ó Séaghdha , and A. Copestake 2007. Co-occurrence contexts for noun compound interpretation. In Proceedings of the ACL-07 Workshop on A Broader Perspective on Multiword Expressions, Prague, Czech Republic.

D. Ó Séaghdha , and A. Copestake 2008. Semantic classification with distributional kernels. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08), Manchester, UK.

D. Ó Séaghdha , and A. Copestake 2009. Using lexical and relational similarity to classify semantic relations. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09), Athens, Greece.

S. Padó and M. Lapata 2007. Dependency-based construction of semantic space models. Computational Linguistics, 33 (2): 161–99.

Claudine N. Raffray , Martin J. Pickering , and Holly P. Branigan 2007. Priming the interpretation of noun-noun compounds. Journal of Memory and Language, 57 (3): 380–95.

J. Shawe-Taylor , and N. Cristianini 2004. Kernel Methods for Pattern Analysis., Cambridge: Cambridge University Press.

Peter D. Turney 2006. Similarity of semantic relations. Computational Linguistics, 32 (3): 379416.

Peter D. Turney 2008. A uniform approach to analogies, synonyms, antonyms, and associations. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08), Manchester, UK.

L. Yao , D. Mimno and A. McCallum 2009. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-09), Paris, France.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 2
Total number of PDF views: 11 *
Loading metrics...

Abstract views

Total abstract views: 88 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 28th June 2017. This data will be updated every 24 hours.